10 Replies Latest reply: Feb 9, 2013 2:32 PM by Stew Ashton RSS

    Need help in Converting Column Data in Rows.

    675884
      Hi All,

      create table TEST_UNPVT (C1 number, C2 varchar2(10));

      insert into TEST_UNPVT values(1, 'A');
      insert into TEST_UNPVT values(2, 'A|B');
      insert into TEST_UNPVT values(3, 'A|B|C');
      insert into TEST_UNPVT values(4, 'A|B|C|D');
      insert into TEST_UNPVT values(5, 'A|FG|HEF');
      insert into TEST_UNPVT values(6, 'AB|GF|KK');

      Note: there could be multiple Pipe delimeted data in a column. The data is not fixed.


      My requirement is as below:
      C1    C2
      1      A
      2      A
      2     B
      3     A
      3     B
      3     C
      4     A
      4     B
      4     C
      4     D
      5     A
      5     FG
      5     HEF
      6     AB
      6     GF
      6     KK
        • 1. Re: Need help in Converting Column Data in Rows.
          Etbin
          Take a look below {message:id=10737655}

          Regards

          Etbin
          • 2. Re: Need help in Converting Column Data in Rows.
            675884
            The data set that i'm refering to is in billions and the query refered in the above link works good for like say 1000-10000 records.
            The performance for the same has been tested by me.
            • 3. Re: Need help in Converting Column Data in Rows.
              Solomon Yakobson
              XML solution:
              select  c1,
                      x.c2
                from  test_unpvt,
                      xmltable(
                               '
                                for $c at $i in ora:tokenize($str,"\|")
                                where $i != 1
                                return $c
                               '
                               passing '|' || c2 as "str"
                               columns c2 varchar2(10) path '.'
                               ) x
              /
              
                      C1 C2
              ---------- ----------
                       1 A
                       2 A
                       2 B
                       3 A
                       3 B
                       3 C
                       4 A
                       4 B
                       4 C
                       4 D
                       5 A
              
                      C1 C2
              ---------- ----------
                       5 FG
                       5 HEF
                       6 AB
                       6 GF
                       6 KK
              
              16 rows selected.
              
              SQL>
              SY.
              • 4. Re: Need help in Converting Column Data in Rows.
                Solomon Yakobson
                user10440713 wrote:
                The data set that i'm refering to is in billions and the query refered in the above link works good for like say 1000-10000 records.
                Storing multiple values in one column indicates flawed design. Anyway, assuming there are no NULL elements in pipe delimited list (no consecutive pipe characters):
                with t1 as (
                            select  max(
                                        length(c2) - length(replace(c2,'|'))
                                       ) + 1 max_lvl
                              from  test_unpvt
                           ),
                     t2 as (
                            select  c1,
                                    substr(
                                           c2,
                                           instr(
                                                 '|' || c2,
                                                 '|',
                                                 1,
                                                 lvl
                                                ),
                                           instr(
                                                 c2 || '|',
                                                 '|',
                                                 1,
                                                 lvl
                                                )- instr(
                                                         '|' || c2,
                                                         '|',
                                                         1,
                                                         lvl
                                                         )
                                          ) c2
                              from  test_unpvt,
                                   (
                                    select  level lvl
                                      from  t1
                                      connect by level <= max_lvl
                                   )
                           )
                select  *
                  from  t2
                  where c2 is not null
                /
                
                        C1 C2
                ---------- ----------
                         1 A
                         2 A
                         3 A
                         4 A
                         5 A
                         6 AB
                         2 B
                         3 B
                         4 B
                         5 FG
                         6 GF
                
                        C1 C2
                ---------- ----------
                         3 C
                         4 C
                         5 HEF
                         6 KK
                         4 D
                
                16 rows selected.
                
                SQL>
                And if you know max number of elements in pipe delimited list upfront:
                with t as (
                           select  c1,
                                   substr(
                                          c2,
                                          instr(
                                                '|' || c2,
                                                '|',
                                                1,
                                                lvl
                                               ),
                                          instr(
                                                c2 || '|',
                                                '|',
                                                1,
                                                lvl
                                               )- instr(
                                                        '|' || c2,
                                                        '|',
                                                        1,
                                                        lvl
                                                        )
                                         ) c2
                             from  test_unpvt,
                                  (
                                   select  level lvl
                                     from  dual
                                     connect by level <= 4 -- max number of elements
                                  )
                          )
                select  *
                  from  t
                  where c2 is not null
                /
                with t as (
                           select  c1,
                                   substr(
                                          c2,
                                          instr(
                                                '|' || c2,
                                                '|',
                                                1,
                                                lvl
                                               ),
                                          instr(
                                                c2 || '|',
                                                '|',
                                                1,
                                                lvl
                                               )- instr(
                                                        '|' || c2,
                                                        '|',
                                                        1,
                                                        lvl
                                                        )
                                         ) c2
                             from  test_unpvt,
                                  (
                                   select  level lvl
                                     from  dual
                                     connect by level <= 4 -- max number of elements
                                  )
                          )
                select  *
                  from  t
                  where c2 is not null
                /
                
                        C1 C2
                ---------- ----------
                         1 A
                         2 A
                         3 A
                         4 A
                         5 A
                         6 AB
                         2 B
                         3 B
                         4 B
                         5 FG
                         6 GF
                
                        C1 C2
                ---------- ----------
                         3 C
                         4 C
                         5 HEF
                         6 KK
                         4 D
                
                16 rows selected.
                
                SQL>
                SY.
                • 5. Re: Need help in Converting Column Data in Rows.
                  ranit B
                  Try this..
                  Ranit>> select
                    2    c1, regexp_substr(c2,'[^|]+',1,level) cc
                    3    from
                    4    test_unpvt
                    5    connect by regexp_substr(c2,'[^|]+',1,level) is not null
                    6     and
                    7     c2 = prior c2
                    8     and prior sys_guid() is not null
                    9  ORDER BY c1,c2;
                  
                          C1 CC
                  ---------- ----------------------------------------
                           1 A
                           2 A
                           2 B
                           3 C
                           3 A
                           3 B
                           4 B
                           4 A
                           4 C
                           4 D
                           5 FG
                           5 HEF
                           5 A
                           6 AB
                           6 GF
                           6 KK
                  
                  16 rows selected.
                  • 6. Re: Need help in Converting Column Data in Rows.
                    675884
                    Hi SY,
                    Thanks for the solution. But there is one issue with the above query, the max length for a varchar2 column is 4000, so if my column length for c2 data is more then 4000 it will fail.

                    Getting the below error message :
                    ORA-00910: specified length too long for its datatype
                    00910. 00000 - "specified length too long for its datatype"
                    *Cause:    for datatypes CHAR and RAW, the length specified was > 2000;
                    otherwise, the length specified was > 4000.
                    *Action:   use a shorter length or switch to a datatype permitting a
                    longer length such as a VARCHAR2, LONG CHAR, or LONG RAW
                    Error at Line: 56 Column: 43
                    • 7. Re: Need help in Converting Column Data in Rows.
                      ranit B
                      create table TEST_UNPVT (C1 number, C2 varchar2(10));
                      Then why do you create a table like this???
                      Column 'C2' is varchar2 ,if it is more than 4000chars why did you gave a test table with varchar... should have given a CLOB

                      Please try to post your doubts properly. People spend time helping you and then you change your initial requirement.

                      Check this query to know the max length of data present -
                      ranit@XE11GR2>> select c1, max(length(c2)) over()  from test_unpvt;
                      
                              C1 MAX(LENGTH(C2))OVER()
                      ---------- ---------------------
                               1                     8
                               2                     8
                               3                     8
                               4                     8
                               5                     8
                               6                     8
                      
                      6 rows selected.
                      • 8. Re: Need help in Converting Column Data in Rows.
                        Etbin
                        I have no reason to not believe you, but it seems you're exchanging causes with consequences.
                        It's not that (as others have said already) the design is flawed, it just seems that no method seems to perform well enough according to some criteria of yours.
                        I don't mean you're to be blamed for that, but some actions are by the nature of things irreversible (the famous: [url http://www.phrases.org.uk/bulletin_board/20/messages/1094.html]on ne fait pas d omlette sans casser des oeufs used litterally, not in it's Machiavellistic context) and not taking much consideration (don't tell us no one knew there will be billions of rows to deal with) in the first place (at design time) results in having to live with it for now, unless ... , but the first step will almost always be someone has to admit/assess (s)he/someone else was wrong at the time.
                        I'm not trying to say the method indicated is the best either. There's simply no use crying over spilled milk.

                        Regards

                        Etbin
                        • 9. Re: Need help in Converting Column Data in Rows.
                          675884
                          Sorry, to not have posted the correct details.

                          The point is that we are doing data migration and the source system is having data in such a way that a column could be mulivalued with '|' (pipe) and could have multiple rows of data for same record. So, what we need here is to convert the data from rows to columns (also consider the piped data), while doing that we ran into issue with the Listagg function failed if the record value > 4000.

                          Now, we are figuring out another solution to overcome this scenerio.

                          Source Column length = 4000 Chars and the Target Column length = 150 Chars.
                          Give me some time to prepare some test data as could not share client specific information.

                          Sorry Again, and Thanks to all of you for spending your time on this. :)

                          Edited by: user10440713 on 09-Feb-2013 09:35
                          • 10. Re: Need help in Converting Column Data in Rows.
                            Stew Ashton
                            user10440713 wrote:
                            Source Column length = 4000 Chars and the Target Column length = 150 Chars.
                            There are at least three methods for doing this, using either XML, REGEXP functions or a combination of SUBSTR and INSTR.

                            The SUBSTR/INSTR method is definitely the most efficient. Since you have many records, I recommend that method.

                            The problem is that the common solution, used by Tom Kyte and others, works for a maximum length of 3998 bytes.

                            Here is a variant that works with input of 4000 bytes:
                            with indata as (
                              select 'A' rowlabel, 'A|2|3|4|5' x from dual
                              union all
                              select 'B' rowlabel, 'B|7|8|9|10' x from dual
                              union all
                              select 'C' rowlabel, rpad('C', 4000, 'C') x from dual
                              union all
                              select 'D' rowlabel, rpad('D', 2000, '1')  || '|' || rpad('D', 1998, '2')|| '|' x from dual
                              union all
                              select 'E' rowlabel, '|' || rpad('E', 3999, 'E') x from dual
                            )
                            select a.rowlabel, length(b.column_value) "Length", b.column_value from indata a,
                            table(cast(multiset(
                              SELECT trim(SUBSTR(
                                x,
                                (case level
                                  when 1 then 1 
                                  else instr(x, '|', 1, level-1) + 1 
                                end),
                                (case instr(x, '|', 1, level)
                                  when 0 then 4000
                                  else instr(x, '|', 1, level) - 1 -
                                    (case level
                                      when 1 then 0
                                      else instr(x, '|', 1, level-1) 
                                    end)
                                end)
                              ))
                              FROM dual
                              CONNECT BY level <= LENGTH(x) - LENGTH(REPLACE(x, '|', '')) + 1
                            ) as sys.odcivarchar2list)) b;
                            
                            ROWLABEL     Length COLUMN_VALUE                 
                            -------- ---------- ------------------------------
                            A                 1 A                              
                            A                 1 2                              
                            A                 1 3                              
                            A                 1 4                              
                            A                 1 5                              
                            B                 1 B                              
                            B                 1 7                              
                            B                 1 8                              
                            B                 1 9                              
                            B                 2 10                             
                            C              4000 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
                            D              2000 D11111111111111111111111111111 
                            D              1998 D22222222222222222222222222222 
                            D                                                  
                            E                                                  
                            E              3999 EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
                            Edited by: Stew Ashton on Feb 9, 2013 9:32 PM