Forum Stats

  • 3,827,508 Users
  • 2,260,787 Discussions
  • 7,897,281 Comments

Discussions

Inverse WM_CONCAT()

644238
644238 Member Posts: 5
edited Mar 12, 2009 12:31PM in SQL & PL/SQL
Having the following data:

{color:#0000ff}338033 2 531030 142062
338033 2 531010 142062
338033 2 530010 142062{color}

... I use WM_CONCAT() to get:

{color:#0000ff}338033 2 531030,531010,530010 142062{color}

Now what I am looking for is the way around -> Converting the 1 agregated record to the 3 initial records.

I thought I'll ask before starting to program some PL/SQL loops...

Thx,
Peter

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,072 Red Diamond
    edited Mar 12, 2009 12:31PM
    Hi, Peter

    No need for PL/SQL. REGEXP_SUBSTR can eaily find the n-th item. Use a counter table to get the integers 1,2, ..., n.
    WITH	cntr	AS
    (
    	SELECT	LEVEL	AS n
    	FROM	dual
    	CONNECT BY	LEVEL <= (
    			      	 SELECT	MAX (REGEXP_COUNT ( concatenated_txt	-- See note below
    				 	    		  , ','
    							  )
    					    ) + 1
    				 FROM	table_x
    				 )
    )
    SELECT	REGEXP_SUBSTR ( concatenated_txt
    		      , '[^,]+'
    		      , 1
    		      , n
    		      )
    FROM	table_x
    JOIN	cntr	ON	n <= REGEXP_COUNT ( concatenated_txt
    		   			  , ','
    					  ) + 1
    ;
    REGEXP_COUNT was intrioduced in Oracle 11.
    An alternative way to find how many commas are in concatenated_txt, which works in any version of Oracle, is:
    LENGTH (concatenated_txt) - LENGTH ( REPLACE ( concatenated_txt
           			    	     	     , ','
    					     )
                                       )
    Edited by: Frank Kulash on Mar 12, 2009 9:19 AM
    Forgott "+1" in join condition
    Frank Kulash
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,483 Red Diamond
    with t as (
               select '338033' c1,'2' c2,'531030,531010,530010' aggregated_c3,'142062' c4 from dual
              )
    select  c1,
            c2,
            rtrim(regexp_substr(aggregated_c3 || ',','[^,]*,',1,lvl),',') c3,
            c4
      from  t,
            (
             select  level lvl
               from  dual
               connect by level <= (select max(length(regexp_replace(aggregated_c3 || ',','[^,]'))) from t)
            )
      where lvl <= length(regexp_replace(aggregated_c3 || ',','[^,]'))
    /
    
    C1     C C3                    C4
    ------ - --------------------- ------
    338033 2 531030                142062
    338033 2 531010                142062
    338033 2 530010                142062
    
    SQL> 
    SY.
    Solomon Yakobson
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,483 Red Diamond
    Hi Frank,

    If comma-separated list can have null elements solution with REGEXP_SUBSTR pattern '[^,]+' will produce right rows, but in wrong order. So it will not work for the task when we need to get elements and their sequence numbers or we need to get ordered elements.

    SY.
  • 666352
    666352 Member Posts: 1,442
    Solution with model.

    Regards Salim.
    WITH T AS
     (SELECT   '338033' c1,'2' c2,'531030,531010,530020' aggregated_c3,'142062' c4 from dual)
     SELECT   c1, c2, aggregated_c3, c4
       FROM   T
       MODEL
        RETURN UPDATED ROWS
       partition  by (c1)
        DIMENSION BY (0 POSITION)
        MEASURES     (c2,c4,aggregated_c3 ,NVL(LENGTH(REGEXP_REPLACE(aggregated_c3,'[^,]+','')),0)+1 NB_MOT)
        RULES
        (aggregated_c3[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =
         REGEXP_SUBSTR(aggregated_c3[0],'[^,]+',1,CV(POSITION)),
         c4[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =c4[0]     ,
         c2[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =c2[0])
    SQL> WITH T AS
      2   (SELECT   '338033' c1,'2' c2,'531030,531010,530020' aggregated_c3,'142062' c4 from dual)
      3   SELECT   c1, c2, aggregated_c3, c4
      4     FROM   T
      5     MODEL
      6      RETURN UPDATED ROWS
      7     partition  by (c1)
      8      DIMENSION BY (0 POSITION)
      9      MEASURES     (c2,c4,aggregated_c3 ,NVL(LENGTH(REGEXP_REPLACE(aggregated_c3,'[^,]+','')),0)+
    1 NB_MOT)
     10      RULES
     11      (aggregated_c3[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =
     12       REGEXP_SUBSTR(aggregated_c3[0],'[^,]+',1,CV(POSITION)),
     13       c4[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =c4[0]     ,
     14       c2[FOR POSITION FROM  1 TO NB_MOT[0] INCREMENT 1] =c2[0]);
    
    C1     C AGGREGATED_C3        C4
    ------ - -------------------- ------
    338033 2 531030               142062
    338033 2 531010               142062
    338033 2 530020               142062
    
    SQL> 
    666352
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,072 Red Diamond
    edited Mar 12, 2009 12:27PM
    Hi,
    Solomon Yakobson wrote:
    Hi Frank,

    If comma-separated list can have null elements solution with REGEXP_SUBSTR pattern '[^,]+' will produce right rows, but in wrong order. So it will not work for the task when we need to get elements and their sequence numbers or we need to get ordered elements.

    SY.
    That's a good thing to keep in mind.
    REGEXEP_SUBSTR (txt, '[^,]+', 1, n) 
    is the n-th substring that does not contain a comma. It is not necessarily the substring that comes immediately before the n-th comma. The two will be different if the list contains consecutive commas.
    For example, what is the second item in the list 'Agatha,,,Lucy'?
    'Lucy' is the second item in the list, if it is delimited by one or more commas (that is, if you count ',,,' as a single delimiter). The formula I used is correct in this case.
    The second item is NULL if the list is delimited by exactly one comma. In this case, 'Lucy" is the 4th item. The forumula you posted is correct in this case.

    Peter (OP) did say that the concatenated string had been generated by WM_CONCAT, which only adds non-NULL items. Unless Peter meant the list was originally generated by WM_CONCAT, and then modified, then the distinction doesn't matter in this case.
  • MichaelS
    MichaelS Member Posts: 8,424 Bronze Crown
    Or with the help of a little xquery:
    SQL>  with t as 
    (
      select '338033' c1,'2' c2,'531030,531010,530010' aggregated_c3,'142062' c4 from dual
    )
    --
    --
    select c1, c2, c3 ,c4
      from t, 
           xmltable ('declare function local:tokenize($str, $delim)
                      {
                        if (substring-after ($str, $delim)) then 
                           (substring-before($str, $delim), local:tokenize(substring-after($str, $delim), $delim))
                        else 
                           ($str)
                       };
                       local:tokenize(A, ",")' passing xmlforest(aggregated_c3 a)
                       columns c3 varchar2(10) path '.')
    
    C1     C C3         C4    
    ------ - ---------- ------
    338033 2 531030     142062
    338033 2 531010     142062
    338033 2 530010     142062
    
    3 rows selected.
    MichaelS
  • 644238
    644238 Member Posts: 5
    Amazing guys! Many thanks!! I implemented the solution from Solomon. The other solutions would also work!
This discussion has been closed.