8 Replies Latest reply on Feb 8, 2019 10:03 AM by BEDE

    How to display clob output in many columns

    Costa

      Hi,

      From the "Information" column, I need to split the data and copy it into the respective column (name type, age type etc..) till it encounters semicolon as special character or last word of the sentence

      I have around 1000 such rows. Can anyone help me how to present it like below

       

      with abc as

      (select 123 as ID, '[name type] : Kistrian Davis read; [Age type]: thirtyfive years old; working some where' as Information from dual union all

      select 124 as ID, '[name type] : Patt cummines; [Age type]: twoenty; [Gender type] Female; someone working 213' from dual union all

      select 125 as ID, 'Going anonymous [Gender type] : female; [Age type]: thirtyfive years' as Information from dual union all

      select 123 as ID, '[name type] : Stephan Hawk read; working some where' as Information from dual union all

      select 125 as ID, 'Visiting China this time [Age type] : fourtyseven; [Weight type]: thirtyfive but not sure your weight' as Information from dual

      )

      select * from abc

       

      My Output

       

             

      IDINFORMATIONname typeAge typegender typeweight typeCountry typeEating type
      123[name type] : Kistrian Davis read; [Age type]: thirtyfive years old; working some whereKistrian Davis readthirtyfive years old
      124[name type] : Patt cummines; [Age type]: twoenty; [Gender type] Female; someone working 213Patt cumminestwoentyFemale
      125Going anonymous [Gender type] : female; [Age type]: thirtyfive years thirtyfive yearsfemale
      123[name type] : Stephan Hawk read; working some whereStephan Hawk read
      125Visiting China this time [Age type] : fourtyseven; [Weight type]: thirtyfive but not sure your weight fourtyseven thirtyfive but not sure your weight
        • 1. Re: How to display clob output in many columns
          John Thorton

          You should stop violating First Normal Form by only having single value in any column.

          The data model is seriously flawed.

          If the data was properly normalized the solution would be trivial.

          • 2. Re: How to display clob output in many columns
            mathguy

            We don't know where the data comes from, in this format.

             

            If one received such data in this format, and wanted to normalize it (following your wise advice), they would have to answer EXACTLY the question the OP asked.

            • 3. Re: How to display clob output in many columns
              mathguy

              So - the inputs are CLOB. Can you assume that the individual fragments will always be no more than 4000 bytes, or do you need to anticipate that some will still be CLOB?

               

              Then - are the columns you must create (such as "name type" and "Age type") determined and known in advance, without looking at the data? For example, in your desired output you have "Country type" even though none of the rows have that fragment. The really difficult part would be the opposite: one of the rows has "Location type" but this was not expected; does your solution, then, have to CREATE a new COLUMN for the output? This would make the question much more complicated.

               

              Careless use of capitalization: Why "name type" but "Age type"? Do you need to anticipate the possibility that "name type" may sometimes be written as "Name type", sometimes as "Name Type" and also perhaps as "NAME TYPE"? (And perhaps sometimes with two spaces between the words, instead of one?)  Normally such consolidated data would follow standards (for example: column names are always lower-cap in the CLOB - or whatever other standard); no such standard seems to exist in your inputs. This is another issue that may make your problem harder to solve.

              • 4. Re: How to display clob output in many columns
                BluShadow

                Agree with mathguy, there's so much inconsistency in the format of the data it makes it a mess to handle.

                 

                Anyway, you can pull data out using regular expressions, something like...

                 

                 

                SQL> col information format a20
                SQL> col name_type format a25
                SQL> col age_type format a25
                SQL> col gender_type format a25
                SQL> col weight_type format a35
                SQL> col country_type format a25
                SQL>
                SQL> with abc as
                  2  (select 123 as ID, '[name type] : Kistrian Davis read; [Age type]: thirtyfive years old; working some where' as Information from dual union all
                  3  select 124 as ID, '[name type] : Patt cummines; [Age type]: twoenty; [Gender type] Female; someone working 213' from dual union all
                  4  select 125 as ID, 'Going anonymous [Gender type] : female; [Age type]: thirtyfive years' as Information from dual union all
                  5  select 123 as ID, '[name type] : Stephan Hawk read; working some where' as Information from dual union all
                  6  select 125 as ID, 'Visiting China this time [Age type] : fourtyseven; [Weight type]: thirtyfive but not sure your weight' as Information from dual
                  7  )
                  8  select abc.id
                  9        ,regexp_replace(information||';', '^.*\[name type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i') as name_type
                10        ,regexp_replace(information||';', '^.*\[age type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i') as age_type
                11        ,regexp_replace(information||';', '^.*\[gender type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i') as gender_type
                12        ,regexp_replace(information||';', '^.*\[weight type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i') as weight_type
                13        ,regexp_replace(information||';', '^.*\[country type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i') as country_type
                14  from abc
                15  /

                        ID NAME_TYPE                 AGE_TYPE                  GENDER_TYPE               WEIGHT_TYPE                         COUNTRY_TYPE
                ---------- ------------------------- ------------------------- ------------------------- ----------------------------------- -------------------------
                       123 Kistrian Davis read       thirtyfive years old
                       124 Patt cummines             twoenty                   Female
                       125                           thirtyfive years          female
                       123 Stephan Hawk read
                       125                           fourtyseven                                         thirtyfive but not sure your weight

                 

                1 person found this helpful
                • 5. Re: How to display clob output in many columns
                  Costa

                  I apologize for not writing the content example perfectly.

                  The column "Information" is a free text field in database where users can insert any information in their own way irrespective of any standard format.

                  Thank you very much BlueShadow for providing the solution. If possible, can you please guide and explain me how you use the below expression to find the exact word.

                   

                  '^.*\[gender type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i'

                   

                  My apology once again

                  • 6. Re: How to display clob output in many columns
                    mathguy

                    Costa wrote:

                     

                    The column "Information" is a free text field in database where users can insert any information in their own way irrespective of any standard format.

                     

                     

                    I don't understand. Are you saying that users could insert information in the format    ... (age type): 23 ....   (using parentheses instead of square brackets), or perhaps   <age type> "twenty-three" (no colon separator, and enclosing the value in double-quotes and the type in angled brackets)?

                     

                    If the answer is NO, then what you said is not true. There ARE some standard formats, and you must tell us what they are if you want us to help you with your code. If the answer is YES, then this is 100% crazy and I doubt that any solution is possible.

                     

                    You still didn't answer one of the crucial questions I asked you. Are the columns to be included in the output known ahead of time, before seeing the data? Or do you need to create columns on the fly, based on the actual data in each row?

                    • 7. Re: How to display clob output in many columns
                      BluShadow

                      As mathguy says, if it's completely free text, and the users can enter it manually, then you'll be luck to capture all possibilities.

                       

                       

                      In terms of that regular expression pattern it's looking for the following:

                       

                       

                      '^.*\[name type\] ?:? ?(.+?);.*$|^.*$', '\1', 1, 1, 'i')

                      ||||           ||    ||   ||| |||  |    |    \-------/

                      ||||           ||    ||   ||| |||  |    |       |

                      ||||           ||    ||   ||| |||  |    |       \-- "1,1,'i'" = from the start of the string, first occurrence, case insensitive

                      ||||           ||    ||   ||| |||  |    |

                      ||||           ||    ||   ||| |||  |    \-- "\1" = Replace the matched pattern with the string found in backreference 1 (Note:1)

                      ||||           ||    ||   ||| ||\--/

                      ||||           ||    ||   ||| || |

                      ||||           ||    ||   ||| || \-- "^.*$" = All Characters from the start to the end of the string

                      ||||           ||    ||   ||| ||

                      ||||           ||    ||   ||| |\-- "|" = OR, if the first pattern isn't matched

                      ||||           ||    ||   ||\-/

                      ||||           ||    ||   || |

                      ||||           ||    ||   || \-- ".*$" = any characters up to the end of the string ($)

                      ||||           ||    ||   ||

                      ||||           ||    ||   |\-- ";" = a semicolon character

                      ||||           ||    |\---/

                      ||||           ||    |  |

                      ||||           |\----/  \-- "(.+?)" = 1 or more characters, non greedy (?), the brackets indicate it should be backreferenced (backreference #1)

                      ||||           |   |

                      |||\-----------/   \-- " ?:? ?" = an optional space followed by an optional ":" followed by an optional space

                      |||     |

                      |\/     \-- "\[name type\]" = literally the string "[name type]"

                      | |

                      | \-- ".*" = any characters

                      |

                      \-- "^" = Start of String

                       

                      Note: 1 - if the pattern matches the left side of the OR then backreference 1 will contain the data you want, so the whole matched string is replace with that data

                               - if the pattern matches the right side of the OR then the backreference will be null, so the whole matched string is replaced with nothing

                       

                       

                      (The community platform has removed the space from the start of each explanation line, so everything looks offset by 1 character - can't do much about that)

                      1 person found this helpful
                      • 8. Re: How to display clob output in many columns
                        BEDE

                        One of the good examples how NOT to use a relational database.