1 2 Previous Next 21 Replies Latest reply: Sep 16, 2009 10:11 AM by 708890 RSS

    Alpha numeric sorting in sql

    708890
      Please help me in sorting the following data of a column in ascending or descending order :

      Data
      0B001.c
      0B001.c.1
      0B001.c.10
      0B001.c.11
      0B001.c.11.a
      0B001.c.11.b
      0B001.c.11.c
      0B001.c.11.d
      0B001.c.12
      0B001.c.13
      0B001.c.2
      0B001.c.3
      0B001.c.3.a
      0B001.c.3.b
      0B001.c.3.c
      0B001.c.4
      0B001.c.5
      0B001.c.6
      0B001.c.6.a
      0B001.c.6.b
      0B001.c.6.c
      0B001.c.7
      0B001.c.7.a
      0B001.c.7.b
      0B001.c.7.c
      0B001.c.8
      0B001.c.8.a
      0B001.c.8.b
      0B001.c.8.c
      0B001.c.9
      0B001.d
      0B001.d.1
      0B001.d.2
      0B001.d.3

      expected order:
      0B001.c
      0B001.c.1
      0B001.c.2
      0B001.c.3
      0B001.c.3.a
      0B001.c.3.b
      0B001.c.3.c
      0B001.c.4
      0B001.c.5
      0B001.c.6
      0B001.c.6.a
      0B001.c.6.b
      0B001.c.6.c
      0B001.c.7
      0B001.c.7.a
      0B001.c.7.b
      0B001.c.7.c
      0B001.c.8
      0B001.c.8.a
      0B001.c.8.b
      0B001.c.8.c
      0B001.c.9
      0B001.c.10
      0B001.c.11
      0B001.c.11.a
      0B001.c.11.b
      0B001.c.11.c
      0B001.c.11.d
      0B001.c.12
      0B001.c.13
      0B001.d
      0B001.d.1
      0B001.d.2
      0B001.d.3

      Thanks,
      Chandra
        • 1. Re: Alpha numeric sorting in sql
          Andy Tael-Oracle
          Check this out:

          http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:786026162431

          Tom Kyte usually have the answers.

          --Andy                                                                                                                                                                                                                                                                                                       
          • 2. Re: Alpha numeric sorting in sql
            666352
            Wrong post

            Edited by: Salim Chelabi on 2009-06-25 11:39
            • 3. Re: Alpha numeric sorting in sql
              Frank Kulash
              Hi, Chandra,

              Welcome to the forum!

              If you have a column txt that is a . delimited list, and the 3rd item in that list has to be sorted as a number (where 1<9<10), then:
              ORDER BY  REGEXP_SUBSTR ( txt               -- items 1 and 2 sort as strings
                                      , '([^.]+\.){2}'
                             )
              ,       TO_NUMBER ( REGEXP_SUBSTR ( txt     -- item 3 sorts as NUMBER
                                              , '[^.]+'
                                      , 1
                                      , 3
                                      )
                             )
              ,       REGEXP_REPLACE ( txt               -- items 4 an up sort as strings
                                       , '([^.]+\.){3}'
                              )
              Regular expressions require Oracle 10 (or higher).

              If txt happens to have 3 (or fewer) items, nothing bad happens. The last expression (or the last two) will simply return NULL, and the earlier expressions will sort correctly.

              In any version of Oracle, you can use INSTR to find where the dots are, and SUBSTR to extract the sections relative to them (for example, up to the 2nd dot).
              • 4. Re: Alpha numeric sorting in sql
                SanjayRs
                Chandra,
                with num as ( select '0B001.c' n from dual 
                union all select '0B001.c.1' from dual 
                union all select '0B001.c.10' from dual 
                union all select '0B001.c.11' from dual 
                union all select '0B001.c.11.a' from dual 
                union all select '0B001.c.11.b' from dual 
                union all select '0B001.c.11.c' from dual 
                union all select '0B001.c.11.d' from dual 
                union all select '0B001.c.12' from dual 
                union all select '0B001.c.13' from dual 
                union all select '0B001.c.2' from dual 
                union all select '0B001.c.3' from dual 
                union all select '0B001.c.3.a' from dual 
                union all select '0B001.c.3.b' from dual 
                union all select '0B001.c.3.c' from dual 
                union all select '0B001.c.4' from dual 
                union all select '0B001.c.5' from dual 
                union all select '0B001.c.6' from dual 
                union all select '0B001.c.6.a' from dual 
                union all select '0B001.c.6.b' from dual 
                union all select '0B001.c.6.c' from dual 
                union all select '0B001.c.7' from dual 
                union all select '0B001.c.7.a' from dual 
                union all select '0B001.c.7.b' from dual 
                union all select '0B001.c.7.c' from dual 
                union all select '0B001.c.8' from dual 
                union all select '0B001.c.8.a' from dual 
                union all select '0B001.c.8.b' from dual 
                union all select '0B001.c.8.c' from dual 
                union all select '0B001.c.9' from dual 
                union all select '0B001.d' from dual 
                union all select '0B001.d.1' from dual 
                union all select '0B001.d.2' from dual 
                union all select '0B001.d.3' from dual 
                )
                select n 
                from num
                order by regexp_substr(n,'.[a-z]'),
                nvl(to_number(replace(regexp_substr(n,'.[0-9]+',1,2),'.') ),0)
                /
                
                N
                ------------
                0B001.c
                0B001.c.1
                0B001.c.2
                0B001.c.3.c
                0B001.c.3.b
                0B001.c.3.a
                0B001.c.3
                0B001.c.4
                0B001.c.5
                0B001.c.6.c
                0B001.c.6
                0B001.c.6.a
                0B001.c.6.b
                0B001.c.7
                0B001.c.7.b
                0B001.c.7.c
                0B001.c.7.a
                0B001.c.8.c
                0B001.c.8
                0B001.c.8.a
                0B001.c.8.b
                0B001.c.9
                0B001.c.10
                0B001.c.11
                0B001.c.11.a
                0B001.c.11.b
                0B001.c.11.c
                0B001.c.11.d
                0B001.c.12
                0B001.c.13
                0B001.d
                0B001.d.1
                0B001.d.2
                0B001.d.3
                
                34 rows selected.
                SS
                • 5. Re: Alpha numeric sorting in sql
                  phani marella
                  "Chandra..sorry to deviate your post"

                  Hi Frank and sanjay..

                  I am trying to get this expression..
                  ORDER BY  REGEXP_SUBSTR ( txt          
                                          , '([^.]+\.){2}'               ---- Could you please breifly tell what does these symbols represent in this expression
                                 )
                  ,       TO_NUMBER ( REGEXP_SUBSTR ( txt     -- Could you please breifly tell what does these symbols represent in this expression
                                                  , '[^.]+'
                                          , 1
                                          , 3
                                          )
                                 )
                  ,       REGEXP_REPLACE ( txt               -- Could you please breifly tell what does these symbols represent in this expression
                                           , '([^.]+\.){3}'
                                  )
                  • 6. Re: Alpha numeric sorting in sql
                    Frank Kulash
                    Hi,
                    phani marella wrote:
                    "Chandra..sorry to deviate your post"
                    Actually, Chandra might be wondering the same thing; but it's thoughtful of you to mention that.

                    (1)
                    ORDER BY  REGEXP_SUBSTR ( txt          
                    , '([^.]+\.){2}'               ---- Could you please breifly tell what does these symbols represent in this expression
                                   )
                    [^.]
                    means the set of all characters except the dot.
                    + immediately after that means a consecutive group of one or more such characters.
                    \. means a dot. Outside of square brackets, . is a wild-card. To signify the character, the symbol . has to be "escaped" with a \ immediately before it.
                    (<expr>){2}
                    means exactly two repititions of <expr>.
                    So this means exactly 2 groups of (any number of non-dots followed by a dot).

                    (2)
                    ,       TO_NUMBER ( REGEXP_SUBSTR ( txt     -- Could you please breifly tell what does these symbols represent in this expression
                                                    , '[^.]+'
                                            , 1
                                            , 3
                                            )
                                   )
                    See (1), above, for the meaning of
                    [^.]+
                    The 3rd argument to REGEXP_SUBSTR (1 in this example) is the position in txt where the search will begin. 1 is the default, so I didn't have to specify it in the other call to REGEXP_SUBSTR. In fact, I wouldn't bother specifying it here, either, except that I need to in order to pass the next argument.
                    The 4th argument to REGEXP_SUBSTR (3 in this example) means find the 3rd sub-string that matches the given pattern.

                    (3)
                    ,       REGEXP_REPLACE ( txt               -- Could you please breifly tell what does these symbols represent in this expression
                    , '([^.]+\.){3}'
                                    )
                    See (1), above, for the symbols.
                    REGEXP_REPLACE replaces the sub-string that matches the pattern in the 2nd argument with the 3rd argument. I didn't pass a 3rd argument, so it defaults to NULL. So the pattern (the first three items) will be replaced with NULL, that is, removed.

                    Esoteric Point
                    In an earlier message I said that if there happend to be only one or two items in the list, then this last expression would be NULL, but all the necessary sorting would be done by the first expression. That's not quite true. If the list contains only one or two items, this last expression will not find the pattern to be replaced, so it will return txt itself, unchanged, so the ORDER BY clause is like this:
                    ORDER BY  x, NULL, x
                    All the necessary sorting was indeed done by the first expression, and the effect is the same as
                    ORDER BY  x
                    • 7. Re: Alpha numeric sorting in sql
                      708890
                      Frank,

                      Thanks for your response, this solution works perfect. However if my dataset has values like

                      6A005.b.6.c.1.a
                      6A005.b.6.c.1.b
                      6A005.b.6.a.3
                      6A005.b.6.a.3.a
                      6A005.b.6.a.3.c

                      I seem to get an error like invalid number , i am trying hard to understand the logic and extend it to these test cases.

                      Thanks again..
                      Chandra
                      • 8. Re: Alpha numeric sorting in sql
                        Frank Kulash
                        Hi,
                        user10875403 wrote:
                        Thanks for your response, this solution works perfect. However ...
                        I seem to get an error like invalid number ,
                        I wish more people were so easily satisfied!

                        I don't have any error with that data.

                        It would help if you could post a complete, self-contained script that showed exactly what the problem is.
                        An example of such a script is:
                        CREATE TABLE     num
                        AS
                        SELECT  '6A005.b.6.c.1.a'     AS txt     FROM dual     UNION ALL
                        SELECT  '6A005.b.6.c.1.b'     AS txt     FROM dual     UNION ALL
                        SELECT  '6A005.b.6.a.3'          AS txt     FROM dual     UNION ALL
                        SELECT  '6A005.b.6.a.3.a'     AS txt     FROM dual     UNION ALL
                        SELECT  '6A005.b.6.a.3.c'     AS txt     FROM dual
                        ;
                        
                        SELECT       txt
                        FROM       num
                        ORDER BY  REGEXP_SUBSTR ( txt               -- items 1 and 2 sort as strings
                                                , '([^.]+\.){2}'
                                       )
                        ,       TO_NUMBER ( REGEXP_SUBSTR ( txt     -- item 3 sorts as NUMBER
                                                        , '[^.]+'
                                                , 1
                                                , 3
                                                )
                                       )
                        ,       REGEXP_REPLACE ( txt               -- items 4 and up sort as strings
                                                 , '([^.]+\.){3}'
                                        )
                        ;
                        The output from the script above (which looks perfect to me) is:
                        TXT
                        ---------------
                        6A005.b.6.a.3
                        6A005.b.6.a.3.a
                        6A005.b.6.a.3.c
                        6A005.b.6.c.1.a
                        6A005.b.6.c.1.b
                        i am trying hard to understand the logic and extend it to these test cases.
                        Did you see my last message? I may have posted it while you were writing your message.

                        The solutionI posted assumed that the 3rd item was the only item that had to be sorted like a number. In this new set of data, it looks like the 5th item is a number, too. If that's the case, you'll have to change my solution some.
                        To get the n-th item from the list, use:
                        REGEXP_SUBSTR ( txt
                                   , '[^.]+'
                                   , 1
                                   , n
                                   )
                        Use this as the argument to TO_NUMBER (like I did for the 3rd item) when appropriate.
                        • 9. Re: Alpha numeric sorting in sql
                          Hoek
                          A "regexp-less" way:
                          SQL> with t as (
                            2  select '0B001.c' col from dual union all
                            3  select '0B001.c.1' from dual union all
                            4  select '0B001.c.10' from dual union all
                            5  select '0B001.c.11' from dual union all
                            6  select '0B001.c.11.a' from dual union all
                            7  select '0B001.c.11.b' from dual union all
                            8  select '0B001.c.11.c' from dual union all
                            9  select '0B001.c.11.d' from dual union all
                           10  select '0B001.c.12' from dual union all
                           11  select '0B001.c.13' from dual union all
                           12  select '0B001.c.2' from dual union all
                           13  select '0B001.c.6.b' from dual union all
                           14  select '0B001.d.2' from dual union all
                           15  select '0B001.c.3' from dual union all
                           16  select '0B001.c.6.c' from dual union all
                           17  select '0B001.c.7' from dual union all
                           18  select '0B001.c.7.a' from dual union all
                           19  select '0B001.c.7.b' from dual union all
                           20  select '0B001.c.7.c' from dual union all
                           21  select '0B001.c.8' from dual union all
                           22  select '0B001.c.8.a' from dual union all
                           23  select '0B001.c.8.b' from dual union all
                           24  select '0B001.c.8.c' from dual union all
                           25  select '0B001.c.9' from dual union all
                           26  select '0B001.d' from dual union all
                           27  select '0B001.d.1' from dual union all
                           28  select '0B001.c.3.a' from dual union all
                           29  select '0B001.c.3.b' from dual union all
                           30  select '0B001.c.3.c' from dual union all
                           31  select '0B001.c.4' from dual union all
                           32  select '0B001.c.5' from dual union all
                           33  select '0B001.c.6' from dual union all
                           34  select '0B001.c.6.a' from dual union all
                           35  select '0B001.d.3' from dual
                           36  )
                           37  select col
                           38  from   t
                           39  order by
                           40         substr(col, 1, instr(col, '.', 1, 1)-1)
                           41  ,      case 
                           42           when instr(col, '.', 1, 1) > 0
                           43           then substr( col
                           44                      , instr(col, '.', 1, 1)+1
                           45                      , case
                           46                          when instr(col, '.', 1, 2) = 0 
                           47                          then length(col) - instr(col, '.', 1, 1) 
                           48                          else instr(col, '.', 1, 2)-1 - instr(col, '.', 1, 1)
                           49                        end     
                           50                      ) 
                           51           else null
                           52         end
                           53  ,      case 
                           54           when instr(col, '.', 1, 2) > 0
                           55           then to_number(substr( col
                           56                                , instr(col, '.', 1, 2)+1
                           57                                , case
                           58                                    when instr(col, '.', 1, 3) = 0 
                           59                                    then length(col) - instr(col, '.', 1, 2) 
                           60                                    else instr(col, '.', 1, 3)-1 - instr(col, '.', 1, 2) 
                           61                                  end     
                           62                                ) 
                           63                         )
                           64           else 0 
                           65         end
                           66  ,      case when instr(col, '.', 1, 3) > 0
                           67              then
                           68                substr(col, instr(col, '.', 1, 3)+1, length(col))
                           69               else null 
                           70         end;
                          
                          COL
                          ------------
                          0B001.c
                          0B001.c.1
                          0B001.c.2
                          0B001.c.3.a
                          0B001.c.3.b
                          0B001.c.3.c
                          0B001.c.3
                          0B001.c.4
                          0B001.c.5
                          0B001.c.6.a
                          0B001.c.6.b
                          0B001.c.6.c
                          0B001.c.6
                          0B001.c.7.a
                          0B001.c.7.b
                          0B001.c.7.c
                          0B001.c.7
                          0B001.c.8.a
                          0B001.c.8.b
                          0B001.c.8.c
                          0B001.c.8
                          0B001.c.9
                          0B001.c.10
                          0B001.c.11.a
                          0B001.c.11.b
                          0B001.c.11.c
                          0B001.c.11.d
                          0B001.c.11
                          0B001.c.12
                          0B001.c.13
                          0B001.d
                          0B001.d.1
                          0B001.d.2
                          0B001.d.3
                          
                          34 rows selected.
                          • 10. Re: Alpha numeric sorting in sql
                            708890
                            Frank,

                            Sorry for the confusion, Yes your original solution worked perfect even for the new dataset like :

                            6A005.b.6.c.1.a
                            6A005.b.6.c.1.b
                            6A005.b.6.a.3
                            6A005.b.6.a.3.a
                            6A005.b.6.a.3.c

                            even though the fifth term is a number here I havent changed your original solution ( not replaced 3 to 5 ) and still it works fine.
                            However there is one thing that I observed, please see the below unsorted data set :

                            2B006.b
                            2B006.b.1
                            2B006.b.1.a
                            2B006.b.1.c.1
                            2B006.b.1.b
                            2B006.b.1.c.2.b
                            2B006.b.1.d
                            2B006.b.1.b.1
                            2B006.b.1.b.2
                            2B006.b.1.c
                            2B006.b.1.c.2
                            2B006.b.1.c.2.a

                            after I run the original solution that you gave me ,

                            SELECT ECCN_CD, global_id_nbr
                            FROM export_control_classification
                            WHERE upper(eccn_cd) like '2B006.B%'
                            ORDER BY  REGEXP_SUBSTR ( ECCN_CD
                            *, '([^.]+\.){2}'*
                            *)*
                            *, TO_NUMBER ( REGEXP_SUBSTR ( ECCN_CD*
                            *, '[^.]+'*
                            *, 1*
                            *, 3*
                            *)*
                            *)*
                            *, REGEXP_REPLACE ( ECCN_CD*
                            *, '([^.]+\.){3}'*
                            *)*

                            The sorted output is like this :

                            2B006.b.1
                            2B006.b.1.a
                            2B006.b.1.b
                            2B006.b.1.b.1
                            2B006.b.1.b.2
                            2B006.b.1.c
                            2B006.b.1.c.1
                            2B006.b.1.c.2
                            2B006.b.1.c.2.a
                            2B006.b.1.c.2.b
                            2B006.b.1.d
                            2B006.b.2
                            +2B006.b+

                            However the requirement is that 2B006.b should be at the top of the list, can you please help.

                            Thank you again very much for taking your time and helping me. I must admit that with my limited SQL knowledge I wouldn't have been able to crack this by myself.

                            Thanks
                            Chandra
                            • 11. Re: Alpha numeric sorting in sql
                              Frank Kulash
                              Hi, Chandra,
                              user10875403 wrote:
                              ...
                              even though the fifth term is a number here I havent changed your original solution ( not replaced 3 to 5 ) and still it works fine.
                              If the numeric columns always contain the same number of digits (and in this example, they are all one digit), then it doesn't matter if you sort them as strings or as numbers. It's only when comparing strings with different numbers of digits that the distinction is important. For example:
                              '1' < `2` < '3', and 1 < 2 < 3 (See note below)
                              '10' < '20' < '30', and 10 < 20 < 30
                              '100' < '200' < '300', and 100 < 200 < 300
                              All the examples above have the same number iof digits. However
                              '100' < '20' < '3' but 3 < 20 < 100
                              after I run the original solution that you gave me ,

                              SELECT ECCN_CD, global_id_nbr
                              FROM export_control_classification
                              WHERE upper(eccn_cd) like '2B006.B%'
                              ORDER BY  REGEXP_SUBSTR ( ECCN_CD
                              *, '([^.]+\.){2}'*
                              *)*
                              *, TO_NUMBER ( REGEXP_SUBSTR ( ECCN_CD*
                              *, '[^.]+'*
                              *, 1*
                              *, 3*
                              *)*
                              *)*
                              *, REGEXP_REPLACE ( ECCN_CD*
                              *, '([^.]+\.){3}'*
                              *)*
                              Doe the code above look right to you?
                              Whenever you post code on this site, type these 6 characters:

                              &#123;code&#125;

                              (small letters only, inside curly brackets) before and after sections of formatted text, to present the text exactly as you posted it. This is always important for code, which should always be formatted and indented, but it is especially important when posting regular expressions, because otherwise the square brackets will be treated as some kind of markup.
                              However the requirement is that 2B006.b should be at the top of the list, can you please help.
                              You're essentially saying
                              ORDER BY  x
                              ,         TO_NUMBER (y)
                              ,         z
                              If y or z happen to be NULL, then will (by default) be sorted as infinitely large values, always last in ascending order.
                              What you want is to override that default, so that NULLs come first:
                              ORDER BY  x
                              ,         TO_NUMBER (y)          NULLS FIRST
                              ,         z                   NULLS FIRST
                              Edited by: Frank Kulash on Jun 26, 2009 1:26 PM
                              This site won't lshow a 2 in single-quotes here, even if I use &#123;code&#125; tags. It displays one single-quote, but not the 2 or the other single-quote. I used accent marks instead.
                              • 12. Re: Alpha numeric sorting in sql
                                708890
                                Frank,

                                Apologies for posting the code the way I did, going forward I will remember your advise. I tried by giving the NULLS FIRST for y and z. However the result is still the same

                                2B006.b.1
                                2B006.b.1.a
                                2B006.b.1.b
                                2B006.b.1.b.1
                                2B006.b.1.b.2
                                2B006.b.1.c
                                2B006.b.1.c.1
                                2B006.b.1.c.2
                                2B006.b.1.c.2.a
                                2B006.b.1.c.2.b
                                2B006.b.1.d
                                2B006.b.2
                                2B006.b

                                Am I missing something ?

                                Thanks
                                Chandra
                                • 13. Re: Alpha numeric sorting in sql
                                  Frank Kulash
                                  Hi, Chandra,

                                  Whenver you have a problem with some code, post your code. I understand your description, but you may have made a typing mistake, or something could have gotten lost when you cut and pasted my code.
                                  • 14. Re: Alpha numeric sorting in sql
                                    708890
                                    Frank,

                                    Here is the code :
                                    SELECT ECCN_CD, global_id_nbr
                                    
                                    FROM export_control_classification 
                                    
                                    WHERE upper(eccn_cd) like '2B006.B%' 
                                    
                                    ORDER BY  REGEXP_SUBSTR ( ECCN_CD
                                    
                                                            , '([^.]+\.){2}'
                                    
                                                )
                                    
                                    ,      TO_NUMBER ( REGEXP_SUBSTR ( ECCN_CD
                                    
                                                                , '[^.]+'
                                    
                                                        , 1
                                    
                                                          , 3
                                    
                                                                )
                                    
                                                     ) NULLS FIRST
                                    
                                    ,       REGEXP_REPLACE ( ECCN_CD
                                    
                                                             , '([^.]+\.){3}'
                                    
                                                       ) NULLS FIRST
                                    1 2 Previous Next