Forum Stats

  • 3,875,518 Users
  • 2,266,939 Discussions
  • 7,912,241 Comments

Discussions

Need to have sequence number for rows based on a custom logic.

User_XY6RS
User_XY6RS Member Posts: 1 Green Ribbon

I have raw data in this format:


I need to have sequences like this using Oracle SQL:


Background: Order_Release_Gid starts from LINE and goes to PORT B. This Order will be broken into 7 movement pices as:

Line to Factory Yard

Factory Yard - Factory Yard

Factory Yard - EY1

EY1 -EY1 ...etc

Please provide your sql to solve this. I am also trying and if I will get the solution will post here.


Thank You.

Answers

  • BluShadow
    BluShadow Member, Moderator Posts: 42,580 Red Diamond

    Surely you just want a CASE statement?

    ,case when shipment_source_location = 'DOMAIN.LINE' and shipment_destination = 'DOMAIN.FACTORY YARD' then 1
          when shipment_source_location = 'DOMAIN.FACTORY YARD' and shipment_destination = 'DOMAIN.FACTORY YARD' then 2
    ...
    ...
    end as sequence
     
    

    or a lookup table

    create table shipment_move_sequence as
      select 'DOMAIN.LINE' as src, 'DOMAIN.FACTORY YARD' as dst, 1 as seq from dual union all
      select 'DOMAIN.FACTORY YARD', 'DOMAIN.FACTORY YARD', 2 from dual union all
    .. etc.
    

    and then join on that lookup table

    join shipment_move_sequence ms on (    ms.src = rawdata.shipment_source_location
                                       and ms.dst = rawdata.shipment_destination)
    

    to then give you the seq number you want.

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,783 Red Diamond

    Hi, @User_XY6RS

    Whenever you have a question, please post a little sample data (CREATE TABLE and INSERT statements for all tables involved, relevant columns only) so the people who want to help you can re-create the problem and test their ideas. Also post the exact results you want from that data, and explain why you want those results from that data. Always post your complete Oracle version (e.g. 18.4.0.0.0).

    If you want to map the numbers 1-7 to rows based on the shipment_source_loca and shipment_destination columns, then you should have a lookup table with those two columns, plus the desired result. (You could also use a CASE expression, but a lookup table is better.) If you need to generate consecutive integers 1, 2, 3, ... even when some steps are missing, then start with the lookup table, then use the analytic ROW_NUMBER function.

  • mathguy
    mathguy Member Posts: 10,914 Black Diamond

    Absolutely crucial: Do what Mr. Kulash asked you to do. So it is clear: Tell us your Oracle version, and post CREATE TABLE and INSERT statements to present the sample data in a usable format (so we can test - we can't test on images).

    Additional questions: I assume that, other than the origin and the destination, you don't know ahead of time where the stops are (EY_1, PORT A etc.) BluShadow assumed these are known and never change, but I believe your problem is harder than that, correct?

    Then - can you guarantee that the route doesn't cycle? For example - go from Dock to Port A, do some processing there, then return to Dock for some reason? Of course, if there are cycles, then the problem may simply be unsolvable; either way, please clarify.

    And, will there always be a PROCESSING step at absolutely every stop along the route? Or will, in some cases, a transport (say to Port B) be followed immediately by another transport, with no processing at Port B? The problem can be solved either way, but if we know beforehand (and guaranteed) that there is PROCESSING at every intermediate step, then the problem is easier.

  • BluShadow
    BluShadow Member, Moderator Posts: 42,580 Red Diamond


    BluShadow assumed these are known and never change, but I believe your problem is harder than that, correct?

    Well OP did say:

    Background: Order_Release_Gid starts from LINE and goes to PORT B. This Order will be broken into 7 movement pices as:

    Line to Factory Yard

    Factory Yard - Factory Yard

    Factory Yard - EY1

    EY1 -EY1 ...etc

    ... which kind of indicates these are 7 known movements that can be used.

    If it's not a fixed list and the items can move all over the place, then you'd have to rely on other data, such as the time of movement etc. to give an order sequence. But that doesn't seem to be what the OP is asking for.

    Of course we could both be wrong. 😁

  • mathguy
    mathguy Member Posts: 10,914 Black Diamond

    @BluShadow

    If it's not a fixed list and the items can move all over the place, then you'd have to rely on other data, such as the time of movement etc. to give an order sequence.

    Not necessarily. The data as presented may already suffice (in fact, if anything, it is redundant - the origin and the final destination don't need to be present in the data either, they can be "calculated" from other data).

    Namely, that would be the case, for example, if the answers to the questions I asked to the OP are what one would expect them to be.

    What you need for each "order" ("orders" are separate entities, independent of each other) is a set of rows, each representing either a "transport" step (destination different from origin) or a "processing" step (done in place). Processing steps may even be allowed in some places but not required in other places. The only condition is that there are no loops, and there is only one "processing" step at each place; even "processing" at the first origin and/or at the final destination can be handled. Everything required to figure out the exact sequence is already encoded in the data I presented so far. No need for dates or any other info of any kind.

  • BluShadow
    BluShadow Member, Moderator Posts: 42,580 Red Diamond

    @mathguy So what you're saying is that, on the assumption that the transportation happens in the order of processing then 'transporting' (non processing) at each step, then we can recurse the steps to follow where it's gone and just give a sequence to each step... something like...

    SQL> with recurse(order_release_gid
      2              ,order_source_location
      3              ,order_destination
      4              ,shipment_gid
      5              ,shipment_source_location
      6              ,shipment_destination
      7              ,transport_mode_gid
      8              ,seq
      9              ) as
     10    (
     11    -- first get the starting point
     12    select r.order_release_gid
     13          ,r.order_source_location
     14          ,r.order_destination
     15          ,r.shipment_gid
     16          ,r.shipment_source_location
     17          ,r.shipment_destination
     18          ,r.transport_mode_gid
     19          ,1 as seq
     20    from   raw_data r
     21    where  r.shipment_source_location = r.order_source_location
     22    and    r.transport_mode_gid = 'PROCESS'
     23    union all
     24    -- follow the path of 'transporting' and processing
     25    select rc.order_release_gid
     26          ,rc.order_source_location
     27          ,rc.order_destination
     28          ,r.shipment_gid
     29          ,r.shipment_source_location
     30          ,r.shipment_destination
     31          ,r.transport_mode_gid
     32          ,rc.seq+1 as seq
     33    from   recurse rc
     34           join    raw_data r
     35           on      r.order_release_gid = rc.order_release_gid -- assume primary key
     36              and    (   (    rc.shipment_source_location != rc.shipment_destination
     37                          and r.shipment_source_location = rc.shipment_destination
     38                          and r.transport_mode_gid = 'PROCESS'
     39                         )
     40                      or (    rc.shipment_source_location = rc.shipment_destination
     41                          and r.shipment_source_location = rc.shipment_destination
     42                          and r.transport_mode_gid != 'PROCESS'
     43                         )
     44                     )
     45    )
     46  select *
     47  from   recurse
     48  /
    
    ORDER_RELEASE_GID   ORDER_SOURC ORDER_DESTINA SHIPMENT_GID SHIPMENT_SOURCE_LOC SHIPMENT_DESTINATIO TRANSPORT        SEQ
    ------------------- ----------- ------------- ------------ ------------------- ------------------- --------- ----------
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87823 DOMAIN.LINE         DOMAIN.FACTORY YARD PROCESS            1
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87824 DOMAIN.FACTORY YARD DOMAIN.FACTORY YARD PROCESS            2
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87243 DOMAIN.FACTORY YARD DOMAIN.EY_1         TRANSPORT          3
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87244 DOMAIN.EY_1         DOMAIN.EY_1         PROCESS            4
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87825 DOMAIN.EY_1         DOMAIN.PORT A       TRANSPORT          5
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87826 DOMAIN.PORT A       DOMAIN.PORT A       PROCESS            6
    DOMAIN.20221111_001 DOMAIN.LINE DOMAIN.PORT_B DOMAIN.87829 DOMAIN.PORT A       DOMAIN.PORT B       VESSEL             7
    
    
    7 rows selected.
    

    ... and then that is independent of any fixed lookup of source and destination, but just relies on the order source being specified (with a 'PROCESS') so that we know where to start.

    ... and like you say... assuming there are no 'loops' in the data so that something ends up back where it was before and get's re-processed (which based on my experience of having goods delivered, happens quite often, especially if a delivery attempt was made and the goods were taken back to the depot in order to attempt redelivery on another day)

  • mathguy
    mathguy Member Posts: 10,914 Black Diamond

    @BluShadow What I said is strictly about the problem statement, so we should be able to discuss it outside any code (in any language). I did not read the code you wrote.

    We are probably still not on the same page. In what I described, the order source does not need to be specified in a separate column, it can be inferred from the data. It should be the only node that appears in the shipment source column, but does not appear in the shipment destination column at all, OR it appears only in a "processing" row (meaning, a row where the destination is the same as the source). Note that the "processing" qualification and such may appear in the data if the business user needs it, but it is not required for solving the problem; for our needs, "processing" is just a qualifier for rows where the shipping source and shipping destination are the same, while we will call rows where the destination is not the source "transport" rows. (This does not match the OP's usage, where he has different descriptions - "transport", "vessel" and possibly others; for us, they are all "transport" steps.)

    Suppose the pairs (shipping source, shipping destination) for an order are as below:

    S D
    - -
    M N
    A D
    M M
    D F
    A A
    F M
    

    There is only one node that appears in the S column, but does not appear in the D column except in a "processing" row (which again, by definition, means a row where the source and the destination are the same). That node is A. That must be the shipment source. All I need to be able to infer that is the data above, nothing else. (Of course, the existence and uniqueness of such a node is essential - without that, the problem as described by me cannot be solved.)

    Since there is a "processing" row at A, that should be the first step in the whole sequence. If there was no such processing row at A, the first step would be A to (wherever), in our case (A, D). If there is processing at A, then the row (A, D) is the second in the sequence. Then we look to see if there is a processing step at D; in this case there isn't, but if there was, it would be sequenced immediately after (A, D). Then we look for the next "transport" step, which in this case is (D, F) and assign to it the next sequential number. Etc.

    This works if there are no cycles (other than the processing steps), and if there is at most one processing step at every node. This is in addition to the condition that there is exactly one order source and one order destination (identified similarly to order source, as described earlier). If all these conditions are satisfied, then we only need the data in S and D to assign the sequential numbers.

    I won't attempt to write any code until the OP confirms that this is indeed his or her problem. Perhaps this is what your code does already; I see that you are referencing other columns in it, but maybe that can be changed so it only references the shipment source and destination columns.

  • BluShadow
    BluShadow Member, Moderator Posts: 42,580 Red Diamond

    Yeah, I think we are on the same page. You're just considering that the starting point doesn't have to be specified in the data because it can be inferred from the source/destination data, which is fair enough... but as the OP said that was the "raw data" I was assuming that information is available anyway, so no need to go working it out. 😉

    And as for the "Processing", certainly that can be considered where the source/destination is the same.

    Example of above recursing, but with cut down data...

    Here's the new raw data (without the order source and order destination, or the transport mode specified)...

    SQL> select * from raw_data2
      2  /
    
    ORDER_RELEASE_GID   SHIPMENT_GID SHIPMENT_SOURCE_LOC SHIPMENT_DESTINATIO
    ------------------- ------------ ------------------- -------------------
    DOMAIN.20221111_001 DOMAIN.87244 DOMAIN.EY_1         DOMAIN.EY_1
    DOMAIN.20221111_001 DOMAIN.87243 DOMAIN.FACTORY YARD DOMAIN.EY_1
    DOMAIN.20221111_001 DOMAIN.87825 DOMAIN.EY_1         DOMAIN.PORT A
    DOMAIN.20221111_001 DOMAIN.87826 DOMAIN.PORT A       DOMAIN.PORT A
    DOMAIN.20221111_001 DOMAIN.87829 DOMAIN.PORT A       DOMAIN.PORT B
    DOMAIN.20221111_001 DOMAIN.87823 DOMAIN.LINE         DOMAIN.FACTORY YARD
    DOMAIN.20221111_001 DOMAIN.87824 DOMAIN.FACTORY YARD DOMAIN.FACTORY YARD
    
    7 rows selected.
    

    And then recursing that based on the rules:

    a) starting point is the shipment_source_location that does not appear in the shipment_destination

    b) if the current location is the same as the destination the we are 'processing' and the next will be the same source location but with a different destination.

    c) if the current location is different to the destination then we are 'travelling' and the next will have a source and destination the same as our current destination

    SQL> with recurse(order_release_gid
      2              ,shipment_gid
      3              ,shipment_source_location
      4              ,shipment_destination
      5              ,seq
      6              ) as
      7    (
      8    -- first get the starting point
      9    select r.order_release_gid
     10          ,r.shipment_gid
     11          ,r.shipment_source_location
     12          ,r.shipment_destination
     13          ,1 as seq
     14    from   raw_data2 r
     15    where  r.shipment_source_location not in (select distinct shipment_destination from raw_data2)
     16    union all
     17    -- follow the path of 'transporting' and processing
     18    select rc.order_release_gid
     19          ,r.shipment_gid
     20          ,r.shipment_source_location
     21          ,r.shipment_destination
     22          ,rc.seq+1 as seq
     23    from   recurse rc
     24           join    raw_data2 r
     25           on      r.order_release_gid = rc.order_release_gid -- assume primary key
     26              and    (   (    rc.shipment_source_location != rc.shipment_destination
     27                          and r.shipment_source_location = rc.shipment_destination
     28                          and r.shipment_source_location = r.shipment_destination
     29                         )
     30                      or (    rc.shipment_source_location = rc.shipment_destination
     31                          and r.shipment_source_location = rc.shipment_destination
     32                          and r.shipment_source_location != r.shipment_destination
     33                         )
     34                     )
     35    )
     36  select *
     37  from   recurse
     38  /
    
    ORDER_RELEASE_GID   SHIPMENT_GID SHIPMENT_SOURCE_LOC SHIPMENT_DESTINATIO        SEQ
    ------------------- ------------ ------------------- ------------------- ----------
    DOMAIN.20221111_001 DOMAIN.87823 DOMAIN.LINE         DOMAIN.FACTORY YARD          1
    DOMAIN.20221111_001 DOMAIN.87824 DOMAIN.FACTORY YARD DOMAIN.FACTORY YARD          2
    DOMAIN.20221111_001 DOMAIN.87243 DOMAIN.FACTORY YARD DOMAIN.EY_1                  3
    DOMAIN.20221111_001 DOMAIN.87244 DOMAIN.EY_1         DOMAIN.EY_1                  4
    DOMAIN.20221111_001 DOMAIN.87825 DOMAIN.EY_1         DOMAIN.PORT A                5
    DOMAIN.20221111_001 DOMAIN.87826 DOMAIN.PORT A       DOMAIN.PORT A                6
    DOMAIN.20221111_001 DOMAIN.87829 DOMAIN.PORT A       DOMAIN.PORT B                7
    
    7 rows selected.
    

    Of course this is all supposition depending on what data the OP really has and hoping there are no 'loops'

  • mathguy
    mathguy Member Posts: 10,914 Black Diamond

    @BluShadow

    This appears to be another one of the many abandoned threads.

    Too bad. The question was interesting. I was waiting to post the following solution for the problem as I explained it:

    create table t (s, d) as
        select 'M', 'N' from dual union all
        select 'A', 'D' from dual union all
        select 'M', 'M' from dual union all
        select 'D', 'F' from dual union all
        select 'A', 'A' from dual union all
        select 'F', 'M' from dual
    ;
    
    with
      prep (s, d, rn) as (
        select s, d, row_number() over (partition by s order by case when s = d then 1 else 2 end)
        from   t
      )
    select  s, d, level as seq
    from    prep
    start   with s not in (select d from t where d != s) and rn = 1
    connect by nocycle s = prior d and (rn = 1 or prior s = prior d)
    ;
    
    S   D   SEQ
    --- --- ---
    A   A     1
    A   D     2
    D   F     3
    F   M     4
    M   M     5
    M   N     6
    
    BluShadow