1 2 Previous Next 25 Replies Latest reply on May 4, 2015 8:44 PM by BrendanP

    How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

    Blais

      Can someone who knows Recursive Subquery Factoring (RSF) can tell me if my Dijkstra’s shortest path algorithm is implement correctly and efficiently? Below you will found my SQL query and a trial dataset.

      Below my query, a trial dataset and table definition to contain the data:

      /*Dijkstra Shortest path*/

      WITH UsefulArcs AS

        (SELECT SRC, DST, DISTANCE FROM ARCS

        ),

        Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

        (SELECT SRC AS SrcArc ,

          DST AS DestArc ,

          SRC AS SrcPath ,

          DST AS DestPath ,

          SRC

          ||'>'

          || DST AS Path ,

          ''

          ||DISTANCE AS OpDist ,

          DISTANCE AS DistPath,

          DISTANCE AS DistPathMin

        FROM UsefulArcs

        WHERE SRC=:prmSource

        UNION ALL

        SELECT a.SRC AS SrcArc,

          a.DST AS DestArc,

          Paths.SrcPath ,

          a.DST AS DestPath ,

          Paths.Path

          ||'>'

          ||a.DST AS Path,

          Paths.OpDist

          ||'+'

          ||a.DISTANCE AS OpDist ,

          Paths.DistPath    +a.DISTANCE                          AS DistPath,

          MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

        FROM UsefulArcs a

        INNER JOIN Paths

        ON (Paths.DestArc        =a.SRC)

        WHERE Paths.DistPathMin=Paths.DistPath

        ) SEARCH BREADTH FIRST BY DestPath

        --SEARCH DEPTH FIRST BY DestPath

        SET Order01

      SELECT *

      FROM Paths

      WHERE SrcPath          =:prmSource

      AND DestPath           =:prmSink

      AND DistPath =

        (SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

        )

      ORDER BY DistPath;

      Set prmSource=1 and prmSink=10.

       

      Dijkstra’s algorithm definition: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm


      Problems data: http://www.ime.unicamp.br/~andreani/MS515/capitulo7.pdf

       

      CREATE TABLE "ARCS"

         (   "SRC" NUMBER(3,0),

             "DST" NUMBER(3,0),

             "DISTANCE" NUMBER(3,0)

         ) ;

        CREATE TABLE "NODES"

         (   "NODES" NUMBER(3,0)

         ) ;

      REM INSERTING into ARCS

      SET DEFINE OFF;

      Insert into ARCS (SRC,DST,DISTANCE) values ('1','2','2');

      Insert into ARCS (SRC,DST,DISTANCE) values ('1','3','4');

      Insert into ARCS (SRC,DST,DISTANCE) values ('1','4','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('2','5','7');

      Insert into ARCS (SRC,DST,DISTANCE) values ('3','5','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('4','5','4');

      Insert into ARCS (SRC,DST,DISTANCE) values ('2','6','4');

      Insert into ARCS (SRC,DST,DISTANCE) values ('3','6','2');

      Insert into ARCS (SRC,DST,DISTANCE) values ('4','6','1');

      Insert into ARCS (SRC,DST,DISTANCE) values ('2','7','6');

      Insert into ARCS (SRC,DST,DISTANCE) values ('3','7','4');

      Insert into ARCS (SRC,DST,DISTANCE) values ('4','7','5');

      Insert into ARCS (SRC,DST,DISTANCE) values ('5','8','1');

      Insert into ARCS (SRC,DST,DISTANCE) values ('6','8','6');

      Insert into ARCS (SRC,DST,DISTANCE) values ('7','8','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('5','9','4');

      Insert into ARCS (SRC,DST,DISTANCE) values ('6','9','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('7','9','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('8','10','3');

      Insert into ARCS (SRC,DST,DISTANCE) values ('9','10','4');

      REM INSERTING into NODES

      SET DEFINE OFF;

      Insert into NODES (NODES) values ('1');

      Insert into NODES (NODES) values ('2');

      Insert into NODES (NODES) values ('3');

      Insert into NODES (NODES) values ('4');

      Insert into NODES (NODES) values ('5');

      Insert into NODES (NODES) values ('6');

      Insert into NODES (NODES) values ('7');

      Insert into NODES (NODES) values ('8');

      Insert into NODES (NODES) values ('9');

      Insert into NODES (NODES) values ('10');

       

        CREATE UNIQUE INDEX "ARCS_PK1" ON "ARCS" ("SRC", "DST")

        ;

        CREATE UNIQUE INDEX "NOEUDS_PK1" ON "NODES" ("NODES")

        ;

        ALTER TABLE "ARCS" MODIFY ("SRC" NOT NULL ENABLE);

        ALTER TABLE "ARCS" MODIFY ("DST" NOT NULL ENABLE);

        ALTER TABLE "ARCS" MODIFY ("DISTANCE" NOT NULL ENABLE);

        ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_PK" PRIMARY KEY ("SRC", "DST") ENABLE;

        ALTER TABLE "NODES" MODIFY ("NODES" NOT NULL ENABLE);

        ALTER TABLE "NODES" ADD CONSTRAINT "NOEUDS_PK" PRIMARY KEY ("NODES") ENABLE;

       

        ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK1" FOREIGN KEY ("SRC")

      REFERENCES "NODES" ("NODES") ENABLE;

        ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK2" FOREIGN KEY ("DST")

      REFERENCES "NODES" ("NODES") ENABLE;

        • 1. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
          James Su

          There's one thing you can improve. If prmSink already exists in the path you can prune all the branches longer than that known path.

          1 person found this helpful
          • 2. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
            _jum

            This is no implementation of /*Dijkstra Shortest path*/ imo, but a "simple" brute force algorithm, wich checks all possible ways from prmSource.

             

            • 3. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
              James Su

              There's a restriction with RSF: in the recursive part you can only see the last level of data.

              So the Paths.DistPathMin=Paths.DistPath condition only works for the paths having the same length.

               

               

              If we add this row:

              Insert into ARCS (SRC,DST,DISTANCE) values (1,5,4);

              Now path >1>5>8>10> is the best, but the query wil still search along >1>3>5>8>10> and >1>4>5>8>10> because when it sees "5" at level 3, the "5" at level 2 is invisible and can't be included in the calculation of DistPathMin.

               

               

              In order to see the level 2 "5" node we can add fake nodes to bring the data from all previous level:

               

               

              WITH UsefulArcs AS

                (SELECT SRC, DST, DISTANCE FROM ARCS

                 UNION SELECT SRC,SRC,0 FROM ARCS  ------- add fake nodes

                 UNION SELECT DST,DST,0 FROM ARCS

                ),

                Paths (SrcArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin,LVL,CNT,found_dist) AS

                (SELECT SRC AS SrcArc,

                     SRC AS SrcPath ,

                     DST AS DestPath ,

                     CAST('>'||SRC||'>'||DST||'>' AS VARCHAR2(4000)) AS Path ,

                     ''

                     ||DISTANCE AS OpDist ,

                     DISTANCE AS DistPath,

                     DISTANCE AS DistPathMin

                     ,1 AS LVL

                     ,COUNT(*) OVER() AS CNT

                     ,MIN(CASE WHEN DST=:prmSink THEN DISTANCE END) OVER() AS found_dist ---- the best found distance

                FROM UsefulArcs

                WHERE SRC=:prmSource AND SRC<>DST

                UNION ALL

                SELECT a.SRC AS SrcArc,

                  Paths.SrcPath ,

                  a.DST  AS DestPath ,

                  Paths.Path

                  ||CASE WHEN a.DST<>a.SRC THEN a.DST||'>' END AS Path,

                  Paths.OpDist

                  ||CASE WHEN a.DST <>a.SRC THEN '+'||a.DISTANCE END AS OpDist ,

                  Paths.DistPath    +a.DISTANCE                    AS DistPath,

                  MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

                 ,LVL+1

                 ,COUNT(CASE WHEN a.DST <>a.SRC THEN a.src END) OVER() AS CNT

                 ,MIN(CASE WHEN a.DST=:prmSink AND (paths.found_dist IS NULL OR Paths.DistPath+a.DISTANCE<paths.found_dist)

                           THEN Paths.DistPath+a.DISTANCE ---- if new path found and it's better than replace the old one

                           ELSE paths.found_dist

                      END) OVER() AS found_dist

                FROM UsefulArcs a

                     JOIN Paths

                         ON (Paths.DestPath        =a.SRC)

                            AND (Paths.SrcArc<>Paths.DestPath AND INSTR(Paths.Path,'>'||a.DST||'>')=0 ---- add only new destinations

                                 OR a.DST=a.SRC    ---- or it's the end

                                )

                WHERE Paths.DistPathMin=Paths.DistPath

                      AND cnt>0 ------- if no more new nodes then it's the end

                      AND paths.DestPath<>:prmSink   ---- if we reach prmSink then no need to continue on this path

                      AND (paths.found_dist IS NULL  --- not found yet

                           OR paths.found_dist>=Paths.DistPath+a.DISTANCE --- if not better than the path found then stop

                          )

                )

                CYCLE path,lvl SET cycle_flag TO 'Y' DEFAULT 'N'  ------- cycle may appear because of outer join may keep the same path from last level

              SELECT DISTINCT path,OpDist||'='||DistPath as distance

              FROM Paths

              WHERE SrcPath          =:prmSource

                    AND DestPath     =:prmSink

                    AND DistPath = found_dist

              ;

               

               

               

               

              But this raises another issue when nodes will be repeated at all levels.

              Actually in the final select we only need the latest level of data, but unfortunately there's no option to discard the older levels during the recursive process.

              If we rewrite this RSF into a PL/SQL loop then it's possible to keep only the latest level and implement this algorithm.

              1 person found this helpful
              • 4. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                Etbin

                Something to play with

                 

                with

                arcs as

                (select '1' src,'2' dst,2 d from dual union all

                select '1','3',4 from dual union all

                select '1','4',3 from dual union all

                select '2','5',7 from dual union all

                select '3','5',3 from dual union all

                select '4','5',4 from dual union all

                select '2','6',4 from dual union all

                select '3','6',2 from dual union all

                select '4','6',1 from dual union all

                select '2','7',6 from dual union all

                select '3','7',4 from dual union all

                select '4','7',5 from dual union all

                select '5','8',1 from dual union all

                select '6','8',6 from dual union all

                select '7','8',3 from dual union all

                select '5','9',4 from dual union all

                select '6','9',3 from dual union all

                select '7','9',3 from dual union all

                select '8','10',3 from dual union all

                select '9','10',4 from dual

                ),

                shortest_path(src,dst,path,dist,stop,step) as

                (select src,dst,path,dist,null,1

                   from (select src,:dst dst,src || '~' || dst path,d,min(d) over () dist

                           from arcs

                          where src = :src

                        )

                  where d = dist

                union all

                select a.src,

                        p.dst,p.path || '~' || a.dst || case when a.dst = p.dst then '+' end,

                        case when a.d = min(a.d) over ()

                             then p.dist + a.d

                             when a.dst = p.dst

                             then p.dist + a.d

                        end,

                        last_value(case when a.dst = p.dst then 'stop' end) ignore nulls over (order by null),

                        step + 1

                   from arcs a,

                        shortest_path p

                  where a.src = regexp_substr(p.path,'\d+$')

                    and p.dst != regexp_substr(p.path,'\d+$')

                    and p.dist is not null

                    and p.stop is null

                )

                select rtrim(path,'+') path,dist

                  from shortest_path

                where stop is not null

                   and instr(path,'+') > 0

                 

                PATHDIST
                1~2~6~9~1013

                 

                Regards

                 

                Etbin

                • 5. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                  James Su

                  You pick the closest node each time, which does not necessarily lead to the shortest path. Actually it may not lead to the destination at all.

                  • 6. Re: Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                    Etbin

                    That's why something to play with stands for.

                    I just proposed a query (using recursive subquery factoring) that (in some cases) produces a solution, so I thought others might be interested in - I was playing too .

                    I'm here mainly to learn and to stay exercised as providing a solution of some kind is (for a non-native speaker) much more simple than debating about how it might have been achieved and a forum IMHO is certainly not the place to exhibit proofs whether a certain algorithm is correct or not.

                    As I'm typing this already I can share my reasoning:

                    • at each step I pick the nodes reachable from the starting node or from the last node on the current shortest path.
                    • If the nodes include the destination node I pick the destination node and stop else I concatenate the nearest node to the shortest path incrementing the distance.

                    I don't claim it solves a general case leaving to others to find examples when it doesn't work.

                     

                    Regards

                     

                    Etbin

                     

                    As there's not much traffic an easy couter example where my query would give an incorrect result

                     

                        1

                       / \

                      2   3

                    /      \

                    2___4___3

                    \        \

                      5        2

                       \        \

                        \________4


                    with

                    arcs as

                    (select '1' src,'2' dst,2 d from dual union all

                    select '1','3',3 from dual union all

                    select '2','3',4 from dual union all

                    select '2','4',5 from dual union all

                    select '3','4',2 from dual

                    )


                     

                    PATH5DIST
                    1~2~47


                    • 8. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                      BrendanP

                      This solution finds all minimum distance routes to each node. I stuck with the lettered nodes in the data set link.

                       

                      NODE              LEV PATH                            DST_PATH
                      ---------- ---------- ------------------------------ ----------
                      B                  1 A,B                                    2
                      C                  1 A,C                                    4
                      D                  1 A,D                                    3
                      E                  2 A,D,E                                  7
                                          2 A,C,E                                  7
                      F                  2 A,D,F                                  4
                      G                  2 A,D,G                                  8
                                          2 A,B,G                                  8
                                          2 A,C,G                                  8
                      H                  3 A,D,E,H                                8
                                          3 A,C,E,H                                8
                      I                  3 A,D,F,I                                7
                      J                  4 A,C,E,H,J                              11
                                          4 A,D,F,I,J                              11
                                          4 A,D,E,H,J                              11
                      
                      
                      15 rows selected.
                      
                      
                        1  WITH arcs_both_ways AS (
                        2  SELECT src, dst, distance
                        3    FROM arcs
                        4  UNION ALL
                        5  SELECT dst, src, distance
                        6    FROM arcs
                        7  ), paths (node, path, dst_path, rnk, nodes_visited, lev) AS (
                        8  SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, a.src || ','  || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (), 1
                        9    FROM arcs_both_ways a
                      10  WHERE a.src            = :prmSource
                      11  UNION ALL
                      12  SELECT a.dst,
                      13          p.path || ',' || a.dst,
                      14          p.dst_path + a.distance,
                      15          Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
                      16          p.nodes_visited || ',' || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (),
                      17          p.lev + 1
                      18    FROM paths p
                      19    JOIN arcs_both_ways a
                      20      ON a.src = p.node
                      21    AND p.rnk = 1
                      22  WHERE Instr (',' || p.path || ',',  ',' || a.dst || ',') = 0
                      23  ), paths_ranked AS (
                      24  SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
                      25    FROM paths
                      26    WHERE rnk = 1
                      27  )
                      28  SELECT node, lev, path, dst_path
                      29    FROM paths_ranked
                      30    WHERE rnk_tot = 1
                      31*  ORDER BY 1, 2
                      
                      

                       

                      130415 0705:

                      Small tidy-up, and use Oracle's cycle direction to replace my instr. Will try to add a comment later today, but suggest to OP to try reversing his source and sink.

                       

                      WITH arcs_both_ways AS (
                      SELECT src, dst, distance
                        FROM arcs
                       UNION ALL
                      SELECT dst, src, distance
                        FROM arcs
                      ), paths (node, path, dst_path, rnk, lev) AS (
                      SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, 1
                        FROM arcs_both_ways a
                      WHERE a.src             = :prmSource
                       UNION ALL
                      SELECT a.dst, 
                              p.path || ',' || a.dst, 
                              p.dst_path + a.distance, 
                              Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
                              p.lev + 1
                        FROM paths p
                        JOIN arcs_both_ways a
                          ON a.src = p.node
                         AND p.rnk = 1
                      ) CYCLE node SET IS_CYCLE TO '1' DEFAULT '0'
                      , paths_ranked AS (
                      SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
                        FROM paths
                        WHERE rnk = 1
                      )
                      SELECT node, lev, path, dst_path
                        FROM paths_ranked
                        WHERE rnk_tot = 1
                        ORDER BY 1, 2
                      
                      • 9. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                        James Su

                        Your query is very similar to OP's and the rank() has the same problem that it can't see the same dst which appears prior to the last level. This isn't as efficient as Dijkstra's algorithm but still gives the correct result.

                        • 10. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                          James Su

                          It's a very interesting thread, I will spend a day to study it!

                          • 11. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                            Blais

                            You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.

                            • 12. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                              Blais

                              I know that this is not Dijkstra’s. It’s the reason of this post. Below is the "brute force" shortest path. This enumerates 107 paths. In comparison, the first query that I propose only enumerate 27 paths. It's less but it can be better as @James Su mention. We can add a global cut after the first solution is found. We also have to found how to compare branches from different level.

                               

                              /*Shortest path*/

                              WITH UsefulArcs AS

                                (SELECT SRC, DST, DISTANCE FROM ARCS

                                ),

                                Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

                                (SELECT SRC AS SrcArc ,

                                  DST AS DestArc ,

                                  SRC AS SrcPath ,

                                  DST AS DestPath ,

                                  SRC

                                  ||'>'

                                  || DST AS Path ,

                                  ''

                                  ||DISTANCE AS OpDist ,

                                  DISTANCE AS DistPath,

                                  DISTANCE AS DistPathMin

                                FROM UsefulArcs

                                UNION ALL

                                SELECT a.SRC AS SrcArc,

                                  a.DST AS DestArc,

                                  Paths.SrcPath ,

                                  a.DST AS DestPath ,

                                  Paths.Path

                                  ||'>'

                                  ||a.DST AS Path,

                                  Paths.OpDist

                                  ||'+'

                                  ||a.DISTANCE AS OpDist ,

                                  Paths.DistPath    +a.DISTANCE                          AS ,

                                  MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

                                FROM UsefulArcs a

                                INNER JOIN Paths

                                ON (Paths.DestArc=a.SRC)

                                ) SEARCH BREADTH FIRST BY DestPath

                                --SEARCH DEPTH FIRST BY DestPath

                                SET Order01

                              SELECT *

                              FROM Paths

                              WHERE SrcPath=:prmSource

                              AND DestPath =:prmSink

                              AND DistPath =

                                (SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

                                )

                              ORDER BY DistPath;

                              • 13. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                                jihuyao

                                Blais wrote:

                                 

                                You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.

                                 

                                It had better to quickly find one path to destination and use its distance as reference to eliminate all other paths with longer distances for next level process.  Otherwise one may take a trip around the would and then back to the destination .  Before that it is hard to drop any path for further blind searching.

                                 

                                But for large collection of nodes and far away between start and destination points, at any intermediate point (M), you can pick the one with MIN distance in EACH SAME group and continue searching, eg for all rows in such group.  Such aggregate function during recursive process will be expensive but it can certainly save PGA for longer process.

                                 

                                (S)>A>B>(M) 

                                (S)>B>A>(M)

                                • 14. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?
                                  James Su

                                  How do you pick the M point? If it's from the neighbours then this is already described in the algorithm.

                                  1 2 Previous Next