1 2 Previous Next 25 Replies Latest reply on May 4, 2015 8:44 PM by BrendanP

# How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

Can someone who knows Recursive Subquery Factoring (RSF) can tell me if my Dijkstra’s shortest path algorithm is implement correctly and efficiently? Below you will found my SQL query and a trial dataset.

Below my query, a trial dataset and table definition to contain the data:

/*Dijkstra Shortest path*/

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

),

Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

(SELECT SRC AS SrcArc ,

DST AS DestArc ,

SRC AS SrcPath ,

DST AS DestPath ,

SRC

||'>'

|| DST AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

FROM UsefulArcs

WHERE SRC=:prmSource

UNION ALL

SELECT a.SRC AS SrcArc,

a.DST AS DestArc,

Paths.SrcPath ,

a.DST AS DestPath ,

Paths.Path

||'>'

||a.DST AS Path,

Paths.OpDist

||'+'

||a.DISTANCE AS OpDist ,

Paths.DistPath    +a.DISTANCE                          AS DistPath,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

FROM UsefulArcs a

INNER JOIN Paths

ON (Paths.DestArc        =a.SRC)

WHERE Paths.DistPathMin=Paths.DistPath

) SEARCH BREADTH FIRST BY DestPath

--SEARCH DEPTH FIRST BY DestPath

SET Order01

SELECT *

FROM Paths

WHERE SrcPath          =:prmSource

AND DestPath           =:prmSink

AND DistPath =

(SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

)

ORDER BY DistPath;

Set prmSource=1 and prmSink=10.

Dijkstra’s algorithm definition: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Problems data: http://www.ime.unicamp.br/~andreani/MS515/capitulo7.pdf

CREATE TABLE "ARCS"

(   "SRC" NUMBER(3,0),

"DST" NUMBER(3,0),

"DISTANCE" NUMBER(3,0)

) ;

CREATE TABLE "NODES"

(   "NODES" NUMBER(3,0)

) ;

REM INSERTING into ARCS

SET DEFINE OFF;

Insert into ARCS (SRC,DST,DISTANCE) values ('1','2','2');

Insert into ARCS (SRC,DST,DISTANCE) values ('1','3','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('1','4','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','5','7');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','5','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','5','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','6','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','6','2');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','6','1');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','7','6');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','7','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','7','5');

Insert into ARCS (SRC,DST,DISTANCE) values ('5','8','1');

Insert into ARCS (SRC,DST,DISTANCE) values ('6','8','6');

Insert into ARCS (SRC,DST,DISTANCE) values ('7','8','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('5','9','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('6','9','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('7','9','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('8','10','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('9','10','4');

REM INSERTING into NODES

SET DEFINE OFF;

Insert into NODES (NODES) values ('1');

Insert into NODES (NODES) values ('2');

Insert into NODES (NODES) values ('3');

Insert into NODES (NODES) values ('4');

Insert into NODES (NODES) values ('5');

Insert into NODES (NODES) values ('6');

Insert into NODES (NODES) values ('7');

Insert into NODES (NODES) values ('8');

Insert into NODES (NODES) values ('9');

Insert into NODES (NODES) values ('10');

CREATE UNIQUE INDEX "ARCS_PK1" ON "ARCS" ("SRC", "DST")

;

CREATE UNIQUE INDEX "NOEUDS_PK1" ON "NODES" ("NODES")

;

ALTER TABLE "ARCS" MODIFY ("SRC" NOT NULL ENABLE);

ALTER TABLE "ARCS" MODIFY ("DST" NOT NULL ENABLE);

ALTER TABLE "ARCS" MODIFY ("DISTANCE" NOT NULL ENABLE);

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_PK" PRIMARY KEY ("SRC", "DST") ENABLE;

ALTER TABLE "NODES" MODIFY ("NODES" NOT NULL ENABLE);

ALTER TABLE "NODES" ADD CONSTRAINT "NOEUDS_PK" PRIMARY KEY ("NODES") ENABLE;

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK1" FOREIGN KEY ("SRC")

REFERENCES "NODES" ("NODES") ENABLE;

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK2" FOREIGN KEY ("DST")

REFERENCES "NODES" ("NODES") ENABLE;

• ###### 1. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

There's one thing you can improve. If prmSink already exists in the path you can prune all the branches longer than that known path.

• ###### 2. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

This is no implementation of /*Dijkstra Shortest path*/ imo, but a "simple" brute force algorithm, wich checks all possible ways from prmSource.

• ###### 3. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

There's a restriction with RSF: in the recursive part you can only see the last level of data.

So the Paths.DistPathMin=Paths.DistPath condition only works for the paths having the same length.

Insert into ARCS (SRC,DST,DISTANCE) values (1,5,4);

Now path >1>5>8>10> is the best, but the query wil still search along >1>3>5>8>10> and >1>4>5>8>10> because when it sees "5" at level 3, the "5" at level 2 is invisible and can't be included in the calculation of DistPathMin.

In order to see the level 2 "5" node we can add fake nodes to bring the data from all previous level:

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

UNION SELECT SRC,SRC,0 FROM ARCS  ------- add fake nodes

UNION SELECT DST,DST,0 FROM ARCS

),

Paths (SrcArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin,LVL,CNT,found_dist) AS

(SELECT SRC AS SrcArc,

SRC AS SrcPath ,

DST AS DestPath ,

CAST('>'||SRC||'>'||DST||'>' AS VARCHAR2(4000)) AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

,1 AS LVL

,COUNT(*) OVER() AS CNT

,MIN(CASE WHEN DST=:prmSink THEN DISTANCE END) OVER() AS found_dist ---- the best found distance

FROM UsefulArcs

WHERE SRC=:prmSource AND SRC<>DST

UNION ALL

SELECT a.SRC AS SrcArc,

Paths.SrcPath ,

a.DST  AS DestPath ,

Paths.Path

||CASE WHEN a.DST<>a.SRC THEN a.DST||'>' END AS Path,

Paths.OpDist

||CASE WHEN a.DST <>a.SRC THEN '+'||a.DISTANCE END AS OpDist ,

Paths.DistPath    +a.DISTANCE                    AS DistPath,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

,LVL+1

,COUNT(CASE WHEN a.DST <>a.SRC THEN a.src END) OVER() AS CNT

,MIN(CASE WHEN a.DST=:prmSink AND (paths.found_dist IS NULL OR Paths.DistPath+a.DISTANCE<paths.found_dist)

THEN Paths.DistPath+a.DISTANCE ---- if new path found and it's better than replace the old one

ELSE paths.found_dist

END) OVER() AS found_dist

FROM UsefulArcs a

JOIN Paths

ON (Paths.DestPath        =a.SRC)

AND (Paths.SrcArc<>Paths.DestPath AND INSTR(Paths.Path,'>'||a.DST||'>')=0 ---- add only new destinations

OR a.DST=a.SRC    ---- or it's the end

)

WHERE Paths.DistPathMin=Paths.DistPath

AND cnt>0 ------- if no more new nodes then it's the end

AND paths.DestPath<>:prmSink   ---- if we reach prmSink then no need to continue on this path

OR paths.found_dist>=Paths.DistPath+a.DISTANCE --- if not better than the path found then stop

)

)

CYCLE path,lvl SET cycle_flag TO 'Y' DEFAULT 'N'  ------- cycle may appear because of outer join may keep the same path from last level

SELECT DISTINCT path,OpDist||'='||DistPath as distance

FROM Paths

WHERE SrcPath          =:prmSource

AND DestPath     =:prmSink

AND DistPath = found_dist

;

But this raises another issue when nodes will be repeated at all levels.

Actually in the final select we only need the latest level of data, but unfortunately there's no option to discard the older levels during the recursive process.

If we rewrite this RSF into a PL/SQL loop then it's possible to keep only the latest level and implement this algorithm.

• ###### 4. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

Something to play with

with

arcs as

(select '1' src,'2' dst,2 d from dual union all

select '1','3',4 from dual union all

select '1','4',3 from dual union all

select '2','5',7 from dual union all

select '3','5',3 from dual union all

select '4','5',4 from dual union all

select '2','6',4 from dual union all

select '3','6',2 from dual union all

select '4','6',1 from dual union all

select '2','7',6 from dual union all

select '3','7',4 from dual union all

select '4','7',5 from dual union all

select '5','8',1 from dual union all

select '6','8',6 from dual union all

select '7','8',3 from dual union all

select '5','9',4 from dual union all

select '6','9',3 from dual union all

select '7','9',3 from dual union all

select '8','10',3 from dual union all

select '9','10',4 from dual

),

shortest_path(src,dst,path,dist,stop,step) as

(select src,dst,path,dist,null,1

from (select src,:dst dst,src || '~' || dst path,d,min(d) over () dist

from arcs

where src = :src

)

where d = dist

union all

select a.src,

p.dst,p.path || '~' || a.dst || case when a.dst = p.dst then '+' end,

case when a.d = min(a.d) over ()

then p.dist + a.d

when a.dst = p.dst

then p.dist + a.d

end,

last_value(case when a.dst = p.dst then 'stop' end) ignore nulls over (order by null),

step + 1

from arcs a,

shortest_path p

where a.src = regexp_substr(p.path,'\d+\$')

and p.dst != regexp_substr(p.path,'\d+\$')

and p.dist is not null

and p.stop is null

)

select rtrim(path,'+') path,dist

from shortest_path

where stop is not null

and instr(path,'+') > 0

 PATH DIST 1~2~6~9~10 13

Regards

Etbin

• ###### 5. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

You pick the closest node each time, which does not necessarily lead to the shortest path. Actually it may not lead to the destination at all.

• ###### 6. Re: Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

That's why something to play with stands for.

I just proposed a query (using recursive subquery factoring) that (in some cases) produces a solution, so I thought others might be interested in - I was playing too .

I'm here mainly to learn and to stay exercised as providing a solution of some kind is (for a non-native speaker) much more simple than debating about how it might have been achieved and a forum IMHO is certainly not the place to exhibit proofs whether a certain algorithm is correct or not.

As I'm typing this already I can share my reasoning:

• at each step I pick the nodes reachable from the starting node or from the last node on the current shortest path.
• If the nodes include the destination node I pick the destination node and stop else I concatenate the nearest node to the shortest path incrementing the distance.

I don't claim it solves a general case leaving to others to find examples when it doesn't work.

Regards

Etbin

As there's not much traffic an easy couter example where my query would give an incorrect result

1

/ \

2   3

/      \

2___4___3

\        \

5        2

\        \

\________4

with

arcs as

(select '1' src,'2' dst,2 d from dual union all

select '1','3',3 from dual union all

select '2','3',4 from dual union all

select '2','4',5 from dual union all

select '3','4',2 from dual

)

 PATH5 DIST 1~2~4 7

• ###### 7. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

Re: Analytic functions in Model clause

• ###### 8. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

This solution finds all minimum distance routes to each node. I stuck with the lettered nodes in the data set link.

```NODE              LEV PATH                            DST_PATH
---------- ---------- ------------------------------ ----------
B                  1 A,B                                    2
C                  1 A,C                                    4
D                  1 A,D                                    3
E                  2 A,D,E                                  7
2 A,C,E                                  7
F                  2 A,D,F                                  4
G                  2 A,D,G                                  8
2 A,B,G                                  8
2 A,C,G                                  8
H                  3 A,D,E,H                                8
3 A,C,E,H                                8
I                  3 A,D,F,I                                7
J                  4 A,C,E,H,J                              11
4 A,D,F,I,J                              11
4 A,D,E,H,J                              11

15 rows selected.

1  WITH arcs_both_ways AS (
2  SELECT src, dst, distance
3    FROM arcs
4  UNION ALL
5  SELECT dst, src, distance
6    FROM arcs
7  ), paths (node, path, dst_path, rnk, nodes_visited, lev) AS (
8  SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, a.src || ','  || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (), 1
9    FROM arcs_both_ways a
10  WHERE a.src            = :prmSource
11  UNION ALL
12  SELECT a.dst,
13          p.path || ',' || a.dst,
14          p.dst_path + a.distance,
15          Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
16          p.nodes_visited || ',' || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (),
17          p.lev + 1
18    FROM paths p
19    JOIN arcs_both_ways a
20      ON a.src = p.node
21    AND p.rnk = 1
22  WHERE Instr (',' || p.path || ',',  ',' || a.dst || ',') = 0
23  ), paths_ranked AS (
24  SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
25    FROM paths
26    WHERE rnk = 1
27  )
28  SELECT node, lev, path, dst_path
29    FROM paths_ranked
30    WHERE rnk_tot = 1
31*  ORDER BY 1, 2

```

130415 0705:

Small tidy-up, and use Oracle's cycle direction to replace my instr. Will try to add a comment later today, but suggest to OP to try reversing his source and sink.

```WITH arcs_both_ways AS (
SELECT src, dst, distance
FROM arcs
UNION ALL
SELECT dst, src, distance
FROM arcs
), paths (node, path, dst_path, rnk, lev) AS (
SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, 1
FROM arcs_both_ways a
WHERE a.src             = :prmSource
UNION ALL
SELECT a.dst,
p.path || ',' || a.dst,
p.dst_path + a.distance,
Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
p.lev + 1
FROM paths p
JOIN arcs_both_ways a
ON a.src = p.node
AND p.rnk = 1
) CYCLE node SET IS_CYCLE TO '1' DEFAULT '0'
, paths_ranked AS (
SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
FROM paths
WHERE rnk = 1
)
SELECT node, lev, path, dst_path
FROM paths_ranked
WHERE rnk_tot = 1
ORDER BY 1, 2
```
• ###### 9. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

Your query is very similar to OP's and the rank() has the same problem that it can't see the same dst which appears prior to the last level. This isn't as efficient as Dijkstra's algorithm but still gives the correct result.

• ###### 10. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

It's a very interesting thread, I will spend a day to study it!

• ###### 11. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.

• ###### 12. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

I know that this is not Dijkstra’s. It’s the reason of this post. Below is the "brute force" shortest path. This enumerates 107 paths. In comparison, the first query that I propose only enumerate 27 paths. It's less but it can be better as @James Su mention. We can add a global cut after the first solution is found. We also have to found how to compare branches from different level.

/*Shortest path*/

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

),

Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

(SELECT SRC AS SrcArc ,

DST AS DestArc ,

SRC AS SrcPath ,

DST AS DestPath ,

SRC

||'>'

|| DST AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

FROM UsefulArcs

UNION ALL

SELECT a.SRC AS SrcArc,

a.DST AS DestArc,

Paths.SrcPath ,

a.DST AS DestPath ,

Paths.Path

||'>'

||a.DST AS Path,

Paths.OpDist

||'+'

||a.DISTANCE AS OpDist ,

Paths.DistPath    +a.DISTANCE                          AS ,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

FROM UsefulArcs a

INNER JOIN Paths

ON (Paths.DestArc=a.SRC)

) SEARCH BREADTH FIRST BY DestPath

--SEARCH DEPTH FIRST BY DestPath

SET Order01

SELECT *

FROM Paths

WHERE SrcPath=:prmSource

AND DestPath =:prmSink

AND DistPath =

(SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

)

ORDER BY DistPath;

• ###### 13. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

Blais wrote:

You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.

It had better to quickly find one path to destination and use its distance as reference to eliminate all other paths with longer distances for next level process.  Otherwise one may take a trip around the would and then back to the destination .  Before that it is hard to drop any path for further blind searching.

But for large collection of nodes and far away between start and destination points, at any intermediate point (M), you can pick the one with MIN distance in EACH SAME group and continue searching, eg for all rows in such group.  Such aggregate function during recursive process will be expensive but it can certainly save PGA for longer process.

(S)>A>B>(M)

(S)>B>A>(M)

• ###### 14. Re: How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

How do you pick the M point? If it's from the neighbours then this is already described in the algorithm.

1 2 Previous Next