## Forum Stats

• 3,733,057 Users
• 2,246,687 Discussions

Discussions

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# How to use Recursive Subquery Factoring (RSF) to Implement Dijkstra’s shortest path algorithm?

edited May 2015

Can someone who knows Recursive Subquery Factoring (RSF) can tell me if my Dijkstra’s shortest path algorithm is implement correctly and efficiently? Below you will found my SQL query and a trial dataset.

Below my query, a trial dataset and table definition to contain the data:

/*Dijkstra Shortest path*/

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

),

Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

(SELECT SRC AS SrcArc ,

DST AS DestArc ,

SRC AS SrcPath ,

DST AS DestPath ,

SRC

||'>'

|| DST AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

FROM UsefulArcs

WHERE SRC=:prmSource

UNION ALL

SELECT a.SRC AS SrcArc,

a.DST AS DestArc,

Paths.SrcPath ,

a.DST AS DestPath ,

Paths.Path

||'>'

||a.DST AS Path,

Paths.OpDist

||'+'

||a.DISTANCE AS OpDist ,

Paths.DistPath    +a.DISTANCE                          AS DistPath,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

FROM UsefulArcs a

INNER JOIN Paths

ON (Paths.DestArc        =a.SRC)

WHERE Paths.DistPathMin=Paths.DistPath

) SEARCH BREADTH FIRST BY DestPath

--SEARCH DEPTH FIRST BY DestPath

SET Order01

SELECT *

FROM Paths

WHERE SrcPath          =:prmSource

AND DestPath           =:prmSink

AND DistPath =

(SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

)

ORDER BY DistPath;

Set prmSource=1 and prmSink=10.

Dijkstra’s algorithm definition: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Problems data: http://www.ime.unicamp.br/~andreani/MS515/capitulo7.pdf

CREATE TABLE "ARCS"

(   "SRC" NUMBER(3,0),

"DST" NUMBER(3,0),

"DISTANCE" NUMBER(3,0)

) ;

CREATE TABLE "NODES"

(   "NODES" NUMBER(3,0)

) ;

REM INSERTING into ARCS

SET DEFINE OFF;

Insert into ARCS (SRC,DST,DISTANCE) values ('1','2','2');

Insert into ARCS (SRC,DST,DISTANCE) values ('1','3','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('1','4','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','5','7');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','5','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','5','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','6','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','6','2');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','6','1');

Insert into ARCS (SRC,DST,DISTANCE) values ('2','7','6');

Insert into ARCS (SRC,DST,DISTANCE) values ('3','7','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('4','7','5');

Insert into ARCS (SRC,DST,DISTANCE) values ('5','8','1');

Insert into ARCS (SRC,DST,DISTANCE) values ('6','8','6');

Insert into ARCS (SRC,DST,DISTANCE) values ('7','8','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('5','9','4');

Insert into ARCS (SRC,DST,DISTANCE) values ('6','9','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('7','9','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('8','10','3');

Insert into ARCS (SRC,DST,DISTANCE) values ('9','10','4');

REM INSERTING into NODES

SET DEFINE OFF;

Insert into NODES (NODES) values ('1');

Insert into NODES (NODES) values ('2');

Insert into NODES (NODES) values ('3');

Insert into NODES (NODES) values ('4');

Insert into NODES (NODES) values ('5');

Insert into NODES (NODES) values ('6');

Insert into NODES (NODES) values ('7');

Insert into NODES (NODES) values ('8');

Insert into NODES (NODES) values ('9');

Insert into NODES (NODES) values ('10');

CREATE UNIQUE INDEX "ARCS_PK1" ON "ARCS" ("SRC", "DST")

;

CREATE UNIQUE INDEX "NOEUDS_PK1" ON "NODES" ("NODES")

;

ALTER TABLE "ARCS" MODIFY ("SRC" NOT NULL ENABLE);

ALTER TABLE "ARCS" MODIFY ("DST" NOT NULL ENABLE);

ALTER TABLE "ARCS" MODIFY ("DISTANCE" NOT NULL ENABLE);

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_PK" PRIMARY KEY ("SRC", "DST") ENABLE;

ALTER TABLE "NODES" MODIFY ("NODES" NOT NULL ENABLE);

ALTER TABLE "NODES" ADD CONSTRAINT "NOEUDS_PK" PRIMARY KEY ("NODES") ENABLE;

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK1" FOREIGN KEY ("SRC")

REFERENCES "NODES" ("NODES") ENABLE;

ALTER TABLE "ARCS" ADD CONSTRAINT "ARCS_FK2" FOREIGN KEY ("DST")

REFERENCES "NODES" ("NODES") ENABLE;

• Member Posts: 1,100 Silver Trophy
edited April 2015

There's one thing you can improve. If prmSink already exists in the path you can prune all the branches longer than that known path.

• Member Posts: 494 Silver Badge
edited April 2015

This is no implementation of /*Dijkstra Shortest path*/ imo, but a "simple" brute force algorithm, wich checks all possible ways from prmSource.

• Member Posts: 1,100 Silver Trophy
edited April 2015

There's a restriction with RSF: in the recursive part you can only see the last level of data.

So the Paths.DistPathMin=Paths.DistPath condition only works for the paths having the same length.

Insert into ARCS (SRC,DST,DISTANCE) values (1,5,4);

Now path >1>5>8>10> is the best, but the query wil still search along >1>3>5>8>10> and >1>4>5>8>10> because when it sees "5" at level 3, the "5" at level 2 is invisible and can't be included in the calculation of DistPathMin.

In order to see the level 2 "5" node we can add fake nodes to bring the data from all previous level:

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

UNION SELECT SRC,SRC,0 FROM ARCS  ------- add fake nodes

UNION SELECT DST,DST,0 FROM ARCS

),

Paths (SrcArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin,LVL,CNT,found_dist) AS

(SELECT SRC AS SrcArc,

SRC AS SrcPath ,

DST AS DestPath ,

CAST('>'||SRC||'>'||DST||'>' AS VARCHAR2(4000)) AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

,1 AS LVL

,COUNT(*) OVER() AS CNT

,MIN(CASE WHEN DST=:prmSink THEN DISTANCE END) OVER() AS found_dist ---- the best found distance

FROM UsefulArcs

WHERE SRC=:prmSource AND SRC<>DST

UNION ALL

SELECT a.SRC AS SrcArc,

Paths.SrcPath ,

a.DST  AS DestPath ,

Paths.Path

||CASE WHEN a.DST<>a.SRC THEN a.DST||'>' END AS Path,

Paths.OpDist

||CASE WHEN a.DST <>a.SRC THEN '+'||a.DISTANCE END AS OpDist ,

Paths.DistPath    +a.DISTANCE                    AS DistPath,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

,LVL+1

,COUNT(CASE WHEN a.DST <>a.SRC THEN a.src END) OVER() AS CNT

,MIN(CASE WHEN a.DST=:prmSink AND (paths.found_dist IS NULL OR Paths.DistPath+a.DISTANCE<paths.found_dist)

THEN Paths.DistPath+a.DISTANCE ---- if new path found and it's better than replace the old one

ELSE paths.found_dist

END) OVER() AS found_dist

FROM UsefulArcs a

JOIN Paths

ON (Paths.DestPath        =a.SRC)

AND (Paths.SrcArc<>Paths.DestPath AND INSTR(Paths.Path,'>'||a.DST||'>')=0 ---- add only new destinations

OR a.DST=a.SRC    ---- or it's the end

)

WHERE Paths.DistPathMin=Paths.DistPath

AND cnt>0 ------- if no more new nodes then it's the end

AND paths.DestPath<>:prmSink   ---- if we reach prmSink then no need to continue on this path

OR paths.found_dist>=Paths.DistPath+a.DISTANCE --- if not better than the path found then stop

)

)

CYCLE path,lvl SET cycle_flag TO 'Y' DEFAULT 'N'  ------- cycle may appear because of outer join may keep the same path from last level

SELECT DISTINCT path,OpDist||'='||DistPath as distance

FROM Paths

WHERE SrcPath          =:prmSource

AND DestPath     =:prmSink

AND DistPath = found_dist

;

But this raises another issue when nodes will be repeated at all levels.

Actually in the final select we only need the latest level of data, but unfortunately there's no option to discard the older levels during the recursive process.

If we rewrite this RSF into a PL/SQL loop then it's possible to keep only the latest level and implement this algorithm.

• Member Posts: 8,968 Gold Crown
edited April 2015

Something to play with

with

arcs as

(select '1' src,'2' dst,2 d from dual union all

select '1','3',4 from dual union all

select '1','4',3 from dual union all

select '2','5',7 from dual union all

select '3','5',3 from dual union all

select '4','5',4 from dual union all

select '2','6',4 from dual union all

select '3','6',2 from dual union all

select '4','6',1 from dual union all

select '2','7',6 from dual union all

select '3','7',4 from dual union all

select '4','7',5 from dual union all

select '5','8',1 from dual union all

select '6','8',6 from dual union all

select '7','8',3 from dual union all

select '5','9',4 from dual union all

select '6','9',3 from dual union all

select '7','9',3 from dual union all

select '8','10',3 from dual union all

select '9','10',4 from dual

),

shortest_path(src,dst,path,dist,stop,step) as

(select src,dst,path,dist,null,1

from (select src,:dst dst,src || '~' || dst path,d,min(d) over () dist

from arcs

where src = :src

)

where d = dist

union all

select a.src,

p.dst,p.path || '~' || a.dst || case when a.dst = p.dst then '+' end,

case when a.d = min(a.d) over ()

then p.dist + a.d

when a.dst = p.dst

then p.dist + a.d

end,

last_value(case when a.dst = p.dst then 'stop' end) ignore nulls over (order by null),

step + 1

from arcs a,

shortest_path p

where a.src = regexp_substr(p.path,'\d+\$')

and p.dst != regexp_substr(p.path,'\d+\$')

and p.dist is not null

and p.stop is null

)

select rtrim(path,'+') path,dist

from shortest_path

where stop is not null

and instr(path,'+') > 0

 PATH DIST 1~2~6~9~10 13

Regards

Etbin

• Member Posts: 1,100 Silver Trophy
edited April 2015

You pick the closest node each time, which does not necessarily lead to the shortest path. Actually it may not lead to the destination at all.

• Member Posts: 8,968 Gold Crown
edited April 2015

That's why something to play with stands for.

I just proposed a query (using recursive subquery factoring) that (in some cases) produces a solution, so I thought others might be interested in - I was playing too .

I'm here mainly to learn and to stay exercised as providing a solution of some kind is (for a non-native speaker) much more simple than debating about how it might have been achieved and a forum IMHO is certainly not the place to exhibit proofs whether a certain algorithm is correct or not.

As I'm typing this already I can share my reasoning:

• at each step I pick the nodes reachable from the starting node or from the last node on the current shortest path.
• If the nodes include the destination node I pick the destination node and stop else I concatenate the nearest node to the shortest path incrementing the distance.

I don't claim it solves a general case leaving to others to find examples when it doesn't work.

Regards

Etbin

As there's not much traffic an easy couter example where my query would give an incorrect result

1

/ \

2   3

/      \

2___4___3

\        \

5        2

\        \

\________4

with

arcs as

(select '1' src,'2' dst,2 d from dual union all

select '1','3',3 from dual union all

select '2','3',4 from dual union all

select '2','4',5 from dual union all

select '3','4',2 from dual

)

 PATH5 DIST 1~2~4 7

• Member Posts: 3,513 Bronze Crown
edited April 2015

• Member Posts: 383 Bronze Badge
edited April 2015

This solution finds all minimum distance routes to each node. I stuck with the lettered nodes in the data set link.

```NODE              LEV PATH                            DST_PATH
---------- ---------- ------------------------------ ----------
B                  1 A,B                                    2
C                  1 A,C                                    4
D                  1 A,D                                    3
E                  2 A,D,E                                  7
2 A,C,E                                  7
F                  2 A,D,F                                  4
G                  2 A,D,G                                  8
2 A,B,G                                  8
2 A,C,G                                  8
H                  3 A,D,E,H                                8
3 A,C,E,H                                8
I                  3 A,D,F,I                                7
J                  4 A,C,E,H,J                              11
4 A,D,F,I,J                              11
4 A,D,E,H,J                              11

15 rows selected.

1  WITH arcs_both_ways AS (
2  SELECT src, dst, distance
3    FROM arcs
4  UNION ALL
5  SELECT dst, src, distance
6    FROM arcs
7  ), paths (node, path, dst_path, rnk, nodes_visited, lev) AS (
8  SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, a.src || ','  || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (), 1
9    FROM arcs_both_ways a
10  WHERE a.src            = :prmSource
11  UNION ALL
12  SELECT a.dst,
13          p.path || ',' || a.dst,
14          p.dst_path + a.distance,
15          Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
16          p.nodes_visited || ',' || Listagg (a.dst, ',') WITHIN GROUP (ORDER BY a.dst) OVER (),
17          p.lev + 1
18    FROM paths p
19    JOIN arcs_both_ways a
20      ON a.src = p.node
21    AND p.rnk = 1
22  WHERE Instr (',' || p.path || ',',  ',' || a.dst || ',') = 0
23  ), paths_ranked AS (
24  SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
25    FROM paths
26    WHERE rnk = 1
27  )
28  SELECT node, lev, path, dst_path
29    FROM paths_ranked
30    WHERE rnk_tot = 1
31*  ORDER BY 1, 2

```

130415 0705:

Small tidy-up, and use Oracle's cycle direction to replace my instr. Will try to add a comment later today, but suggest to OP to try reversing his source and sink.

```WITH arcs_both_ways AS (
SELECT src, dst, distance
FROM arcs
UNION ALL
SELECT dst, src, distance
FROM arcs
), paths (node, path, dst_path, rnk, lev) AS (
SELECT a.dst, a.src || ',' || a.dst, a.distance, 1, 1
FROM arcs_both_ways a
WHERE a.src             = :prmSource
UNION ALL
SELECT a.dst,
p.path || ',' || a.dst,
p.dst_path + a.distance,
Rank () OVER (PARTITION BY a.dst ORDER BY p.dst_path + a.distance),
p.lev + 1
FROM paths p
JOIN arcs_both_ways a
ON a.src = p.node
AND p.rnk = 1
) CYCLE node SET IS_CYCLE TO '1' DEFAULT '0'
, paths_ranked AS (
SELECT lev, node, path, dst_path, Rank () OVER (PARTITION BY node ORDER BY dst_path) rnk_tot
FROM paths
WHERE rnk = 1
)
SELECT node, lev, path, dst_path
FROM paths_ranked
WHERE rnk_tot = 1
ORDER BY 1, 2
```
• Member Posts: 1,100 Silver Trophy
edited April 2015

Your query is very similar to OP's and the rank() has the same problem that it can't see the same dst which appears prior to the last level. This isn't as efficient as Dijkstra's algorithm but still gives the correct result.

• Member Posts: 1,100 Silver Trophy
edited April 2015

It's a very interesting thread, I will spend a day to study it!

• Member Posts: 32 Bronze Badge
edited April 2015

You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.

• Member Posts: 32 Bronze Badge
edited April 2015

I know that this is not Dijkstra’s. It’s the reason of this post. Below is the "brute force" shortest path. This enumerates 107 paths. In comparison, the first query that I propose only enumerate 27 paths. It's less but it can be better as @James Su mention. We can add a global cut after the first solution is found. We also have to found how to compare branches from different level.

/*Shortest path*/

WITH UsefulArcs AS

(SELECT SRC, DST, DISTANCE FROM ARCS

),

Paths (SrcArc, DestArc, SrcPath, DestPath, Path, OpDist, DistPath, DistPathMin) AS

(SELECT SRC AS SrcArc ,

DST AS DestArc ,

SRC AS SrcPath ,

DST AS DestPath ,

SRC

||'>'

|| DST AS Path ,

''

||DISTANCE AS OpDist ,

DISTANCE AS DistPath,

DISTANCE AS DistPathMin

FROM UsefulArcs

UNION ALL

SELECT a.SRC AS SrcArc,

a.DST AS DestArc,

Paths.SrcPath ,

a.DST AS DestPath ,

Paths.Path

||'>'

||a.DST AS Path,

Paths.OpDist

||'+'

||a.DISTANCE AS OpDist ,

Paths.DistPath    +a.DISTANCE                          AS ,

MIN(Paths.DistPath+a.DISTANCE) OVER (PARTITION BY a.DST) AS DistPathMin

FROM UsefulArcs a

INNER JOIN Paths

ON (Paths.DestArc=a.SRC)

) SEARCH BREADTH FIRST BY DestPath

--SEARCH DEPTH FIRST BY DestPath

SET Order01

SELECT *

FROM Paths

WHERE SrcPath=:prmSource

AND DestPath =:prmSink

AND DistPath =

(SELECT MIN(DistPath) FROM Paths WHERE SrcPath=:prmSource AND DestPath =:prmSink

)

ORDER BY DistPath;

• Member Posts: 462
edited April 2015
```Blais wrote:

You are right. I will work to imagine how to do that using RSF. I will also implement the first insight you give me.
```

It had better to quickly find one path to destination and use its distance as reference to eliminate all other paths with longer distances for next level process.  Otherwise one may take a trip around the would and then back to the destination .  Before that it is hard to drop any path for further blind searching.

But for large collection of nodes and far away between start and destination points, at any intermediate point (M), you can pick the one with MIN distance in EACH SAME group and continue searching, eg for all rows in such group.  Such aggregate function during recursive process will be expensive but it can certainly save PGA for longer process.

(S)>A>B>(M)

(S)>B>A>(M)

• Member Posts: 1,100 Silver Trophy
edited April 2015

How do you pick the M point? If it's from the neighbours then this is already described in the algorithm.

• Member Posts: 1,100 Silver Trophy
edited April 2015

You are right, it does look like a poem and it's very enjoyable reading experience. Thank you!

• Member Posts: 462
edited April 2015

hmm, my understanding is at start point it can move to any neighbors in all directions but each node can be only in the path once, eg, only instr(path, next neighbor)=0 is valid path at next level.

So there are 4 paths from S to M (S>A>M, S>B>M, S>A>B>M, S>B>A>M).  And the later 2 can be reduced to one for further searching.  Such intermediate points are possibly existing at level 3 and beyond.

S --------- A -------------

|             |                |

|_______B ----------- M ------------------------- ........... ----------------------------- D

• Member Posts: 3,513 Bronze Crown
edited April 2015
```Blais wrote:

I know that this is not Dijkstra’s. It’s the reason of this post.
```

In your initial post you asked: "if my Dijkstra’s shortest path algorithm is implement correctly"?

As far as i know the Dijkstra algorithm only leads to one shortest path and not to all shortest pathes.

In the thread i mentioned above i implemented Dijkstra by using model clause. The solution works for your example too (10-9-6-4-1=11).

As far as i remember i came to the conclusion that it is not possible to implement the Dijkstra by using recursive subquery, but i may be wrong.

(That does not mean that it is not possible to solve shortest path problems at all with this tool!)

• Member Posts: 3,513 Bronze Crown
edited April 2015
```with agraph as (
select src source, dst destination, distance from arcs
union all
select 0, 1, 0 from dual
)
,m as ( -- eval the nodes from the arcs, not from the nodes table
select source n from agraph
union
select destination from agraph
)
, -- calculates all shortest pathes from start node (#1) to all other nodes
-- reachable from the start node
-- if another start node should be choosen just renumerate them so that
-- the new one is #1
pathes as (select nn node, p predecessor, d dist from m
model
reference destinations
on (select source, destination, distance
from agraph)
dimension by (source,destination)
measures (source s, distance dist, destination dest)
main mm
dimension by(n)
measures(n as nn, 0 as p, case n when 0 then 0 else 1000 end as d, 0 as v, 0 as it)
UNIQUE SINGLE REFERENCE
rules UPDATE
iterate (2000) until d[0]=1
(
p[0]=min(nn) keep(dense_rank first order by v,d)[any]
,it[p[0]]=ITERATION_number+1
,p[any] order by d,n=
case
when
cv(n) = dest[p[0],cv(n)] and
d[cv(n)] > dist[p[0],cv(n)] + d[p[0]]
then s[p[0],cv(n)]
else p[cv(n)]
end
,d[any] order by d,n=
case
when
cv(n) = dest[p[0],cv(n)] and
d[cv(n)] > dist[p[0],cv(n)] + d[p[0]]
then dist[p[0],cv(n)] + d[p[0]]
else d[cv(n)]
end
,v[p[0]]=1
,d[0]=min(v)[any]
))

select
ltrim(min(sys_connect_by_path (node, '<-')) keep (dense_rank first order by dist), '-<') path
,max(dist) dist
from pathes
connect by nocycle node = prior predecessor

PATHDIST

10<-9<-6<-4<-1
11

```
• Member Posts: 1,100 Silver Trophy
edited April 2015

This algorithm is for directed graph. And it will only keep the shortest path and never needs to merge.

• Member Posts: 1,100 Silver Trophy
edited April 2015

Aggregation is not allowed in recursive subquery, but analytic function is supported so we can achieve the same goal. Now the only issue is we might end up with too much intermediate data in the final result set. Otherwise it already implements this algorithm.

• Member Posts: 462
edited April 2015

It seems I have different understanding.  To me it is like puzzle palace game, one starts at entrance and try to exit.  So one may either reach to a dead end or luckily get out in different paths sooner or later.

But if one gets some (good) directions at certain points/nodes then it is not puzzle palace game.

• Member Posts: 462
edited April 2015

Remember reading document about using aggregate and/or analytic function in recursive sql.  And based on some posts here it seems the case for 11.2.0.3+.  Do you know if it is the case for 12c?

• Member Posts: 1,100 Silver Trophy
edited April 2015

Aggregation is never supported; analytic function has been supported in document since RSF was first introduced in 11gR2, but it was not working in 11.2.0.1. The first time I saw it working was in 11.2.0.3. It's also working in 12c.

• Member Posts: 462
edited April 2015

I see.  Thanks.  Have it long in mind to test the aggregate function in scalar query using tmp table in recursive sql and just keep slipping so far.  But it is time to install 12c.

• Member Posts: 383 Bronze Badge
edited May 2015

Having had a closer look at the query in Blais' original post, I agree with James Su that mine above is very similar.

If anyone is still interested I wrote more about this approach here "SQL for Shortest Path Problems | A Programmer Writes… (Brendan's Blog)"

Update, 4 May 2015: I posted this follow-up: SQL for Shortest Path Problems 2: A Branch and Bound Approach

This discussion has been closed.