This discussion is archived
13 Replies Latest reply: Jan 4, 2013 2:00 AM by Stew Ashton RSS

How to find the id of the node given the path using connect by?

982585 Newbie
Currently Being Moderated
I have a table like this:

CREATE TABLE tab1 (Id INTEGER, Name VARCHAR2(100), ParentId INTEGER)

Let's say I have the following rows:

Id name ParentId

1 X NULL
2 Y 1
3 Z 2
4 A 3

Now, given the path /X/Y/Z/A, I need to return 4

Is it possible to achieve this using CONNECT BY?

If it helps, I have over simplified the scenario - that a node has only one child. In reality, a node can have many children.

Thanks.
  • 1. Re: How to find the id of the node given the path using connect by?
    Frank Kulash Guru
    Currently Being Moderated
    Hi,

    Welcome to the forum!
    user2888313 wrote:
    I have a table like this:

    CREATE TABLE tab1 (Id INTEGER, Name VARCHAR2(100), ParentId INTEGER)

    Let's say I have the following rows:

    Id name ParentId

    1 X NULL
    2 Y 1
    3 Z 2
    4 A 3

    Now, given the path /X/Y/Z/A, I need to return 4

    Is it possible to achieve this using CONNECT BY?
    I'm not sure what you're asking. Use the pseduo-column LEVEL, or the column id, depending on what you want.
    If it helps, I have over simplified the scenario - that a node has only one child. In reality, a node can have many children.
    No problem; a parent can have any number of children. In fact, a child can have any number of parents, too.

    If you're worried about the test case being over-simplified, then don't over-simplify it. It's easy enough to post an example that includes a multiple children for the same parent.
    Whatever you do, post INSERT stateements for your sample data, and post the exact results you want from that data.
    See the forum FAQ {message:id=9360002}
  • 2. Re: How to find the id of the node given the path using connect by?
    L-MachineGun Pro
    Currently Being Moderated
    user2888313 wrote:
    I have a table like this:

    CREATE TABLE tab1 (Id INTEGER, Name VARCHAR2(100), ParentId INTEGER)
    . . . Etc . . .

    Now, given the path /X/Y/Z/A, I need to return 4
    Like this? I added a child:
    WITH my_tab
         AS (SELECT 1 id, 'X' name, NULL parentid FROM DUAL UNION
             SELECT 2 id, 'Y' name, 1 parentid    FROM DUAL UNION
             SELECT 3 id, 'Z' name, 2 parentid    FROM DUAL UNION
             SELECT 4 id, 'A' name, 3 parentid    FROM DUAL UNION
             SELECT 5 id, 'B' name, 3 parentid    FROM DUAL)
    SELECT *
      FROM ( 
       SELECT t.*, RANK () OVER (ORDER BY LEVEL DESC) rn
                  FROM my_tab t
            START WITH parentid IS NULL
            CONNECT BY PRIOR id = parentid
            )
     WHERE rn = 1
    /
    :p
  • 3. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    Thanks for the suggestions - will follow from now on.

    I am not clear how to use the LEVEL pseudo column. Should I just start from the root, find all paths up to the level I am looking for? Could you please give me the syntax?

    Thanks again.
  • 4. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    Thanks, in this solution I have to construct the query with the parts of the path each time, correct? is there a way to generalize the query for any level in the path?
  • 5. Re: How to find the id of the node given the path using connect by?
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    user2888313 wrote:
    Thanks for the suggestions - will follow from now on.
    Here's one way to post the sample data, given the CREATE TABLE statement you posted earlier:
    -- Basic data:
    INSERT INTO tab1 (id, name, parentid) VALUES (1,  'X', NULL);
    INSERT INTO tab1 (id, name, parentid) VALUES (2,  'Y', 1);
    INSERT INTO tab1 (id, name, parentid) VALUES (3,  'Z', 2);
    INSERT INTO tab1 (id, name, parentid) VALUES (4,  'A', 3);
    
    -- To test branching (i.e., multiple children for the same parent):
    INSERT INTO tab1 (id, name, parentid) VALUES (11, 'P', 1);
    INSERT INTO tab1 (id, name, parentid) VALUES (12, 'Q', 1);
    INSERT INTO tab1 (id, name, parentid) VALUES (13, 'R', 12);
    Alternatively, you could post a WITH clause, as someone did above.
    I am not clear how to use the LEVEL pseudo column. Should I just start from the root, find all paths up to the level I am looking for? Could you please give me the syntax?
    Sorry, I'm still not clear what you want, or why you want it.
    Do you want 4 because 'A' is the 4th generation in this family tree (that is, because 'A' has 3 ancestors), or do you want 4 because id=4 is on the same row as name='A'? In the former case, use LEVEL; in the latter, use the id column.
    This query shows both:
    SELECT  SYS_CONNECT_BY_PATH (name, '/')          AS name_path
    ,     LEVEL                                AS lvl
    ,     id
    FROM     tab1
    START WITH     parentid     IS NULL
    CONNECT BY     parentid     = PRIOR id
    ;
    Output from the expanded sample data:
    NAME_PATH                   LVL         ID
    -------------------- ---------- ----------
    /X                            1          1
    /X/Y                          2          2
    /X/Y/Z                        3          3
    /X/Y/Z/A                      4          4
    /X/P                          2         11
    /X/Q                          2         12
    /X/Q/R                        3         13
  • 6. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    I want '4' because id=4 is on the same row as name='A.

    I want the query to select just the Id Column when the 'path' matches, meaning, the query should just return one row in the result set. Is there a way to filter/match on the SYS_CONNECT_BY_PATH?

    I can do the following query, but I think it will be inefficient, since all paths must be enumerated first:

    WITH my_tab AS (
    SELECT SYS_CONNECT_BY_PATH (name, '/')     AS name_path
    ,     LEVEL                    AS lvl
    ,     id
    FROM      tab1
    WHERE LEVEL <= 4
    START WITH     parentid     IS NULL
    CONNECT BY     parentid     = PRIOR id
    )
    SELECT id from my_tab where name_path = '/X/Y/Z/A'
  • 7. Re: How to find the id of the node given the path using connect by?
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    user2888313 wrote:
    I want '4' because id=4 is on the same row as name='A.

    I want the query to select just the Id Column when the 'path' matches, meaning, the query should just return one row in the result set. Is there a way to filter/match on the SYS_CONNECT_BY_PATH?
    I see now! A string such as '/X/Y/Z/A' will be an input to the query, not the output.
    I can do the following query, but I think it will be inefficient, since all paths must be enumerated first:
    That's right; CONNECT BY is slow under any circumstances, and there's often no way to keep it from going down all paths. Recursive WITH clauses (new in Oracle 11.2) are better at that.
    WITH my_tab AS (
    SELECT SYS_CONNECT_BY_PATH (name, '/')     AS name_path
    ,     LEVEL                    AS lvl
    ,     id
    FROM      tab1
    WHERE LEVEL <= 4
    START WITH     parentid     IS NULL
    CONNECT BY     parentid     = PRIOR id
    )
    SELECT id from my_tab where name_path = '/X/Y/Z/A'
    How about
    VARIABLE  name_path     VARCHAR2 (20)
    EXEC     :name_path := '/X/Y/Z/A';
    
    SELECT     :name_path
    ,     id
    FROM     tab1
    WHERE     name     = SUBSTR ( :name_path
                      , 1 + INSTR (:name_path, '/', -1)
                    )
    ;
    ?
  • 8. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    Looks like the assumption is that the full path is stored in 'name' column. But I am not doing that. I only store the name of that particular node in 'name' column and not the entire node path.
  • 9. Re: How to find the id of the node given the path using connect by?
    Frank Kulash Guru
    Currently Being Moderated
    Hi,
    user2888313 wrote:
    Looks like the assumption is that the full path is stored in 'name' column...
    No, only a single name, like 'X' or 'A', is stored in the name column, as in the INSERT statements I posted earlier. Given that data, the query I posted produces the results you requested. have you tried it? If it doesn't work for some other data and/or parameter, then post an example.
  • 10. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    Fantastic. Thanks a lot..

    There is one problem though - the query assumes names are unique.

    If the names are not unique, then the query returns multiple rows.

    For example:

    INSERT INTO tab1 (id, name, parentid) VALUES (5, 'A', 2);

    Then the query returns two rows..

    Edited by: user2888313 on Jan 3, 2013 8:55 AM

    Edited by: user2888313 on Jan 3, 2013 9:00 AM
  • 11. Re: How to find the id of the node given the path using connect by?
    Stew Ashton Expert
    Currently Being Moderated
    This alternative could be more efficient, since it just goes down the path you indicate:
    WITH DATA AS (SELECT '/X/Y/Z/A' || '/' INPUT_PATH FROM DUAL)
    SELECT A.*
    FROM TAB1 A, DATA B
    START WITH A.NAME = SUBSTR(
      b.INPUT_PATH,
      2,
      INSTR(b.INPUT_PATH, '/', 1, 2) - 2
    )
    CONNECT BY a.NAME = SUBSTR(
      B.INPUT_PATH,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL) + 1,
      INSTR(b.INPUT_PATH, '/', 1, LEVEL+1) - INSTR(b.INPUT_PATH, '/', 1, LEVEL) -1
    );
    
    ID NAME      PARENTID
    -- --------- --------
     1 X                  
     2 Y                1 
     3 Z                2 
     4 A                3
    If you just want the last row:
    WITH DATA AS (SELECT '/X/Y/Z/A' || '/' INPUT_PATH FROM DUAL)
    SELECT A.*
    FROM TAB1 A, DATA B
    where connect_by_isleaf = 1
    START WITH A.NAME = SUBSTR(
      b.INPUT_PATH,
      2,
      INSTR(b.INPUT_PATH, '/', 1, 2) - 2
    )
    CONNECT BY a.NAME = SUBSTR(
      B.INPUT_PATH,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL) + 1,
      INSTR(b.INPUT_PATH, '/', 1, LEVEL+1) - INSTR(b.INPUT_PATH, '/', 1, LEVEL) -1
    );
    
    ID NAME      PARENTID
    -- --------- --------
     4 A                3
  • 12. Re: How to find the id of the node given the path using connect by?
    982585 Newbie
    Currently Being Moderated
    If you have multiple rows with the same name, the query returns all the leaf nodes with that last node name.
  • 13. Re: How to find the id of the node given the path using connect by?
    Stew Ashton Expert
    Currently Being Moderated
    user2888313 wrote:
    If you have multiple rows with the same name, the query returns all the leaf nodes with that last node name.
    Oops! Of course it does, you're right.

    First, add a row and prove your point:
    INSERT INTO tab1 (id, name, parentid) VALUES (99,  'A', 2);
    
    WITH DATA AS (SELECT '/X/Y/Z/A' || '/' INPUT_PATH FROM DUAL)
    SELECT A.*
    FROM TAB1 A, DATA B
    where connect_by_isleaf = 1
    START WITH A.NAME = SUBSTR(
      b.INPUT_PATH,
      2,
      INSTR(b.INPUT_PATH, '/', 1, 2) - 2
    )
    CONNECT BY a.NAME = SUBSTR(
      B.INPUT_PATH,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL) + 1,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL+1) - INSTR(B.INPUT_PATH, '/', 1, LEVEL) -1
    );
    
    ID NAME      PARENTID
    -- --------- --------
     4 A                3 
    99 A                2
    Next, add a line at the end to correct the bug:
    WITH DATA AS (SELECT '/X/Y/Z/A' || '/' INPUT_PATH FROM DUAL)
    SELECT A.*, level
    FROM TAB1 A, DATA B
    where level = length(input_path) + 1 - length(replace(input_path, '/', ''))
    START WITH A.NAME = SUBSTR(
      b.INPUT_PATH,
      2,
      INSTR(b.INPUT_PATH, '/', 1, 2) - 2
    )
    CONNECT BY a.NAME = SUBSTR(
      B.INPUT_PATH,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL) + 1,
      INSTR(B.INPUT_PATH, '/', 1, LEVEL+1) - INSTR(B.INPUT_PATH, '/', 1, LEVEL) -1
    )
    and parentid = prior id;
    
    ID NAME      PARENTID
    -- --------- --------
     4 A                3
    Regards, Stew

    P.S. I also changed the WHERE clause to make sure I got all the way to the end of the path. Before, I would have returned the 'Z' line if the 'A' line was missing.

    Edited by: Stew Ashton on Jan 4, 2013 10:57 AM

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points