Forum Stats

  • 3,733,253 Users
  • 2,246,738 Discussions
  • 7,856,635 Comments

Discussions

Generating Balanced Tree Using SQL Hierarchical Queries

Bilal
Bilal Member Posts: 480 Bronze Badge
edited May 2011 in SQL & PL/SQL
Hi All,

I have the following hierarchical data having different sub-tree levels:

A0
-A001
--A00101
A1
-A101
A2
-A201
--A20101
---A201010001

A0's sub-tree has 3 levels, A1's sub-tree has 2 levels, and A3's sub-tree has 4 level. I want to generate a balanced tree out of the given data having all sub-tree levels equal to the maximum number of levels available in the whole tree which in this particular case is 4.

I dont know it would be possible with SQL. The script to generate the above mentioned data is as below:

CREATE TABLE codes_tree
(node_id VARCHAR2(10),
parent_node_id VARCHAR2(10)
);
INSERT INTO codes_tree VALUES('A0',NULL);
INSERT INTO codes_tree VALUES('A001','A0');
INSERT INTO codes_tree VALUES('A00101','A001');
---
INSERT INTO codes_tree VALUES('A1',NULL);
INSERT INTO codes_tree VALUES('A101','A1');
---
INSERT INTO codes_tree VALUES('A2',NULL);
INSERT INTO codes_tree VALUES('A201','A2');
INSERT INTO codes_tree VALUES('A20101','A201');
INSERT INTO codes_tree VALUES('A201010001','A20101');

Any help will be highly appreciated.

Thanks ... Best Regards

Edited by: naive2Oracle on May 12, 2011 7:40 PM

Edited by: naive2Oracle on May 12, 2011 7:41 PM
User_JCPG9

Best Answer

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,492 Red Diamond
    edited May 2011 Accepted Answer
    Hi,

    Sure, you can do that in SQL.
    One way is to take the normal hierarchical output, and manipulate the result set so that the leaves are repeated as often as necessary to make all the branches the same length. I only have Oracle 10.2 available right now, so here's a solution that will work in Oracle 10 (and up):
    WITH	original_hierarchy	AS
    (
    	SELECT	node_id
    	,	LEVEL			AS lvl
    	,	CONNECT_BY_ISLEAF	AS isleaf
    	,	ROWNUM			AS rnum
    	FROM	codes_tree
    	START WITH	parent_node_id	IS NULL
    	CONNECT BY	parent_node_id	= PRIOR node_id
    )
    ,	got_max_lvl	AS
    (
    	SELECT	o.*
    	,	MAX (lvl) OVER ()	AS max_lvl
    	FROM	original_hierarchy	o
    )
    SELECT	  LPAD ( ' '
    	       , 3 * ( ( d.lvl 
    		       + NVL (c.rnum, 1)
    		       - 1
    		       )
    		     - 1
    		     )
    	       ) || CASE
    			WHEN c.rnum > 1
    			THEN '*' || d.node_id || '*'
    			ELSE        d.node_id
    		    END		AS display_id
    FROM		  got_max_lvl	d
    LEFT OUTER JOIN	  got_max_lvl	c  ON	d.isleaf	= 1
    				   AND	c.rnum		<= 1 + d.max_lvl - d.lvl
    ORDER BY  d.rnum
    ,	  c.rnum
    ;
    Using Oracle 11.2, it might be better to generate original_hierarchy as above, but then to manipulate it using a recursive WITH clause.
    Analytic functions often interfere with CONNECT BY, so I used a separate sub-query to get max_lvl, to do the CONNECT BY in one sub-querry and the analytic function in a separate sub-query. I'm not sure that's necessary on all versions.

    Output from your sample data:
    DISPLAY_ID
    -------------------------------
    A0
       A001
          A00101
             *A00101*
    A1
       A101
          *A101*
             *A101*
    A2
       A201
          A20101
             A201010001
    Below is a generic version of the same query, which I used to test this on scott.emp:
    DEFINE	table_name	= scott.emp
    DEFINE	id_col		= empno
    DEFINE	parent_id_col	= mgr
    DEFINE	display_col	= ename
    
    WITH	original_hierarchy	AS
    (
    	SELECT	&display_col		AS display_txt
    	,	LEVEL			AS lvl
    	,	CONNECT_BY_ISLEAF	AS isleaf
    	,	ROWNUM			AS rnum
    	FROM	&table_name
    	START WITH	&parent_id_col	IS NULL
    	CONNECT BY	&parent_id_col	= PRIOR &id_col
    )
    ,	got_max_lvl	AS
    (
    	SELECT	o.*
    	,	MAX (lvl) OVER ()	AS max_lvl
    	FROM	original_hierarchy	o
    )
    SELECT	  LPAD ( ' '
    	       , 3 * ( ( d.lvl 
    		       + NVL (c.rnum, 1)
    		       - 1
    		       )
    		     - 1
    		     )
    	       ) || CASE
    			WHEN c.rnum > 1
    			THEN '*' || d.display_txt || '*'
    			ELSE        d.display_txt
    		    END		AS display_id
    FROM		  got_max_lvl	d
    LEFT OUTER JOIN	  got_max_lvl	c  ON	d.isleaf	= 1
    				   AND	c.rnum		<= 1 + d.max_lvl - d.lvl
    ORDER BY  d.rnum
    ,	  c.rnum
    ;
    Output:
    DISPLAY_ID
    -----------------------------
    KING
       JONES
          SCOTT
             ADAMS
          FORD
             SMITH
       BLAKE
          ALLEN
             *ALLEN*
          WARD
             *WARD*
          MARTIN
             *MARTIN*
          TURNER
             *TURNER*
          JAMES
             *JAMES*
       CLARK
          MILLER
             *MILLER*
    Edited by: Frank Kulash on May 13, 2011 6:38 AM
    Added generic version

Answers

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,492 Red Diamond
    edited May 2011
    Hi,

    Thanks for posting the CREATE TABLE and INSERT statements; that's very helpful!

    Don't forget to post the results you want from that data, and an explanation of how you get those results from that data.
    Always say what version of Oracle you're using. This is especially important with CONNECT BY questions, because every version since Oracle 7 has had significant improvements in this area.

    In your sample data, there is no branching; that is, the maximum number of children any node has is 1. Will the data always be like that? If not, include some examples of branching in your sample data and results.
    If there is branching, what will you want if the leaves are at different levels in different branches, as in scott.emp?
    SELECT	LEVEL
    ,	LPAD (' ', 3 * LEVEL) || ename	AS indented_ename
    FROM	scott.emp
    START WITH	mgr	IS NULL
    CONNECT BY	mgr	= PRIOR empno
    ;
    
         LEVEL INDENTED_ENAME
    ---------- --------------------
             1    KING
             2       JONES
             3          SCOTT
             4             ADAMS
             3          FORD
             4             SMITH
             2       BLAKE
             3          ALLEN
             3          WARD
             3          MARTIN
             3          TURNER
             3          JAMES
             2       CLARK
             3          MILLER
    Edited by: Frank Kulash on May 12, 2011 11:12 PM
  • Bilal
    Bilal Member Posts: 480 Bronze Badge
    edited May 2011
    Hi Frank Kulash,

    Thanks for response and guidance.

    I am using Oracle database 11g R2. I am trying to explain what I want, see the manipulated result from your post.
        LEVEL INDENTED_ENAME
    ---------- --------------------
             1    KING
             2       JONES
             3          SCOTT
             4             ADAMS
             3          FORD
             4             SMITH
             2       BLAKE
             3          ALLEN
             4             *ALLEN*
             3          WARD
             4             *WARD*
             3          MARTIN
             4             *MARTIN*
             3          TURNER
             4             *TURNER*
             3          JAMES
             4             *JAMES*
             2       CLARK
             3          MILLER
             4             *MILLER*
    The maximum length of this tree is 4, because ADAMS and SMITH fall at level 4, whereas ALLEN, WARD, etc they fall at level 3. So the tree is not balanced, I want it to get balanced in such a way that all sub-trees having level less than four shall be replicated to next level with the same value until it reach max level. The bold rows in the result are created by me to get them balanced. When I say tree it is the full result, when I say sub-tree then it is a one branch like (KING -> JONES -> SCOTT -> ADAMS).

    I hope this might help.

    Thanks ... Allah Hafiz

    Edited by: naive2Oracle on May 12, 2011 10:06 PM

    Edited by: naive2Oracle on May 12, 2011 10:19 PM
  • Mahir M. Quluzade
    Mahir M. Quluzade Member Posts: 2,377
    try this please
    SELECT	LEVEL
    ,	LPAD ('--', 3 * (LEVEL-1)) || NODE_ID AS TREE
    FROM	codes_tree
    START WITH	parent_node_id	IS NULL
    CONNECT BY	parent_node_id = PRIOR node_id	
  • Bilal
    Bilal Member Posts: 480 Bronze Badge
    edited May 2011
    Hi Mahir,

    Thanks for reply.
    Your query return following result.
         LEVEL TREE
    ---------- --------------------------------------------------
             1 A0
             2  --A001
             3     --A00101
             1 A1
             2  --A101
             1 A2
             2  --A201
             3     --A20101
             4        --A201010001
    What I want is
         LEVEL TREE
    ---------- --------------------------------------------------
             1 A0
             2  --A001
             3     --A00101
             4        --A00101
             1 A1
             2  --A101
             3     --A101
             4        --A101
             1 A2
             2  --A201
             3     --A20101
             4        --A201010001
    See I have made it balanced as every subtree has reached level 4.

    Thanks ... Best Regards

    Edited by: naive2Oracle on May 12, 2011 10:28 PM
  • Mahir M. Quluzade
    Mahir M. Quluzade Member Posts: 2,377
    Hi naive2Oracle

    I understand you,
    you want insert into new row like
      INSERT INTO codes_tree VALUES('A00101','A00101');
    But this is no logical tree view.

    InsiAllah you can understand me.

    Mahir M. Quluzade
  • Bilal
    Bilal Member Posts: 480 Bronze Badge
    Hi Mahir,

    Thanks for reply.

    I don't want to insert records for those sub-trees where level is less than max level, I just want to achieve it using SQL query.

    Thanks ... Best Regards
  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 40,492 Red Diamond
    edited May 2011 Accepted Answer
    Hi,

    Sure, you can do that in SQL.
    One way is to take the normal hierarchical output, and manipulate the result set so that the leaves are repeated as often as necessary to make all the branches the same length. I only have Oracle 10.2 available right now, so here's a solution that will work in Oracle 10 (and up):
    WITH	original_hierarchy	AS
    (
    	SELECT	node_id
    	,	LEVEL			AS lvl
    	,	CONNECT_BY_ISLEAF	AS isleaf
    	,	ROWNUM			AS rnum
    	FROM	codes_tree
    	START WITH	parent_node_id	IS NULL
    	CONNECT BY	parent_node_id	= PRIOR node_id
    )
    ,	got_max_lvl	AS
    (
    	SELECT	o.*
    	,	MAX (lvl) OVER ()	AS max_lvl
    	FROM	original_hierarchy	o
    )
    SELECT	  LPAD ( ' '
    	       , 3 * ( ( d.lvl 
    		       + NVL (c.rnum, 1)
    		       - 1
    		       )
    		     - 1
    		     )
    	       ) || CASE
    			WHEN c.rnum > 1
    			THEN '*' || d.node_id || '*'
    			ELSE        d.node_id
    		    END		AS display_id
    FROM		  got_max_lvl	d
    LEFT OUTER JOIN	  got_max_lvl	c  ON	d.isleaf	= 1
    				   AND	c.rnum		<= 1 + d.max_lvl - d.lvl
    ORDER BY  d.rnum
    ,	  c.rnum
    ;
    Using Oracle 11.2, it might be better to generate original_hierarchy as above, but then to manipulate it using a recursive WITH clause.
    Analytic functions often interfere with CONNECT BY, so I used a separate sub-query to get max_lvl, to do the CONNECT BY in one sub-querry and the analytic function in a separate sub-query. I'm not sure that's necessary on all versions.

    Output from your sample data:
    DISPLAY_ID
    -------------------------------
    A0
       A001
          A00101
             *A00101*
    A1
       A101
          *A101*
             *A101*
    A2
       A201
          A20101
             A201010001
    Below is a generic version of the same query, which I used to test this on scott.emp:
    DEFINE	table_name	= scott.emp
    DEFINE	id_col		= empno
    DEFINE	parent_id_col	= mgr
    DEFINE	display_col	= ename
    
    WITH	original_hierarchy	AS
    (
    	SELECT	&display_col		AS display_txt
    	,	LEVEL			AS lvl
    	,	CONNECT_BY_ISLEAF	AS isleaf
    	,	ROWNUM			AS rnum
    	FROM	&table_name
    	START WITH	&parent_id_col	IS NULL
    	CONNECT BY	&parent_id_col	= PRIOR &id_col
    )
    ,	got_max_lvl	AS
    (
    	SELECT	o.*
    	,	MAX (lvl) OVER ()	AS max_lvl
    	FROM	original_hierarchy	o
    )
    SELECT	  LPAD ( ' '
    	       , 3 * ( ( d.lvl 
    		       + NVL (c.rnum, 1)
    		       - 1
    		       )
    		     - 1
    		     )
    	       ) || CASE
    			WHEN c.rnum > 1
    			THEN '*' || d.display_txt || '*'
    			ELSE        d.display_txt
    		    END		AS display_id
    FROM		  got_max_lvl	d
    LEFT OUTER JOIN	  got_max_lvl	c  ON	d.isleaf	= 1
    				   AND	c.rnum		<= 1 + d.max_lvl - d.lvl
    ORDER BY  d.rnum
    ,	  c.rnum
    ;
    Output:
    DISPLAY_ID
    -----------------------------
    KING
       JONES
          SCOTT
             ADAMS
          FORD
             SMITH
       BLAKE
          ALLEN
             *ALLEN*
          WARD
             *WARD*
          MARTIN
             *MARTIN*
          TURNER
             *TURNER*
          JAMES
             *JAMES*
       CLARK
          MILLER
             *MILLER*
    Edited by: Frank Kulash on May 13, 2011 6:38 AM
    Added generic version
  • Aketi Jyuuzou
    Aketi Jyuuzou Member Posts: 1,072 Bronze Badge
    I like recursive with clause 8-)
    I insist that Using Left Join at recursive with clause is very useful.

    There is similar case which is using Left Join at recursive with clause.
    2205059
    col tree for a30
    
    with t(node_id,parent_node_id) as(
    select 'A0'        ,NULL     from dual union all
    select 'A001'      ,'A0'     from dual union all
    select 'A00101'    ,'A001'   from dual union all
    select 'A1'        ,NULL     from dual union all
    select 'A101'      ,'A1'     from dual union all
    select 'A2'        ,NULL     from dual union all
    select 'A201'      ,'A2'     from dual union all
    select 'A20101'    ,'A201'   from dual union all
    select 'A201010001','A20101' from dual),
    rec(node_id,LV) as(
    select node_id,1
      from t
     where parent_node_id is null
    union all
    select nvl(b.node_id,a.node_id),a.LV+1
      from rec a Left Join t b
        on a.node_id = b.parent_node_id
     where a.LV+1 <= 4)
    search depth first by node_id set rn
    CYCLE LV SET IsLoop TO 'Y' DEFAULT 'N'
    select LV,
    case when LV >= 2
         then LPad(' ',LV*3-5) || '--'
    end || node_id as tree
    from rec order by rn;
    
    LV  TREE
    --  -------------------
     1  A0
     2   --A001
     3      --A00101
     4         --A00101
     1  A1
     2   --A101
     3      --A101
     4         --A101
     1  A2
     2   --A201
     3      --A20101
     4         --A201010001
    My SQL articles of OTN-Japan :-)
    http://www.oracle.com/technetwork/jp/articles/otnj-sql-image3-1-323602-ja.html
    Aketi Jyuuzou
  • Aketi Jyuuzou
    Aketi Jyuuzou Member Posts: 1,072 Bronze Badge
    edited May 2011
    Nice one "Frank Kulash"

    Hehehe I have made recursive with clause version :-)
    with rec(empno,ename,LV) as(
    select empno,ename,1
      from scott.emp
     where mgr is null
    union all
    select b.empno,
    nvl(b.ename,'*' || a.ename || '*'),a.LV+1
      from rec a Left Join scott.emp b
        on a.empno = b.mgr
     where LV+1 <= 4)
    search depth first by empno set rn
    select LPad(' ',(LV-1)*2) || ename as DISPLAY_ID
     from rec order by rn;
    
    DISPLAY_ID
    ---------------
    KING
      JONES
        SCOTT
          ADAMS
        FORD
          SMITH
      BLAKE
        ALLEN
          *ALLEN*
        WARD
          *WARD*
        MARTIN
          *MARTIN*
        TURNER
          *TURNER*
        JAMES
          *JAMES*
      CLARK
        MILLER
          *MILLER*
This discussion has been closed.