This discussion is archived
1 2 Previous Next 16 Replies Latest reply: Dec 1, 2012 9:47 PM by Ayham

# Sorting and Grouping -Two months in this query

Currently Being Moderated
Hi All,

many thanks for jeneesh

i am doing project for construction company, i face this problem in grouping points according to relation between these points the
Relation is from 1 to 100. If the point between this rang that mean there is relation between these points.

this question already solve but the results not correct when the table has more data.

SQL - sorting and grouping.
from jeneesh and many thanks for him.

This example for more clarifications

for example i have these points
``````id   location         percentage   comments
1     loc 1,2          20%                that mean point  1 and 2 close to each other by 20%
2     loc 1,3          40%              that mean point 1 and 3 close to each other byy 40%
3     Loc 8,6          25%               that mean point 8 and 6 close to each other by 25%
4     Loc  6,10        20%
5     LOC 11,10        10 %
6     LOC 15,14         0%``````
Also , we can see the relation between these points as follwoing
- points 1,2,3 in one group why becuase 1,2 has relation and 1,3 has relation that mean 1,3 also has hidden relation.
- Points 6,8,10,11 in second group there are relations between them .
- but no relation between 1 or 2 or 3 with any point of 6,8,9,10,11
- as well as no relation between 15, 14 that mean 14 in third group and 15 in fourth group.

whati need?

to group the points that has relation according to percentage value ascending

The most important part is to group the points. SO , the below query the gropuing is not correct.

I have the follwoing table with data
``````drop table temp_value;
create table temp_value(id number(10),location varchar2(20), percentage number(9));

insert into temp_value values  (1,'LOC 1,2',10);
insert into  temp_value values (2,'LOC 1,3',0);
insert into  temp_value values (3,'LOC 1,4',0);
insert into  temp_value values (4,'LOC 1,5',0);
insert into  temp_value values (5,'LOC 1,6',0);
insert into  temp_value values (6,'LOC 2,3',0);
insert into  temp_value  values(7,'LOC 2,4',0);
insert into  temp_value values (8,'LOC 2,5',30);
insert into  temp_value values (9,'LOC 2,6',0);
insert into  temp_value values (10,'LOC 3,4',0);
insert into  temp_value values (11,'LOC 3,5',0);
insert into  temp_value values (12,'LOC 4,5',40);
insert into  temp_value values (13,'LOC 4,6',0);
insert into  temp_value values (14,'LOC 6,7',40);
insert into  temp_value values (15,'LOC 7,2',0);
insert into  temp_value values (16,'LOC 8,2',60);
insert into  temp_value values (17,'LOC 8,3',0);
insert into  temp_value values (18,'LOC 3,1',0);
insert into  temp_value values (19,'LOC 9,6',30);
insert into  temp_value values (20,'LOC 11,2',0);
insert into  temp_value values (22,'LOC 12,3',10);
insert into  temp_value values (23,'LOC 19,3',0);
insert into  temp_value values (24,'LOC 17,3',0);
insert into  temp_value values (24,'LOC 20,3',0);``````
when i used this query , the results is not correct

`````` with t as
(select percentage,loc1,loc2,sum(case when percentage = 0 then 1
when loc1 in (l1,l2) then 0
when loc2 in (l1,l2) then 0
when l1 is null and l2 is null then 0
else 1
end) over(order by rn) sm
from (     select id,location,percentage,
regexp_substr(location,'\d+',1,1) LOC1,
regexp_substr(location,'\d+',1,2)  LOC2,
lag(regexp_substr(location,'\d+',1,1))
over(order by percentage desc) l1,
lag(regexp_substr(location,'\d+',1,2))
over(order by percentage desc) l2,
row_number() over(order by percentage desc) rn
from temp_value
order by percentage desc
)
)
select loc,min(sm)+1 grp
from(
select loc,rownum rn,sm
from(
select percentage,decode(rn,1,loc1,loc2) loc,sm
from t a,
(select 1 rn from dual union all
select 2 from dual ) b
order by percentage desc,decode(rn,1,loc1,loc2) asc
)
)
group by loc
order by min(sm),min(rn);``````
the results

SQL> /
``````LOC                         GRP
-------------------- ----------
2                             1
8                             1
6                             2
7                             2
4                             3
5                             3
9                             4
1                             5
12                            6
3                             6
11                           13

LOC                         GRP
-------------------- ----------
19                           14
17                           15
20                           22

14 rows selected.``````
SQL>

but the correct is
``````Location        group No
2                  1
8                  1
4                  1
5                  1
1                  1
6                  2
7                  2
9                  2
12                 3
3                  3
19                 4
17                 5
20                 6``````
many thanks in advance.

Edited by: Ayham on Nov 30, 2012 3:07 AM
• ###### 1. Re: Sorting and Grouping
Currently Being Moderated
Waiting help
• ###### 2. Re: Sorting and Grouping
Currently Being Moderated
Still need help.
• ###### 3. Re: Sorting and Grouping
Currently Being Moderated
First of all it would be much better to split the loc into two columns. I did this in my example (loc1,loc2)
My solution gives you the groups for the corrolated locations only.
The locations with no correlations at all, must be added accordintaly.
``````with connects as (
select distinct
loc1
,loc2
,dense_rank() over (order by connect_by_root(loc1)) grp
from
temp_value
percentage != 0
connect by nocycle
(prior loc2 = loc1
or
prior loc1 = loc2
or
prior loc1 = loc1
or
prior loc2 = loc2)
and
percentage != 0
)

select
loc
,dense_rank() over (order by grp) grp
from (
select
loc
,max(grp) keep (dense_rank first order by grp) grp
from (
select
loc1 loc
,grp
from connects
union
select
loc2
,grp
from connects
)
group by
loc
)
order by 2,1

LOC GRP
1 1
2 1
4 1
5 1
8 1
12 2
3 2
6 3
7 3
9 3 ``````
Edited by: chris227 on 30.11.2012 10:34
• ###### 4. Re: Sorting and Grouping
Currently Being Moderated
Thank you very much ... I will test it and back to you. Now I am out but really I am so happy to see your reply.
• ###### 5. Re: Sorting and Grouping
Currently Being Moderated
yes, many thanks but how can i sort it based on percentage Descending
like this
``````Loc  Grp
2      1
8      1
4      1
5      1
1      1
12    2
3      2
6      3
7      3
9      3``````
regards
• ###### 6. Re: Sorting and Grouping
Currently Being Moderated
Hi,
Ayham wrote:
yes, many thanks but how can i sort it based on percentage Descending
You can use the analytic COUNT function for that. Since you want to PARTITION BY grp, and grp is itself the results of an analytic function, then you have to compute grp in a sub-query, such as got_grp below:
``````with connects as (
select distinct
loc1
,loc2
,dense_rank() over (order by connect_by_root(loc1)) grp
from temp_value
percentage != 0
connect by nocycle
(prior loc2 = loc1
or
prior loc1 = loc2
or
prior loc1 = loc1
or
prior loc2 = loc2)
and
percentage != 0
)
,  got_grp     AS
(
select
loc
,dense_rank() over (order by grp) grp
from (
select
loc
,max(grp) keep (dense_rank first order by grp) grp
from (
select
loc1 loc
,grp
from connects
union
select
loc2
,grp
from connects
)
group by
loc
)
)
SELECT       loc
,       grp
FROM       got_grp
ORDER BY  COUNT (*) OVER (PARTITION BY grp)   DESC
,            grp
,       loc
;``````
• ###### 7. Re: Sorting and Grouping
Currently Being Moderated
Thanks,

i want the sorting for each group DESC not all groups to gather
when i used your query i get
``````SQL> with connects as (
2  select distinct
3   loc1
4  ,loc2
5  ,dense_rank() over (order by connect_by_root(loc1)) grp
6  from temp_value
8  percentage != 0
9  connect by nocycle
10  (prior loc2 = loc1
11  or
12  prior loc1 = loc2
13  or
14  prior loc1 = loc1
15  or
16  prior loc2 = loc2)
17  and
18  percentage != 0
19  )
20  ,  got_grp AS
21  (
22     select
23      loc
24      ,dense_rank() over (order by grp) grp
25      from (
26      select
27       loc
28       ,max(grp) keep (dense_rank first order by grp) grp
29       from (
30       select
31        loc1 loc
32        ,grp
33        from connects
34        union
35        select
36         loc2
37         ,grp
38         from connects
39         )
40         group by
41         loc
42        )
43  )
44  SELECT       loc
45  ,    grp
46  FROM         got_grp
47  ORDER BY  COUNT (*) OVER (PARTITION BY grp)   DESC
48  ,            grp
49  ,    loc
50  ;``````
The output is
``````LOC                         GRP
-------------------- ----------
1                             1
2                             1
4                             1
5                             1
8                             1
6                             3
7                             3
9                             3
12                            2
3                             2

10 rows selected.``````
but i want it like this
``````Loc  Grp
2      1
8      1
4      1
5      1
1      1
12    2
3      2
6      3
7      3
9      3``````
So , the sorting for each group Separate based on the percentage column.

many thanks

Edited by: Ayham on Nov 30, 2012 9:43 AM
• ###### 8. Re: Sorting and Grouping
Currently Being Moderated
Hi,
Ayham wrote:
... The output is
``````LOC                         GRP
-------------------- ----------
1                             1
2                             1
4                             1
5                             1
8                             1
6                             3
7                             3
9                             3
12                            2
3                             2

10 rows selected.``````
but i want it like this
``````Loc  Grp
2      1
8      1
4      1
5      1
1      1
12    2
3      2
6      3
7      3
9      3``````
Sorry, I don't understand.

Do you want the output sorted by the number of members in each group? If so, then why do you want grp=2, which has only 2 members, to come before grp=3, which has 3 members? Is it because you want them in <b>ASC</b>ending order? Then why do you want grp=1, which has 5 members, to come first, not last?
• ###### 9. Re: Sorting and Grouping
Currently Being Moderated
I want only output sorted by the number of members in each group. But for the sequence of groups for 1,2 and 3 which coming first no problem. That mean can be 3 Then 1. then 2 or 2 then 1 then 3.
Many thanks for you.
• ###### 10. Re: Sorting and Grouping
Currently Being Moderated
Ayham wrote:
yes, many thanks but how can i sort it based on percentage Descending
like this
Easy with the blueprint given, study and understand it ;-) Just add percentage to the projection.
``````with connects as (
select distinct
loc1
,loc2
,dense_rank() over (order by connect_by_root(loc1)) grp
,percentage per
from
temp_value
percentage != 0
connect by nocycle
(prior loc2 = loc1
or
prior loc1 = loc2
or
prior loc1 = loc1
or
prior loc2 = loc2)
and
percentage != 0
)

select
loc
,dense_rank() over (order by grp) grp
from (
select
loc
,max(grp) keep (dense_rank first order by grp) grp
,max(per) keep (dense_rank last order by per nulls first) per
from (
select
loc1 loc
,grp
,per
from connects
union
select
loc2
,grp
,per
from connects
)
group by
loc
)
order by 2,per desc,1

LOC     GRP
2     1
8     1
4     1
5     1
1     1
12     2
3     2
6     3
7     3
9     3``````
Edited by: chris227 on 30.11.2012 10:33
• ###### 11. Re: Sorting and Grouping - Two months in this query
Currently Being Moderated
Ok, i thinked one this again and i found out how to include the points with no percentage also.
In your example expected output you missed the 11 therefore we got 7 groups now.
The order of group 2 and 3 is ambiguous, because the highest percentage in these groups is the same.
``````with connects as (
select distinct
loc1
,loc2
,connect_by_root(loc1) grp
,percentage per
from
temp_value
--percentage != 0
connect by nocycle
(prior loc2 = loc1
or
prior loc1 = loc2
or
prior loc1 = loc1
or
prior loc2 = loc2)
and
percentage != 0
and
prior percentage != 0
)

select
loc
,dense_rank() over (order by decode(per,0,1,0), grp) grp
from (
select
loc
,max(grp) keep (dense_rank first order by per desc, grp) grp
,max(per) keep (dense_rank last order by per nulls first) per
from (
select
loc1 loc
,grp
,per
from connects
union
select
loc2
,grp
,per
from connects
)
group by
loc )
order by 2,per desc,1

LOC     GRP
2     1
8     1
4     1
5     1
1     1
12     2
3     2
6     3
7     3
9     3
11     4
17     5
19     6
20     7``````
Think we are done now;-)
Edited by: chris227 on 30.11.2012 16:46

Edited by: chris227 on 30.11.2012 17:12
order corrected

Edited by: chris227 on 30.11.2012 17:15
simplification, no need for rank in connects subquery

Edited by: chris227 on 30.11.2012 17:26
• ###### 12. Re: Sorting and Grouping - Two months in this query
Currently Being Moderated
really i am grateful for you.
Thank you very much . Yes this is what i need.
• ###### 13. Re: Sorting and Grouping - Two months in this query
Currently Being Moderated
I think it's worth pointing out how general the underlying problem is here: Given a set of nodal links, obtain the sets of distinct subnetworks formed thereby.

Chris's solution relies on the fact that any node in a given subnetwork can be paired with all others in the same by means of a tree-walk, and that the minimum uid of the paired nodes is the same across the subnetwork if you include self-pairs. Thus the problem can be solved by tree-walking starting from all nodes, then grouping by the root-nodes.

My implementation uses the same idea, but differs in the details; in particular I use recursive subquery factoring and break out the nodes at the start.

How it works

ln1: Normalise the data (!)
nod: Get the nodes
rsf: Anchor from all nodes; recursively join the links, skipping the one that led to the current node, and using CYCLE to handle other loops, and passing the root nodes through each iteration
Main: Group by root, take min (uid) and turn into a ranking as subnetwork identifier

I am ordering purely by node number, as not sure about the percentage stuff above.
``````WITH ln1 AS (
SELECT      Regexp_Substr (location, '\d+', 1, 1) lfr,
Regexp_Substr (location, '\d+', 1, 2) lto,
id,
percentage
FROM temp_value
), lnk AS (
SELECT lfr, lto, id, percentage
FROM ln1
UNION
SELECT lto, lfr, id, percentage
FROM ln1
), nod AS (
SELECT DISTINCT lfr n
FROM lnk
), rsf (n, id, n_root) AS (
SELECT n, 0 id, n n_root
FROM nod
UNION ALL
SELECT l.lto, l.id, r.n_root
FROM rsf r
JOIN lnk l
ON l.lfr = r.n
AND l.id != r.id
AND l.percentage > 0
) CYCLE n SET is_cycle TO '1' DEFAULT '0'
SELECT Dense_Rank () OVER (ORDER BY Min (n)) grp, n_root loc
FROM rsf
GROUP BY n_root
ORDER BY 1, 2
/
Solution

GRP  LOC
---------- ----
1    1
1    2
1    4
1    5
1    8
2    3
2   12
3    6
3    7
3    9
4   11
5   17
6   19
7   20

14 rows selected.``````
• ###### 14. Re: Sorting and Grouping - Two months in this query
Currently Being Moderated
Many thanks for you .

But now i have another problem.

when i execute the query for 20 rows that fine.
but when i execute it for 4000 row it take more time.

the Question is how can i make it fast?

many thanks for all.
1 2 Previous Next

#### Legend

• Correct Answers - 10 points