For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!
Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.
Hi,
Sorry, I didn't understand the problem before.
It seems that you want to order by effort_no, with every sample_id forming a different group: pk_id plays no role in this problem.
You would like to do:
, COUNT (DISTINCT species) OVER ( PARTITION BY sample_id ORDER BY effort_no , habitat_type -- ? ) AS cnt
but you can't. You'll get the error ORA-30487 if you try to use an analytic function with both DISTINCT and ORDER BY.
Instead of a scalar sub-query, like you posted, I would be inclined to do a self-join:
SELECT a.sample_id , a.habitat_type , a.effort_no , COUNT (DISTINCT b.species) AS cnt FROM roll_test a JOIN roll_test b ON a.sample_id = b.sample_id AND a.effort_no >= b.effort_no GROUP BY a.sample_id, a.effort_no, a.habitat_type ORDER BY a.sample_id, a.effort_no, a.habitat_type ;
This assumes that every row has a distinct combination of the four columns referenced above: sample_id, effort_no, hapitat_type and species. If not, it might be faster to GROUP by those columns first, in a sub-query, to make them unique.
A composite index on sample_id and effort_no would probably make this faster.
I'd be interested in hearing if the self-join above is significantly faster or slower than your solution.
create table aaaa(PK_ID,SAMPLE_ID,HABITAT_TYPE,EFFORT_NO,SPECIES) as select 1,1, 'Riffle', 1, 'A' from dual union select 2,1, 'Riffle', 1, 'B' from dual union select 3,1, 'Riffle', 1, 'C' from dual union select 4,1, 'Riffle', 2, 'B' from dual union select 5,1, 'Riffle', 2, 'D' from dual union select 6,1, 'Pool' , 3, 'C' from dual union select 7,1, 'Pool' , 3, 'D' from dual union select 8,1, 'Pool' , 3, 'E' from dual union select 9,1, 'Riffle', 4, 'F' from dual union select 10,2, 'Pool' , 1, 'A' from dual;
select SAMPLE_ID,HABITAT_TYPE,EFFORT_NO, sum(sum(willSum)) over(partition by SAMPLE_ID order by min(PK_ID)) as cnt from (select PK_ID,SAMPLE_ID,HABITAT_TYPE,EFFORT_NO,SPECIES, case Row_Number() over(partition by SAMPLE_ID,SPECIES order by PK_ID) when 1 then 1 else 0 end as willSum from aaaa) group by SAMPLE_ID,HABITAT_TYPE,EFFORT_NO order by SAMPLE_ID,min(PK_ID);
SAMPLE_ID HABITA EFFORT_NO CNT --------- ------ --------- --- 1 Riffle 1 3 1 Riffle 2 4 1 Pool 3 5 1 Riffle 4 6 2 Pool 1 1
similar threads 593241 569685
A bit late, but in the context of the similar threads mentioned and to complete the picture, here is how this could be done using the MODEL clause:
with roll_test as ( select 1 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 1 as effort_no, 'A' as species from dual union all select 2 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 1 as effort_no, 'B' as species from dual union all select 3 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 1 as effort_no, 'C' as species from dual union all select 4 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 2 as effort_no, 'B' as species from dual union all select 5 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 2 as effort_no, 'D' as species from dual union all select 6 as pk_id, 1 as sample_id, 'Pool' as habitat_type, 3 as effort_no, 'C' as species from dual union all select 7 as pk_id, 1 as sample_id, 'Pool' as habitat_type, 3 as effort_no, 'D' as species from dual union all select 8 as pk_id, 1 as sample_id, 'Pool' as habitat_type, 3 as effort_no, 'E' as species from dual union all select 9 as pk_id, 1 as sample_id, 'Riffle' as habitat_type, 4 as effort_no, 'F' as species from dual union all select 10 as pk_id, 2 as sample_id, 'Pool' as habitat_type, 1 as effort_no, 'A' as species from dual ) select SAMPLE_ID,HABITAT_TYPE,EFFORT_NO, cnt from roll_test model return updated rows dimension by ( rownum as rn, max(pk_id) over (partition by sample_id, habitat_type, effort_no) as pk_id, sample_id, habitat_type, effort_no) measures (species, 0 cnt) rules upsert all (cnt[0, any, any, any, any] = count(distinct species)[any, pk_id<=CV(), CV(), any, any]);
Regards, Randolf
Oracle related stuff: http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle: http://www.sqltools-plusplus.org:7676/ http://sourceforge.net/projects/sqlt-pp/
Thanks guys for all your answers !
While playing around with the queries, I found a weird behaviour if I try to use grouping sets in the query. Is this the normal behaviour or I stumbled across an Oracle bug ?
The following 2 queries work as expected:
Query 1 - works as expected:
SELECT sample_id, SUM (SUM (willsum)) OVER (PARTITION BY sample_id ORDER BY MIN (pk_id) ) AS cnt FROM (SELECT pk_id, sample_id, habitat_type, effort_no, species, CASE ROW_NUMBER () OVER (PARTITION BY sample_id, species ORDER BY pk_id) WHEN 1 THEN 1 ELSE 0 END AS willsum FROM roll_test) GROUP BY sample_id ORDER BY sample_id, MIN (pk_id);
Query 2 - works as expected:
SELECT sample_id, habitat_type, effort_no, SUM (SUM (willsum)) OVER (PARTITION BY sample_id ORDER BY MIN (pk_id) ) AS cnt FROM (SELECT pk_id, sample_id, habitat_type, effort_no, species, CASE ROW_NUMBER () OVER (PARTITION BY sample_id, species ORDER BY pk_id) WHEN 1 THEN 1 ELSE 0 END AS willsum FROM roll_test) GROUP BY sample_id, habitat_type, effort_no ORDER BY sample_id, MIN (pk_id);
Query 3 - applpying grouping sets to combine the above 2 queries returns wrong results:
SELECT grouping( sample_id ), grouping( habitat_type ), sample_id, habitat_type, effort_no, SUM (SUM (willsum)) OVER (PARTITION BY sample_id ORDER BY MIN (pk_id) ) AS cnt FROM (SELECT pk_id, sample_id, habitat_type, effort_no, species, CASE ROW_NUMBER () OVER (PARTITION BY sample_id, species ORDER BY pk_id) WHEN 1 THEN 1 ELSE 0 END AS willsum FROM roll_test) GROUP BY grouping sets ( sample_id, ( sample_id, habitat_type, effort_no ) ) ORDER BY sample_id, MIN (pk_id);
Anybody can explain this behaviour of grouping sets ?
Thanks again !
Sorry, my bad, I found what was wrong with Query 3. Here is the correct query with grouping sets:
SELECT grouping( sample_id ), grouping( habitat_type ), sample_id, habitat_type, effort_no, SUM (SUM (willsum)) OVER (PARTITION BY grouping( sample_id ), grouping( habitat_type ), sample_id ORDER BY MIN (pk_id) ) AS cnt FROM (SELECT pk_id, sample_id, habitat_type, effort_no, species, CASE ROW_NUMBER () OVER (PARTITION BY sample_id, species ORDER BY pk_id) WHEN 1 THEN 1 ELSE 0 END AS willsum FROM roll_test) GROUP BY grouping sets ( sample_id, ( sample_id, habitat_type, effort_no ) ) ORDER BY sample_id, MIN (pk_id);
Thanks !