7 Replies Latest reply on Dec 20, 2009 9:06 AM by Aketi Jyuuzou

# Having trouble of thinking of a query to accomplish this

I'm trying to do a count of which GMCODE has the longest consecutive values equal to 'T' in the SETTING column and then list out all the GMCODES in descending order (from most to least). So here is some data for the sake of explaining what I want a little bit better...

Oracle : 10.2g

KEY
----------------------------------------
GMCODE - code of the game
SETTING - boolean value of true or false
EVNT_ID - id specific to an event that happened on during a specific sequence (SEQ)
SEQ - unique value which is never duplicated and which states the order in which everything occurs (kind of like a timestamp)
------------------------------------------
``````        GMCODE                   SETTING                 EVNT_ID                              SEQ
486609                      F                      1                            150
486609                      F                      4                            151
486609                      T                      1                             152
486609                      T                      2                             153
486609                      T                      4                             154
486609                      T                     10                            155
486609                      F                      1                             156
253048                      T                      6                             201
253048                      F                      4                              202
253048                      T                      2                             203
253048                      F                      2                             204
253048                      F                      7                             205
486778                      F                      4                             245
486778                      T                      4                             246
486778                      T                      1                              247
486778                      F                      4                             248``````
So what i would want in the output would be
``````GAMECODE      COUNT(SETTING)
486609                         4
486778                         2
253048                         1``````
So to explain in a little more detail, from SEQ 152-155 the SETTING is equal to 'T' (SEQ 156 has a SETTING of 'F so we end the count there), so in turn I get a cout of 4 for gamecode 486609

Edited by: user652714 on Dec 9, 2009 8:11 AM
• ###### 1. Re: Having trouble of thinking of a query to accomplish this
``````SQL>WITH t AS (
2          SELECT 486609 AS GMCODE, 'F' AS SETTING, 1 AS EVNT_ID, 150 AS SEQ FROM DUAL UNION ALL
3          SELECT 486609, 'F', 4, 151 FROM DUAL UNION ALL
4          SELECT 486609, 'T', 1, 152 FROM DUAL UNION ALL
5          SELECT 486609, 'T', 2, 153 FROM DUAL UNION ALL
6          SELECT 486609, 'T', 4, 154 FROM DUAL UNION ALL
7          SELECT 486609, 'T', 10, 155 FROM DUAL UNION ALL
8          SELECT 486609, 'F', 1, 156 FROM DUAL UNION ALL
9          SELECT 253048, 'T', 6, 201 FROM DUAL UNION ALL
10          SELECT 253048, 'F', 4, 202 FROM DUAL UNION ALL
11          SELECT 253048, 'T', 2, 203 FROM DUAL UNION ALL
12          SELECT 253048, 'F', 2, 204 FROM DUAL UNION ALL
13          SELECT 253048, 'F', 7, 205 FROM DUAL UNION ALL
14          SELECT 486778, 'F', 4, 245 FROM DUAL UNION ALL
15          SELECT 486778, 'T', 4, 246 FROM DUAL UNION ALL
16          SELECT 486778, 'T', 1, 247 FROM DUAL UNION ALL
17          SELECT 486778, 'F', 4, 248 FROM DUAL)
18  SELECT   gmcode, MAX(seq_length) AS longest_sequence
19      FROM (SELECT   gmcode, COUNT(*) AS seq_length
20                FROM (SELECT gmcode, setting, evnt_id, seq, SUM(new_group) OVER(ORDER BY seq) AS group_no
21                        FROM (SELECT   gmcode, setting, evnt_id, seq,
22                                       DECODE(LAG(setting) OVER(ORDER BY seq), setting, 0, 1) AS new_group
23                                  FROM t
24                              ORDER BY seq))
25               WHERE setting = 'T'
26            GROUP BY gmcode, group_no)
27  GROUP BY gmcode
28  ORDER BY 2 DESC, 1;

GMCODE LONGEST_SEQUENCE
---------- ----------------
486609                4
486778                2
253048                1``````
Urs
• ###### 2. Re: Having trouble of thinking of a query to accomplish this
Hi,

I think you'll need two sub-queries for this problem:
``````WITH     got_rnum     AS
(
SELECT     gmcode
,     setting
,     seq
,     ROW_NUMBER () OVER ( PARTITION BY  gmcode
ORDER BY          seq
)      AS rnum
,     CASE
WHEN  setting = 'F'
OR    LAG (setting) OVER ( PARTITION BY gmcode
ORDER BY       seq
) = 'T'
THEN  NULL
ELSE  1
END               AS grp_start
FROM     table_x
)
,     got_streak_length     AS
(
SELECT  gmcode
,     1 + rnum - LAST_VALUE ( grp_start * rnum IGNORE NULLS)
OVER ( PARTITION BY  gmcode
ORDER BY          seq
) AS streak_length
FROM    got_rnum
WHERE     setting     = 'T'
)
SELECT       gmcode
,       MAX (streak_length)     AS most_consecutive_ts
FROM       got_streak_length
GROUP BY  gmcode
ORDER BY  gmcode
;``````
We need to find the maximum streak_length.
Wheteher we use the aggregate or analytic MAX, that means individual streak_lengths have to be available in the main query. They're not in the table, so we'll have to compute them in a sub-query (got_streak_length).
How can we get the streak_length? By subtracting the current row number from the row number where the streak begaan. That assumes we already have the row number and the starting point of the streak available. They''re not in the table, so we'll have to compute them in a sub-query (got_rnum).

If you can count on seq being consecutive integers for each gmcode, then we might be able to do this with one sub-query.

Edited by: Frank Kulash on Dec 9, 2009 12:46 PM
See Boneist's solution (below) for how to do the one-subquery solution even if seq is not consecutive integers.
• ###### 3. Re: Having trouble of thinking of a query to accomplish this
This is amazing stuff Frank...however if i would like to add one additional filter to this, could this be done:

So saying i had one additional column(the rest of the data is the same) called CHANGED, how would i tweak the query so that I can do the same task ask I had intially stated, but with the additional filter of, where the count of 'T' in the CHANGED column occurs at least 2 times within the consecutive streak of 'T' values in the SETTING column (i realize that may have been confusing)... so modified data would look like:
``````        GMCODE                   SETTING                 EVNT_ID                              SEQ      CHANGED
486609                      F                      1                            150                     T
486609                      F                      4                            151                     T
486609                      T                      1                             152                    F
486609                      T                      2                             153                    T
486609                      T                      4                             154                     F
486609                      T                     10                            155                     T
486609                      F                      1                             156                    T
253048                      T                      6                             201                     T
253048                      F                      4                              202                    F
253048                      T                      2                             203                     T
253048                      F                      2                             204                     F
253048                      F                      7                             205                     F
486778                      F                      4                             245                     T
486778                      T                      4                             246                    F
486778                      T                      1                              247                    F
486778                      F                      4                             248                     F``````
So the result set would look like this, this time around:
``````GAMECODE      COUNT(SETTING)
486609                         4``````
So this time gmcode 486609 would be the ONLY output inthe result set because it was the only gmecode with a streak where the CHANGED column had at least two values equal to 'T' (where seq 153 and 155).

The reason the other two gmcodes are eliminated from the results set is because within there streaks of the SETTING column being set to 'T', the CHANGED column only had a value of 'T' once within those streaks. (253048 is obviously eliminated, 486778 is eliminated because within it's streak of two consecutive rows of 'T' for the SETTING column only one of the CHANGED columns values is equal to 'T'
• ###### 4. Re: Having trouble of thinking of a query to accomplish this
Hi,

Sure; that's just a matter of adding a HAVING clause to the main query.
Since that HAVING clause needs to test the changed column, changed has to be included in both sub-queries.
``````WITH     got_rnum     AS
(
SELECT     gmcode
,     setting
,     seq
,     changed
,     ROW_NUMBER () OVER ( PARTITION BY  gmcode
ORDER BY          seq
)      AS rnum
,     CASE
WHEN  setting = 'F'
OR    LAG (setting) OVER ( PARTITION BY gmcode
ORDER BY       seq
) = 'T'
THEN  NULL
ELSE  1
END               AS grp_start
FROM     table_x
)
,     got_streak_length     AS
(
SELECT  gmcode
,     changed
,     1 + rnum - LAST_VALUE ( grp_start * rnum IGNORE NULLS)
OVER ( PARTITION BY  gmcode
ORDER BY          seq
) AS streak_length
FROM    got_rnum
WHERE     setting     = 'T'
)
SELECT       gmcode
,       MAX (streak_length)     AS most_consecutive_ts
FROM       got_streak_length
GROUP BY  gmcode
HAVING       COUNT (CASE WHEN changed = 'T' THEN 1 END)     >= 2
ORDER BY  gmcode
;``````
• ###### 5. Re: Having trouble of thinking of a query to accomplish this
Using Aketi's "Sense of Tabibitosan" (do a search in the forums for tabibitosan if you're not sure what it is!):
``````with my_tab as (select 486609 gmcode, 'F' setting, 1 event_id, 150 seq from dual union all
select 486609 gmcode, 'F' setting, 4 event_id, 151 seq from dual union all
select 486609 gmcode, 'T' setting, 1 event_id, 152 seq from dual union all
select 486609 gmcode, 'T' setting, 2 event_id, 153 seq from dual union all
select 486609 gmcode, 'T' setting, 4 event_id, 154 seq from dual union all
select 486609 gmcode, 'T' setting, 10 event_id, 155 seq from dual union all
select 486609 gmcode, 'F' setting, 1 event_id, 156 seq from dual union all
select 253048 gmcode, 'T' setting, 6 event_id, 201 seq from dual union all
select 253048 gmcode, 'F' setting, 4 event_id, 202 seq from dual union all
select 253048 gmcode, 'T' setting, 2 event_id, 203 seq from dual union all
select 253048 gmcode, 'F' setting, 2 event_id, 204 seq from dual union all
select 253048 gmcode, 'F' setting, 7 event_id, 205 seq from dual union all
select 486778 gmcode, 'F' setting, 4 event_id, 245 seq from dual union all
select 486778 gmcode, 'T' setting, 4 event_id, 246 seq from dual union all
select 486778 gmcode, 'T' setting, 1 event_id, 247 seq from dual union all
select 486778 gmcode, 'F' setting, 4 event_id, 248 seq from dual), -- end of mimicking your data
tabibitosan as (select gmcode,
setting,
event_id,
seq,
dense_rank() over (order by seq)
- row_number() over (partition by gmcode, setting order by seq) grp
from   my_tab),
results as (select gmcode,
count(gmcode) total_num
from   tabibitosan
where  setting = 'T'
group by gmcode, grp)
select gmcode,
max(total_num) max_cons_count
from   results
group by gmcode;

GMCODE MAX_CONS_COUNT
---------- --------------
486778              2
486609              4
253048              1``````
• ###### 6. Re: Having trouble of thinking of a query to accomplish this
With extra requirement:
``````with my_tab as (select 486609 gmcode, 'F' setting, 1 event_id, 150 seq, 'T' changed from dual union all
select 486609 gmcode, 'F' setting, 4 event_id, 151 seq, 'T' changed from dual union all
select 486609 gmcode, 'T' setting, 1 event_id, 152 seq, 'F' changed from dual union all
select 486609 gmcode, 'T' setting, 2 event_id, 153 seq, 'T' changed from dual union all
select 486609 gmcode, 'T' setting, 4 event_id, 154 seq, 'F' changed from dual union all
select 486609 gmcode, 'T' setting, 10 event_id, 155 seq, 'T' changed from dual union all
select 486609 gmcode, 'F' setting, 1 event_id, 156 seq, 'T' changed from dual union all
select 253048 gmcode, 'T' setting, 6 event_id, 201 seq, 'T' changed from dual union all
select 253048 gmcode, 'F' setting, 4 event_id, 202 seq, 'F' changed from dual union all
select 253048 gmcode, 'T' setting, 2 event_id, 203 seq, 'T' changed from dual union all
select 253048 gmcode, 'F' setting, 2 event_id, 204 seq, 'F' changed from dual union all
select 253048 gmcode, 'F' setting, 7 event_id, 205 seq, 'F' changed from dual union all
select 486778 gmcode, 'F' setting, 4 event_id, 245 seq, 'T' changed from dual union all
select 486778 gmcode, 'T' setting, 4 event_id, 246 seq, 'F' changed from dual union all
select 486778 gmcode, 'T' setting, 1 event_id, 247 seq, 'F' changed from dual union all
select 486778 gmcode, 'F' setting, 4 event_id, 248 seq, 'F' changed from dual), -- end of mimicking your data
tabibitosan as (select gmcode,
setting,
event_id,
seq,
decode(changed, 'T', 1) changed,
dense_rank() over (order by seq)
- row_number() over (partition by gmcode, setting order by seq) grp
from   my_tab),
results as (select gmcode,
count(changed) over (partition by gmcode, grp) count_changed,
grp
from   tabibitosan
where  setting = 'T'),
final_res as (select gmcode,
count_changed,
count(gmcode) total_num
from   results
group by gmcode, count_changed, grp)
select gmcode,
max(total_num) max_cons_count
from   final_res
where  count_changed >= 2
group by gmcode;

GMCODE MAX_CONS_COUNT
---------- --------------
486609              4``````
• ###### 7. Re: Having trouble of thinking of a query to accomplish this
Hahaha I have used sense of Tabibitosan B-)
``````create table GMTable AS
SELECT 486609 AS GMCODE, 'F' AS SETTING, 150 AS SEQ FROM DUAL UNION ALL
SELECT 486609, 'F', 151 FROM DUAL UNION ALL
SELECT 486609, 'T', 152 FROM DUAL UNION ALL
SELECT 486609, 'T', 153 FROM DUAL UNION ALL
SELECT 486609, 'T', 154 FROM DUAL UNION ALL
SELECT 486609, 'T', 155 FROM DUAL UNION ALL
SELECT 486609, 'F', 156 FROM DUAL UNION ALL
SELECT 253048, 'T', 201 FROM DUAL UNION ALL
SELECT 253048, 'F', 202 FROM DUAL UNION ALL
SELECT 253048, 'T', 203 FROM DUAL UNION ALL
SELECT 253048, 'F', 204 FROM DUAL UNION ALL
SELECT 253048, 'F', 205 FROM DUAL UNION ALL
SELECT 486778, 'F', 245 FROM DUAL UNION ALL
SELECT 486778, 'T', 246 FROM DUAL UNION ALL
SELECT 486778, 'T', 247 FROM DUAL UNION ALL
SELECT 486778, 'F', 248 FROM DUAL;

select GMCODE,cnt
from (select GMCODE,count(*) as cnt,
Row_Number() over(partition by GMCODE order by count(*) desc) as rn
from (select GMCODE,SETTING,
Row_Number() over(partition by GMCODE,SETTING order by SEQ)
-Row_Number() over(partition by GMCODE         order by SEQ) as distance
from GMTable)
where SETTING = 'T'
group by GMCODE,distance)
where rn=1
order by cnt desc;

GMCODE        CNT
---------  ---------
486609          4
486778          2
253048          1``````