SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Count nested table column in SQL

JackKSep 22 2021 — edited Sep 22 2021

Hi All,
I am using Oracle 19c.

create or replace type t_numbers is table of number;

create table test_tab (
  id    number(22),
  nums  t_numbers
) nested table nums store as nt_tt_nums;

INSERT INTO test_tab VALUES (1, t_numbers(1,2,3,4));
INSERT INTO test_tab VALUES (2, t_numbers(1,2,3,4,5));
INSERT INTO test_tab VALUES (3, t_numbers(1,2,3,4,5,6));

How may I count elements in nums column?
Below statement does not work because of ORA-00904: "NUMS"."COUNT": invalid identifier

SELECT id, nums.count AS has_items
 FROM test_tab;

Best regards,
Jacek

Cleaning:

drop table test_tab;
drop type t_numbers;

This post has been answered by mathguy on Sep 22 2021

Jump to Answer

Hoek

Hi Wateenmooiedag,

it's cloudy over here btw ;)

Have you tried using LAG:

MHO%xe> select t
  2  ,      userid
  3  ,      case when trunc( mod( (t-lag(t) over ( partition by userid order by userid, t))*24*60, 60 ) ) >= 1
  4              then 1
  5              else 0
  6         end cnt
  7  from   clickstream;

T                       USERID        CNT
------------------- ---------- ----------
05-09-2009 10:00:00          2          0
05-09-2009 10:00:24          2          0
05-09-2009 10:01:23          2          0
05-09-2009 10:02:40          2          1
05-09-2009 00:58:24          7          0
05-09-2009 02:30:33          7          1

6 rijen zijn geselecteerd.

( It reminded me of this link:http://asktom.oracle.com/tkyte/Misc/DateDiff.html )

486393

It is not exactly what I had in mind.

When I add two new rows:

insert into clickstream values (to_date('5-9-2009 10:02:50','dd-mm-yyyy hh24:mi:ss') , 2);
insert into clickstream values (to_date('5-9-2009 10:05:40','dd-mm-yyyy hh24:mi:ss') , 2);

It should return:

T                       USERID        CNT
------------------- ---------- ----------
05-09-2009 10:00:00          2          0
05-09-2009 10:00:24          2          0
05-09-2009 10:01:23          2          0
05-09-2009 10:02:40          2          1
05-09-2009 10:02:50          2          1
05-09-2009 10:05:40          2          2
05-09-2009 00:58:24          7          0
05-09-2009 02:30:33          7          1

Because user 2 started a new session on 10:02:40 and another one on 10:05:40 .

By the way it's cloudy here too but that should not suprise you! :) (wateenmooiedag means what a beautiful day in Dutch)

Edited by: wateenmooiedag on Sep 6, 2009 3:33 AM

Hoek

Aha, OK, that calls for some adjustment , indeed.

One little thing:
Shouldn't

insert into clickstream values (to_date('5-9-2009 10:02:40','dd-mm-yyyy hh24:mi:ss') , 2);

(from your extra inserts)
be

insert into clickstream values (to_date('5-9-2009 10:02:50','dd-mm-yyyy hh24:mi:ss') , 2);

when looking at your desired output?

And what exactly makes the requirement for CNT = 2 on 10:05:40?
And how does 10:02:50 being 1 relate to that (since it differs only 10 secs from the previous t)?
Because I interpreted the requirement like:
"For at least a difference of 60 seconds (one minute) between the current and previous t we will set CNT to 1."
But that apparently isn't enough.

486393

It should indeed be: insert into clickstream values (to_date('5-9-2009 10:02:_50_'...

Every session should have a unique number (SESSIONID) assigned (unique per userid).

I should have renamed column CNT to SESSIONID.

Hoek

If the number has to be unique but it has no further meaning or purpose then maybe using row_number() is OK here:

MHO%xe> select t
  2  ,      userid
  3  ,      nvl(last_value(nullif(sessionid, 0) ignore nulls) over ( partition by userid order by userid, t),0) sessionid
  4  from ( select t
  5         ,      userid 
  6  --       ,      trunc( mod( (t-lag(t) over ( partition by userid order by userid, t))*24*60, 60 ) ) minutes
  7         ,      case when trunc( mod( (t-lag(t) over ( partition by userid order by userid, t))*24*60, 60 )) >= 1
  8                     then row_number() over ( partition by userid order by userid, t)
  9                     else 0
 10                end sessionid
 11         from   clickstream
 12       );

T                       USERID  SESSIONID
------------------- ---------- ----------
05-09-2009 10:00:00          2          0
05-09-2009 10:00:24          2          0
05-09-2009 10:01:23          2          0
05-09-2009 10:02:40          2          4
05-09-2009 10:02:50          2          4
05-09-2009 10:05:40          2          6
05-09-2009 00:58:24          7          0
05-09-2009 02:30:33          7          2

8 rijen zijn geselecteerd.

I've made a few assumptions here regarding setting sessionid to 4 on 10:02:50:
Once a session is identified, then keep that session id for the next records, regardless of the difference in minutes/seconds until the difference is at least a minute again.

1 person found this helpful

486393

That looks very good!! I added a dense_rank() to improve the sessionid:

select t
,      userid
,      -1 + (dense_rank() over (partition by userid order by sessionid)) sessionid
from
(
select t
,      userid
,      nvl(last_value(nullif(sessionid, 0) ignore nulls) over 
       ( partition by userid order by userid, t),0) sessionid
from ( select t
     ,      userid 
     ,      case when trunc( mod( (t-lag(t) over ( partition by userid order by userid, t))*24*60, 60 )) >= 1
                    then row_number() over ( partition by userid order by userid, t)
                    else 0
            end sessionid
     from   clickstream
)
)
/

You can read here more: http://www.asterdata.com/resources/downloads/whitepapers/sqlmr.pdf , chapter 1.2 .

Edited by: wateenmooiedag on Sep 6, 2009 5:27 AM

Hoek

Yes ofcourse, dense_rank() tops it off nicely, didn't think of that one.

Interesting link by the way, thanks Tuinstoel.
It states SQL needs 'an expensive self-join' but by using analytics we don't need a self-join at all.

MHO%xe> select t
  2  ,      userid
  3  ,      -1 + (dense_rank() over (partition by userid order by sessionid)) sessionid
  4  from ( select t
  5         ,      userid
  6         ,      nvl(last_value(nullif(sessionid, 0) ignore nulls) over 
  7                ( partition by userid order by userid, t),0) sessionid
  8         from ( select t
  9                ,      userid 
 10                ,      case when trunc( mod( (t-lag(t) over ( partition by userid order by userid, t))*24*60, 60 )) >= 1
 11                               then dense_rank() over ( partition by userid order by userid, t)
 12                               else 0
 13                       end sessionid
 14                from   clickstream
 15              )
 16       );
Verstreken: 00:00:00.25

Uitvoeringspan
----------------------------------------------------------
Plan hash value: 1413192182

--------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             |     8 |   280 |     5  (40)| 00:00:01 |
|   1 |  WINDOW SORT           |             |     8 |   280 |     5  (40)| 00:00:01 |
|   2 |   VIEW                 |             |     8 |   280 |     4  (25)| 00:00:01 |
|   3 |    WINDOW BUFFER       |             |     8 |   280 |     4  (25)| 00:00:01 |
|   4 |     VIEW               |             |     8 |   280 |     4  (25)| 00:00:01 |
|   5 |      WINDOW SORT       |             |     8 |   176 |     4  (25)| 00:00:01 |
|   6 |       TABLE ACCESS FULL| CLICKSTREAM |     8 |   176 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

It would be very interesting to see how this query would look like when using the MODEL-clause or some XML function, by the way...
I think this requirement can be met using less code that way.

hopes a few other well-known users will drop in here ;)

MichaelS

Answer

Looks to me you can do a bit simpler:

SQL>  select userid, t, nvl (sum (s) over (partition by userid order by t), 0) sessionid
from (select userid, 
             t, 
             case when t - lag(t) over (partition by userid order by t) > 1 / (24 * 60) then 1 end s
      from clickstream
      order by userid, t)
/
    USERID T                      SESSIONID
---------- --------------------- ----------
         2 05.09.2009 10:00:00            0
         2 05.09.2009 10:00:24            0
         2 05.09.2009 10:01:23            0
         2 05.09.2009 10:02:40            1
         2 05.09.2009 10:02:50            1
         2 05.09.2009 10:05:40            2
         7 05.09.2009 00:58:24            0
         7 05.09.2009 02:30:33            1

8 rows selected.

Marked as Answer by 486393 · Sep 27 2020

Hoek

Nice and very elegant, Michael!
It looks so easy, but it just never came to my mind...
memorizing the '> 1 / (24 * 60)' - part

Aketi Jyuuzou

I like using interVal ;-)

create table streamT(USERID,TimeVal) as
select 2,to_date('2009-09-05 10:00:00','YYYY/MM/DD HH24:MI:SS') from dual union
select 2,to_date('2009-09-05 10:00:24','YYYY/MM/DD HH24:MI:SS') from dual union
select 2,to_date('2009-09-05 10:01:23','YYYY/MM/DD HH24:MI:SS') from dual union
select 2,to_date('2009-09-05 10:02:40','YYYY/MM/DD HH24:MI:SS') from dual union
select 2,to_date('2009-09-05 10:02:50','YYYY/MM/DD HH24:MI:SS') from dual union
select 2,to_date('2009-09-05 10:05:40','YYYY/MM/DD HH24:MI:SS') from dual union
select 7,to_date('2009-09-05 00:58:24','YYYY/MM/DD HH24:MI:SS') from dual union
select 7,to_date('2009-09-05 02:30:33','YYYY/MM/DD HH24:MI:SS') from dual;

select USERID,to_char(TimeVal,'yyyy-mm-dd hh24:mi:ss') as TimeVal,
count(willCnt) over(partition by USERID order by TimeVal) as SESSIONID
from (select USERID,TimeVal,
      case when TimeVal-interVal '1' minute >=
                Lag(TimeVal) over(partition by USERID order by TimeVal)
           then 1 end as willCnt
        from streamT)
order by USERID,TimeVal;

 USERID  TimeVal              SESSIONID
-------  -------------------  ---------
      2  2009-09-05 10:00:00          0
      2  2009-09-05 10:00:24          0
      2  2009-09-05 10:01:23          0
      2  2009-09-05 10:02:40          1
      2  2009-09-05 10:02:50          1
      2  2009-09-05 10:05:40          2
      7  2009-09-05 00:58:24          0
      7  2009-09-05 02:30:33          1

1 person found this helpful

1 - 10

Added on Sep 22 2021

26 comments

3,408 views

3 mentions

SQL & PL/SQL

Count nested table column in SQL

Comments

Post Details