Forum Stats

  • 3,750,345 Users
  • 2,250,159 Discussions
  • 7,866,944 Comments

Discussions

finding the min date of different dataset and pivoting

Devx
Devx Member Posts: 130 Blue Ribbon
edited Mar 7, 2020 10:01PM in SQL & PL/SQL

Hi everyone,

i have a scenario where i want to find the min dates from multiple tables that contain different date field.  let me explain further. please consider the following data

  

 

with summ as

(

SELECT 8521 AS cid, 'partial' as status,    to_date('2/16/2015 8:54:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,  123 as v_id from dual union all

SELECT 1428 AS cid, 'partial' as status,to_date('2/15/2015 5:28:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,      123 as v_id from dual union all

SELECT 7364 AS cid, 'complete' as status ,to_date('2/16/2015 2:15:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union all

SELECT 8965 AS cid, 'complete' as status ,to_date('2/15/2015 8:03:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union all

SELECT 6361 AS cid, 'complete' as status ,to_date('2/15/2015 8:17:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union all

SELECT 6935 AS cid, 'partial' as status , to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual

)

,orders as

(

SELECT 8521 AS cid,  'attending' as data_val, to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union all

SELECT 8521 AS cid,  'phone' as data_val,      to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union all

SELECT 8521 AS cid,  'md' as data_val,         to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union all

SELECT 1428 AS cid, 'attending' as data_val, to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union all

SELECT 1428 AS cid, 'phone' as data_val,    to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union all

SELECT 1428 AS cid, 'md' as data_val,       to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union all

SELECT 8965 AS cid, 'attending' as data_val,  to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union all

SELECT 8965 AS cid,   'phone' as data_val, to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union all

SELECT 8965 AS cid, 'md' as data_val,  to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union all

SELECT 6935 AS cid, 'attending' as data_val,  to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union all

SELECT 6935 AS cid,  'phone' as data_val,  to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union all

SELECT 6935 AS cid,  'md' as data_val, to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual

)

,detail as

(

  SELECT 8521 AS cid,  'date_Read' as data_val, to_date('3/17/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,  123 as v_id from dual union all

SELECT 1428 AS cid, 'date_Read' as data_val, to_date('2/24/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,      123 as v_id from dual union all

SELECT 8965 AS cid, 'date_Read' as data_val,  to_date('3/31/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,    123 as v_id from dual union all

SELECT 6935 AS cid,  'date_Read' as data_val, to_date('2/25/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,    123 as v_id from dual

)

i want to join all 3 tables (summ, orders,detail) by cid, v_id and i want to find the following.

from summ table i want to find the row with min ev_dt where status is complete.  from orders table i want to find the row with min or_dt

and from detail table i want to find out the row with min date_Read.  my output should look like this

vi_id        first_complete                  first_ordered                       first_read

123          2/15/2015 8:03:00 PM    2/15/2015 4:59:00 PM         2/24/2015 4:59:00 PM

my query looks like this but it is not completed

select * from (

SELECT  row_number () over (partition by v_id , status order by ev_dt) as first_completed,

                row_number () over (partition by v_id order by ord_dt) as first_ordered,

                row_number () over (partition by v_id order by date_Read) as first_read

  FROM summ a

   JOIN orders b

     on (a.v_id = b._vid)

  JOIN detail c

    on (a.v_id = c.v_id)

) where first_completed = 1 or (first_ordered = 1 and status='complete')

      or first_read = 1

can someone help me modify my query so that i can get the output describe above? im using oracle 11g

thanks in advance

Tagged:

Answers

  • mathguy
    mathguy Member Posts: 9,967 Gold Crown
    edited Mar 7, 2020 8:43PM

    First you must make up your mind. You said this:

    i want to join all 3 tables (summ, orders,detail) by cid, v_id and i want to find the following.

    but then everything else in your post joins by v_id only. So, which is it?  Below I will assume that you are only joining by V_ID. If you need to join on both columns, it's an easy modification to the query below.

    Then: You are going about it the wrong way from the start. You should aggregate each table separately first, and then join the results (which will be much smaller than the original tables). Don't join first - there is a lot of data in the base tables that is not needed in the end result, so don't join unneeded rows from one table to rows from the other two. And, standard aggregation suffices - you don't need analytic functions for this task.

    Here it is:

    select v_id, s.first_complete, o.first_ordered, d.first_readfrom   (select v_id, min(ev_dt) as first_complete from summ where status = 'complete' group by v_id) s       full outer join       (select v_id, min(or_dt) as first_ordered  from orders group by v_id) o using (v_id)       full outer join       (select v_id, min(date_read) as first_read from detail group by v_id) d using (v_id);      V_ID  FIRST_COMPLETE        FIRST_ORDERED         FIRST_READ       ----------  --------------------  --------------------  --------------------       123  2/15/2015 8:03:00 PM  2/15/2015 4:59:00 PM  2/24/2015 4:59:00 PM
  • alvinder
    alvinder Member Posts: 389 Silver Badge
    edited Mar 7, 2020 7:41PM

    Something along the lines.

    You mentioned that tables are joined on cid and v_id.

    But if i join them on cid and v_id then the dates you mention are for cid 1428 which is partial.

    Please advise if the data is correct.

    So i have removed the join for cid.

    Alvinder

    with summ as(SELECT 8521 AS cid, 'partial' as status,    to_date('2/16/2015 8:54:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,  123 as v_id from dual union allSELECT 1428 AS cid, 'partial' as status,to_date('2/15/2015 5:28:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,      123 as v_id from dual union allSELECT 7364 AS cid, 'complete' as status ,to_date('2/16/2015 2:15:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union allSELECT 8965 AS cid, 'complete' as status ,to_date('2/15/2015 8:03:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union allSELECT 6361 AS cid, 'complete' as status ,to_date('2/15/2015 8:17:00 PM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual union allSELECT 6935 AS cid, 'partial' as status , to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as ev_dt,    123 as v_id from dual),orders as(SELECT 8521 AS cid,  'attending' as data_val, to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union allSELECT 8521 AS cid,  'phone' as data_val,      to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union allSELECT 8521 AS cid,  'md' as data_val,         to_date('2/16/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,  123 as v_id from dual union allSELECT 1428 AS cid, 'attending' as data_val, to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union allSELECT 1428 AS cid, 'phone' as data_val,    to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union allSELECT 1428 AS cid, 'md' as data_val,       to_date('2/15/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,      123 as v_id from dual union allSELECT 8965 AS cid, 'attending' as data_val,  to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union allSELECT 8965 AS cid,   'phone' as data_val, to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union allSELECT 8965 AS cid, 'md' as data_val,  to_date('2/15/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union allSELECT 6935 AS cid, 'attending' as data_val,  to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union allSELECT 6935 AS cid,  'phone' as data_val,  to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual union allSELECT 6935 AS cid,  'md' as data_val, to_date('2/16/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as or_dt,    123 as v_id from dual),detail as( SELECT 8521 AS cid,  'date_Read' as data_val, to_date('3/17/2015 8:17:00 AM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,  123 as v_id from dual union allSELECT 1428 AS cid, 'date_Read' as data_val, to_date('2/24/2015 4:59:00 PM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,      123 as v_id from dual union allSELECT 8965 AS cid, 'date_Read' as data_val,  to_date('3/31/2015 8:01:00 PM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,    123 as v_id from dual union allSELECT 6935 AS cid,  'date_Read' as data_val, to_date('2/25/2015 1:57:00 AM', 'mm/dd/yyyy hh:mi:ss am') as date_Read,    123 as v_id from dual)SELECT distinct     a.v_id,    MIN(ev_dt) first_Complete_dt, --   a.cid,    FIRST_VALUE(or_dt) OVER(PARTITION BY a.v_id        ORDER BY            or_dt    ) first_ordered_dt,    FIRST_VALUE(date_read) OVER(PARTITION BY a.v_id        ORDER BY            or_dt    ) First_Date_Rd    --MIN(date_read),FROM    summ    a,    orders  b,    detail  cWHERE        a.v_id = b.v_id AND    a.v_id = c.v_id --    AND    a.cid = b.cid--and a.cid= c.cidand a.status='complete'group by a.v_id --, a.cid , or_dt,date_read;
  • Devx
    Devx Member Posts: 130 Blue Ribbon
    edited Mar 7, 2020 8:12PM

    thank you for your replies.  my apologies for the confusion. yes, the join is on cid and v_id.  in my query i forgot to add cid on the join.  in your query, should i just add the join with cid or it will required further modification.  thanks

  • mathguy
    mathguy Member Posts: 9,967 Gold Crown
    edited Mar 7, 2020 8:45PM

    You didn't just forget to add cid to the joins - you didn't include it in the "desired output" either. At the very least, the output should have a column for CID, if you group and join on it, right?

    This is where the FULL OUTER joins in my reply become even more important (depending, though, on your complete and detailed problem statement). In some cases, for some CID, you only have data in one or two of the three tables - not in all three of them. If you need to show all available data for ALL pairs (CID, V_ID) - even when data is missing from one of the tables - then you need FULL OUTER JOIN as I have written it. (And that's also when the USING clause is more convenient than an ON clause for the join.) On the other hand, if you only need the output for (CID, V_ID) that have at least one row (with 'complete' status in the first table), then you will need to change the joins to INNER JOIN.

    Here is the modified query and the full output with the requested change. Note that you need to add CID to the earlier query in several places: the SELECT and GROUP BY clauses of every subquery, to the USING clause of the joins, and also to the outer SELECT (main query).

    select cid, v_id, s.first_complete, o.first_ordered, d.first_readfrom   (select cid, v_id, min(ev_dt) as first_complete from summ where status = 'complete' group by cid, v_id) s       full outer join       (select cid, v_id, min(or_dt) as first_ordered from orders group by cid, v_id) o using (cid, v_id)       full outer join       (select cid, v_id, min(date_read) as first_read from detail group by cid, v_id) d using (cid, v_id);       CID        V_ID  FIRST_COMPLETE        FIRST_ORDERED         FIRST_READ           ----------  ----------  --------------------  --------------------  -----------------------      8521         123                        2/16/2015 8:17:00 AM  3/17/2015 8:17:00 AM         1428         123                        2/15/2015 4:59:00 PM  2/24/2015 4:59:00 PM         8965         123  2/15/2015 8:03:00 PM  2/15/2015 8:01:00 PM  3/31/2015 8:01:00 PM         6935         123                        2/16/2015 1:57:00 AM  2/25/2015 1:57:00 AM         6361         123  2/15/2015 8:17:00 PM                                                         7364         123  2/16/2015 2:15:00 AM   

    If you use INNER JOIN instead of FULL OUTER JOIN everywhere, you will only get the row for CID = 8965.

  • Devx
    Devx Member Posts: 130 Blue Ribbon
    edited Mar 7, 2020 8:59PM

    thanks again. actually, i dont want to display the cid in the output because i will get multiple rows and i am looking for one row only which has all 3 dates.  that is the reason i was using analytics function instead of min aggregation.  i was thinking of joining all the tables together with cid and visitid and then do analytics function to get one row with the min dates from each table and acquire the output i mentioned above

  • alvinder
    alvinder Member Posts: 389 Silver Badge
    edited Mar 7, 2020 9:26PM

    The issue is if you join on CID as well.

    then the output you are expecting is for CID=1428 and the status is "Partial" for that. Not "Complete. Maybe i am missing something.

    As @mathguy mentioned you will get the rows which is displayed in bold in the output.

    Alvinder

  • Devx
    Devx Member Posts: 130 Blue Ribbon
    edited Mar 7, 2020 9:59PM

    I think the confusion here is that you are relating the dates to specific cid. to clarify futher, this is what i am looking for.

    i want to join all 3 tables by cid, v_id.  from that result set then i want to find the min date(ev_dt) where status is complete. then from the same  result set (3 tables joined together) i want the row with  min or_dt . then from same result set, i want to find min date_read. 

    thats is the reason i join all 3 tables first and then from the result set i was trying to find the min dates using analytics functions

    if you use min and group by on each table separate, then i guess it is more difficult.  let me now if this is clear now. thanks

  • mathguy
    mathguy Member Posts: 9,967 Gold Crown
    edited Mar 7, 2020 10:01PM

    Please explain what you believe it means to “join on cid and v_id“, since obviously you are not using the technical term. I have no idea what you mean.

    Good luck with your task.