This discussion is archived
12 Replies Latest reply: Feb 6, 2013 4:48 AM by 861381 RSS

How to improve 'select query with subselects'?

861381 Newbie
Currently Being Moderated
Oracle Database 11g Release 11.1.0.7.0 - 64bit Production
PL/SQL Release 11.1.0.7.0
Hi there,

I'm a C# developer so I really have troubles with understanding PL/SQL. I use a lot of subselects in my queries because I find it easy to understand, however, I need to learn to make use of the power of queries. I made this little library sample to illustrate the problem I have in a smaller scale. I have multiple subselects looking in the same table, running through the same amount of columns to obtain data. This is a waste of performance and resources.

Idea of the query:
Information about a book
- Location of this book
- Number of unique readers of this book
- Number of days it was rent
- Number of reservation of this book

Test data
with 
book as (
select 1 as id, 'reference0001' as reference, 'titel0001' as title, 'description0001' as description, 'author0001' as author, 1 as locid from dual union all
select 2 as id, 'reference0002' as reference, 'titel0002' as title, 'description0002' as description, 'author0002' as author, 1 as locid from dual union all
select 3 as id, 'reference0003' as reference, 'titel0003' as title, 'description0003' as description, 'author0003' as author, 1 as locid from dual union all
select 4 as id, 'reference0004' as reference, 'titel0004' as title, 'description0004' as description, 'author0004' as author, 2 as locid from dual union all
select 5 as id, 'reference0005' as reference, 'titel0005' as title, 'description0005' as description, 'author0005' as author, 2 as locid from dual union all
select 6 as id, 'reference0006' as reference, 'titel0006' as title, 'description0006' as description, 'author0006' as author, 2 as locid from dual union all
select 7 as id, 'reference0007' as reference, 'titel0007' as title, 'description0007' as description, 'author0007' as author, 3 as locid from dual union all
select 8 as id, 'reference0008' as reference, 'titel0008' as title, 'description0008' as description, 'author0008' as author, 3 as locid from dual
),
location as (
select 1 as id, '100.1' as roomnumber from dual union all
select 2 as id, '100.2' as roomnumber from dual union all
select 3 as id, '100.3' as roomnumber from dual
),
client as (
select 1 as id, 'client001' as name from dual union all
select 2 as id, 'client002' as name from dual union all
select 3 as id, 'client003' as name from dual
),
book_history as (
select 1 as bookid, 1 as clientid, 10 as days, to_date('06-06-2012') as rentdate from dual union all
select 3 as bookid, 1 as clientid, 5 as days, to_date('06-06-2012') as rentdate  from dual union all
select 2 as bookid, 2 as clientid, 5 as days, to_date('07-06-2012') as rentdate  from dual union all
select 1 as bookid, 2 as clientid, 5 as days, to_date('07-06-2012') as rentdate  from dual union all
select 2 as bookid, 1 as clientid, 5 as days, to_date('08-06-2012') as rentdate  from dual union all
select 2 as bookid, 1 as clientid, 5 as days, to_date('08-06-2012') as rentdate  from dual union all
select 3 as bookid, 1 as clientid, 3 as days, to_date('09-06-2012') as rentdate  from dual
),
reservation_history as (
select 1 as bookid, 1 as clientid, to_date('04-06-2012') as reservationdate from dual union all
select 3 as bookid, 1 as clientid, to_date('04-06-2012') as reservationdate  from dual union all
select 2 as bookid, 2 as clientid, to_date('05-06-2012') as reservationdate  from dual union all
select 1 as bookid, 2 as clientid, to_date('05-06-2012') as reservationdate  from dual union all
select 2 as bookid, 1 as clientid, to_date('06-06-2012') as reservationdate  from dual union all
select 2 as bookid, 1 as clientid, to_date('06-06-2012') as reservationdate  from dual union all
select 3 as bookid, 1 as clientid, to_date('07-06-2012') as reservationdate  from dual
)
Normally my queries look like this
select  b.id,
        b.reference,
        b.title,
        b.description,
        b.author,
        l.roomnumber,
        (
           select count(distinct bh.clientid)
           from   book_history bh
           where  bh.bookid = b.id
           and    bh.rentdate >= to_date('01-01-2012')
           and    bh.rentdate < to_date('01-01-2013')
         ) as different_readers,
        nvl((
           select sum(bh.days)
           from   book_history bh
           where  bh.bookid = b.id
           and    bh.rentdate >= to_date('01-01-2012')
           and    bh.rentdate < to_date('01-01-2013')
         ),0) as days_of_rent,
        (
          select count(1)
          from reservation_history rh
          where rh.bookid = b.id
          and   rh.reservationdate >= to_date('01-01-2012')
          and   rh.reservationdate < to_date('01-01-2013')
        ) as times_reserved
from    book b
join    location l
  on     b.locid = l.id  
I tried to improve it to this
  select b.id,
          b.reference,
          b.title,
          b.description,
          b.author,
       l.roomnumber,
       nvl(t1.clients, 0) as different_readers,
       nvl(t1.days, 0) as days_of_rent,
       (
          select count(1)
          from reservation_history rh
          where rh.bookid = b.id
          and   rh.reservationdate >= to_date('01-01-2012')
          and   rh.reservationdate < to_date('01-01-2013')
        ) as times_reserved
from   book b
left join location l
       on b.locid = l.id
left join (select bh.bookid as bookid,
                  count(distinct bh.clientid) as clients,
                  sum(bh.days) as days
           from   book_history bh
           where  bh.rentdate >= to_date('01-01-2012')
           and    bh.rentdate < to_date('01-01-2013')
           group by bh.bookid
           ) t1
        on t1.bookid = b.id  
I am told that I should try to move out the group by from the last left join or try to use partitioning.
Now this is a bit too much for my understanding. When I use partitioning
select distinct 
       b.id,
          b.reference,
          b.title,
          b.description,
          b.author,
       l.roomnumber,
       nvl(count(distinct bh.clientid) over (partition by bh.bookid), 0) as different_readers,
       nvl(sum(bh.days) over (partition by bh.bookid),0) as days_of_rent,
       (
          select count(1)
          from reservation_history rh
          where rh.bookid = b.id
          and   rh.reservationdate >= to_date('01-01-2012')
          and   rh.reservationdate < to_date('01-01-2013')
        ) as times_reserved
from   book b
left join location l
       on b.locid = l.id
left join book_history bh
       on b.id = bh.bookid
      and bh.rentdate >= to_date('01-01-2012')
      and bh.rentdate < to_date('01-01-2013')
I suppressed same result rows by using distinct, however I am told to avoid using distinct as this slows down a lot.
Is there any other solution I can use to achieve the same?

I hope my question is clear enough to be answered! Thanks in advance for your time!

Kind regards,

Metroickha

Edited by: 858378 on 6-feb-2013 2:37
  • 1. Re: How to improve 'select query with subselects'?
    jeneesh Guru
    Currently Being Moderated
    Is this not a simple GROUP BY - Coz the non aggregated columns in your SELECT LIST are from the master tables BOOK and LOCATION. Hopefully the relation to these tables will be many to one for the table BOOK_HSITORY
    select b.id,
              b.reference,
              b.title,
              b.description,
              b.author,
           l.roomnumber,
           count(distinct bh.clientid) different_readers,
           nvl(sum(bh.days),0) as days_of_rent
    from   book b
    left join location l
           on b.locid = l.id
    left join book_history bh
           on b.id = bh.bookid
          and bh.rentdate >= to_date('01-01-2012','dd-mm-yyyy')
          and bh.rentdate < to_date('01-01-2013','dd-mm-yyyy')
    group by  b.id,
              b.reference,
              b.title,
              b.description,
              b.author,
           l.roomnumber
    order by b.id; 
    
    ID REFERENCE     TITLE     DESCRIPTION     AUTHOR     ROOMNUMBER DIFFERENT_READERS DAYS_OF_RENT
    -- ------------- --------- --------------- ---------- ---------- ----------------- ------------
     1 reference0001 titel0001 description0001 author0001 100.1                      2           15 
     2 reference0002 titel0002 description0002 author0002 100.1                      2           15 
     3 reference0003 titel0003 description0003 author0003 100.1                      1            8 
     4 reference0004 titel0004 description0004 author0004 100.2                      0            0 
     5 reference0005 titel0005 description0005 author0005 100.2                      0            0 
     6 reference0006 titel0006 description0006 author0006 100.2                      0            0 
     7 reference0007 titel0007 description0007 author0007 100.3                      0            0 
     8 reference0008 titel0008 description0008 author0008 100.3                      0            0 
    
     8 rows selected 
    Edited by: jeneesh on Feb 6, 2013 3:45 PM
  • 2. Re: How to improve 'select query with subselects'?
    BluShadow Guru Moderator
    Currently Being Moderated
    Thanks for providing the data in a usable format, that really helps.
    858378 wrote:
    I am told that I should try to move out the group by from the last left join or try to use partitioning.
    Now this is a bit too much for my understanding. When I use partitioning
    select b.id,
              b.reference,
              b.title,
              b.description,
              b.author,
    l.roomnumber,
    count(distinct bh.clientid) over (partition by bh.bookid) as different_readers,
    nvl(sum(bh.days) over (partition by bh.bookid),0) as days_of_rent
    from   book b
    left join location l
    on b.locid = l.id
    left join book_history bh
    on b.id = bh.bookid
    and bh.rentdate >= '01-01-2012'
    and bh.rentdate &lt; '01-01-2013'
    I get multiple rows with the same values (as expected). So what do I need to add to remove the multiple same rows?
    Use a DISTINCT...
    SQL> ed
    Wrote file afiedt.buf
    
      1  with book as (
      2      select 1 as id, 'reference0001' as reference, 'titel0001' as title, 'description0001' as description, 'author0001' as author, 1 as locid from dual union all
      3      select 2 as id, 'reference0002' as reference, 'titel0002' as title, 'description0002' as description, 'author0002' as author, 1 as locid from dual union all
      4      select 3 as id, 'reference0003' as reference, 'titel0003' as title, 'description0003' as description, 'author0003' as author, 1 as locid from dual union all
      5      select 4 as id, 'reference0004' as reference, 'titel0004' as title, 'description0004' as description, 'author0004' as author, 2 as locid from dual union all
      6      select 5 as id, 'reference0005' as reference, 'titel0005' as title, 'description0005' as description, 'author0005' as author, 2 as locid from dual union all
      7      select 6 as id, 'reference0006' as reference, 'titel0006' as title, 'description0006' as description, 'author0006' as author, 2 as locid from dual union all
      8      select 7 as id, 'reference0007' as reference, 'titel0007' as title, 'description0007' as description, 'author0007' as author, 3 as locid from dual union all
      9      select 8 as id, 'reference0008' as reference, 'titel0008' as title, 'description0008' as description, 'author0008' as author, 3 as locid from dual
     10      ),
     11    location as (
     12      select 1 as id, '100.1' as roomnumber from dual union all
     13      select 2 as id, '100.2' as roomnumber from dual union all
     14      select 3 as id, '100.3' as roomnumber from dual
     15      ),
     16    client as (
     17      select 1 as id, 'client001' as name from dual union all
     18      select 2 as id, 'client002' as name from dual union all
     19      select 3 as id, 'client003' as name from dual
     20      ),
     21    book_history as (
     22      select 1 as bookid, 1 as clientid, 10 as days, to_date('06-06-2012','DD-MM-YYYY') as rentdate from dual union all
     23      select 3 as bookid, 1 as clientid, 5 as days, to_date('06-06-2012','DD-MM-YYYY') as rentdate  from dual union all
     24      select 2 as bookid, 2 as clientid, 5 as days, to_date('07-06-2012','DD-MM-YYYY') as rentdate  from dual union all
     25      select 1 as bookid, 2 as clientid, 5 as days, to_date('07-06-2012','DD-MM-YYYY') as rentdate  from dual union all
     26      select 2 as bookid, 1 as clientid, 5 as days, to_date('08-06-2012','DD-MM-YYYY') as rentdate  from dual union all
     27      select 2 as bookid, 1 as clientid, 5 as days, to_date('08-06-2012','DD-MM-YYYY') as rentdate  from dual union all
     28      select 3 as bookid, 1 as clientid, 3 as days, to_date('09-06-2012','DD-MM-YYYY') as rentdate  from dual
     29      )
     30  select  distinct
     31          b.id
     32         ,b.reference
     33         ,b.title
     34         ,b.description
     35         ,b.author
     36         ,l.roomnumber
     37         ,count(distinct bh.clientid) over (partition by b.id) as different_readers
     38         ,nvl(sum(bh.days) over (partition by b.id),0) as days_of_rent
     39  from    book b
     40          join location l on (b.locid = l.id)
     41          left outer join book_history bh on (   bh.bookid = b.id
     42                                             and bh.rentdate >= to_date('01-01-2012','DD-MM-YYYY')
     43                                             and bh.rentdate < to_date('01-01-2013','DD-MM-YYYY')
     44                                             )
     45* order by 1
    SQL> /
    
            ID REFERENCE     TITLE     DESCRIPTION     AUTHOR     ROOMN DIFFERENT_READERS DAYS_OF_RENT
    ---------- ------------- --------- --------------- ---------- ----- ----------------- ------------
             1 reference0001 titel0001 description0001 author0001 100.1                 2           15
             2 reference0002 titel0002 description0002 author0002 100.1                 2           15
             3 reference0003 titel0003 description0003 author0003 100.1                 1            8
             4 reference0004 titel0004 description0004 author0004 100.2                 0            0
             5 reference0005 titel0005 description0005 author0005 100.2                 0            0
             6 reference0006 titel0006 description0006 author0006 100.2                 0            0
             7 reference0007 titel0007 description0007 author0007 100.3                 0            0
             8 reference0008 titel0008 description0008 author0008 100.3                 0            0
    
    8 rows selected.
    
    SQL>
    also ensure you treat dates as dates and specify a format mask when using TO_DATE.
  • 3. Re: How to improve 'select query with subselects'?
    Karthick_Arp Guru
    Currently Being Moderated
    Basically your table BOOK and BOOK_HISTORY has one-to-many relation. So to join them and get the count from BOOK_HISTORY for each ID in BOOK table, You have to aggregate your BOOK_HISTORY table for the ID and then do a join with BOOK. Now this join after aggregation becomes one-to-one relation which is perfect for getting your output.

    So i think your second query meets the requirement in an optimal way.

    You cant use analytical function as an aggregation is required on BOOK_HISTORY. So analytical function is kind of not relevant and it would create cartesian product and thats the reason you see multiple rows in the third query.

    One general suggestion. Dont pass date as string, use TO_DATE along with Format (second parameter). When you pass it as string oracle tries to do implicit conversion based on the NLS paramter. And this can fail if the NLS setting is different.

    Check this out...
    SQL> alter session set nls_date_format = 'DD-MM-YYYY';
     
    Session altered.
     
    SQL> select * from dual where trunc(sysdate) = '06-02-2013'
      2  /
     
    D
    -
    X
     
    SQL> alter session set nls_date_format = 'YYYY-MM-DD'
      2  /
     
    Session altered.
     
    SQL> select * from dual where trunc(sysdate) = '06-02-2013'
      2  /
    select * from dual where trunc(sysdate) = '06-02-2013'
                                              *
    ERROR at line 1:
    ORA-01830: date format picture ends before converting entire input string
     
    Edited by: Karthick_Arp on Feb 6, 2013 2:24 AM
  • 4. Re: How to improve 'select query with subselects'?
    Purvesh K Guru
    Currently Being Moderated
    You can use a Distinct to supress Duplicate rows.

    However, I feel that the Original Query with sub-queries is different from the later versions. Since there is no Left Join involved in the Original Query.

    Although, the results do not differ since there may not be presence of data that is applicable only to Book but not with Location. But in case of such presence of data, you will retrieve additional rows that would have been suppressed in your Original query. So, please do have a check on it.
  • 5. Re: How to improve 'select query with subselects'?
    861381 Newbie
    Currently Being Moderated
    The solution you gave worked for this example. I updated the example which looks more like the problem I currently have. I added reservation history. I appreciated your quick answer!
  • 6. Re: How to improve 'select query with subselects'?
    861381 Newbie
    Currently Being Moderated
    I am told that using distinct can greatly reduce a query performance, is this correct? Your solution is helpful. Thanks for the quick reply.
  • 7. Re: How to improve 'select query with subselects'?
    861381 Newbie
    Currently Being Moderated
    Thanks for the to_date advise!

    However I do not understand the difference between BOOK joined by BOOK_HISTORY ... or BOOK_HISTORY joined by BOOK ... the result table of these joins are the same?
  • 8. Re: How to improve 'select query with subselects'?
    jeneesh Guru
    Currently Being Moderated
    Still Analytics is not the one required..
    with book_hist_cnt as
    (
    select bh.bookid,count(distinct bh.clientid) dist_bk_cnt,
           sum(bh.days) bk_days
    from   book_history bh
    where  bh.rentdate >= to_date('01-Jan-2012','dd-Mon-yyyy')
    and    bh.rentdate < to_date('01-Jan-2013','dd-Mon-yyyy')
    group by bh.bookid
    ),
    res_hist_cnt as
    (
    select rh.bookid,count(1) res_cnt
    from reservation_history rh
    where  rh.reservationdate >= to_date('01-Jan-2012','dd-Mon-yyyy')
    and   rh.reservationdate < to_date('01-Jan-2013','dd-Mon-yyyy')
    group by rh.bookid
    )
    select  b.id,b.reference,b.title,b.description,b.author,l.roomnumber,
            nvl(bh.dist_bk_cnt,0) dist_bk_cnt,nvl(bh.bk_days,0) bk_days,
            nvl(rh.res_cnt,0) res_cnt
    from    book b
    join    location l
      on     b.locid = l.id
    left outer join  book_hist_cnt bh
      on bh.bookid = b.id
    left outer join  res_hist_cnt rh
      on rh.bookid = b.id
    order by 1  ;
    
    ID REFERENCE     TITLE     DESCRIPTION     AUTHOR     ROOMNUMBER DIST_BK_CNT BK_DAYS RES_CNT
    -- ------------- --------- --------------- ---------- ---------- ----------- ------- -------
     1 reference0001 titel0001 description0001 author0001 100.1                2      15       2 
     2 reference0002 titel0002 description0002 author0002 100.1                2      15       3 
     3 reference0003 titel0003 description0003 author0003 100.1                1       8       2 
     4 reference0004 titel0004 description0004 author0004 100.2                0       0       0 
     5 reference0005 titel0005 description0005 author0005 100.2                0       0       0 
     6 reference0006 titel0006 description0006 author0006 100.2                0       0       0 
     7 reference0007 titel0007 description0007 author0007 100.3                0       0       0 
     8 reference0008 titel0008 description0008 author0008 100.3                0       0       0 
    
     8 rows selected 
  • 9. Re: How to improve 'select query with subselects'?
    jeneesh Guru
    Currently Being Moderated
    858378 wrote:
    I am told that using distinct can greatly reduce a query performance, is this correct? Your solution is helpful. Thanks for the quick reply.
    I feel, DISTINCT and Analytics are not required..I have just amended your second query to avoid two scans on the history table.. (Please have a look at my previous post)..

    This will be the simplest and efficient way, since you are joining two detail tables - which are not directly related..
  • 10. Re: How to improve 'select query with subselects'?
    Karthick_Arp Guru
    Currently Being Moderated
    However I do not understand the difference between BOOK joined by BOOK_HISTORY ... or BOOK_HISTORY joined by BOOK ... the result table of these joins are the same?
    BOOK joined by BOOK_HISTORY -> one-to-many
    BOOK joined by Aggregated BOOK_HISTORY -> one-to-one

    I was talking about that.

    You can write your query like this
    SQL> with book_history_agg
      2  as
      3  (
      4  select bookid,
      5         count(distinct clientid) as clients,
      6         sum(days) as days
      7    from book_history
      8   where rentdate >= to_date('01-01-2012', 'DD-MM-YYYY')
      9     and rentdate < to_date('01-01-2013', 'DD-MM-YYYY')
     10   group
     11      by bookid
     12  ), reservation_history_agg
     13  as
     14  (
     15  select bookid
     16       , count(1) times_reserved
     17    from reservation_history
     18   where reservationdate >= to_date('01-01-2012', 'DD-MM-YYYY')
     19     and reservationdate < to_date('01-01-2013', 'DD-MM-YYYY')
     20   group
     21      by bookid
     22  )
     23  select b.id
     24       , b.reference
     25       , b.title
     26       , b.description
     27       , b.author
     28       , l.roomnumber
     29       , bh.clients
     30       , bh.days
     31       , rh.times_reserved
     32    from book b
     33    left join location l
     34      on b.locid = l.id
     35    left join book_history_agg bh
     36      on bh.bookid = b.id
     37    left join reservation_history_agg rh
     38      on rh.bookid = b.id
     39  /
     
            ID REFERENCE     TITLE     DESCRIPTION     AUTHOR     ROOMN    CLIENTS       DAYS TIMES_RESERVED
    ---------- ------------- --------- --------------- ---------- ----- ---------- ---------- --------------
             1 reference0001 titel0001 description0001 author0001 100.1          2         15              2
             3 reference0003 titel0003 description0003 author0003 100.1          1          8              2
             2 reference0002 titel0002 description0002 author0002 100.1          2         15              3
             5 reference0005 titel0005 description0005 author0005 100.2
             6 reference0006 titel0006 description0006 author0006 100.2
             4 reference0004 titel0004 description0004 author0004 100.2
             8 reference0008 titel0008 description0008 author0008 100.3
             7 reference0007 titel0007 description0007 author0007 100.3
     
    8 rows selected.
     
    SQL> 
  • 11. Re: How to improve 'select query with subselects'?
    861381 Newbie
    Currently Being Moderated
    Thank you. I think the question is answered. I understand and like your solution, which is a good thing! However, it seems slower compared to the other two solutions I made.
  • 12. Re: How to improve 'select query with subselects'?
    861381 Newbie
    Currently Being Moderated
    I understand the relation part now! Thanks. I also understand your solution. It was really helpfull and thanks for the insights!

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points