Forum Stats

  • 3,839,728 Users
  • 2,262,531 Discussions
  • 7,901,050 Comments

Discussions

not in

hemant_k
hemant_k Member Posts: 415 Bronze Badge
edited Nov 24, 2021 7:20AM in SQL & PL/SQL

hi

I have a query where in I have used 'not in ' construct

It takes a lot of time ..

any other way to get the result faster?

e.g.

select custno from cust_mast where custno not in (select custno from other_mast)

please help

Best Answer

Answers

  • User_H3J7U
    User_H3J7U Member Posts: 1,062 Gold Trophy

    minus.

    If custno can be null, the result will be different.

  • BEDE
    BEDE Oracle Developer Member Posts: 2,454 Gold Trophy

    There is the alternative of using not exists.

    select custno from cust_mast cm

    where not exists in (select 1 from other_mast om where cm.custno=om.custno)

    ;

    Eventually you may try the hint no_unnest, which may work reasonably if you have an index on custno in other_mast:

    select custno from cust_mast cm

    where not exists in (select /*+ no_unnest */ 1 from other_mast om where cm.custno=om.custno)

    ;

    Anyway, an index on custno in other_mast will help.

  • BluShadow
    BluShadow Member, Moderator Posts: 42,150 Red Diamond

    The problem with "not in" is that for some thing to be determined as not being in a set of things, then everything in that set of things has to be checked. It's not like "in" or "exists" where you can stop looking as soon as you find a match.

    As BEDE says, having an index on custno will help.

    It depends what you're ultimately trying to achieve. MINUS, as user_<whatever> suggests is an option, but if you're actual query contains other columns then you could use a left outer join e.g.

    select m.custno, m.other_column ... etc.
    from cust_mast m
         left outer join other_mast o on (o.custno = m.custno)
    where o.custno is null -- where m.custno is not in o.custno set
    


  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,012 Blue Diamond


    It's not possible to an appropriate answer to this question, because you haven't supplied know enough information. It's possible that the query is already running as quickly as it could - without seeing the execution plan, knowing the version, know what indexes exist, whether or not custno is declared NOT NULL on either table, and what uniqueness constraints are in place, the volume of data and what you're doing to get the data into a client program from the server we can't say much about "fastest".

    BluShadow has described why the the operation is necessary a relatively expensive one - but maybe Oracle is running extremely efficiently and the result set appears only slowly because you're fetching it one row at a time over a slow networks.

    Regards

    Jonathan Lewis

  • mathguy
    mathguy Member Posts: 10,602 Blue Diamond

    The problem with "not in" is that for some thing to be determined as not being in a set of things, then everything in that set of things has to be checked. It's not like "in" or "exists" where you can stop looking as soon as you find a match.

    That makes no sense.

    For any given condition, some rows will satisfy it and some won't, and the condition must be evaluated for ALL rows. For an "in" or "exists" conditions, you can stop looking as soon as you find a match - so that will save time for the rows that DO satisfy the condition, but it won't save any time for those that don't. To say with certainty that a row fails the condition, you still must check all the values in the IN list. Similarly for a "not in" or "not exists" condition, **you can stop looking as soon as you find a match** - that will save time for the rows that DO NOT satisfy the condition, but it won't save any time for those that do.

    The work required by a "not in" condition is exactly the same as for the "in" condition that has the same "in" ("not in") list.

  • BluShadow
    BluShadow Member, Moderator Posts: 42,150 Red Diamond

    Apologies Mathguy, I wasn't clear (as usual LOL! 😀)

    If there is an index on the column in question then using IN will be able to use the index to lookup the value pretty efficiently. Whereas using NOT IN cannot use the index as efficiently and usually results in a full scan (depends on uniqueness of the value, so if a primary key it's more likely). At least that's what I was taught at Oracle University.... Can't honestly say I've gone ahead and put it to the test; I usually avoid NOT IN for most queries unless I know it's small sets of data. 😉

  • hemant_k
    hemant_k Member Posts: 415 Bronze Badge

    Deepest regrets first of all

    had ahemorage and was in ICu ...

    have logged in today only...

    okay .. the issue is still eatinh me ... I suspect some issue with datbase or privilees....

    Actually I am trying to execute the query on DB2 written in oracle....

    could that be a reason?

    with main as 
    (
    select * from (
    select a.LBrCode,a.PrdAcctId,a.AcctType,nvl(c.AMLRating,1) AMLRating,a.CustNo,e.CodeDesc actype,a.acctstat,
    case when (a.AcctStat=3 and a.DateClosed>'31-mar-2021') or a.AcctStat<>3 then 'live' else 'cl' end r1
    from  [email protected] a left join  [email protected] c on a.CustNo=c.CustNo
    left join  [email protected] e on a.AcctType=e.Code and e.CodeType=1067
    inner join  [email protected] b on a.LBrCode=b.LBrCode and  trim( substr(a.PrdAcctId,1,8))= trim(b.PrdCd)
    inner join  [email protected] d on a.LBrCode=d.LBrCode and a.PrdAcctId=d.PrdAcctId and d.NameType=10
    where a.AcctType in (4,5,13,17,35,36,37,38,39,61,62,63,64,66) and b.ModuleType in (11,12,13,14,30,31)
    ) where r1='live'
    )
    ,
    bal as (
    select * from (
    select a.LBrCode,a.PrdAcctId,
    case when b.ModuleType in (30,31) then a.Balance1-(a.Balance2-a.Balance3)-(a.Balance4-a.Balance5) 
    else a.Balance4 end totbal,
    rank() over(partition by a.LBrCode,a.PrdAcctId order by a.CblDate desc) rnk
    from  [email protected] a left join  [email protected] b  on a.LBrCode=b.LBrCode and  trim( substr(a.PrdAcctId,1,8))= trim(b.PrdCd)
    where a.CblDate <=  '31-mar-2021'
    ) x where rnk=1
    )
    --
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'total_customers' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    ) x group by label
    
    
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Company' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.AcctType in (13,37,61)
    ) x group by label
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Partnership_Firm' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.AcctType in (17)
    ) x group by label
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Trust' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.AcctType in (4,35)
    ) x group by label
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Association_of_Persons' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.AcctType in (38,62,64)
    ) x group by label
    
    
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Others' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    sum( case when a.AMLRating =1 then b.totbal end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    sum( case when a.AMLRating =2 then b.totbal end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    sum( case when a.AMLRating =3 then b.totbal end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.AcctType in (5,36,39,63,66)
    ) x group by label
    
    
    union all
    --
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'inOperative_accounts' label,
    count(  case when a.AMLRating=1 then a.CustNo end) low_cust_cnt,
    count( case when a.AMLRating =1 then a.PrdAcctId end ) low_amt,
    count(  case when a.AMLRating=2 then a.CustNo end) medium_cust_cnt,
    count( case when a.AMLRating =2 then a.PrdAcctId end ) medium_amt,
    count(  case when a.AMLRating=3 then a.CustNo end) high_cust_cnt,
    count( case when a.AMLRating =3 then a.PrdAcctId end ) high_amt
    from main a left join bal b on a.LBrCode=b.LBrCode and a.PrdAcctId=b.PrdAcctId
    where a.acctStat in (4,5)
    ) x group by label
    
    
    --
    union all
    select label,
    sum(low_cust_cnt	)low_cust_cnt,
    sum(low_amt			)low_amt,
    sum(medium_cust_cnt	)medium_cust_cnt,
    sum(medium_amt		)medium_amt,
    sum(high_cust_cnt	)high_cust_cnt,
    sum(high_amt        )high_amt
    from (
    select 
    'Pending_BO' label,
    count(  case when AMLRating=1 then CustNo end) low_cust_cnt,
    sum( case when AMLRating =1 then totbal end ) low_amt,
    count(  case when AMLRating=2 then CustNo end) medium_cust_cnt,
    sum( case when AMLRating =2 then totbal end ) medium_amt,
    count(  case when AMLRating=3 then CustNo end) high_cust_cnt,
    sum( case when AMLRating =3 then totbal end ) high_amt
    from (
    select * from (
    select a.lbrcode,a.prdacctid,a.acctstat,a.accttype,b.nametype,nvl(d.amlrating,1) amlrating,a.custno,
    case when (a.AcctStat=3 and a.DateClosed>'30-jun-2021') or a.AcctStat<>3 then 'live' else 'cl' end r1,
    case when c.ModuleType in (30,31) then (y.Balance1-(y.Balance2-y.Balance3)-(y.Balance4-y.Balance5) )
    else y.Balance4 end totbal,
    rank() over (partition by y.LBrCode,y.PrdAcctId order by y.CblDate Desc ) rnk
    from [email protected] a  inner join [email protected]  b on a.lbrcode=b.lbrcode and a.prdacctid=b.prdacctid
    inner join [email protected]  c on a.lbrcode=c.lbrcode and trim(substr(a.prdacctid,1,8))=trim(c.prdcd) and c.moduletype in (11,12,13,14,30,31)
    left join [email protected] d on a.custno=d.custno
    left join [email protected] y on a.lbrcode=y.lbrcode and a.prdacctid=y.prdacctid
    where  a.AcctType in (4,5,13,17,35,36,37,38,39,61,62,63,64,66)
    --and (a.lbrcode,a.prdacctid) not in (select lbrcode,prdacctid from [email protected] where nametype=10)
    ) where r1='live' and nametype!=10  and rnk=1
    )
    )group by label
    
    
    
    
    
    
    select * from (
    select a.lbrcode,a.prdacctid,a.acctstat,a.accttype,b.nametype,nvl(d.amlrating,1) amlrating,
    case when (a.AcctStat=3 and a.DateClosed>'30-jun-2021') or a.AcctStat<>3 then 'live' else 'cl' end r1,
    case when c.ModuleType in (30,31) then (y.Balance1-(y.Balance2-y.Balance3)-(y.Balance4-y.Balance5) )
    else y.Balance4 end totbal,
    rank() over (partition by y.LBrCode,y.PrdAcctId order by y.CblDate Desc ) rnk
    from [email protected] a  inner join [email protected]  b on a.lbrcode=b.lbrcode and a.prdacctid=b.prdacctid
    inner join [email protected]  c on a.lbrcode=c.lbrcode and trim(substr(a.prdacctid,1,8))=trim(c.prdcd) and c.moduletype in (11,12,13,14,30,31)
    left join acbl.D009012 d on a.custno=d.custno
    left join acbl.d010014 y on a.lbrcode=y.lbrcode and a.prdacctid=y.prdacctid
    where  a.AcctType in (4,5,13,17,35,36,37,38,39,61,62,63,64,66)
    and (a.lbrcode,a.prdacctid) not in (select lbrcode,prdacctid from [email protected] where nametype=10)
    ) where r1='live' and nametype!=10  and rnk=1
    
    
    


    Request your attention at a last para here

    the query is taking days to parse on DB2....

    I know it is not a right forum to ask thsi question... stillll

    please help

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 10,012 Blue Diamond

    Sorry to hear that you've been in ICU, and hope you're not being pushed back into work before you're fully recovered.

    Since you're asking only about the last part of the UNION ALL, and since the display of the full SQL statement shows the "not in" clause commented out, is it safe to assume that the full query takes a reasonable time when the "not in" is removed, but a very long time when it present, and that you've satisfied yourself that the ONLY thing consuming a lot of time is that subquery clause.

    I haven't read carefully through the whole query, but it looks as if you've got

    with subquery1 as (),
    subquery2 as ()
              select from subquery1, subquery2 group by ...
    union all select from subquery1, subquery2 group by ....
    union all select from subquery1, subquery2 group by ....
    union all
    etc.
    

    and it's only in the last branch of the union all that you have the NOT IN clause.

    We need to see the execution plan - possibly of a simplified statement, but I think the following may be happening (and it's only a guess at this point):

    • the two subqueries are materializing (into global temporary tables) in the local database because they are used many times so most of the branches of the query are hash joins with aggregation of the two local "tables"
    • the final branch is now between two local tables with a remote correlated subquery, so for every row in the join you are executing a remote select - and it's the repeated execution of the query that's taking the time.

    Possibly you've tried testing the branch with the subquery by itself and not been able to find the problem - but with a single use of the factored subqueries they probably won't materialize, so the test query becomes a "fully remote" query that operates efficiently at DB2. (Again this is a guess about what you've done so far and what might have happened.)


    If all this guesswork is correct then you could test to see what happens when you put the text of the subquery into a factored subquery with the /*+ materialize */ hint, viz:

    with not_in_subq as (
            select /*+ materialize */ lbrcode,prdacctid 
            from [email protected] 
            where nametype=10
    )
    

    then your "not in" predicate is just where (....) not in (select lnrcode, prdacctid from not_on_subq)

    You'll need to include at least one other branch of the UNION ALL when testing this, or you may find that this subquery materializes but the other two don't and again you'll have a distributed query doing something expensive (or you could put the materialize hint into all three factored subqueries.

    If the materialization works you may still find you then have to control the execution plan with a couple more hints - for example forcing an UNNEST and HASH ANTIJOIN - and the options there will be affected by the version of Oracle. (You haven't said which it is).

    First step, though - check the real execution plan (pulled from memory, NOT from "explain plan").


    Regards

    Jonathan Lewis

    hemant_k
  • hemant_k
    hemant_k Member Posts: 415 Bronze Badge

    thanks for showing concern sir

    thanks