1 2 Previous Next 15 Replies Latest reply on Dec 13, 2018 6:40 PM by jaramill

    How to  improve the performance of my query? / My query is running slow.

    mmoayed

      Hello All,

      I am trying to create Procedure in ODI and one task is executing this SQL:

       

      select cust.department,count(cust_id)

      from customer

      where C_date between Start_date and End_date

            and status in (Status1,status2)

            and version = SELECT MIN(cus2.version )

                                         FROM customer cus2

                                         WHERE cust.Cust_id = cust2.Cust_id AND cust2.status  = cust.status )

      group by cust.department

      union

      select cust.department, count(cust_id)

      from customer cust

      where created_date < End_date

            and exists ( select 'x'

                         from customer cust1

                         where cust1.cust_id = cust1.cust_id

                                AND cust.version  < cust1.version

                                and cust1.C_date between Start_date and End_date

                                and cust1.version = ( SELECT MIN(cust2.VERSION)                                                                           

                                                      FROM customer cust2                                                                     

                                                         WHERE cust2.cust_id = cust1.cust_id AND cust2.department != cust.department  )

                         )

      group by cust.department.

       

      Can any one help me to speed this SQL, it take very long time to be executed.

      the table contains more than 100 million records .

       

      What ideas you can suggest  ?

       

      Best regards,

      Mohammed

        • 1. Re: How to  improve the performance of my query? / My query is running slow.
          John Thorton

          mmoayed wrote:

           

          Hello All,

          I am trying to create Procedure in ODI and one task is executing this SQL:

           

          select cust.department,count(cust_id)

          from customer

          where C_date between Start_date and End_date

          and status in (Status1,status2)

          and version = SELECT MIN(cus2.version )

          FROM customer cus2

          WHERE cust.Cust_id = cust2.Cust_id AND cust2.status = cust.status )

          group by cust.department

          union

          select cust.department, count(cust_id)

          from customer cust

          where created_date < End_date

          and exists ( select 'x'

          from customer cust1

          where cust1.cust_id = cust1.cust_id

          AND cust.version < cust1.version

          and cust1.C_date between Start_date and End_date

          and cust1.version = ( SELECT MIN(cust2.VERSION)

          FROM customer cust2

          WHERE cust2.cust_id = cust1.cust_id AND cust2.department != cust.department )

          )

          group by cust.department.

           

          Can any one help me to speed this SQL, it take very long time to be executed.

          the table contains more than 100 million records .

           

          What ideas you can suggest ?

           

          Best regards,

          Mohammed


          Since NOBODY can optimize SQL just by looking at it, we need a few more details.
          http://www.orafaq.com/forum/mv/msg/84315/433888/#msg_433888
          Please refer to URL above & be sure to provide the details requested:
          1) DDL for all tables & indexes
          2) EXPLAIN PLAN
          3) output from SQL_TRACE & tkprof


          • 2. Re: How to  improve the performance of my query? / My query is running slow.
            mathguy

            There are some obvious mistakes in that query. From a quick look (less than ten seconds):  select cust.department in the first line, but then from customer in the second line (there is no alias cust given there) - that will throw a syntax error, unless you have a different table, view or synonym called cust.  Then, a little later, and version = SELECT   -   missing opening parenthesis before SELECT (but the closing parenthesis is present in the code). This would throw two errors - for unbalanced parentheses, and for missing the left parenthesis around a scalar subquery. Perhaps you have other mistakes as well - I just stopped after the second one.

             

            Is COPY & PASTE not working on your machine? Or is this not the query you have in your code (even though you say it is)?

             

            And before you go much further: I doubt that UNION is the right operator. You are UNIONing the results of two aggregate queries. UNION (as opposed to UNION ALL) de-duplicates the final result. So if a department has different counts in the two members of the UNION, that department will have two rows in the output. But if the counts are equal, there will be only one row in the output. Is that really the desired result? Then: The two subqueries will become indistinguishable after UNION. You will have a department, with two different counts (in two rows), and you will no longer know which count comes from the first subquery and which from the second. That makes no sense.

             

            First make sure the query is correct and it does what it is supposed to do. I already demonstrated that what you posted is wrong (syntactically), and I doubt that it does what it is supposed to do. If we fix the syntax and then help you optimize the query, you may still get the wrong results - just faster. Will that help you?

            • 3. Re: How to  improve the performance of my query? / My query is running slow.
              Frank Kulash

              Hi,

               

              See the Forum FAQ: Re: 3. How to improve the performance of my query? / My query is running slow.

               

              It's remarkable that you chose the exact same title for your thread, but the similarity ends there.  For example, the Forum FAQ page begins: "Performance tuning is not always as simple as just showing people a query and them telling you where it needs improving."

               

              Are you sure the query is even doing what you want?  It's producing output like

              DEPARMENT  COUNT(CUST_ID)

              ---------  --------------

              AIN                   100

              AIN                   200

              YVELINES              250

              For 'AIN', you don't know which branch of the UNION produced 100 or 200.

              For 'YVELINES', it's even less clear.  You don't know which branch produced that row, or whether both branches found 'YVELINES' and they had exactly the same count.

              Is that really what you want?

              • 4. Re: How to  improve the performance of my query? / My query is running slow.

                Can any one help me to speed this SQL, it take very long time to be executed.

                The query only takes 1 second to execute now and returns all 100 million rows!

                 

                Why does it need to be faster? How fast does it need to be? What are you doing with those 100 million rows?

                 

                Terms like 'faster' and 'very long time' have no real meaning and aren't useful as tuning goals.

                 

                REQUIREMENTS FIRST:

                 

                1. How long does the query take now?

                2. How many rows does it return?

                3. How often is it executed?

                4. How many different users execute it?

                5. How much faster does it need to be?

                • 5. Re: How to  improve the performance of my query? / My query is running slow.
                  mmoayed

                  Hello ,
                  Thanky for repyling.

                   

                   

                  Well, this SQL is not the exactlly as I am using. I just create same SQL that simulate the same job. I can't give the exact SQL I am using.
                  But it has the same Concept.

                   

                  I hope you can help  with these Informations?

                   

                  Every thing goes fine till I add this SQL in the part of the whole SQL :

                  and cust1.version = ( SELECT MIN(cust2.VERSION)

                                                           FROM customer cust2

                                                           WHERE cust2.cust_id = cust1.cust_id AND cust2.department != cust.department )

                   

                   

                  select cust.department,count(cust_id)

                    from customer

                    where C_date between Start_date and End_date

                          and status in (Status1,status2)

                          and version = (SELECT MIN(cus2.version )

                                        FROM customer cus2

                                        WHERE cust.Cust_id = cust2.Cust_id

                                              AND cust2.status = cust.status

                                        group by cust.department)

                  union

                  select cust.department, count(cust_id)

                  from customer cust

                  where created_date < End_date

                  and exists ( select 'x'

                               from customer cust1

                               where cust1.cust_id = cust1.cust_id

                                     AND cust.version < cust1.version

                                     and cust1.C_date between Start_date and End_date

                                     and cust1.version = ( SELECT MIN(cust2.VERSION)

                                                           FROM customer cust2

                                                           WHERE cust2.cust_id = cust1.cust_id AND cust2.department != cust.department )

                                                           )

                  group by cust.department.

                  • 6. Re: How to  improve the performance of my query? / My query is running slow.
                    Cookiemonster76

                    You're going to have to provide table structures (including full list of indexes) and explain plans.

                     

                    When I see a table called customers that has column called cust_id I would normally assume that column is probably unique. But if it is then the select MIN's make no sense.

                    Which is to say - we can't make useful suggestions on how to rewrite a query without understanding the data model.

                    • 7. Re: How to  improve the performance of my query? / My query is running slow.
                      Stefan Jager

                      Where do uou see a MIN on the customers ID? I only see a MIN on the Customer's version - which could be apllicable, if one customer has multiple versions.

                       

                      What I do see is table aliases mixed up and messed up - I suspect there's a lot of issues coming from that (besides the union, which should be a union all I would imagine).

                      • 8. Re: How to  improve the performance of my query? / My query is running slow.
                        BEDE

                        Just a thought:

                         

                        with cust_ver as (
                          select min(cust.version) min_ver, max(cust.version) max_ver
                            ,count(*)
                            ,cust_id
                            ,department
                          from customer cust
                          group by cust_id, department
                        )
                        select  /*+ leading(tb) use_nl(c) */ c.department, count(cust_id)
                        from customer c
                        join tb on c.cust_id=tb.cust_id and tb.min_ver=c.version and tb.department=c.department
                        where c.status in   (Status1,status2)
                        group by c.department
                        union all

                        select /*+ leading(tb) use_nl(cust) */ cust.department, count(cust_id)
                        from customer cust
                        join tb on c.cust_id=tb.cust_id and tb.min_ver=c.version and tb.department=c.department
                        where created_date < End_date
                        and exists ( select 'x'
                                     from customer cust1
                                     where cust1.cust_id = cust1.cust_id
                                           AND cust.version < cust1.version
                                           and cust1.C_date between Start_date and End_date
                                           and cust1.version = tb.min_ver and tb.cust_id=cust1.cust_id and tb.department=cust1.department
                        group by cust.department

                        ;

                         

                        Still, what I can't figure out is what you actually wish to achieve with the exists (...). Customers having several versions and what? The Hint leading and use_nl I have put thinking that you must have an index on customer(cust_id) which should better be used.


                                            

                        • 9. Re: How to  improve the performance of my query? / My query is running slow.
                          Cookiemonster76

                          and cust1.version = ( SELECT MIN(cust2.VERSION)

                                                                   FROM customer cust2

                                                                   WHERE cust2.cust_id = cust1.cust_id AND cust2.department != cust.department )

                                                                   )

                           

                          Where clause of sub-query joins to outer query on cust_id (and does anti-join to outer-outer query).

                          If cust_id is unique then that sub-query can only ever return 1 or 0 (if the deparments are the same) rows.

                          So doing a min is pointless.

                          And so is doing a sub-query at all.

                          • 10. Re: How to  improve the performance of my query? / My query is running slow.
                            LPNO

                            Hi mmoayed, you can't filter out minimal version line of each customer and do grouping over [department, customer] in the same (sub)sql chunk. BEDE points you in the right direction, even though I spot a little syntax error. As you are not giving table definition we can not test your sql. I think this is the intended sql and it should help you:

                             

                            with cust_ver as (
                              select cust_id, department, min(cust.version) min_ver, max(cust.version) max_ver
                                ,count(*)
                              from customer cust
                              group by cust_id, department
                            ) 
                            select   /*+ leading(tb) use_nl(c) */ 'sq1' src, c.department, count(cust_id)
                            from customer c, cust_ver tb.
                            join tb on c.cust_id=tb.cust_id and tb.min_ver=c.version and tb.department=c.department
                            where c.status in   (Status1,status2)
                            group by c.department
                            union all
                            
                            
                            select /*+ leading(tb) use_nl(cust) */ 'sq2' src, cust.department, count(cust_id)
                            from customer cust
                            join tb on c.cust_id=tb.cust_id and tb.min_ver=c.version and tb.department=c.department
                            where created_date < End_date
                            and exists ( select 'x'
                                         from customer cust1
                                         where cust1.cust_id = cust1.cust_id
                                               AND cust.version < cust1.version
                                               and cust1.C_date between Start_date and End_date
                                               and cust1.version = tb.min_ver and tb.cust_id=cust1.cust_id and tb.department=cust1.department
                            group by cust.department
                            ;
                            
                            

                             

                             

                            I added an SRC-Denominator  column too, as suggested by Frank. This will give you a hint from where a certain rows comes from, for further interpretation of your data.

                            • 11. Re: How to  improve the performance of my query? / My query is running slow.
                              Stefan Jager

                              I assumed (yes, not very smart to assume, I know...) that that was a consequence of the messing with table aliases. But you are correct: If customer_ID is unique, and the customer version is stored in the same table, there can only be one version per customer ID, indeed making the MIN totally useless.

                              • 12. Re: How to  improve the performance of my query? / My query is running slow.

                                Well, this SQL is not the exactlly as I am using.

                                Then any answer you get may NOT be exactly one that really works.

                                 

                                I hope you can help with these Informations?

                                And we hope that YOU 'can help with these':

                                REQUIREMENTS FIRST:

                                 

                                1. How long does the query take now?

                                2. How many rows does it return?

                                3. How often is it executed?

                                4. How many different users execute it?

                                5. How much faster does it need to be?

                                Terms like 'faster' and 'very long time' have no real meaning and aren't useful as tuning goals.

                                 

                                You need to provide SPECIFIC info about the REAL problem.

                                 

                                There is no reason at all you can provide the REAL answers to the questions above - there is NOTHING proprietary about any of them.

                                 

                                What works for a table with 100 rows may not work at all for a table with 100 million rows.

                                 

                                What works for a query executed once an hour may not be appropriate for a query that executes 100 times an hour.

                                • 13. Re: How to  improve the performance of my query? / My query is running slow.
                                  mathguy

                                  mmoayed wrote:

                                   

                                  Hello ,
                                  Thanky for repyling.

                                   

                                   

                                  Well, this SQL is not the exactlly as I am using. I just create same SQL that simulate the same job. I can't give the exact SQL I am using.
                                  But it has the same Concept.

                                   

                                   

                                  What do you mean by "same Concept"?

                                   

                                  I already pointed out a non-trivial issue in your code (the one you posted, anyway). I don't mean the table alias or the missing parenthesis; I mean the UNION.

                                   

                                  Does your actual query have UNION in it? If it does, is it the correct thing to use? (I seriously doubt that.) If you have UNION where you should have UNION ALL instead, that is a simple change to make, then run your query again and see if that fixed your speed issue. It may very well be enough.

                                   

                                  If your actual query doesn't have UNION in it, then it is not same Concept.  You are asking us to help you with optimization, we'll say something like "replace UNION with UNION ALL", and your reply will be, "I have UNION  in what I posted here, but I don't have UNION in my actual code." So then you are wasting everyone's time.

                                   

                                  Well, not mine. Not anymore.   Good luck fixing your problems using this approach.

                                  • 14. Re: How to  improve the performance of my query? / My query is running slow.
                                    mmoayed

                                    Thank you all, Your answers help me how to re-write the SQL using With clause. The SQL now just in seconds executed.

                                     

                                    BR;

                                    Mohammed

                                    1 2 Previous Next