7 Replies Latest reply: Dec 6, 2012 3:25 PM by jihuyao RSS

    How to tune this query ?

    978031
      Hi,
      Check my query and explain plan for this. And let me know how I can reduce the execution time.
      Now it taking 69 sec to give output.
      Records in both tables approx. 90 lacs.
      Query :-
      select u.location,count(tweet_id) tweet_count from tweet_msg_fact t
      inner join user_profile u
      on t.user_profile_sur_key=u.user_profile_sur_key 
      group by u.location;
      here is explain plan for this
      Plan hash value: 1011782060
       
      ------------------------------------------------------------------------------------------------
      | Id  | Operation             | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
      ------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT      |                |   178K|  6464K|       |   157K  (1)| 00:31:32 |
      |   1 |  HASH GROUP BY        |                |   178K|  6464K|  8432K|   157K  (1)| 00:31:32 |
      |*  2 |   HASH JOIN           |                |   178K|  6464K|  5368K|   155K  (1)| 00:31:11 |
      |   3 |    VIEW               | VW_GBC_9       |   177K|  3285K|       |  7196   (1)| 00:01:27 |
      |   4 |     HASH GROUP BY     |                |   177K|  1037K|    11M|  7196   (1)| 00:01:27 |
      |   5 |      TABLE ACCESS FULL| TWEET_MSG_FACT |   975K|  5715K|       |  5680   (1)| 00:01:09 |
      |   6 |    TABLE ACCESS FULL  | USER_PROFILE   |  9003K|   154M|       |   135K  (1)| 00:27:07 |
      ------------------------------------------------------------------------------------------------
       
      Predicate Information (identified by operation id):
      ---------------------------------------------------
       
         2 - access("ITEM_1"="U"."USER_PROFILE_SUR_KEY")
      Thank you.
      Rohit
        • 1. Re: How to tune this query ?
          Manjusha Muraleedas
          Have you set primary key and forign key for tables user_profile and tweet_msg_fact?

          if no, set primary key for user_profile as user_profile_sur_key .
          and set the above field to forign key to table tweet_msg_fact.user_profile_sur_key
          • 2. Re: How to tune this query ?
            978031
            i have set the p.k & f.k as you said but it is still taking 65.71 sec.
            • 3. Re: How to tune this query ?
              Manik
              If you are licensed to use parallelism can you try this: (I know this would be strongly opposed by other posters as Oracle knows the best and hinting is overriding it but I just want this to be given a try)
                SELECT /*+ PARALLEL(t,8)*/ u.location, COUNT (tweet_id) tweet_count
                  FROM    tweet_msg_fact t
                       INNER JOIN
                          user_profile u
                       ON t.user_profile_sur_key = u.user_profile_sur_key
              GROUP BY u.location;
              Note: Again that 8 number can vary based on your init ora prmts.

              Cheers,
              Manik.

              Edited by: Added Note.
              • 4. Re: How to tune this query ?
                Nikolay Savvinov
                Hi,

                I don't see any indications that your query needs tuning. You are retrieving a large number of rows non-selectively, and according to the plan it only costs you 1 I/O per row. It's a good deal, take it.

                Best regards,
                Nikolay

                P.S. It's not a good idea to name your tables such that they look like Oracle data dictionary views (USER_PROFILE).
                • 5. Re: How to tune this query ?
                  Paul  Horth
                  975028 wrote:
                  i have set the p.k & f.k as you said but it is still taking 65.71 sec.
                  As Nikolay said, you are joining all the rows of one table with all the rows of another. It will take some time.
                  Although PK and FK are good to have anyway, they will not help in this case.

                  As another poster suggested, parallelism may help but be careful it does not eat the resources on your
                  production system.

                  Another suggestion is to use a materialized view to hold your count by location.

                  BTW: please use international English on this forum, 'lacs' is a local term. Please use thousands/millions etc.

                  Edited by: Paul Horth on Dec 5, 2012 7:54 AM

                  Edited by: Paul Horth on Dec 5, 2012 8:31 AM

                  Edited by: Paul Horth on Dec 5, 2012 8:32 AM
                  • 6. Re: How to tune this query ?
                    978031
                    Thanx all for your valuable suggestions.
                    • 7. Re: How to tune this query ?
                      jihuyao
                      Interesting to see how Oracle rewrite the query. Wondering why not use nested loop when the outer source uses a lot temp space and the inner table has very selective index on the join key (almost unique).

                      With that said, it is worth to test it out. Including column u.location with key in the index also helps.
                      select /*+ ordered use_nl(u) */ u.location, sum(v.cnt) 
                      from (select t.user_profile_sur_key, count(*) cnt from tweet_msg_fact t group by t.user_profile_sur_key) v
                      inner join user_profile u on v.user_profile_sur_key=u.user_profile_sur_key 
                      group by u.location
                      ;