1 2 Previous Next 20 Replies Latest reply: Mar 21, 2011 5:23 AM by Timur Akhmadeev RSS

    CBO (optimizer) nest-loop join question

    tx103108
      OS: Red Hat Linux
      DB: 11gR1

      I have gotten two conflicting answers while reading books by Don Burleson and Dan Hotka. It has to do with the CBO and nested-joins:

      One says the CBO will choose the 'smaller' table as the driving table, the other states that the 'larger' table will be the driving table. And both stick by this philosophy as the preferred goal of any SQL Tuning -- that is, one states that the 'smaller' table should be the driving table. The other says the 'larger' table should be the driving table.

      I had always thought that the 'smaller' table should be the driving table. That in a nested loop the driving will not likely use an index even. Who is correct? (I am not going to say who said what, btw). :-)

      But I got to let one of them know they got a 'typo' ... :-)

      Thx.
        • 1. Re: CBO (optimizer) nest-loop join question
          Aman....
          If we take the example of the EMP, Dept table, the EMP table would be picked up by ORacle as the driving table and Dept table as the driven table for two reasons, one, dept is a smaller one. Second, the dept table access would be faster when there would be an index available over it (which is there on the Deptno column of it) .

          HTH
          Aman....
          • 2. Re: CBO (optimizer) nest-loop join question
            Salman Qureshi
            Hi,
            Smaller table should be the driving table.

            Table A has values 1,2,3,4,5,6,7,8,9,10
            Table B is child and has values 1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10

            If Table A is driving table
            For A in 1 to 10 LOOP
                Pick matching values from Table B
            End Loop
            
            --Total loop iterations 10
            If Table B is driving table
            For B in 1 to 20 LOOP
                Pick matching values from Table B
            End Loop
            
            --Total loop iterations 20
            Salman
            • 3. Re: CBO (optimizer) nest-loop join question
              tx103108
              This is interesting...one says the larger table (EMP) and one says the smaller table. Well, shall we do a recount?
              • 4. Re: CBO (optimizer) nest-loop join question
                tx103108
                Ok, time to battle it out ... put your votes in everyone ... bigger or smaller table as driving table (given all other things being equal)...

                :-)
                • 5. Re: CBO (optimizer) nest-loop join question
                  tx103108
                  Let's assume both tables can be accessed by an index here. I understand that if a table has an index that can be used and another does not, then the optimizer would tend to pick the indexed table even if it is the larger of the two tables in a join. I guess I am wondering, as a general rule, should not the smaller table be the driving table....?

                  Thx.
                  • 6. Re: CBO (optimizer) nest-loop join question
                    Jonathan Lewis
                    user601798 wrote:
                    Ok, time to battle it out ... put your votes in everyone ... bigger or smaller table as driving table (given all other things being equal)...

                    :-)
                    Now that's a healthy attitude. (And I really do mean that.)

                    I have to say, though, when you start a question with "Reference A says X and Reference B says the opposite" then asking which one is right is likely to lead to equally contradictory answers.

                    The correct answer is that in the absence of any context they're both wrong, but the one that says "smaller table" is slightly less wrong.

                    First, the size of the table is irrelevant - what matters is the size of (specifically number of rows in) the rowsource that Oracle thinks will be extracted from that table based on the available non-join predicates. Ordering by rowsource is the way that the optimizer chooses the first Join Order to consider.

                    Secondly - this is a simple-minded algorithm that is guaranteed to make mistakes, and the optimizer works its way through many join orders to reduce the chances of error, because it's not just the size of the rowsource that matters it's also the amount of work done to acquire that rowsource, and this is often affected by how widely the data is scattered and how precise the access paths to that data might be.

                    Regards
                    Jonathan Lewis
                    http://jonathanlewis.wordpress.com
                    http://www.jlcomp.demon.co.uk

                    A general reminder about "Forum Etiquette / Reward Points": http://forums.oracle.com/forums/ann.jspa?annID=718

                    If you never mark your questions as answered people will eventually decide that it's not worth trying to answer you because they will never know whether or not their answer has been of any use, or whether you even bothered to read it.

                    It is also important to mark answers that you thought helpful - again it lets other people know that you appreciate their help, but it also acts as a pointer for other people when they are researching the same question, moreover it means that when you mark a bad or wrong answer as helpful someone may be prompted to tell you (and the rest of the forum) what's so bad or wrong about the answer you found helpful.
                    • 7. Re: CBO (optimizer) nest-loop join question
                      jmft2012
                      Mr. Lewis,
                      If you can illustrate that with examples would be great.

                      Sean
                      • 8. Re: CBO (optimizer) nest-loop join question
                        tx103108
                        Yes, a good healthy 'jousting' is good for the brain and the heart... :-)

                        On the First point, my fault (bad semantics). I meant 'result set' not 'table' but good of you to clear that up.

                        On the Second point, you are quite right and it was good of you to clarify.

                        It is an over-simplistic scenario but, as I mentioned, if all other things are 'equal' -- which would include 'access time/work', then I think the small table as the driving table has the advantage.

                        Once again, thanks for your input. I enjoyed you book on the CBO and heard you lecture while in town a year or so ago.
                        Very good on both counts.

                        Cheers!
                        • 9. Re: CBO (optimizer) nest-loop join question
                          tx103108
                          Just as an addendum to this thread --

                          At the risk of being over-simplistic, what would be the Top 5 or 10 things one looks for in an explain plan or execution plan that suggests 'performance trouble'? ...
                          • 10. Re: CBO (optimizer) nest-loop join question
                            sb92075
                            Handle:      user601798
                            Status Level:      Newbie
                            Registered:      Oct 23, 2007
                            Total Posts:      233
                            Total Questions:      34 (19 unresolved)
                            so many questions & so few answers.
                            :-(
                            Just as an addendum to this thread --
                            consider marking this tread as ANSWERED & starting new thread with totally new & different topic

                            do as below so we can know complete Oracle version & OS name since answer may be Version dependent

                            Post via COPY & PASTE complete results of
                            SELECT * from v$version;
                            • 11. Re: CBO (optimizer) nest-loop join question
                              tx103108
                              will do. :-)
                              • 12. Re: CBO (optimizer) nest-loop join question
                                tx103108
                                Thanks for the 'cleanup tip'. Hadn't noticed I had so many questions 'unanswered'. All but a few have been marked as answered...

                                :-)
                                • 13. Re: CBO (optimizer) nest-loop join question
                                  Santosh Kumar
                                  user601798 wrote:
                                  OS: Red Hat Linux
                                  DB: 11gR1

                                  I have gotten two conflicting answers while reading books by Don Burleson and Dan Hotka. It has to do with the CBO and nested-joins:

                                  One says the CBO will choose the 'smaller' table as the driving table, the other states that the 'larger' table will be the driving table. And both stick by this philosophy as the preferred goal of any SQL Tuning -- that is, one states that the 'smaller' table should be the driving table. The other says the 'larger' table should be the driving table.

                                  I had always thought that the 'smaller' table should be the driving table. That in a nested loop the driving will not likely use an index even. Who is correct? (I am not going to say who said what, btw). :-)

                                  But I got to let one of them know they got a 'typo' ... :-)

                                  Thx.
                                  Your 'where' clause in the query should be the deciding factor.

                                  My vote:
                                  Test the different permutations and come to a conclusion.

                                  Regards,
                                  S.K.
                                  • 14. Re: CBO (optimizer) nest-loop join question
                                    Jonathan Lewis
                                    user601798 wrote:

                                    It is an over-simplistic scenario but, as I mentioned, if all other things are 'equal' -- which would include 'access time/work', then I think the small table as the driving table has the advantage.
                                    It is not possible for +"*all* other things to be equal"+. (my emphasis).

                                    If by +'access time/work'+ you mean the total is the same then it doesn't matter which table is first, the time/work is the same either way round.

                                    If you want to say that the +'access time/work'+ for acquiring the first rowsource is the same for both paths, and the +'access time/work'+ for acquiring related rows from the second table is the same FOR EACH DRIVING ROW, then the total +'access time/work'+ will be difference, and it would be better to start with the smaller table. (The example by Salman Qureshi above: Re: CBO (optimizer) nest-loop join question would apply.)

                                    On the other hand, and ignoring any idea of "all other things being equal", smaller tables tend to have smaller indexes, so if your smaller rowsource comes from a smaller table then acquiring those rows may be cheaper than acquiring rows from a larger table - which leads to the observation that (even with perfectly precise indexing):

                                    <ul>
                                    smaller number of rows * larger unit cost to find related rows
                                    </ul>
                                    may produce a larger value than
                                    <ul>
                                    larger number of rows * smaller unit cost to find related rows
                                    </ul>

                                    Regards
                                    Jonathan Lewis
                                    http://jonathanlewis.wordpress.com
                                    http://www.jlcomp.demon.co.uk

                                    A general reminder about "Forum Etiquette / Reward Points": http://forums.oracle.com/forums/ann.jspa?annID=718

                                    If you never mark your questions as answered people will eventually decide that it's not worth trying to answer you because they will never know whether or not their answer has been of any use, or whether you even bothered to read it.

                                    It is also important to mark answers that you thought helpful - again it lets other people know that you appreciate their help, but it also acts as a pointer for other people when they are researching the same question, moreover it means that when you mark a bad or wrong answer as helpful someone may be prompted to tell you (and the rest of the forum) what's so bad or wrong about the answer you found helpful.
                                    1 2 Previous Next