Join confusion (very interesting question)

    Hello Experts,

     

    My company has very strange business logic. As I already explained in this thread , multiple column function based indexes created for each table. In addition, every table contain areacode column which has just one value. I mean, for this column every rows for every table are the same, it is '201'. Also, foreign key and primary key relationship haven't implemented for some spesific reasons. In other words, there are links with each table. However, database don't know about it, because tables have not been created with primary key and foreign key logic. I know this is illogical but I cannot judge them for these meaningless logic.  Anyways, my question is, when I need to join to tables, I have to join multiple columns to each other. For example: table1.col1 = table2.col1 and table1.col2 = table2.col2 I have just create the following sample tables forin order to show you the logic.

     

     

    --Experiment initilization

     

    SQL> create table my_first as select '201' areacode, to_char(rownum+100) account_num, dbms_random.st

    ring('A', 10) name  from dual connect by level <= 5000;

    Table created.

     

    SQL> create table my_sec as select '201' areacode, to_char(rownum+100) account_num, dbms_random.st

    ring('A', 10) name  from dual connect by level <= 5000;

    Table created.

     

    SQL> alter table my_first modify account_num varchar2(12);

    Table altered.


    SQL> alter table my_sec modify account_num varchar2(12);

    Table altered.

     

    SQL> alter table my_first modify areacode varchar2(3);

    Table altered.

     

    SQL> alter table my_sec modify areacode varchar2(3);

    Table altered.

     

    SQL> create index my_first_i on my_first (substr(nls_lower(areacode), 1, 7), substr(nls_lower(account_num),1,15));

    Index created.

     

    SQL> create index my_sec_i on my_sec (substr(nls_lower(areacode), 1, 7), substr(nls_lower(account_num),1,15));

    Index created.

     

    SQL> analyze table my_first compute statistics for all indexed columns for all indexes;

    Table analyzed.

     

    SQL> analyze table my_sec compute statistics for all indexed columns for all indexes;

    Table analyzed.

     

    SQL> exec dbms_stats.gather_table_stats(USER,'MY_FIRST');

    PL/SQL procedure successfully completed.

     

    SQL> exec dbms_stats.gather_table_stats(USER,'MY_SEC');

    PL/SQL procedure successfully completed.

     

    So, when I want to join table_first and table_sec which way should I use and why?

     

    1_ select * from my_first a

    join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25);

     

    2_ select * from my_first a
    join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)
    and substr(nls_lower(a.areacode),1,7) = '201';

     

    3_ select * from my_first a
    join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)
    and substr(nls_lower(b.areacode),1,7) = '201';

     

    4_ select * from my_first a
    join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)
    and substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

     

    Note: I did my experiment at 10g. In my company 11g is used.

     

    Lastly, do you think should I use the constant column equality (substr(nls_lower(a.areacode),1,7) = '201') in join part? or in where clause like the following?

     

    select * from my_first a join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

    where substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

     

    SQL> select * from v$version
      2  where banner like 'Oracle%';

    BANNER
    ----------------------------------------------------------------
    Oracle Database 10g Express Edition Release 10.2.0.1.0 - Product

     

    SQL> set autotrace traceonly

    SQL> select * from my_first a  join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25) ;

    5000 rows selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1578512774

    -------------------------------------------------------------------------------
    | Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |          |     1 |    40 |    17   (6)| 00:00:01 |
    |*  1 |  HASH JOIN         |          |     1 |    40 |    17   (6)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| MY_FIRST |  5000 |    97K|     8   (0)| 00:00:01 |
    |   3 |   TABLE ACCESS FULL| MY_SEC   |  5000 |    97K|     8   (0)| 00:00:01 |
    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)=SUBSTR(NLS_LOWER("B"
                  ."AREACODE"),1,7) AND SUBSTR(SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25),1

                  ,15)=SUBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,15) AND
                  SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1,15)=SUBSTR(SUBSTR(NLS_LOWER(
    "B"."A

                  CCOUNT_NUM"),1,25),1,15) AND SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25)=S

                  UBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,25))


    Statistics
    ----------------------------------------------------------
             29  recursive calls
              0  db block gets
            382  consistent gets
              0  physical reads
              0  redo size
         226223  bytes sent via SQL*Net to client
           4047  bytes received via SQL*Net from client
            335  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
           5000  rows processed

     

    -----------------------------------------------------------------------------------------


    SQL> select * from my_first a  join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25) and substr(nls_lower(a.areacode),1,7) = '201';

     

    5000 rows selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1578512774

    -------------------------------------------------------------------------------
    | Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  1 |  HASH JOIN         |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL| MY_FIRST |  4967 | 99340 |     9  (12)| 00:00:01 |
    |*  3 |   TABLE ACCESS FULL| MY_SEC   |  5000 |    97K|     9  (12)| 00:00:01 |
    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)=SUBSTR(NLS_LOWER("B"
                  ."AREACODE"),1,7) AND SUBSTR(SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25),1

                  ,15)=SUBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,15) AND
                  SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1,15)=SUBSTR(SUBSTR(NLS_LOWER(
    "B"."A

                  CCOUNT_NUM"),1,25),1,15) AND SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25)=S

                  UBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,25))
       2 - filter(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)='201')
       3 - filter(SUBSTR(NLS_LOWER("B"."AREACODE"),1,7)='201')


    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
            374  consistent gets
              0  physical reads
              0  redo size
         226223  bytes sent via SQL*Net to client
           4047  bytes received via SQL*Net from client
            335  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
           5000  rows processed

     

    -----------------------------------------------------------------------------------

     

    SQL> select * from my_first a join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25) and substr(nls_lower(b.areacode),1,7) = '201';

     

    5000 rows selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1578512774

    -------------------------------------------------------------------------------
    | Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  1 |  HASH JOIN         |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL| MY_FIRST |  4967 | 99340 |     9  (12)| 00:00:01 |
    |*  3 |   TABLE ACCESS FULL| MY_SEC   |  5000 |    97K|     9  (12)| 00:00:01 |
    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)=SUBSTR(NLS_LOWER("B"
                  ."AREACODE"),1,7) AND SUBSTR(SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25),1

                  ,15)=SUBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,15) AND
                  SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1,15)=SUBSTR(SUBSTR(NLS_LOWER(
    "B"."A

                  CCOUNT_NUM"),1,25),1,15) AND SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25)=S

                  UBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,25))
       2 - filter(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)='201')
       3 - filter(SUBSTR(NLS_LOWER("B"."AREACODE"),1,7)='201')


    Statistics
    ----------------------------------------------------------
             29  recursive calls
              0  db block gets
            382  consistent gets
              0  physical reads
              0  redo size
         226223  bytes sent via SQL*Net to client
           4047  bytes received via SQL*Net from client
            335  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
           5000  rows processed

     

     

    -------------------------------------------------------------------------------------------------------------

    SQL> select * from my_first a  join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25) and substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

     

    5000 rows selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1578512774

    -------------------------------------------------------------------------------
    | Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  1 |  HASH JOIN         |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL| MY_FIRST |  4967 | 99340 |     9  (12)| 00:00:01 |
    |*  3 |   TABLE ACCESS FULL| MY_SEC   |  5000 |    97K|     9  (12)| 00:00:01 |
    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)=SUBSTR(NLS_LOWER("B"
                  ."AREACODE"),1,7) AND SUBSTR(SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25),1

                  ,15)=SUBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,15) AND
                  SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1,15)=SUBSTR(SUBSTR(NLS_LOWER(
    "B"."A

                  CCOUNT_NUM"),1,25),1,15) AND SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25)=S

                  UBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,25))
       2 - filter(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)='201')
       3 - filter(SUBSTR(NLS_LOWER("B"."AREACODE"),1,7)='201')


    Statistics
    ----------------------------------------------------------
             29  recursive calls
              0  db block gets
            382  consistent gets
              0  physical reads
              0  redo size
         226223  bytes sent via SQL*Net to client
           4047  bytes received via SQL*Net from client
            335  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
           5000  rows processed

     

    With using WHERE clause

     

    SQL> select * from my_first a join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7) and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25) where substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

    5000 rows selected.


    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1578512774

    -------------------------------------------------------------------------------
    | Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  1 |  HASH JOIN         |          |     1 |    40 |    18  (12)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL| MY_FIRST |  4967 | 99340 |     9  (12)| 00:00:01 |
    |*  3 |   TABLE ACCESS FULL| MY_SEC   |  5000 |    97K|     9  (12)| 00:00:01 |
    -------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)=SUBSTR(NLS_LOWER("B"
                  ."AREACODE"),1,7) AND SUBSTR(SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25),1

                  ,15)=SUBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,15) AND
                  SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1,15)=SUBSTR(SUBSTR(NLS_LOWER(
    "B"."A

                  CCOUNT_NUM"),1,25),1,15) AND SUBSTR(NLS_LOWER("A"."ACCOUNT_NUM"),1
    ,25)=S

                  UBSTR(NLS_LOWER("B"."ACCOUNT_NUM"),1,25))
       2 - filter(SUBSTR(NLS_LOWER("A"."AREACODE"),1,7)='201')
       3 - filter(SUBSTR(NLS_LOWER("B"."AREACODE"),1,7)='201')


    Statistics
    ----------------------------------------------------------
             29  recursive calls
              0  db block gets
            382  consistent gets
              0  physical reads
              0  redo size
         226223  bytes sent via SQL*Net to client
           4047  bytes received via SQL*Net from client
            335  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
           5000  rows processed

     

     

     

    You remarkable thoughts and ideas are very important for me.

     

    Regards

    Charlie

      • 1. Re: Join confusion (very interesting question)
        Tubby

        I would explore the option of creating virtual columns (for the expressions you are using to join entities) and then using those virtual columns in an index.

         

        Cheers,

        • 2. Re: Join confusion (very interesting question)
          Brian Bontrager
          every table contain areacode column which has just one value. I mean, for this column every rows for every table are the same, it is '201'.

          If area code is always and only 201 you NEVER need to join it because it is effectively a constant and adds no value whatsoever. The only reason I can think of is if it is part of an index, but even then you can use the literal '201' instead of another column if it is really always the same.   Why use upper() or lower() on a string that only contains digits?  If an application you have no control over chooses to do that, its one thing (and your other thread explained the resulting need for the function-based indexes), but there is no reason your own queries need to make the needless function call.

          • 3. Re: Join confusion (very interesting question)

            Tubby wrote:

             

            I would explore the option of creating virtual columns (for the expressions you are using to join entities) and then using those virtual columns in an index.

             

            Cheers,

            Unfortunately, we haven't got create virtual column grant. Whatsmore, we have nothing except select grant.

            • 4. Re: Join confusion (very interesting question)
              Tubby

              NightWing wrote:

               

              Tubby wrote:

               

              I would explore the option of creating virtual columns (for the expressions you are using to join entities) and then using those virtual columns in an index.

               

              Cheers,

              Unfortunately, we haven't got create virtual column grant. Whatsmore, we have nothing except select grant.

              So you have a problem (presumably) to solve and no tools with which to solve it.

               

              I would suggest escalating the issue with someone whom can help you get the tools you need to do the job you have been tasked with.

               

              Cheers,

              • 5. Re: Join confusion (very interesting question)
                Jonathan Lewis

                 

                NightWing wrote:

                 

                Note: I did my experiment at 10g. In my company 11g is used.

                 

                 

                I wouldn't do my experiments on 10g if the company is running 11g - there are so many bug fixes to ANSI syntax/optimisation, and fixes to NLS problems that anything you discover, or any problems you find, on 10g may be irrelevant to 11g.

                 

                Regards

                Jonathan Lewis

                • 6. Re: Join confusion (very interesting question)

                  Brian Bontrager wrote:

                   

                  If area code is always and only 201 you NEVER need to join it because it is effectively a constant and adds no value whatsoever.

                   

                  I absolutely agree with you. However, every indexed columns begins with areacode column like the following. So, I cannot do anything, this is business logic.

                   

                  create index my_first_i on my_first (substr(nls_lower(areacode), 1, 7), substr(nls_lower(column2), 1, 15), substr(nls_lower(column3), 1, 25), ..........);

                   

                   

                   

                  The only reason I can think of is if it is part of an index, but even then you can use the literal '201' instead of another column if it is really always the same.  

                  Yes, it is aways part of an index .  I don't understand what do you mean when you said "you can use the literal '201' instead of another column"?

                   

                  Why use upper() or lower() on a string that only contains digits? 

                   

                   

                  Because, business logic. I also asked my DBAs that why we have to use these functions even on digits? They also accept this is illogical but this is the rule and business logic. By the way digits are also declared as varchar2.

                   

                  For example: account_num varchar2(12)

                  The formula is: length *2 + 1

                  So, every columns indexed like this substr(nls_lower(account_num),1,25) and begins with substr(nls_lower(areacode), 1, 7). Hence, it becomes

                   

                  create index my_first_i on my_first (substr(nls_lower(areacode), 1, 7), substr(nls_lower(account_num),1,15));

                  • 7. Re: Join confusion (very interesting question)

                    Jonathan Lewis wrote:

                     

                     

                    NightWing wrote:

                     

                    Note: I did my experiment at 10g. In my company 11g is used.

                     

                     

                    I wouldn't do my experiments on 10g if the company is running 11g - there are so many bug fixes to ANSI syntax/optimisation, and fixes to NLS problems that anything you discover, or any problems you find, on 10g may be irrelevant to 11g.

                     

                    Regards

                    Jonathan Lewis

                     

                     

                    Thanks Jonathan. Also, what do you think about my join confusion?

                     

                    Regards

                    • 8. Re: Join confusion (very interesting question)

                      Tubby wrote:

                       

                      NightWing wrote:

                       

                      Tubby wrote:

                       

                      I would explore the option of creating virtual columns (for the expressions you are using to join entities) and then using those virtual columns in an index.

                       

                      Cheers,

                      Unfortunately, we haven't got create virtual column grant. Whatsmore, we have nothing except select grant.

                      So you have a problem (presumably) to solve and no tools with which to solve it.

                       

                      I would suggest escalating the issue with someone whom can help you get the tools you need to do the job you have been tasked with.

                       

                      Cheers,

                       

                      Mate, my intention is not to solve the RDBMS issues. I cannot do this. This is the business logic and they never change it. I cannot judge the company logic.

                       

                      I am just tyring to find the best join way for this business logic and scenerio.

                       

                      Regards

                      • 9. Re: Join confusion (very interesting question)
                        Tubby

                        NightWing wrote:

                         

                        Mate, my intention is not to solve the RDBMS issues. I cannot do this. This is the business logic and they never change it. I cannot judge the company logic.

                         

                        I am just tyring to find the best join way for this business logic and scenerio.

                         

                        Regards

                         

                        That's not at all true. The business logic doesn't mandate how to do a technical implementation. Nor is this a problem with the RDBMS. It's 100% a problem with an application design / implementation.

                         

                        If you're not willing or able to change that then so be it.

                         

                        Cheers,

                        • 10. Re: Join confusion (very interesting question)
                          Iordan Iotzov

                          It appears that there is very little difference in the execution
                          plan and statistics among the four statements.

                           

                           

                          The cost of the first one 17(6) is slightly lower than the
                          cost of the other three 18(12). The cost is just an estimate, and therefore
                          does not guarantee how much actual resources would be spent.

                           

                           

                          Looking at the actual resources (consistent gets, recursive calls, etc) used,
                          we can see that each query, except the second one, needed exactly the same
                          amount of consistent gets - 382. Most likely the second query needed fewer consistent
                          gets and recursive calls because it was already parsed.

                           

                           

                          You can measure actual CPU time, elapsed time and other metrics to see any of the queries
                          is better than the rest.

                           

                           

                          It is possible that the first query is the fastest because it does not have to apply
                          filters on MY_FIRST and MY_SECOND table, but I doubt that the difference would
                          be significant.

                           

                           

                          When in doubt – test!

                           

                           

                           

                           

                          Iordan Iotzov

                          • 11. Re: Join confusion (very interesting question)

                            Tubby wrote:

                             

                             

                            It's 100% a problem with an application design / implementation.

                             

                             

                             

                            Yes, absolutely. You just said it what I couldn't.

                             

                             

                             

                            If you're not willing or able to change that then so be it.

                             

                             

                             

                            I think I have been misunderstood. I didn't say that I don't want to change it. I am trying to say that I cannot chnage the application design. Because I have not no create or modify grant. Only DBAs can manipulate the database. So, we have only select grant. What I am trying to learn is, under these circumstances which join way should I use in order to get the best performance.

                             

                            Cheers

                            • 12. Re: Join confusion (very interesting question)

                              At least, can anyone please says what is the differences between the followings?

                               

                              select * from my_first a

                              join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7)

                              and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

                              and substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

                               

                               

                              select * from my_first a

                              join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7)

                              and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

                              where substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

                              • 13. Re: Join confusion (very interesting question)
                                Dom Brooks

                                > what is the differences between the followings?

                                For an inner join, nothing.

                                ANSI syntax allows for a distinction between join predicates and filter predicates.

                                 

                                For an outer join, the distinction between a join predicate and a filter predicate could be significant, depending on the exact circumstances.

                                • 14. Re: Join confusion (very interesting question)
                                  Jonathan Lewis

                                  NightWing wrote:

                                   

                                  At least, can anyone please says what is the differences between the followings?

                                   

                                  select * from my_first a

                                  join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7)

                                  and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

                                  and substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

                                   

                                   

                                  select * from my_first a

                                  join my_sec b on substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7)

                                  and substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

                                  where substr(nls_lower(a.areacode),1,7) = '201' and substr(nls_lower(b.areacode),1,7) = '201';

                                   

                                  I take it you mean "why might some people argue that one form of the query is logically more correct than the other", since both queries end up with the same execution plan and result.

                                   

                                  I think the argument would go along the lines of:  you have join predicates and filter predicates - you should put join predicates in the ON clause and filter predicates in the WHERE clause.

                                  In this case the "correct" form of the query would be:

                                   

                                  select  *

                                  from    my_first a

                                  join    my_sec b

                                  on      substr(nls_lower(a.areacode),1,7) = substr(nls_lower(b.areacode),1,7)

                                  and     substr(nls_lower(a.account_num),1,25) = substr(nls_lower(b.account_num),1,25)

                                  where   substr(nls_lower(a.areacode),1,7) = '201'

                                  ;

                                   

                                   

                                  (You could put the b.areacode predicate in the where clause instead - but your join order implies A is logically the driving table).

                                  Note - even though YOU know the areacode the same for every row in both tables, this is not defined to be true in the database, hence the suitability of the areacode appearing in the join condition.

                                   

                                   

                                  You might ask your DBAs why they create an index with substr(,1,15) and then write their predicates with substr(,1,25) - this doesn't do anything to help the optimizer.

                                   

                                  Regards

                                  Jonathan Lewis

                                  1 2 上一个 下一个