Forum Stats

  • 3,770,919 Users
  • 2,253,183 Discussions
  • 7,875,654 Comments

Discussions

Distributed sql query through multiple databases

User_37KZC
User_37KZC Member Posts: 2
edited Jul 23, 2020 1:04PM in SQL & PL/SQL

Hello,

In order to optimise a distributed query, I used the "Driving_site" hint

select /*+ DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table) ; -- query n°1

select /*+ RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table); --query n°2

These two queries work fine but once I join the to obtain the desired result in this way:

select * from

select /*+ DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table) ,

select /*+ RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table)

where att_table_1 = att_table_2

I don't get the desired execution_plan and the query is rewritten using both subqueries in  one of the servers, and the query execution runs in an endless way.

Do anyone know the correct fix for such issue?

BEDETubby

Answers

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,931 Red Diamond
    edited Jul 22, 2020 9:13AM

    Try CTE and materialize it (keep in mind hint MATERIALIZE is undocumented hint):

    with t1 as (

                select /*+ MATERIALIZE DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table) ,

               ),

         t2 as (

                select /*+ MATERIALIZE RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table)

               )

    select  *

      from  t1,

            t2

      where att_table_1 = att_table_2

    /

    Other than that, create two global temporary tables GTT1 & GTT2 and:

    insert

      into gtt1

      select /*+ DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table)

    /

    insert

      into gtt2

      select /*+ RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table)

    /

    select  *

      from  gtt1,

            gtt2

      where att_table_1 = att_table_2

    /

    SY.

    BEDE
  • User_37KZC
    User_37KZC Member Posts: 2
    edited Jul 23, 2020 4:40AM

    Hello,

    Trying this hasn't solved the problem.

    The execution plan is using Hash join over nested loops and It's not the best plan to choose.

    I tried to update the statistics for the evolved tables which hasn't worked.

    I tried also to use hint but it's ignored

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,795 Gold Crown
    edited Jul 23, 2020 11:12AM

    How accurate is your description of the problem - is each individual query really a single remote table with single local subquery ?

    Please supply the individual plans (using a call to dbms_xplan.display_cursor() including the 'outline' and 'remote' options in the format,  include the Predicate Information. Please use SQL*Plus (or other text output mechanism) with a suitable pagesize and linesize to avoid line wrap and repeated headings.. Then supply the execution plan for the combined query as you originally wrote it.

    Regards

    Jonathan Lewis

    P.S.  I'm not sure that your RULE hint would be doing anything - though it might make the remote query carry a RULE hint which might have an effect at the remote database.

    In principle your current "join the two queries" code could work the way you want if you add a NO_MERGE to the two separate queries.  (Off the top of my head the MATERIALIZE hint has to be ignored for distributed queries which could be why SY's suggestion didn't work).

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,795 Gold Crown
    edited Jul 23, 2020 11:14AM

    I've just run up a simple mode using your text as a guideline - and discovered a couple of things that surprised me, viz:

    the RULE hint isn't made invalid by the presence of the driving_site() materialize or no_merge hints.

    SY's suggestion of using materalized factored subqueries worked for me.  (As did my suggestion of using the no_merge hint). Interestinglythe presence of the RULE hint in the second subquery (whether materialize'd or no_merge'd) resulted in the both subqueries being optimizer by the RBO.  [Correction: parts of the strategy worked - the two queries produced their separate results, but the driving_site() hint didn't seem to be applied.]

    I ran the test on 12.2.0.1 and 19.3.0.0 - base versions. Which version are you using ?

    Regards

    Jonathan Lewis

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,795 Gold Crown
    edited Jul 23, 2020 1:04PM

    I've been playing around with this a little more.

    It looks like there are two main problems.

    First you can only have one driving site for a query - so when you put two driving_site() hints into the query they are both ignored.

    Secondly - working from the pattern of your query with its two IN subqueries - even if you tell just one of the remote database to become the driving site (and it WILL then do what you want it to do with its tables) it will then have to work out how to handle the other half of the query which, from its perspective, consists of two remote tables. Based on how 19.3 is handling a very simple example is looks as if the optimizer is always going to transform your IN subquery into "join with distinct view" - so we can't necessarily hint whichever is "the other half" to do exactly what you want. On top of that, since both the tables are in remote databases (from the perspective of the driving site) whatever you do may be inefficient.

    Perhaps the only performant option you have left is to create create non-mergable views on both the remote databasess that specify the execution plan you want from each of them, then write a query at the local database that selects and joins the two remote views; alternatively you could write a PL/SQL table function that wraps each of the smaller queries (with their individual driving_site() hints) and then write your main query as a join of the results of the two functions.

    Regards

    Jonathan Lewis

    Tubby