Forum Stats

  • 3,734,032 Users
  • 2,246,861 Discussions
  • 7,857,003 Comments

Discussions

Will lower cost queries run faster than higher cost ones always?

PLSQL_GUY
PLSQL_GUY Member Posts: 77
edited March 2017 in SQL & PL/SQL

I get the same result-set using 2 methods.

One SQL uses sub-queries, the cost is 3.

The other SQL uses joins and the cost is 2.

Does this mean that the query which uses sub-queries will run 33.33% slower than the query which uses joins?

I.e. If it takes 3 minutes for the query with sub-queries to execute, does this mean that the query with

joins will take only 2 minutes, given all other conditions are equal?

Paul  HorthWilliam RobertsonCarlosDLGBluShadowChris Hunt

Answers

  • Barbara Boehmer
    Barbara Boehmer Member Posts: 4,625
    edited February 2017

    Comparing the cost of two queries is meaningless.  Please see the following article by Tom Kyte.

    https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:313416745628

    William RobertsonPLSQL_GUY
  • Unknown
    edited February 2017

    Did I understand you correctly?

    I get the same result-set using 2 methods. 

    So you have two queries.

    And you executed both queries.

    And you want to know how long they take compared to each other

    I.e. If it takes 3 minutes for the query with sub-queries to execute, does this mean that the query withjoins will take only 2 minutes, given all other conditions are equal?

    But you didn't bother to measure how long they took?

    Did it occur to you that you could have answered your own question if you have just timed the queries?

  • PLSQL_GUY
    PLSQL_GUY Member Posts: 77
    edited February 2017
    Barbara Boehmer wrote:Comparing the cost of two queries is meaningless. Please see the following article by Tom Kyte.https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:313416745628 

    Then what is the purpose of the COST column?

  • Nimish Garg
    Nimish Garg Member Posts: 3,185 Gold Trophy
    edited February 2017

    If Database Stats are upto date, then yes (almost) query with the lower cost will run fast. But it is not an exact science.

    I would suggest you to look at physical reads and consistent gets too in AutoTrace.

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited February 2017
    PLSQL_GUY wrote:Barbara Boehmer wrote:Comparing the cost of two queries is meaningless. Please see the following article by Tom Kyte.https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:313416745628 Then what is the purpose of the COST column?

    To ESTIMATE the amount of work a plan will do so that the cheapest plan for a particular query can be chosen.

    As it is based on statistics and math, it could be wrong. Typically this happens when your statistics don't represent your data (maybe there's skew in some column, maybe columns are related, maybe you gathered stats when the table was empty).

    If you want to know how much work really happened then enabling row source execution statistics (set statistics_level=all or gather_plan_statistics hint) , executing the query and then viewing the execution statistics using a call to dbms_xplan.Display_cursor with the appropriate sql_id and a format which includes 'allstats last'

    William RobertsonPLSQL_GUY
  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited February 2017
    Nimish Garg wrote:If Database Stats are upto date, then yes (almost) query with the lower cost will run fast. But it is not an exact science.I would suggest you to look at physical reads and consistent gets too in AutoTrace.

    Up to date stats does not mean representative stats.

    Up to date stats does not mean representative stats.

    Up to date stats does not mean representative stats.

    PLSQL_GUY
  • BluShadow
    BluShadow Member, Moderator Posts: 40,986 Red Diamond
    edited February 2017
    PLSQL_GUY wrote:Barbara Boehmer wrote:Comparing the cost of two queries is meaningless. Please see the following article by Tom Kyte.https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:313416745628 Then what is the purpose of the COST column?

    In principle the cost column is more important to the database internally.

    When you submit a query the database optimizer may produce several different plans internally (they are essentially different "paths" for how it can join and restrict data).  For each of those plans for that one query, the optimizer will determine a cost based on the statistics related to the tables and data, and then it will pick the plan that has the lowest cost.  So, the cost is useful internally for the optimizer to pick the best plan for a single query, but it's not really useful for comparing two different queries for us a developers/performance tuners, as the cost is only really relevant to different plans of the same query.  I haven't followed the AskTom link, but if it's the one I think it is, Tom says that he wishes they hadn't included the cost in the explain plan output as it causes people to mistakenly believe they can use it to compare different queries.

    Sure you can use it to see when a query is obviously going wrong e.g. a cost in the thousands or millions is typically indicative of a slow query, but you'd also determine that from the other information in the plan, such as the rows returned, or the estimated time taken, the amount of I/O and the nature of the joins being used etc.

    Paul  HorthCarlosDLGPLSQL_GUY
  • John Spencer
    John Spencer Member Posts: 8,567
    edited February 2017
  • PLSQL_GUY
    PLSQL_GUY Member Posts: 77
    edited February 2017

    I think Tom Kyte could be true.

    I did this test:

    I got a same result-set from 2 different queries. Once used sub-queries (SQL1)

    SELECT *  FROM table1 where col1 = (SELECT col1                 FROM table2                WHERE col2 = (SELECT col2                                FROM table3                               WHERE col3 = 99999                                 AND col4 = 'some name'));

    The cost of this query is given as 3.

    I used joins to get the same output using another query (SQL2):

    SELECT table1.*  FROM table1  JOIN table2 ON (col1)  JOIN table3 ON (col2 AND col3 = 99999 AND col4 = 'some name');

    The cost of this query is given as 6.

    But if I run the 2 queries using SET TIMING ON, I get.

    SQL> CREATE TABLE t_1 AS SQL1; -- 0.062 seconds

    SQL> CREATE TABLE t_2 AS SQL2; -- 0.047 seconds

    So, the query which cost more ran 24% quicker than the query which cost less.

  • PLSQL_GUY
    PLSQL_GUY Member Posts: 77
    edited February 2017
    rp0428 wrote:Did I understand you correctly?I get the same result-set using 2 methods. So you have two queries.And you executed both queries.And you want to know how long they take compared to each otherI.e. If it takes 3 minutes for the query with sub-queries to execute, does this mean that the query withjoins will take only 2 minutes, given all other conditions are equal?But you didn't bother to measure how long they took?Did it occur to you that you could have answered your own question if you have just timed the queries?

    But in development we don't have enough data to test real world production loads. We have only very few data in the rows.  Both queries executed in sub-second time.  So, how can we compare sub-second times like .047 seconds and .062 seconds?? It will give misleading results, right?

    Only if we run against millions of rows in all 3 tables will we get running times of minutes and then only we can get an idea of which is actually faster, right?

  • Billy Verreynne
    Billy Verreynne Member Posts: 28,267 Red Diamond
    edited February 2017

    Your clock time includes I/O. Physical I/O is usually a lot slower than logical I/O. Comparing 2 queries also mean comparing their I/O - where the "better" query could be doing physical I/O cause the data is not yet in the buffer cache, and the "bad" query could be hitting the cache instead. This can result in a more time consumed by the "better" query (doing less I/O), than the "bad" query  (doing more I/O), as the "bad" query's I/O happens to be from memory and not disk.

    Unless the I/O between the 2 queries are identically handled (from disk and from cache), wall time clocking the 2 queries for comparison is silly and meaningless.

    BluShadow
  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited February 2017
    PLSQL_GUY wrote:rp0428 wrote:Did I understand you correctly?I get the same result-set using 2 methods. So you have two queries.And you executed both queries.And you want to know how long they take compared to each otherI.e. If it takes 3 minutes for the query with sub-queries to execute, does this mean that the query withjoins will take only 2 minutes, given all other conditions are equal?But you didn't bother to measure how long they took?Did it occur to you that you could have answered your own question if you have just timed the queries?But in development we don't have enough data to test real world production loads. We have only very few data in the rows. Both queries executed in sub-second time. So, how can we compare sub-second times like .047 seconds and .062 seconds?? It will give misleading results, right?Only if we run against millions of rows in all 3 tables will we get running times of minutes and then only we can get an idea of which is actually faster, right?

    Yes.

    So load up your table with representative sample data. Although you probably don't want response times in the minutes, right?

  • Martin Preiss
    Martin Preiss Member Posts: 2,381 Gold Trophy
    edited February 2017

    just to add another link Oracle related stuff: Cost Is Time: Next Generation : "The cost estimate generated by the Cost-Based Optimizer (CBO) has always been a time estimate, although expressed in a slightly obscure unit, which is number of single block reads."

    The optimizer uses a model to predict the performance of a query - if this model would always be correct (and lower cost would mean faster access) than the optimizer would be absolutely flawless: there would be no need to tune queries manually (of course the optimizer would still need sound statistics).

    PLSQL_GUY
  • Barbara Boehmer
    Barbara Boehmer Member Posts: 4,625
    edited February 2017

    There are a lot of things that can affect the time that it takes a query to run.  It is best if you can test under as realistic circumstances as possible, such as loading tables with sufficient sample data.  It also helps to eliminate anything else that may consume resources on  your computer during testing.  Having the right indexes and current statistics can drastically affect query time.  Also, the first time a query is run, it must be hard parsed and that can consume time.  After that, the query is stored in the SGA and does not have to be hard parsed again.  I have seen a first query run in 7 seconds and subsequent runs of the same query take only 1 second.  If you use bind variables, then each subsequent run of the same query with different values for the bind variables can re-use the same query in the SGA without repeating hard parsing.  You can somewhat simulate timing with smaller data sets by doing multiple runs of the same query.  Tom Kyte once wrote a runstats_pkg that can be used for this.  Tom Kyte does not seem to have it posted anywhere that I can find anymore, although there are various versions with modifications.  I happen to have a copy of his old code that I have posted below, along with a sample execution below that.  You should do multiple runs to confirm that your results are consistent and not just do to variations in things other than your queries.  The two queries that I used below are just for demonstration of how to run the code and not necessarily good queries for demonstrating the results.

    -- code that only needs to be run once to grant privileges and create the runstats_pkg:
    -- The following code by Tom Kyte requires that the user executing it
    -- be granted select on v_$statname, v_$mystat, and v_$latch by sys.
    CONNECT sys/sys_password AS SYSDBA
    GRANT SELECT ON v_$statname TO scott
    /
    GRANT SELECT ON v_$mystat TO scott
    /
    GRANT SELECT ON v_$latch TO scott
    /
    CONNECT scott/tiger
    -- beginning of code by Tom Kyte:
    create global temporary table run_stats
    ( runid varchar2(15),
      name varchar2(80),
      value int )
    on commit preserve rows;

    create or replace view stats
    as select 'STAT...' || a.name name, b.value
          from v$statname a, v$mystat b
        where a.statistic# = b.statistic#
        union all
        select 'LATCH.' || name,  gets
          from v$latch;


    create or replace package runstats_pkg
    as
        procedure rs_start;
        procedure rs_middle;
        procedure rs_stop( p_difference_threshold in number default 0 );
    end;
    /

    create or replace package body runstats_pkg
    as

    g_start number;
    g_run1  number;
    g_run2  number;

    procedure rs_start
    is
    begin
        delete from run_stats;

        insert into run_stats
        select 'before', stats.* from stats;

        g_start := dbms_utility.get_time;
    end;

    procedure rs_middle
    is
    begin
        g_run1 := (dbms_utility.get_time-g_start);

        insert into run_stats
        select 'after 1', stats.* from stats;
        g_start := dbms_utility.get_time;

    end;

    procedure rs_stop(p_difference_threshold in number default 0)
    is
    begin
        g_run2 := (dbms_utility.get_time-g_start);

        dbms_output.put_line
        ( 'Run1 ran in ' || g_run1 || ' hsecs' );
        dbms_output.put_line
        ( 'Run2 ran in ' || g_run2 || ' hsecs' );
        dbms_output.put_line
        ( 'run 1 ran in ' || round(g_run1/g_run2*100,2) ||
          '% of the time' );
        dbms_output.put_line( chr(9) );

        insert into run_stats
        select 'after 2', stats.* from stats;

        dbms_output.put_line
        ( rpad( 'Name', 30 ) || lpad( 'Run1', 10 ) ||
          lpad( 'Run2', 10 ) || lpad( 'Diff', 10 ) );

        for x in
        ( select rpad( a.name, 30 ) ||
                to_char( b.value-a.value, '9,999,999' ) ||
                to_char( c.value-b.value, '9,999,999' ) ||
                to_char( ( (c.value-b.value)-(b.value-a.value)), '9,999,999' ) data
            from run_stats a, run_stats b, run_stats c
          where a.name = b.name
            and b.name = c.name
            and a.runid = 'before'
            and b.runid = 'after 1'
            and c.runid = 'after 2'
            and (c.value-a.value) > 0
            and abs( (c.value-b.value) - (b.value-a.value) )
                  > p_difference_threshold
          order by abs( (c.value-b.value)-(b.value-a.value))
        ) loop
            dbms_output.put_line( x.data );
        end loop;

        dbms_output.put_line( chr(9) );
        dbms_output.put_line
        ( 'Run1 latches total versus runs -- difference and pct' );
        dbms_output.put_line
        ( lpad( 'Run1', 10 ) || lpad( 'Run2', 10 ) ||
          lpad( 'Diff', 10 ) || lpad( 'Pct', 10 ) );

        for x in
        ( select to_char( run1, '999,999,999' ) ||
                to_char( run2, '999,999,999' ) ||
                to_char( diff, '999,999,999' ) ||
                to_char( round( run1/run2*100,2 ), '999,999.99' ) || '%' data
            from ( select sum(b.value-a.value) run1, sum(c.value-b.value) run2,
                          sum( (c.value-b.value)-(b.value-a.value)) diff
                    from run_stats a, run_stats b, run_stats c
                    where a.name = b.name
                      and b.name = c.name
                      and a.runid = 'before'
                      and b.runid = 'after 1'
                      and c.runid = 'after 2'
                      and a.name like 'LATCH%'
                    )
        ) loop
            dbms_output.put_line( x.data );
        end loop;
    end;

    end;
    /
    -- end of code by Tom Kyte

    -- example usage of Tom Kyte's runstats_pkg to compare two queries, running each query 1000 times

    -- (you can substitute your own queries and change the number of times run)

    -- you execute runstats_pkg.rs_start, then run loops of the first query,

    -- then execute runstats_pkg.middle, then run loops of the second query,

    -- then execute runstats_pk.stop and see the results of times, latches, and other statistics:
    [email protected]_12.1.0.2.0> SET SERVEROUTPUT ON SIZE 1000000
    [email protected]_12.1.0.2.0> EXECUTE runstats_pkg.rs_start

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> DECLARE
      2    v_dummy NUMBER;
      3  BEGIN
      4    FOR i IN 1 .. 1000
      5    LOOP
      6    SELECT COUNT(*) INTO v_dummy
      7    FROM emp
      8    WHERE deptno IN
      9    (SELECT deptno
    10    FROM dept);
    11    END LOOP;
    12  END;
    13  /

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> EXECUTE runstats_pkg.rs_middle

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> DECLARE
      2    v_dummy NUMBER;
      3  BEGIN
      4    FOR i IN 1 .. 1000
      5    LOOP
      6    SELECT COUNT(*) INTO v_dummy
      7    FROM emp
      8    JOIN dept ON (dept.deptno=emp.deptno);
      9    END LOOP;
    10  END;
    11  /

    PL/SQL procedure successfully completed.

    [email protected]_12.1.0.2.0> EXECUTE runstats_pkg.rs_stop
    Run1 ran in 8 hsecs
    Run2 ran in 6 hsecs
    run 1 ran in 133.33% of the time

    Name                                Run1      Run2      Diff
    STAT...Requests to/from client        5        4        -1
    LATCH.messages                        0        1        1
    STAT...session cursor cache hi    1,005    1,004        -1
    STAT...cursor authentications          0        1        1
    STAT...free buffer requested          0        1        1
    STAT...heap block compress            16        17        1
    LATCH.simulator hash latch            5        3        -2
    LATCH.cache buffers chains        14,324    14,326        2
    LATCH.object queue header oper        3        1        -2
    LATCH.OS process allocation            0        2        2
    STAT...CPU used by this sessio        7        4        -3
    STAT...recursive cpu usage            7        4        -3
    STAT...consistent gets examina        14        17        3
    STAT...consistent gets examina        14        17        3
    STAT...consistent gets pin (fa    7,015    7,018        3
    STAT...consistent gets pin        7,015    7,018        3
    STAT...cleanout - number of kt        14        17        3
    STAT...active txn count during        14        17        3
    STAT...HSC Heap Segment Block        26        30        4
    STAT...redo entries                  26        30        4
    STAT...Heap Segment Array Inse        26        30        4
    STAT...consistent gets            7,029    7,035        6
    STAT...consistent gets from ca    7,029    7,035        6
    STAT...calls to kcmgcs            2,047    2,056        9
    LATCH.shared pool                      8        17        9
    STAT...consistent changes            54        64        10
    STAT...db block gets                  54        66        12
    STAT...db block gets from cach        54        66        12
    STAT...db block changes              80        95        15
    STAT...session logical reads      7,083    7,101        18
    STAT...CPU used when call star        24        4      -20
    STAT...bytes received via SQL*    1,817    1,787      -30
    STAT...DB time                        61        4      -57
    STAT...undo change vector size    5,608    5,876      268
    STAT...redo size                  7,688    8,344      656
    STAT...logical read bytes from####################  147,456

    Run1 latches total versus runs -- difference and pct
    Run1      Run2      Diff      Pct
    14,654      14,664          10      99.93%

    PL/SQL procedure successfully completed.

    PLSQL_GUY
  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited February 2017

    An important thing to remember is that for version below 12c, scalar subqueries can't be unnested by the optimizer so it doesn't factor in the number of times the scalar subquery might run - it just needs to optimize it once:

    alter session set optimizer_features_enable='11.2.0.3';
    [email protected]>explain plan for
      2  select (select count(*) from big_table bt2 where bt2.object_name = bt1.object_name)
      3  from big_table bt1;

    Explained.

    [email protected]>@X

    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------
    Plan hash value: 3753531572

    ---------------------------------------------------------------------------------------
    | Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time    |
    ---------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT    |                |  9349K|  222M| 12643  (1)| 00:00:01 |
    |  1 |  SORT AGGREGATE      |                |    1 |    25 |            |          |
    |*  2 |  INDEX RANGE SCAN  | BIG_TABLE_NAME |  178 |  4450 |    3  (0)| 00:00:01 |
    |  3 |  INDEX FAST FULL SCAN| BIG_TABLE_NAME |  9349K|  222M| 12643  (1)| 00:00:01 |
    ---------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - access("BT2"."OBJECT_NAME"=:B1)

    Obviously, we're really going to do that scalar subquery for each row in big_table, but why should that matter?

    In 12c, because Oracle can unnest the scalar subquery, it will cost it according to the number of times it believes it will run (and seems to factor in the effects of scalar subquery caching but I'm not sure on the extent)

    [email protected]>explain plan for
      2  select (select count(*) from big_table bt2 where bt2.object_name = bt1.object_name)
      3  from big_table bt1;

    Explained.

    [email protected]>
    [email protected]>@X

    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------
    Plan hash value: 3753531572

    ---------------------------------------------------------------------------------------
    | Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time    |
    ---------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT    |                |  9349K|  222M|    26M  (1)| 00:17:25 |
    |  1 |  SORT AGGREGATE      |                |    1 |    25 |            |          |
    |*  2 |  INDEX RANGE SCAN  | BIG_TABLE_NAME |  178 |  4450 |    3  (0)| 00:00:01 |
    |  3 |  INDEX FAST FULL SCAN| BIG_TABLE_NAME |  9349K|  222M| 12643  (1)| 00:00:01 |
    ---------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

      2 - access("BT2"."OBJECT_NAME"=:B1)

    Compare this to the join scenario in 12c:

    [email protected]>explain plan for
      2  select bt2.cnt from (select object_name, count(*) cnt from big_table bt2 group by object_name) bt2
      3  join big_table bt1 on bt2.object_name = bt1.object_name
      4  /

    Explained.

    [email protected]>@x

    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------
    Plan hash value: 796932027

    --------------------------------------------------------------------------------------------------
    | Id  | Operation               | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT        |                |  9349K|   927M|       | 65988   (1)| 00:00:03 |
    |*  1 |  HASH JOIN              |                |  9349K|   927M|  4664K| 65988   (1)| 00:00:03 |
    |   2 |   VIEW                  |                | 52452 |  4046K|       | 36719   (1)| 00:00:02 |
    |   3 |    HASH GROUP BY        |                | 52452 |  1280K|   286M| 36719   (1)| 00:00:02 |
    |   4 |     INDEX FAST FULL SCAN| BIG_TABLE_NAME |  9349K|   222M|       | 12643   (1)| 00:00:01 |
    |   5 |   INDEX FAST FULL SCAN  | BIG_TABLE_NAME |  9349K|   222M|       | 12643   (1)| 00:00:01 |
    --------------------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

       1 - access("BT2"."OBJECT_NAME"="BT1"."OBJECT_NAME")

    Note
    -----
       - this is an adaptive plan

    (which is pretty much the same as 11.2.0.3)

  • PLSQL_GUY
    PLSQL_GUY Member Posts: 77
    edited February 2017
    Billy~Verreynne wrote:Your clock time includes I/O. Physical I/O is usually a lot slower than logical I/O. Comparing 2 queries also mean comparing their I/O - where the "better" query could be doing physical I/O cause the data is not yet in the buffer cache, and the "bad" query could be hitting the cache instead. This can result in a more time consumed by the "better" query (doing less I/O), than the "bad" query (doing more I/O), as the "bad" query's I/O happens to be from memory and not disk.Unless the I/O between the 2 queries are identically handled (from disk and from cache), wall time clocking the 2 queries for comparison is silly and meaningless.

    Then how on earth can you do SQL tuning?? When tuning a "bad" SQL you will write it differently and run both old and new SQLs and compare times, right??  Won't both queries that bring the same result-set have the same I/O???  Because all the tables and rows are the same, right?

  • Billy Verreynne
    Billy Verreynne Member Posts: 28,267 Red Diamond
    edited February 2017

    You do not compare wall clock time - you compare metrics.

    The query using fewer resources (such as I/O and CPU) to get the job done, is the better query.

    This is also why instrumentation is important. I use a similar method that Tom uses (as demonstrated by Barbara above) - I collect current session metrics into a temp table at the start of a processing step, and determine the delta between that and the metrics at  the end of the processing step. And store the results in normal tables. If using PQ, then include PQ virtual performance views in the instrumentation data recorded.

    So each step in the process (or even each SQL), can be instrumented - and you know exactly what resources, and how much of that, the step consumed, you know what the step spend most of its time on, you even know the execution plans used for the SQLs in that step.

    Cannot stress how important proper instrumentation is. It is the difference between knowing and having a wtf!? expression on your face.

  • Billy Verreynne
    Billy Verreynne Member Posts: 28,267 Red Diamond
    edited February 2017

    Oh yeah - I/O is not the same, as the work done to find the relevant row data, differs from execution plan to execution plan. Table joins can be done as nested loop or hashes for example. Depends on the amount of data, indexes, etc, as to which one does less work.

    So the source data may be the same. The output (queries do not create physical "result sets") may be the same. The work done to process the source in order to create the output, can differ significantly.

    Wall time is an easy metric to use for comparing queries hitting the same source data and producing the same output. It is also the lazy and incorrect metric to use.

  • Unknown
    edited February 2017
    But in development we don't have enough data to test real world production loads. We have only very few data in the rows. Both queries executed in sub-second time. So, how can we compare sub-second times like .047 seconds and .062 seconds?? It will give misleading results, right?

    You are the one that said you had two results - one with a cost of 2 and one with 3 and ask if they  will run 33% slower.

    So why didn't you time them and see if they ran 33% slower?

    Only if we run against millions of rows in all 3 tables will we get running times of minutes and then only we can get an idea of which is actually faster, right?

    So put 'millions of rows in all 3 tables'. What is stopping you?

    At some point you are actually going to do some testing yourself. I know you are trying to avoid doing any real work but that is the best way to get information.

    Try it - it isn't as hard as you seem to think it will be.

    Chris Hunt
  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited March 2017
    Barbara Boehmer wrote:Also, the first time a query is run, it must be hard parsed and that can consume time. After that, the query is stored in the SGA and does not have to be hard parsed again. I have seen a first query run in 7 seconds and subsequent runs of the same query take only 1 second. 

    Differences on that scale will be due to data caching, not parsing. Hard parses should be avoided, but if it's taking 6 seconds somethings seriously wrong.

    To clarify for the OP - when you run a query a for the first time, oracle will read the data from disk and load it into memory in the SGA. Subsequent runs of the query, or other queries that need the same data, will read the data from memory rather than disk. This is a lot faster, but it means you should always ignore the first run of a query for timing purposes.

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited March 2017

    Also worth pointing out - the plan oracle chooses run a query against tables with small amounts of data can be completely different to the plan it chooses when there are large amounts of data. Which is why looking at plans generated on tiny dev systems is generally a waste of time.

This discussion has been closed.