Forum Stats

  • 3,759,971 Users
  • 2,251,623 Discussions
  • 7,870,894 Comments

Discussions

Bad estimated time in execution plan

user455466
user455466 Member Posts: 32 Blue Ribbon

Hi,

In the last months our DB is going slow. The admin says he doesn't see anything bad.

In the execution plan below, what is the probability that Oracle make so big mistake calculating the time of this simple query? Can this be a clear sign that the DB has any problem?

SQL*Plus: Release 10.2.0.4.0 - Production on Jue Sep 23 21:12:47 2021

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Conectado a:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

select count(*) from (
select a.* from COTA.COGS_TCORPORATIVOS_1 a, (select * from COTA.COGS_TCORPORATIVOS_1 where rownum < 10000) b);

SQL> SQL>   2  
  COUNT(*)
----------
1826877294

Elapsed: 00:03:57.93

Execution Plan
----------------------------------------------------------
Plan hash value: 1851713859

------------------------------------------------------------------------------------
| Id  | Operation                | Name            | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                 |     1 |   130K (32)| 00:00:06 |
|   1 |  SORT AGGREGATE          |                 |     1 |            |          |
|   2 |   MERGE JOIN CARTESIAN   |                 |  1866M|   130K (32)| 00:00:06 |
|   3 |    VIEW                  |                 |  9999 |     2   (0)| 00:00:01 |
|*  4 |     COUNT STOPKEY        |                 |       |            |          |
|   5 |      INDEX FAST FULL SCAN| COGS_TCORP_PK_1 |  9999 |     2   (0)| 00:00:01 |
|   6 |    BUFFER SORT           |                 |   186K|   130K (32)| 00:00:06 |
|   7 |     INDEX FAST FULL SCAN | COGS_TCORP_PK_1 |   186K|    13  (31)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(ROWNUM<10000)


Statistics
----------------------------------------------------------
         70  recursive calls
          0  db block gets
        512  consistent gets
        397  physical reads
          0  redo size
        564  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
          1  rows processed
         


I would appreciate any comment.


Regards, Jose Luis.

Best Answer

  • Paulzip
    Paulzip Member Posts: 8,460 Blue Diamond
    edited Sep 27, 2021 10:55PM Accepted Answer

    The TIME column values are a guess, calculated from cost using system I/O statistics. I don't believe they factor in things like how many IOs were physical IOs, how fast they were read in, how many other users are on the system, network issues and so on. It might also mean you need to run dbms_stats.gather_system_stats, to get more accurate ones, clear the execution plan from the shared pool (see below) and try again. However, there's a multitude of reasons for slow DB performance. The common ones are stale stats, IO bottle necks, under resourced server, too many concurrent sessions, CPU bottle necks, lack of usable memory etc etc. Oracle comes with Automatic Database Diagnostics Monitor (ADDM) exactly for these sorts of diagnostics.

    BTW Your query isn't really diagnosing things. You're doing a huge Cartesian join merge, so it certainly won't be instant.


    Clear execution plan :

    select ADDRESS, HASH_VALUE from V$SQLAREA where SQL_ID like '<part of your query>%';
    
    ADDRESS  HASH_VALUE
    ---------------- ----------
    000000085FD77CF0 808321886
    
    -- PURGE THE PLAN FROM SHARED POOL
    exec DBMS_SHARED_POOL.PURGE ('000000085FD77CF0, 808321886', 'C');
    


Answers

  • Paulzip
    Paulzip Member Posts: 8,460 Blue Diamond
    edited Sep 27, 2021 10:55PM Accepted Answer

    The TIME column values are a guess, calculated from cost using system I/O statistics. I don't believe they factor in things like how many IOs were physical IOs, how fast they were read in, how many other users are on the system, network issues and so on. It might also mean you need to run dbms_stats.gather_system_stats, to get more accurate ones, clear the execution plan from the shared pool (see below) and try again. However, there's a multitude of reasons for slow DB performance. The common ones are stale stats, IO bottle necks, under resourced server, too many concurrent sessions, CPU bottle necks, lack of usable memory etc etc. Oracle comes with Automatic Database Diagnostics Monitor (ADDM) exactly for these sorts of diagnostics.

    BTW Your query isn't really diagnosing things. You're doing a huge Cartesian join merge, so it certainly won't be instant.


    Clear execution plan :

    select ADDRESS, HASH_VALUE from V$SQLAREA where SQL_ID like '<part of your query>%';
    
    ADDRESS  HASH_VALUE
    ---------------- ----------
    000000085FD77CF0 808321886
    
    -- PURGE THE PLAN FROM SHARED POOL
    exec DBMS_SHARED_POOL.PURGE ('000000085FD77CF0, 808321886', 'C');
    


  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,756 Gold Crown

    Oracle has gone through a couple of variations on trying to connect Cost and Time. In your version of Oracle the default behaviour would be (crudely put) - the optimizer thinks a single process on your hardware could do 130,000 single block (physical) read requests in 6 seconds.

    Fiddling the figures a little (since the reported 6 is rounded) that's 20,000 physical reads per second, which is usually a lot more than is really possible. As PaulZip pointed out, this unrealistic conversion is often a consequence of a bad set of "system stats". The general advice from Oracle (see the Optimizer Blog https://blogs.oracle.com/optimizer/post/should-you-gather-system-statistics ) is to not gather them but (delete them if necessary and) use the defaults.

    As Nigel's note indicates, though, that's a pretty big change to make on a production system and one that's likely to cause a degree of nervousness. It's the sort of thing that can make a lot of plans change, and even if many improve it just takes a couple of unlucky changes to cause lots of complaints. (At the same time, it's worth noting that historically people used to set the db_file_mulitblock_read_count to a large value (128) and the optimizer_index_cost_adj and optimizer_index_caching parameters to values that made indexed access paths much lower in cost. Again it's best to allow these to default but, again, it's a scary change to make on a reasonably healthy production system.

    Two related points (from a very big topic):

    • Your version of Oracle has a "table preference" for gathering table stats that helps to correct the optimizer's arithmetic about indexes. Search Richard Foote's blog for "table_cached_blocks".
    • There is a special case for system stats if you're running on Exadata.

    Regards

    Jonathan Lewis