1 2 3 Previous Next 31 Replies Latest reply: Dec 14, 2012 9:14 AM by Penky Go to original post RSS
      • 15. Re: Please help with parallel query
        Penky
        won't help :-(
        SQL> alter session set parallel_min_time_threshold = 0;
         
        Session altered
         
        SQL> explain plan for select * from siebel.s_evt_act where todo_cd is null and owner_login = 'XXX';
         
        Explained
         
        SQL> select * from table(dbms_xplan.display());
         
        PLAN_TABLE_OUTPUT
        --------------------------------------------------------------------------------
        Plan hash value: 3995919838
        -------------------------------------------------------------------------------
        | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
        -------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT  |           |    14 |  8638 |   114K (27)| 00:00:01 |
        |*  1 |  TABLE ACCESS FULL| S_EVT_ACT |    14 |  8638 |   114K (27)| 00:00:01 |
        -------------------------------------------------------------------------------
        Predicate Information (identified by operation id):
        ---------------------------------------------------
           1 - filter("TODO_CD" IS NULL AND "OWNER_LOGIN"='XXX')
        Note
        -----
           - automatic DOP: Computed Degree of Parallelism is 1
        yes, the question is why the estimates are so wrong. estimate of 1 second while the time it takes in serial is 70 seconds...
        • 16. Re: Please help with parallel query
          905562
          Blast! :(

          Edit: Does Oracle know there are 32 CPUs in there? I've seen that overridden before.

          Edited by: mrk on Dec 14, 2012 8:42 AM
          • 17. Re: Please help with parallel query
            Jonathan Lewis
            Give that you're running a Siebel system, it's possible that your optimizer mode has been set to something like first_rows_10, which makes the optimizer produce some unexpected results. For example, the optimizer arithmetic may show that a full tablescan will take 100 seconds, but determine that only the first 20 blocks of the table need to be read to produce the indicated amount of data - so it can show a cost of (say) 114K, but a time of 1 second because it's assuming that you will stop fetching after one second. In these circumstances, parallel execution becomes fairly unlikely.
            show parameter optimizer
            
            alter session set optimizer_mode = all_rows;
            Regards
            Jonathan Lewis
            • 18. Re: Please help with parallel query
              Penky
              Jonathan you are absolutely right, Siebel is using first_rows_10 -> that might make a difference. Does it mean we will never be able to benefit from parallel query in Siebel?
              However, the "test" statements I ran above were tested using SQL client with no modification of optimizer parameters. The default value in this system is still "ALL_ROWS".
              SQL> show parameter optimizer
               
              NAME                                 TYPE        VALUE
              ------------------------------------ ----------- ------------------------------
              optimizer_capture_sql_plan_baselines boolean     FALSE
              optimizer_dynamic_sampling           integer     1
              optimizer_features_enable            string      11.2.0.2
              optimizer_index_caching              integer     0
              optimizer_index_cost_adj             integer     1
              _optimizer_max_permutations          integer     100
              optimizer_mode                       string      ALL_ROWS
              optimizer_secure_view_merging        boolean     TRUE
              optimizer_use_invisible_indexes      boolean     FALSE
              optimizer_use_pending_statistics     boolean     FALSE
              optimizer_use_sql_plan_baselines     boolean     TRUE
              @mrk:

              yes CPUs are recognized:
              SQL> show parameter CPU
               
              NAME                                 TYPE        VALUE
              ------------------------------------ ----------- ------------------------------
              cpu_count                            integer     32
              parallel_threads_per_cpu             integer     2
              resource_manager_cpu_allocation      integer     32
              • 19. Re: Please help with parallel query
                905562
                Wait, I misread your query - can you check what happens if you issue the same large table scan, but without a predicate where it estimates 14 rows out with the session min time modified?

                Perhaps the original one where it expected 7928302 rows out.
                • 20. Re: Please help with parallel query
                  Penky
                  here we go !
                  SQL> explain plan for select * from siebel.s_evt_act;
                   
                  Explained
                   
                  SQL> select * from table(dbms_xplan.display());
                   
                  PLAN_TABLE_OUTPUT
                  --------------------------------------------------------------------------------
                  Plan hash value: 3995919838
                  -------------------------------------------------------------------------------
                  | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
                  -------------------------------------------------------------------------------
                  |   0 | SELECT STATEMENT  |           |  5305K|  3121M|   125K (34)| 00:00:01 |
                  |   1 |  TABLE ACCESS FULL| S_EVT_ACT |  5305K|  3121M|   125K (34)| 00:00:01 |
                  -------------------------------------------------------------------------------
                  Note
                  -----
                     - automatic DOP: Computed Degree of Parallelism is 1
                   
                  12 rows selected
                   
                  SQL> alter session set parallel_min_time_threshold=0;
                   
                  Session altered
                   
                  SQL> explain plan for select * from siebel.s_evt_act;
                   
                  Explained
                   
                  SQL> select * from table(dbms_xplan.display());
                   
                  PLAN_TABLE_OUTPUT
                  --------------------------------------------------------------------------------
                  Plan hash value: 3995919838
                  -------------------------------------------------------------------------------
                  | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
                  -------------------------------------------------------------------------------
                  |   0 | SELECT STATEMENT  |           |  5305K|  3121M|   125K (34)| 00:00:01 |
                  |   1 |  TABLE ACCESS FULL| S_EVT_ACT |  5305K|  3121M|   125K (34)| 00:00:01 |
                  -------------------------------------------------------------------------------
                  Note
                  -----
                     - automatic DOP: Computed Degree of Parallelism is 1
                   
                  12 rows selected
                  • 21. Re: Please help with parallel query
                    905562
                    I'm meant this one explain plan for select * from siebel.s_accnt_postn;

                    The message that gave last time was "- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold", the other one is a different message " - automatic DOP: Computed Degree of Parallelism is 1".

                    Which suggests it is evaluating them differently, however, setting that aside for a moment I still believe it is likely that the source of this is the bad estimated times, even if you set a threshold of zero it's got a good chance of ignoring you anyway because it doesnt see the benefit in PX.

                    Basically I think automatic DOP and the default degrees are 'working as intended' but the skew in the estimates is causing it to throw undesirable results out.

                    Edit: Have you tried a 10053 trace on one of these "big" queries which it pushes DOP 1 into, just to see if there's any hints in there as to where its estimates are coming from/being driven from?

                    Edited by: mrk on Dec 14, 2012 10:12 AM
                    • 22. Re: Please help with parallel query
                      Penky
                      OK, here's the output for the s_accnt_postn table:

                      SQL> explain plan for select * from siebel.s_accnt_postn;
                      
                      Explained.
                      
                      SQL> select * from table(dbms_xplan.display());
                      
                      PLAN_TABLE_OUTPUT
                      ------------------------------------------------------------------------------------------
                      Plan hash value: 4072445022
                      
                      -----------------------------------------------------------------------------------
                      | Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
                      -----------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT  |               |  7915K|  1343M| 34792  (32)| 00:00:01 |
                      |   1 |  TABLE ACCESS FULL| S_ACCNT_POSTN |  7915K|  1343M| 34792  (32)| 00:00:01 |
                      -----------------------------------------------------------------------------------
                      
                      Note
                      -----
                      
                      PLAN_TABLE_OUTPUT
                      ------------------------------------------------------------------------------------------
                         - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold
                      
                      12 rows selected.
                      
                      SQL> alter session set parallel_min_time_threshold=0;
                      
                      Session altered.
                      
                      SQL> explain plan for select * from siebel.s_accnt_postn;
                      
                      Explained.
                      
                      SQL> select * from table(dbms_xplan.display());
                      
                      PLAN_TABLE_OUTPUT
                      ------------------------------------------------------------------------------------------
                      Plan hash value: 4072445022
                      
                      -----------------------------------------------------------------------------------
                      | Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
                      -----------------------------------------------------------------------------------
                      |   0 | SELECT STATEMENT  |               |  7915K|  1343M| 34792  (32)| 00:00:01 |
                      |   1 |  TABLE ACCESS FULL| S_ACCNT_POSTN |  7915K|  1343M| 34792  (32)| 00:00:01 |
                      -----------------------------------------------------------------------------------
                      
                      Note
                      -----
                      
                      PLAN_TABLE_OUTPUT
                      ------------------------------------------------------------------------------------------
                         - automatic DOP: Computed Degree of Parallelism is 1
                      
                      12 rows selected.
                      will try the 10053 trace as suggested and let you know!
                      • 23. Re: Please help with parallel query
                        905562
                        Good stuff.

                        That at least shows it's behaving as expected, keep in mind once it has decided that it is over the min_threshold, it then works out the DOP: If your time for a serial plan is 00:00:01, it's going to be a stretch to improve that with PX.
                        • 24. Re: Please help with parallel query
                          Randolf Geist
                          Penky wrote:
                          The 989 MB/s can only be achieved with several concurrent processes accessing the file system at once. One single process will never reach that speed. Therefore, the query in serial takes 70 seconds, and in parallel it takes around 4 or 5 seconds. The asumption of Optimizer that a serial access will take 1 second is wrong.
                          I believe your problem of Auto DOP not using parallelism is the result of running the (from 11.2.0.2 on required) I/O calibration - the results of your calibration probably lead to such TIME calculations. Check my post about the underlying calculation:

                          http://oracle-randolf.blogspot.de/2011/07/cost-is-time-next-generation.html

                          Assuming you have a 8KB block size for example the cost of one of the tablescans above (114K) will be multiplied by the 8KB block size to arrive at the "calculated data volume" (which is unreasonable in my opinion but read the post for more info), so we have something like 912 MB. If you end up with a MAX_PMBPS value of 989MB/sec, this results in a time estimate of less than 1 second.

                          You would need to check the 10053 trace file (see the post for what to look for) to confirm your numbers and then manipulate the MAX_PMBPS to your liking (see MOS Note "Automatic Degree of Parallelism in 11.2.0.2 [ID 1269321.1]") - it is basically the knob you can use to get the desired Auto Parallel Degree. You do this via direct manipulation of the SYS.RESOURCE_IO_CALIBRATE$ table as outlined the MOS Note.

                          Unfortunately you need to restart the instance to have the changes to the MAX_PMBPS effective, which makes playing around with it a bit of a pain.

                          Randolf
                          • 25. Re: Please help with parallel query
                            Penky
                            Randolf,

                            Seems like you hit the nail on the head!
                            Thank you all for your time!
                            I think IO calibration is exactly the issue.
                            I ran the 10053 trace like mrk previously suggested and here's a little snippet:
                            Access path analysis for S_ACCNT_POSTN
                            ***************************************
                            SINGLE TABLE ACCESS PATH 
                              Single Table Cardinality Estimation for S_ACCNT_POSTN[S_ACCNT_POSTN] 
                              Table: S_ACCNT_POSTN  Alias: S_ACCNT_POSTN
                                Card: Original: 7915967.000000  Rounded: 7915967  Computed: 7915967.00  Non Adjusted: 7915967.00
                              Access Path: TableScan
                                Cost:  34792.34  Resp: 34792.34  Degree: 0
                                  Cost_io: 23738.00  Cost_cpu: 8822416533
                                  Resp_io: 23738.00  Resp_cpu: 8822416533
                            kkeCostToTime: using io calibrate stats 
                             maxmbps=989(MB/s) maxpmbps=1004(MB/s) 
                             block_size=8192 mb_io_count=8 mb_io_size=63111 (bytes) 
                             tot_io_size=2094(MB) time=2086(ms)
                            AutoDOP: Consider caching for S_ACCNT_POSTN[S_ACCNT_POSTN](obj#285037) 
                            cost:34792.34 blkSize:8192 objSize:182863.00 marObjSize:173719.85 bufSize:2020096.00 affPercent:80 smallTab:NO affinitized:YES
                            AutoDOP: Table/Index(#285037) access (scan) cost=34792.34 estTime=2085.70 unit=10000.00 dop=2 -> maxdop=2
                              Best:: AccessPath: TableScan
                                     Cost: 34792.34  Degree: 1  Resp: 34792.34  Card: 7915967.00  Bytes: 0
                            
                            ***************************************
                            I guess now the problem is clear. Seems like the SSD Raid can achieve nearly 1GB/s in read, but that is not possible with one process accessing the files. There need to be a bunch of parallel accesses for this to happen.

                            I might try adjusting the values for maxpmbps manually. Although it's not very elegant. Where's the sense of running the IO calibration at all then .... ??

                            Edited by: Penky on Dec 14, 2012 12:32 PM
                            • 26. Re: Please help with parallel query
                              Penky
                              continued:
                              AUTO DOP PLANS EVALUATION
                              ***************************************
                              Compilation completed with Dop: 1.
                                Cost_io:  23738.00  Cost_cpu: 8822416532.72
                                Card:     7915967.00  Bytes:    1409042126.00
                                Cost:     34792.34  Est_time:  2086ms
                              kkopqSetDopReason: Reason why we chose this DOP is: parallel threshold.
                              Serial phase is inexpensive (34792), stick to it.
                              Costing completed. DOP chosen: 1.
                              AutoDop: kkopqSetMaxDopInCursorEnv:In the Cursor Session Env, max DOP is set to 64
                              *****************************
                              Number of Compilations tried: 1
                              *****************************
                              .....

                              ----- Explain Plan Dump -----
                              ----- Plan Table -----
                               
                              ============
                              Plan Table
                              ============
                              ------------------------------------------+-----------------------------------+
                              | Id  | Operation          | Name         | Rows  | Bytes | Cost  | Time      |
                              ------------------------------------------+-----------------------------------+
                              | 0   | SELECT STATEMENT   |              |       |       |   34K |           |
                              | 1   |  TABLE ACCESS FULL | S_ACCNT_POSTN| 7730K | 1344M |   34K |  00:01:10 |
                              ------------------------------------------+-----------------------------------+
                              Predicate Information:
                              ----------------------
                               
                              Content of other_xml column
                              ===========================
                                dop_reason     : parallel threshold
                                dop            : 1
                                px_in_memory   : yes
                                db_version     : 11.2.0.2
                                parse_schema   : SYS
                                plan_hash      : 4072445022
                                plan_hash_2    : 2917888276
                              • 27. Re: Please help with parallel query
                                Randolf Geist
                                Penky wrote:
                                Randolf,

                                Seems like you hit the nail on the head!
                                Thank you all for your time!
                                I think IO calibration is exactly the issue.
                                I ran the 10053 trace like mrk previously suggested and here's a little snippet:
                                I see that the original calculation in my blog post is not entirely correct - you got a MB_IO_COUNT of 8, and I'm not sure where that comes from. What is your DB_FILE_MULTIBLOCK_READ_COUNT parameter set to, and what are the current contents of SYS.RESOURCE_IO_CALIBRATE$?

                                This results in a multiplier of 63KB instead of 8KB for the "calculated data volume", which is in your case 63.111 KB * 34792.34 = 2,195,779 KB / 1024 / 1024 = approx. 2094 MB.

                                Note this line from the trace file:
                                maxmbps=989(MB/s) maxpmbps=1004(MB/s)
                                So it looks like your single process MB/sec were measured as 1004MB/s, whereas the overall throughput was measured at 989 MB/s, I believe. If that's true it already shows that the result of the calibration is questionable.

                                This results in a time estimate of 2094 MB / 1004 MB / s = approx. 2 secs. Two seconds is below your PARALLEL_MIN_TIME_THRESHOLD, so it goes serial.
                                I might try adjusting the values for maxpmbps manually. Although it's not very elegant. Where's the sense of running the IO calibration at all then .... ??
                                The mentioned MOS Note has a link to bug where Oracle already confirms that the results from the calibration can be questionable. You'll have to find a good value yourself, unfortunately.

                                Oracle wants Exadata customers to set the MAX_PMBPS to 200 (MB/s), by the way, to give you an idea of a reasonable value for a powerful I/O subsystem (that's also mentioned in the MOS note).

                                Randolf

                                Edited by: Randolf Geist on Dec 14, 2012 12:55 PM
                                • 28. Re: Please help with parallel query
                                  Penky
                                  Randolf,

                                  db_file_multiblock_read_count is not set at all AFAIK, so it results in defaulting to 128 in 11g. I read somewhere that it's not recommended to set this manually from 10 or 11 onwards.
                                  SQL> show parameter db_file_multiblock
                                   
                                  NAME                                 TYPE        VALUE
                                  ------------------------------------ ----------- ------------------------------
                                  db_file_multiblock_read_count        integer     128
                                   
                                  Here's the content of resource_io_calibrate$

                                  SQL> select * from SYS.RESOURCE_IO_CALIBRATE$;
                                   
                                  START_TIME                                                                       END_TIME                                                                           MAX_IOPS   MAX_MBPS  MAX_PMBPS    LATENCY  NUM_DISKS
                                  -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ---------- ---------- ---------- ---------- ----------
                                  17.11.12 12:10:59,048000                                                         17.11.12 12:27:32,984000                                                              21569        989       1004         21          1
                                   
                                  • 29. Re: Please help with parallel query
                                    Randolf Geist
                                    Penky wrote:
                                    Randolf,

                                    db_file_multiblock_read_count is not set at all AFAIK, so it results in defaulting to 128 in 11g. I read somewhere that it's not recommended to set this manually from 10 or 11 onwards.
                                    Thanks for the values. That's recommended, correct, but still many sites set the value to something non-default. I'm still not sure where that MB_IO_COUNT =8 comes from, though.

                                    By the way, if you don't want to fiddle with the Auto DOP, you could just stick to old manual DOP. If you set your PARALLEL_DEGREE_POLICY to MANUAL but have the objects marked as PARALLEL you should get a PARALLEL query, provided there is no cheaper serial plan available to the optimizer.

                                    The default DOP is very likely to high (64 per node with your given configuration), so you might want to set the PARALLEL degree to something lower.

                                    You could also play around with ALTER SESSION FORCE PARALLEL QUERY PARALLEL x if you want/can limit this to specific sessions, then you don't even need to mark the objects as PARALLEL, as it might have side effects to other processes you don't want to run in parallel.

                                    Randolf