This discussion is archived
1 2 3 4 Previous Next 45 Replies Latest reply: Nov 24, 2009 8:25 AM by 724046 RSS

Help with Parallelization on our SPARC server...we are drowning!

724046 Newbie
Currently Being Moderated
Hi Guys,

Anyone out there have a DB (We are running a large data mart) on UltraSPARC T2 server. We have
uname -a
SunOS whms3204 5.10 Generic_141414-02 sun4v sparc SUNW,Sun-Blade-T6320

prtconf
System Configuration: Sun Microsystems sun4v
Memory size: 16256 Megabytes
Our Parallel settings are as follows:
SQL> show parameter parallel
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_log_parallelism                     integer     8
_log_parallelism_max                 integer     12
fast_start_parallel_rollback         string      high
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     32768
parallel_instance_group              string      
parallel_max_servers                 integer     64
parallel_min_percent                 integer     0
parallel_min_servers                 integer     16
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

SQL> show parameter cpu
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     32

SQL> show parameter pool
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
buffer_pool_keep                     string      
buffer_pool_recycle                  string      
global_context_pool_size             string      
java_pool_size                       big integer 0
large_pool_size                      big integer 0
olap_page_pool_size                  big integer 0
shared_pool_reserved_size            big integer 25M
shared_pool_size                     big integer 0
streams_pool_size                    big integer 0
Our developers are concerned that we are not taking full advantage of parallelization as the jobs they run take forever to run. All tables and indexes have a degree of 1. The developer at times uses the ALTER SESSION FORCE PARALLEL QUERY which I am not too fond of. Our top waits are generally as follows:
Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
PX Deq Credit: send blkd            959,256   1,876,663   1956   92.4      Other
CPU time                                        126,325           6.2           
db file scattered read               29,068         505     17    0.0   User I/O
db file sequential read             145,547         215      1    0.0   User I/O
cursor: pin S wait on X              10,084         196     19    0.0 Concurrenc
          -------------------------------------------------------------       

Wait Events                                DB/Inst: VADP/VADP  Snaps: 265-282
-> s  - second      
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
 
                                                                   Avg          
                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
PX Deq Credit: send blkd            959,256  100.7   1,876,663    1956   1,668.3
db file scattered read               29,068     .0         505      17      50.6
db file sequential read             145,547     .0         215       1     253.1
cursor: pin S wait on X              10,084  100.0         196      19      17.5
enq: BF - allocation content            292   62.0         178     611       0.5
control file parallel write          20,400     .0          39       2      35.5
log file parallel write               4,027     .0          33       8       7.0
log buffer space                         71   19.7          16     221       0.1
db file parallel write                5,432     .0          10       2       9.4
os thread startup                        39     .0           5     122       0.1
control file sequential read         35,222     .0           3       0      61.3
The developer is convinced that we can do better this and really make use of the servers horsepower but we as the DBA's think that the processing they are doing is what is causing the bottlenecks.

Anyone...please give me some suggestions.
  • 1. Re: Help with Parallelization on our SPARC server...we are drowning!
    724046 Newbie
    Currently Being Moderated
    perhaps I should post my whole parameter listing :)
    SQL> show parameter
     
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    O7_DICTIONARY_ACCESSIBILITY          boolean     FALSE
    _lgwr_io_slaves                      integer     0
    _log_parallelism                     integer     8
    _log_parallelism_max                 integer     12
    _optimizer_cost_based_transformation string      off
    _optimizer_cost_model                string      CPU
    _realfree_heap_pagesize_hint         big integer 4M
    active_instance_count                integer     
    aq_tm_processes                      integer     0
    archive_lag_target                   integer     0
    asm_diskgroups                       string      
    asm_diskstring                       string      
    asm_power_limit                      integer     1
    audit_file_dest                      string      /JH0001/app/oracle/admin/VADI/adump
    audit_sys_operations                 boolean     FALSE
    audit_syslog_level                   string      
    audit_trail                          string      NONE
    background_core_dump                 string      partial
    background_dump_dest                 string      /JH0001/app/oracle/admin/VADI/bdump
    backup_tape_io_slaves                boolean     FALSE
    bitmap_merge_area_size               integer     0
    blank_trimming                       boolean     FALSE
    buffer_pool_keep                     string      
    buffer_pool_recycle                  string      
    circuits                             integer     
    cluster_database                     boolean     FALSE
    cluster_database_instances           integer     1
    cluster_interconnects                string      
    commit_point_strength                integer     1
    commit_write                         string      
    compatible                           string      10.2.0.1.0
    control_file_record_keep_time        integer     7
    control_files                        string      /JH2056/db/VADI/controlfile/control01.ctl, /JH2056/db/VADI/oradata/control02.ctl, /JH2056/db/VADI/redos/control03.ctl
    core_dump_dest                       string      /JH0001/app/oracle/admin/VADI/cdump
    cpu_count                            integer     32
    create_bitmap_area_size              integer     0
    create_stored_outlines               string      
    cursor_sharing                       string      EXACT
    cursor_space_for_time                boolean     FALSE
    db_16k_cache_size                    big integer 0
    db_2k_cache_size                     big integer 0
    db_32k_cache_size                    big integer 0
    db_4k_cache_size                     big integer 0
    db_8k_cache_size                     big integer 0
    db_block_buffers                     integer     0
    db_block_checking                    string      FALSE
    db_block_checksum                    string      false
    db_block_size                        integer     32768
    db_cache_advice                      string      ON
    db_cache_size                        big integer 0
    db_create_file_dest                  string      
    db_create_online_log_dest_1          string      
    db_create_online_log_dest_2          string      
    db_create_online_log_dest_3          string      
    db_create_online_log_dest_4          string      
    db_create_online_log_dest_5          string      
    db_domain                            string      
    db_file_multiblock_read_count        integer     32
    db_file_name_convert                 string      
    db_files                             integer     200
    db_flashback_retention_target        integer     1440
    db_keep_cache_size                   big integer 0
    db_name                              string      VADI
    db_recovery_file_dest                string      /JH2056/db/VADI/flash_recovery_area
    db_recovery_file_dest_size           big integer 2G
    db_recycle_cache_size                big integer 0
    db_unique_name                       string      VADI
    db_writer_processes                  integer     2
    dbwr_io_slaves                       integer     0
    ddl_wait_for_locks                   boolean     FALSE
    dg_broker_config_file1               string      /JH0001/app/oracle/product/10.2.0.4/dbs/dr1VADI.dat
    dg_broker_config_file2               string      /JH0001/app/oracle/product/10.2.0.4/dbs/dr2VADI.dat
    dg_broker_start                      boolean     FALSE
    disk_asynch_io                       boolean     FALSE
    dispatchers                          string      (PROTOCOL=TCP) (SERVICE=VADIXDB)
    distributed_lock_timeout             integer     60
    dml_locks                            integer     4860
    drs_start                            boolean     FALSE
    event                                string      
    fal_client                           string      
    fal_server                           string      
    fast_start_io_target                 integer     0
    fast_start_mttr_target               integer     300
    fast_start_parallel_rollback         string      high
    file_mapping                         boolean     FALSE
    fileio_network_adapters              string      
    filesystemio_options                 string      SETALL
    fixed_date                           string      
    gc_files_to_locks                    string      
    gcs_server_processes                 integer     0
    global_context_pool_size             string      
    global_names                         boolean     FALSE
    hash_area_size                       integer     0
    hi_shared_memory_address             integer     0
    hs_autoregister                      boolean     TRUE
    ifile                                file        
    instance_groups                      string      
    instance_name                        string      VADI
    instance_number                      integer     0
    instance_type                        string      RDBMS
    java_max_sessionspace_size           integer     0
    java_pool_size                       big integer 0
    java_soft_sessionspace_limit         integer     0
    job_queue_processes                  integer     2
    large_pool_size                      big integer 0
    ldap_directory_access                string      NONE
    license_max_sessions                 integer     0
    license_max_users                    integer     0
    license_sessions_warning             integer     0
    local_listener                       string      
    lock_name_space                      string      
    lock_sga                             boolean     FALSE
    log_archive_config                   string      
    log_archive_dest                     string      /JH0001/app/oracle/admin/VADI/arch
    log_archive_dest_1                   string      
    log_archive_dest_10                  string      
    log_archive_dest_2                   string      
    log_archive_dest_3                   string      
    log_archive_dest_4                   string      
    log_archive_dest_5                   string      
    log_archive_dest_6                   string      
    log_archive_dest_7                   string      
    log_archive_dest_8                   string      
    log_archive_dest_9                   string      
    log_archive_dest_state_1             string      enable
    log_archive_dest_state_10            string      enable
    log_archive_dest_state_2             string      enable
    log_archive_dest_state_3             string      enable
    log_archive_dest_state_4             string      enable
    log_archive_dest_state_5             string      enable
    log_archive_dest_state_6             string      enable
    log_archive_dest_state_7             string      enable
    log_archive_dest_state_8             string      enable
    log_archive_dest_state_9             string      enable
    log_archive_duplex_dest              string      
    log_archive_format                   string      VADI_%t_%s_%r.arc
    log_archive_local_first              boolean     TRUE
    log_archive_max_processes            integer     2
    log_archive_min_succeed_dest         integer     1
    log_archive_start                    boolean     FALSE
    log_archive_trace                    integer     0
    log_buffer                           integer     14278656
    log_checkpoint_interval              integer     0
    log_checkpoint_timeout               integer     1800
    log_checkpoints_to_alert             boolean     FALSE
    log_file_name_convert                string      
    logmnr_max_persistent_sessions       integer     1
    max_commit_propagation_delay         integer     0
    max_dispatchers                      integer     
    max_dump_file_size                   string      UNLIMITED
    max_enabled_roles                    integer     150
    max_shared_servers                   integer     
    nls_calendar                         string      
    nls_comp                             string      
    nls_currency                         string      
    nls_date_format                      string      
    nls_date_language                    string      
    nls_dual_currency                    string      
    nls_iso_currency                     string      
    nls_language                         string      AMERICAN
    nls_length_semantics                 string      BYTE
    nls_nchar_conv_excp                  string      FALSE
    nls_numeric_characters               string      
    nls_sort                             string      
    nls_territory                        string      AMERICA
    nls_time_format                      string      
    nls_time_tz_format                   string      
    nls_timestamp_format                 string      
    nls_timestamp_tz_format              string      
    object_cache_max_size_percent        integer     10
    object_cache_optimal_size            integer     102400
    olap_page_pool_size                  big integer 0
    open_cursors                         integer     1000
    open_links                           integer     4
    open_links_per_instance              integer     4
    optimizer_dynamic_sampling           integer     2
    optimizer_features_enable            string      10.2.0.4
    optimizer_index_caching              integer     0
    optimizer_index_cost_adj             integer     10
    optimizer_mode                       string      ALL_ROWS
    optimizer_secure_view_merging        boolean     TRUE
    os_authent_prefix                    string      ops$
    os_roles                             boolean     FALSE
    parallel_adaptive_multi_user         boolean     TRUE
    parallel_automatic_tuning            boolean     FALSE
    parallel_execution_message_size      integer     32768
    parallel_instance_group              string      
    parallel_max_servers                 integer     64
    parallel_min_percent                 integer     0
    parallel_min_servers                 integer     16
    parallel_server                      boolean     FALSE
    parallel_server_instances            integer     1
    parallel_threads_per_cpu             integer     2
    pga_aggregate_target                 big integer 750M
    plsql_ccflags                        string      
    plsql_code_type                      string      INTERPRETED
    plsql_compiler_flags                 string      INTERPRETED, NON_DEBUG
    plsql_debug                          boolean     FALSE
    plsql_native_library_dir             string      
    plsql_native_library_subdir_count    integer     0
    plsql_optimize_level                 integer     2
    plsql_v2_compatibility               boolean     FALSE
    plsql_warnings                       string      DISABLE:ALL
    pre_11g_enable_capture               boolean     FALSE
    pre_page_sga                         boolean     FALSE
    processes                            integer     300
    query_rewrite_enabled                string      FALSE
    query_rewrite_integrity              string      enforced
    rdbms_server_dn                      string      
    read_only_open_delayed               boolean     FALSE
    recovery_parallelism                 integer     0
    recyclebin                           string      OFF
    remote_archive_enable                string      true
    remote_dependencies_mode             string      TIMESTAMP
    remote_listener                      string      
    remote_login_passwordfile            string      EXCLUSIVE
    remote_os_authent                    boolean     FALSE
    remote_os_roles                      boolean     FALSE
    replication_dependency_tracking      boolean     TRUE
    resource_limit                       boolean     FALSE
    resource_manager_plan                string      
    resumable_timeout                    integer     0
    rollback_segments                    string      
    serial_reuse                         string      disable
    service_names                        string      VADI
    session_cached_cursors               integer     20
    session_max_open_files               integer     10
    sessions                             integer     1105
    sga_max_size                         big integer 2560M
    sga_target                           big integer 2560M
    shadow_core_dump                     string      partial
    shared_memory_address                integer     0
    shared_pool_reserved_size            big integer 25M
    shared_pool_size                     big integer 0
    shared_server_sessions               integer     
    shared_servers                       integer     1
    skip_unusable_indexes                boolean     TRUE
    smtp_out_server                      string      
    sort_area_retained_size              integer     0
    sort_area_size                       integer     0
    spfile                               string      /JH0001/app/oracle/product/10.2.0.4/dbs/spfileVADI.ora
    sql92_security                       boolean     FALSE
    sql_trace                            boolean     FALSE
    sql_version                          string      NATIVE
    sqltune_category                     string      DEFAULT
    standby_archive_dest                 string      ?/dbs/arch
    standby_file_management              string      MANUAL
    star_transformation_enabled          string      true
    statistics_level                     string      TYPICAL
    streams_pool_size                    big integer 0
    tape_asynch_io                       boolean     TRUE
    thread                               integer     0
    timed_os_statistics                  integer     0
    timed_statistics                     boolean     TRUE
    trace_enabled                        boolean     TRUE
    tracefile_identifier                 string      
    transactions                         integer     1215
    transactions_per_rollback_segment    integer     5
    undo_management                      string      AUTO
    undo_retention                       integer     2400
    undo_tablespace                      string      VADI_UNDO
    use_indirect_data_buffers            boolean     FALSE
    user_dump_dest                       string      /JH0001/app/oracle/admin/VADI/udump
    utl_file_dir                         string      
    workarea_size_policy                 string      AUTO
  • 2. Re: Help with Parallelization on our SPARC server...we are drowning!
    sb92075 Guru
    Currently Being Moderated
    The developer is convinced that we can do better this and really make use of the servers horsepower
    but we as the DBA's think that the processing they are doing is what is causing the bottlenecks.
    What is happening at the OS level?

    post results from following command

    uptime

    What is the bottleneck; CPU, RAM, I/O, network?

    ALTER SESSION SET SQL_TRACE=TRUE;
    -- invoke the problem SQL here
    EXEC PROBLEM_PROC
    ALTER SESSION SET SQL_TRACE=FALSE;

    now find the trace file within ./udump folder
    tkprof <trace_file.trc> trace_results.txt explain=<username>/<password>

    post the contents of trace_results.txt back here

    Run benchmark test with parallel enabled & with parallel disabled.
    Some SQL will not benefit from parallelization.
    PARALLEL is not a SILVER BULLET & automagivally make all SQL FAST
  • 3. Re: Help with Parallelization on our SPARC server...we are drowning!
    Tubby Guru
    Currently Being Moderated
    [http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1657271217898]

    Point number 6 in Tom's Original reply is likely the one you'd want to focus on and share with your developer (but the whole thread probably won't hurt to read either). Now keep in mind i have 0 idea what your data mart is doing, when it's doing it, etc...

    I'm simply echoing what Tom has said. If you have many jobs running, all attempting to run in parallel, all trying to consume 100% of the server, they will fight each other, cause contention, and have the opposite of the intended effect.

    If you have a single job and nothing happening on the server, well heck, super parallel might be just dandy.

    It really depends on knowing your system, what's happening, and what needs to happen, and when.
  • 4. Re: Help with Parallelization on our SPARC server...we are drowning!
    724046 Newbie
    Currently Being Moderated
    The query takes hours to run though......

    whms3204:/export/home/oracle
    () VADD> uptime
    4:05pm up 7 day(s), 7:13, 4 users, load average: 12.11, 12.12, 12.24

    Edited by: Gunners on Nov 13, 2009 1:05 PM

    I believe the bottleneck is the CPU and I/O. The DB accesses the same sub-system disk for all its data. Actually this server has 3 DB's. All which access the same disk.

    Edited by: Gunners on Nov 13, 2009 1:05 PM
  • 5. Re: Help with Parallelization on our SPARC server...we are drowning!
    724046 Newbie
    Currently Being Moderated
    Also the developers are all running there stuff in FORCE PARALLEL mode now as well. If I started running the queries I would slow them down as well...is there another means I can get you the info you seek? AWR, ADDM, ASH??
  • 6. Re: Help with Parallelization on our SPARC server...we are drowning!
    Tubby Guru
    Currently Being Moderated
    Gunners wrote:
    The query takes hours to run though......
    Take a horribly written query and run it in parallel. That does nothing to change the underlying problem that the query is horribly written (actually, it may change it in that it actually exacerbates the problem).

    I'm not saying that's your problem, but you've really given us nothing to help with. If you have a particular query that's performing poorly, post a tuning request for that query (there is loads of information required, which you can [FIND HERE|http://forums.oracle.com/forums/thread.jspa?messageID=3749835&]

    The query might have been written poorly, you may be lacking a proper indexing scheme, the list goes on.

    As for the original post dealing with parallel execution, as i outlined before ... it's not just something you can turn on and close your eyes hoping everything works out, it needs to be designed with a good understanding of the system and the server load.
  • 7. Re: Help with Parallelization on our SPARC server...we are drowning!
    sb92075 Guru
    Currently Being Moderated
    4:05pm up 7 day(s), 7:13, 4 users, load average: 12.11, 12.12, 12.24
    If system has fewer than 8 processors, system is CPU bound.

    PARALLEL makes more work for CPUs.
    On a system that can't handle the load, PARALLEL exacerbates the problem.
    PARALLEL adds some (CPU) overhead.

    You can't push a string.
    CPU cycles are a finite resource & you can't consume more than 100%.
    When you try to do so the result is run queue depth start climbing to infinity & beyond.

    I suggest that PARALLEL use be reduced or eliminated.
  • 8. Re: Help with Parallelization on our SPARC server...we are drowning!
    Tubby Guru
    Currently Being Moderated
    Gunners wrote:
    I believe the bottleneck is the CPU and I/O. The DB accesses the same sub-system disk for all its data. Actually this server has 3 DB's. All which access the same disk.

    Edited by: Gunners on Nov 13, 2009 1:05 PM
    This is more concerning, how many databases do you have running on this single machine (your statement that you have 3 DBA's leads me to believe more than one)? *edit, read this wrong, read DB as DBA :P) .... So you answered my questions before i asked them....

    If you have a shared server like this, you almost certainly do not want to make use of parallel operations (there's likely no way for you to know what the other applications are doing on the server and when they are doing it).

    Edited by: Tubby on Nov 13, 2009 7:05 PM

    Read fast = read bad
  • 9. Re: Help with Parallelization on our SPARC server...we are drowning!
    724046 Newbie
    Currently Being Moderated
    Yes we have 3 DB's in total. 2 of them are sized the same with the same init settings and 1 is sized more with the same init setting.

    #1 = 2.5gb
    #2 = 2.5gb
    #3 = 4.0 GB

    Our Server, the SunSPARC T6320 has a 1.2 GHz 1 CPU, 8 core with 16GB of ram.

    As for the Query, they have been tuned and are pretty much not changeable as they are vendor queries.

    I am wondering if CPU_COUNT should be set to 64 and increase the PARALLEL_MAX_SERVERS paramemter, but unsure as to what value.
  • 10. Re: Help with Parallelization on our SPARC server...we are drowning!
    sb92075 Guru
    Currently Being Moderated
    I am wondering if CPU_COUNT should be set to 64
    When all else fails, Read The Fine Manual

    http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/initparams032.htm#CHDBDHHB



    Do not change the value of CPU_COUNT.
  • 11. Re: Help with Parallelization on our SPARC server...we are drowning!
    Hemant K Chitale Oracle ACE
    Currently Being Moderated
    I believe the bottleneck is the CPU and I/O.
    The "uptime" command will not tell you if CPU and/or I/O is a/the bottleneck for that query.
    Monitor CPU and I/O with OS utilities (sar, vmstat, top etc) when the query is running.
    Check V$PX_PROCESS, V$PX_SESSION to see if Parallel Query is being used when the query is running.


    Hemant K Chitale
    http://hemantoracledba.blogspot.com
  • 12. Re: Help with Parallelization on our SPARC server...we are drowning!
    614235 Explorer
    Currently Being Moderated
    Be very, very wary of the Sun T1 or T2 (Niagara) based systems. They are terrible at database performance. Be aware that at 1.2 GHz with 8 cores and 4 threads each core, this is effectively 32 * 300 MHz CPUs. In which case my strongest advice is buy another system and throw this one out. The Niagara processors are well known as being very bad for running real world Oracle database on - in spite of everything Sun may say about them.

    For instance read this Pythian blog entry:
    http://www.pythian.com/blogs/1499/sunfire-t2000-servers-are-best-suited-for-oracle-middletier

    Parallelization is not the way to go for a number of reasons. Not least that there are only 8 real CPU execution cores. So the more parallel threads you create per core, the more they steal real CPU cycles from each other as they share and contend for the same shared CPU core. As you increase the number of threads per CPU core so the average throughput per thread drops, because of the increase in sharing of only one physical resource.

    In simple terms the T2 processor is not suitable for running database server software such as Oracle. In spite of everything Sun will tell you and their published test results, the reality is that the T2 processor is very slow to begin with (1.2 GHz), and gets slower in real terms due to the way it switches between multiple threads (processes) so that each Oracle process will get the equivalent of something like a 200 MHz CPU. We left 200 MHz CPUs behind over 10 years ago.

    Someone else who also had poor performance with Sun T2 based servers got the same reply about the poor scalability and performance:
    Migration from UltraSPARC I,-V to UltraSPARC T1 and T2 results in 2/3 perfo

    You could do a search on the Database forums for "Sun T2" over the past 3 years and will see other people who have seen the same "poor performance" on their database.

    John
  • 13. Re: Help with Parallelization on our SPARC server...we are drowning!
    724046 Newbie
    Currently Being Moderated
    John,

    Thanks for the read. Very interesting article. We always have been concerned over the server, but wanted to do our due dilligence from the DB side to make sure it was not the DB being the issue rather the server just can't handle the types processsing it is doing. Essentially what we are doing is this.

    1. Load data from Informatica into Schema A
    2. Run validation scripts and clean the data up.
    3. After step 2 completes, which is supposed to be running in parallel and takes well over 24 hours, we push the data into Schema B.
    4. Schema B then does some more processing, blah blah blah, and then pushes the data into Schema C which is the final resting spot for which reporting can take place of.

    So right away you can see we are doing heavy reads and heavy inserts and updates. I personally think the Vendor has not tuned the DB objects well enough. Schema A barely has any indexes and Schema B is not any better. We are constantly seeing too many FTS's. The size of the each schema is rather large. Schema A & B is roughly 8 GB's and C is about 15 GB's. I know the size of them may seem small but since they are reading in all the data it ends up being a lot.

    I have opened a SR with support and they are telling me to increase the value of PARALLEL_MAX_SERVERS. To what, well this formula is what they gave me and to be honest it does not make any sense. Tell me if I have interpreted this wrong:

    +>
    Oracle Support wrote:+
    +>From the output uploaded from v$px_process you can see that there are 64 slaves in use and there are 1152 queries that are running in parallel. And there are 338 downgraded to 50% and 1996 queries are downgraded to serial as there are not enough resources available to spawn the parallel slaves.The parallel_max_servers is set to 64 so it only can spawn 64 slaves and rest of them will be downgraded to serial.+

    +>You need to increase the parallel max_servers to higher than 64 so that other queries can take advantage of parallel processing.Make sure that you have enough memory available to spawn these parallel slaves.+

    My response was:
    What do you recommend?

    +>
    Oracle Support wrote:+
    +>The parallel_max_servers is derived based on cpus and other parameters into account. Please review the documentation below for more details on how to set it up.+

    +>Oracle® Database Data Warehousing Guide+
    +>10g Release 2 (10.2)+
    +>Part Number B14223-02+
    +>Chapter 25 Using Parallel Execution+
    +>Tuning General Parameters for Parallel Execution+

    +>Oracle sets PARALLEL_MAX_SERVERS to a default value that is sufficient for most systems. The default value for PARALLEL_MAX_SERVERS is as follows:+

    +>(CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5)+

    +>This might not be enough for parallel queries on tables with higher DOP attributes. We recommend users who expects to run queries of higher DOP to set PARALLEL_MAX_SERVERS as follows:+

    +>2 x DOP x NUMBER_OF_CONCURRENT_USERS+

    +>For example, setting PARALLEL_MAX_SERVERS to 64 will allow you to run four parallel queries simultaneously, assuming that each query is using two slave sets with a DOP of eight for each set.+

    +>If the hardware system is neither CPU bound nor I/O bound, then you can increase the number of concurrent parallel execution users on the system by adding more query server processes. When the system becomes CPU- or I/O-bound, however, adding more concurrent users becomes detrimental to the overall performance. Careful setting of PARALLEL_MAX_SERVERS is an effective method of restricting the number of concurrent parallel operations.+

    +>If users initiate too many concurrent operations, Oracle might not have enough query server processes. In this case, Oracle executes the operations sequentially or displays an error if PARALLEL_MIN_PERCENT is set to a value other than the default value of 0 (zero).+

    +>This condition can be verified through the GV$SYSSTAT view by comparing the statistics for parallel operations not downgraded and parallel operations downgraded to serial. For example:+

    +>SELECT * FROM GV$SYSSTAT WHERE name LIKE 'Parallel operation%';+

    So from what I gathered here, they did not exactly recommend a value but based on the formula above:
    +(CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5)+
    32x2x(2x5)
    64x10 = 640

    To me this does not make sense as the value seems pretty high. Then they go on by saying to use:
    +2 x DOP x NUMBER_OF_CONCURRENT_USERS+
    Well currently, the vendor developers have set all the objects DOP to 1, essentially turning them off. So...
    2x1x# (the NUMBER_OF_CONCURRENT_USERS varies from time to time. Just to makes life simple, I will go with 4. At times there may only be 1 or 2 and at other times there may be 10, but then the load part is happening I would assume 1 or 2 users.)
    So value here would be 8 (2x1x4)

    Now these two values are way off from each other. Any insight here?
  • 14. Re: Help with Parallelization on our SPARC server...we are drowning!
    614235 Explorer
    Currently Being Moderated
    I cannot help with your Parallel Query settings, as I don't know enough about them. My real point was that the Sun T2 is so much slower in reality than any other system, and that even moving to a PC with a fast Intel or AMD CPU would probably be twice as quick as on the Sun system. You seem to have less than 200 GB of data.

    From the top 5 events you reported at the very start, CPU usage is much higher than any disk I/O wait times:
    Top 5 Timed Events                                         Avg %Total
    ~~~~~~~~~~~~~~~~~~                                        wait   Call
    Event                                 Waits    Time (s)   (ms)   Time Wait Class
    ------------------------------ ------------ ----------- ------ ------ ----------
    PX Deq Credit: send blkd            959,256   1,876,663   1956   92.4      Other
    CPU time                                        126,325           6.2           
    db file scattered read               29,068         505     17    0.0   User I/O
    db file sequential read             145,547         215      1    0.0   User I/O
    cursor: pin S wait on X              10,084         196     19    0.0 Concurrenc
    CPU time was 126,325 seconds while waiting on disk I/O was only 720 (505 + 215). And your "PX Deq Credit" was far more than either of these, at a massive 1,876,663.

    I would be tempted to switch off Parallel Query completely - set all "parallel_*" settings to zero, such as "parallel_min_servers" and "parallel_max_servers" - and see what happens. Is CPU or "db file *" then the top wait event? If it is disk, then you are I/O bound and it is not the T2 CPU that is the problem. If you are CPU bound (CPU time is top), then it is the T2 that is your problem.

    There could be lots of other gotcha's with using the T2 CPU for Oracle. For instance Oracle does busy spinning on internal locks (latches), which uses up CPU cycles checking if a latch becomes free, rather than sleeping and waiting on a queue. Because the T2 is multi-threaded with shared CPU cores, a spinning session will stick on a CPU core, starving the other 3 or 7 threads on that core from using it. So you could see serious negative scalability on the T2 if you hit latch contention in Oracle and busy spinning. I am not saying this is happening to you, but just another example of how one thread on a CPU core is effectively stealing CPU cycles from the other threads that share the same CPU core.

    Solaris is not aware enough of this sharing to report it in any meaningful fashion. If I have 4 threads running on the same CPU core, you cannot find out if they each got 1/4 or 25% of the available CPU cycles or if it was imbalanced with one thread getting 70% and the other 3 getting 10% each. Solaris simply assumes that when a thread is scheduled on a virtual CPU then it gets 100% of the available cycles. This of course is rubbish in a processor like the T2.

    Put another way, if 8 threads run on the same CPU core for 8 seconds, Solaris will report that they each got 8 seconds of execution for 64 seconds of CPU usage total. Yet we know that there is only one physical CPU core in this case, so there can only ever be 8 seconds of CPU usage across those 8 threads, not 64! Likewise Oracle will record 64 seconds of CPU usage by sessions, even though only 8 seconds of real CPU execution was achieved.

    In other words Solaris and as a result Oracle can only record allocated time not execution time on a CPU. Each of the 8 threads was "allocated" onto that CPU core for 8 seconds each (64 allocated seconds total), but that CPU core can only deliver 8 seconds of execution time over an 8 second elapsed period.

    Either way, as I said, I would switch off Parallel Query, rerun my tests while taking the same AWR or Statspack report as you did before, and see if CPU or disk I/O is the top event.

    John
1 2 3 4 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points