14 Replies Latest reply: Nov 16, 2011 11:58 AM by 874555 RSS

    Performance Issue: Deadlocks RAC

    874555
      Hello,

      Our team are recently doing performance tests in our RAC 11gR2 3 nodes Lab under Redhat Linux 5.
      During the tests, there are deadlocks detected in the alert log file;
      ...
      Tue Nov 15 11:22:54 2011
      Global Enqueue Services Deadlock detected. More info in file
       /opt/app/oracle/diag/rdbms/db/db1/trace/db1_lmd0_12563.trc.
      Global Enqueue Services Deadlock detected. More info in file
       /opt/app/oracle/diag/rdbms/db/db1/trace/db1_lmd0_12563.trc.
      Tue Nov 15 11:23:10 2011
      Global Enqueue Services Deadlock detected. More info in file
       /opt/app/oracle/diag/rdbms/db/db1/trace/db1_lmd0_12563.trc.
      Tue Nov 15 11:23:10 2011
      Dumping diagnostic data in directory=[cdmp_20111115113104], requested by (instance=3, osid=13645), summary=[abnormal process termination].
      Tue Nov 15 11:23:37 2011
      Global Enqueue Services Deadlock detected. More info in file
       /opt/app/oracle/diag/rdbms/db/db1/trace/db1_lmd0_12563.trc.
      ....
      About 50% of the queries are failing with this error; Looking into the trace file I have ;
      /opt/app/oracle/diag/rdbms/db/db1/trace/db1_lmd0_12563.trc
      ....
      user session for deadlock lock 0x3d2438198
        sid: 200 ser: 9277 audsid: 1824543 user: 68/SNEADMIN
          flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
          flags2: (0x40009) -/-/INC
        pid: 65 O/S info: user: grid, term: UNKNOWN, ospid: 5258
          image: oracle@db01
        client details:
          O/S info: user: cceadmin, term: unknown, ospid: 1234
          machine: ca.gency.com program: JDBC Thin Client
          application name: JDBC Thin Client, hash value=2546894660
        current SQL:
        update CCE_ORDER_ITEM_DETAILS set ORDER_ID=:1 , CLUSTER_ID=:2  where ORDER_LINE_ITEM_ID=:3  and CLUSTER_ID=:4 
      DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
        possible owner[65.5258] on resource TM-0001DF87-00000000
      
      *** 2011-11-15 05:34:48.194
      Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
      Global blockers dump end:-----------------------------------
      Global Wait-For-Graph(WFG) at ddTS[0.5] :
      BLOCKED 0x3d2438198 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[41000-0001-000004A5] inst 1 
      BLOCKER 0x3d7846108 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[38000-0002-00002C68] inst 2 
      BLOCKED 0x3d7846108 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[38000-0002-00002C68] inst 2 
      BLOCKER 0x3d75a4fd8 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[43000-0003-000009C1] inst 3 
      BLOCKED 0x3d75a4fd8 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[43000-0003-000009C1] inst 3 
      BLOCKER 0x3d2438198 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[41000-0001-000004A5] inst 1 
      
      .....
      Any idea please , I can provide more information if needed

      Thanks
        • 1. Re: Performance Issue: Deadlocks RAC
          Dom Brooks
          The deadlocks posted seem to be concerned with mode 4 TM locks but the full stack is incomplete.

          Do you have unindexed foregin keys related to the update statement listed?

          Edited by: Dom Brooks on Nov 16, 2011 2:26 PM
          See my message further down - believe that "4" represents mode 5 not mode 4 in the GWFG.
          • 2. Re: Performance Issue: Deadlocks RAC
            874555
            Dom Brooks wrote:
            The deadlocks posted seem to be concerned with mode 4 TM locks but the full stack is incomplete.

            Do you have unindexed foregin keys related to the update statement listed?
            Hi Dom,

            Please tell me what more info you need me to post;
            also, please find below the DDL of the table
            -- Create table
            create table SNEADMIN.CCE_ORDER_ITEM_DETAILS
            (
              order_line_item_id         NUMBER(10) not null,
              order_id                   NUMBER(10) not null,
              cluster_id                 NUMBER(3) not null,
              order_time                 NUMBER not null,
              external_order_id          VARCHAR2(25),
              line_item_id               VARCHAR2(5) not null,
              subscriber_id              NUMBER(10) not null,
              cancelled_ind              NUMBER(2) not null,
              adder_id                   VARCHAR2(255) not null,
              line_item_time             NUMBER not null,
              external_product_id        VARCHAR2(12) not null,
              product_id                 NUMBER not null,
              product_version            VARCHAR2(10) not null,
              content_type               NUMBER(2),
              product_display_name       VARCHAR2(500),
              external_offer_id          VARCHAR2(100),
              ppp_id                     NUMBER,
              ppp_code                   VARCHAR2(9),
              ppp_version                VARCHAR2(10),
              qty                        NUMBER(5),
              available_qty              NUMBER,
              offer_text                 VARCHAR2(1000),
              status                     NUMBER(2),
              unit_price                 NUMBER(12,2),
              currency                   VARCHAR2(3),
              sub_total_amount           NUMBER(12,2),
              tax                        NUMBER(12,2),
              total_value                NUMBER(12,2),
              vendor_cost_price          NUMBER(12,2),
              related_order_line_item_id NUMBER(10),
              related_order_cluster_id   NUMBER(3),
              content_id                 VARCHAR2(120),
              provider_name              VARCHAR2(100),
              item_type                  VARCHAR2(50)
            )
            tablespace SNE_16M_DAT
              pctfree 10
              initrans 2
              maxtrans 255
              storage
              (
                initial 64K
                next 1M
                minextents 1
                maxextents unlimited
              );
            -- Create/Recreate primary, unique and foreign key constraints 
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_PK primary key (ORDER_LINE_ITEM_ID, CLUSTER_ID)
              using index 
              tablespace SNE_16M_IDX
              pctfree 10
              initrans 3
              maxtrans 255
              storage
              (
                initial 64K
                next 1M
                minextents 1
                maxextents unlimited
              );
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_OEID_UK unique (EXTERNAL_ORDER_ID, CLUSTER_ID, LINE_ITEM_ID)
              using index 
              tablespace SNE_16M_IDX
              pctfree 10
              initrans 3
              maxtrans 255
              storage
              (
                initial 64K
                next 1M
                minextents 1
                maxextents unlimited
              );
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_OLID_UK unique (ORDER_ID, CLUSTER_ID, LINE_ITEM_ID)
              using index 
              tablespace SNE_16M_IDX
            ;
            
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_EOID_FK foreign key (EXTERNAL_ORDER_ID, CLUSTER_ID)
              references SNEADMIN.CCE_ORDER_DETAILS (EXTERNAL_ORDER_ID, CLUSTER_ID);
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_OID_FK foreign key (ORDER_ID, CLUSTER_ID)
              references SNEADMIN.CCE_ORDER_DETAILS (ORDER_ID, CLUSTER_ID);
            alter table SNEADMIN.CCE_ORDER_ITEM_DETAILS
              add constraint CCE_ORDER_ITEM_DETAILS_ROID_FK foreign key (RELATED_ORDER_LINE_ITEM_ID, RELATED_ORDER_CLUSTER_ID)
              references SNEADMIN.CCE_ORDER_ITEM_DETAILS (ORDER_LINE_ITEM_ID, CLUSTER_ID);
            • 3. Re: Performance Issue: Deadlocks RAC
              Dom Brooks
              See Tom Kyte's article in oramag for a succint description of when you need indexes on foreign keys:
              http://www.oracle.com/technetwork/issue-archive/2010/10-nov/o60asktom-176254.html

              I suspect you need an index on columns (RELATED_ORDER_LINE_ITEM_ID, RELATED_ORDER_CLUSTER_ID) which is self-referential.

              Foreign key locking is one of those issues that can change subtly but significantly between releases so if you're in the lab testing an upgrade, for example, then it's not uncommon to see significant differences in such areas.

              For more information see:
              http://jonathanlewis.wordpress.com/2010/02/15/lock-horror/
              http://richardfoote.wordpress.com/2010/11/10/oracle11g-new-locking-modes-when-policing-fk-constraints-a-wolf-at-the-door/
              • 4. Re: Performance Issue: Deadlocks RAC
                874555
                Dom Brooks wrote:
                See Tom Kyte's article in oramag for a succint description of when you need indexes on foreign keys:
                http://www.oracle.com/technetwork/issue-archive/2010/10-nov/o60asktom-176254.html

                I suspect you need an index on columns (RELATED_ORDER_LINE_ITEM_ID, RELATED_ORDER_CLUSTER_ID) which is self-referential.

                Foreign key locking is one of those issues that can change subtly but significantly between releases so if you're in the lab testing an upgrade, for example, then it's not uncommon to see significant differences in such areas.

                For more information see:
                http://jonathanlewis.wordpress.com/2010/02/15/lock-horror/
                http://richardfoote.wordpress.com/2010/11/10/oracle11g-new-locking-modes-when-policing-fk-constraints-a-wolf-at-the-door/
                Thanks Dom, I suspected as well the missed indexes problem. DO you mean a normal index or unique index on columns (RELATED_ORDER_LINE_ITEM_ID, RELATED_ORDER_CLUSTER_ID) ?
                • 5. Re: Performance Issue: Deadlocks RAC
                  Dom Brooks
                  Whatever's relevant to the data. A normal b*tree index would be fine if this is the root cause.


                  By the way, I know that I'm not familiar with your tables and data etc, but I'm not sure I'm a huge fan of this design.
                  CLUSTER ID is part of your primary key.
                  If it's then updateable, don't you then have to go an update all the associated child records with the old related_cluser_id?
                  Possibly the same with updateable columns in the other foreign keys.
                  • 6. Re: Performance Issue: Deadlocks RAC
                    874555
                    Thank you Dom,
                    How can I know that un-indexed foreign keys are the root cause of the deadlocks?
                    • 7. Re: Performance Issue: Deadlocks RAC
                      Dom Brooks
                      Good question.

                      You need to come up with a test case that demonstrates the problem.
                      Then show that the proposed solution addresses the issue.
                      You might need to do more investigation regarding all the statements involved in the locking scenario.
                      • 8. Re: Performance Issue: Deadlocks RAC
                        Dom Brooks
                        The deadlocks posted seem to be concerned with mode 4 TM locks
                        From the following links:
                        http://jonathanlewis.wordpress.com/2010/02/15/lock-horror/
                        http://www.rachelp.nl/index_kb.php?menu=articles&actie=show&id=15

                        and this information related to the third column in the global wait-for-graph:
                        #define KJUSERNL 0          /* no permissions */   (Null)
                        #define KJUSERCR 1          /* concurrent read */ (Row-S (SS))
                        #define KJUSERCW 2         /* concurrent write */ (Row-X (SX))
                        #define KJUSERPR 3         /* protected read */    (Share)
                        #define KJUSERPW 4          /* protected write */ (S/Row-X (SSX))
                        #define KJUSEREX 5         /* exclusive access */ (Exclusive)
                        It seems I may have misinterpreted the "4" as meaning mode 4 when it should be mode "5"

                        If it is mode 5 that then might support the unindexed self-referential foreign key as I hope the demo below will illustrate.

                        Let me create a table will the relevant fk ( I used a 2 column fk just because it was similar to your situation but it's not relevant that it's 2 columns).
                        SQL> create table t1
                          2  (col1         number       not null
                          3  ,col2         number       not null
                          4  ,col3         varchar2(10) not null
                          5  ,related_col1 number
                          6  ,related_col2 number);
                        
                        Table created.
                        
                        SQL> 
                        SQL> alter table t1 add constraint pk_t1 primary key (col1,col2);
                        
                        Table altered.
                        
                        SQL> 
                        SQL> alter table t1 add constraint fk_t1 foreign key 
                          2  (related_col1,related_col2) references t1 (col1,col2);
                        
                        Table altered.
                        
                        SQL> 
                        SQL> insert into t1 values (1,1,'X',null,null);
                        
                        1 row created.
                        
                        SQL> insert into t1 values (2,2,'X',null,null);
                        
                        1 row created.
                        
                        SQL> commit;
                        Now, show that if you don't update one of the columns involved in the pk, there's no issue with blocking locks:

                        Session 1:
                        Session1>update t1
                          2  set    col3 = 'Y'
                          3  where col1 = 1
                          4  and   col2 = 1;
                        
                        1 row updated.
                        
                        Session1>
                        Session 2:
                        Session2>update t1
                          2  set    col3 = 'Y'
                          3  where col1 = 2
                          4  and   col2 = 2;
                        
                        1 row updated.
                        
                        Session2>
                        But if the update statement updates one of the columns in the primary key which is referenced in the foreign key on the same table:

                        Session 1:
                        Session1>update t1
                          2  set    col2 = 1
                          3  ,      col3 = 'Y'
                          4  where col1 = 1
                          5  and   col2 = 1;
                        
                        1 row updated.
                        
                        Session1>
                        Session 2 hangs:
                        Session2>update t1
                          2  set    col1 = 2
                          3  ,      col3 = 'Y'
                          4  where col1 = 2
                          5  and   col2 = 2;
                        .....
                        hangs
                        And you can see here that session 2 is waiting on a mode 5 TM lock:
                        Session1>l
                          1  select process,
                          2         l.sid,
                          3         type,
                          4         lmode,
                          5         request,
                          6         do.object_name
                          7  from   v$locked_object lo,
                          8         dba_objects do,
                          9         v$lock l
                         10  where  lo.object_id   = do.object_id
                         11  AND    l.sid          = lo.session_id
                         12* AND    do.object_name = 'T1'
                        Session1>/
                        
                        PROCESS           SID TY      LMODE    REQUEST OBJECT_NAM
                        ---------- ---------- -- ---------- ---------- ----------
                        4724:6028        1514 TM          0          5 T1
                        4724:6028        1514 AE          4          0 T1
                        4408:2384         380 TX          6          0 T1
                        4408:2384         380 TM          3          0 T1
                        4408:2384         380 AE          4          0 T1
                        
                        Session1>
                        However, if I create an index on that fk, the update no longer has the blocking issue:
                        Session1>create index i_t1 on t1 (related_col1, related_col2);
                        
                        Index created.
                        
                        Session1>update t1
                          2  set    col2 = 1
                          3  ,      col3 = 'Y'
                          4  where col1 = 1
                          5  and   col2 = 1;
                        
                        1 row updated.
                        
                        Session1>
                        And session 2 no longer blocks:
                        Session2>update t1
                          2  set    col1 = 2
                          3  ,      col3 = 'Y'
                          4  where col1 = 2
                          5  and   col2 = 2;
                        
                        1 row updated.
                        
                        Session2>
                        What of course I've missed is the added complexity which makes this a deadlock scenario involving these mode 5 locks.

                        We can do that artificially using the two sets of statements to make sure the two sessions are also competing for the same rows.
                        .... in session 1
                        Session1>drop index i_t1
                          2  /
                        
                        Index dropped.
                        
                        Session1>
                        Session1>update t1
                          2  set    col3 = 'Y'
                          3  where col1 = 2
                          4  and   col2 = 2;
                        
                        1 row updated.
                        Session1>
                        
                        .... over to session 2
                        
                        Session2>update t1
                          2  set    col3 = 'Y'
                          3  where col1 = 1
                          4  and   col2 = 1;
                        
                        1 row updated.
                        Session2>
                        
                        .... back to session 1
                        
                        Session1>update t1
                          2  set    col2 = 1
                          3  ,      col3 = 'Y'
                          4  where col1 = 1
                          5  and   col2 = 1;
                        
                        
                        .... back to Session 2
                        
                        
                        Session2>update t1
                          2  set    col1 = 2
                          3  ,      col3 = 'Y'
                          4  where col1 = 2
                          5  and   col2 = 2;
                        
                        .... wait a few secs ....
                        update t1
                               *
                        ERROR at line 1:
                        ORA-00060: deadlock detected while waiting for resource
                        
                        
                        Session2>rollback;
                        And the locks involved in the deadlock look like this:
                        SQL> /
                        
                        PROCESS           SID TY      LMODE    REQUEST OBJECT_NAM
                        ---------- ---------- -- ---------- ---------- ----------
                        4724:6028        1514 TX          6          0 T1
                        4724:6028        1514 TM          3          5 T1
                        4724:6028        1514 AE          4          0 T1
                        4408:2384         380 TX          6          0 T1
                        4408:2384         380 TM          3          5 T1
                        4408:2384         380 AE          4          0 T1
                        
                        6 rows selected.
                        
                        SQL> 
                        So your situation might be slightly more complicated and you always have to bear in mind that the deadlock trace files do not tell you everything you need to know about all the statements involved in the deadlock situation. But it certainly points to this being an issue.

                        Edited by: Dom Brooks on Nov 16, 2011 2:39 PM

                        Edited by: Dom Brooks on Nov 16, 2011 2:46 PM
                        Added deadlock demo
                        • 9. Re: Performance Issue: Deadlocks RAC
                          874555
                          Hi Dom,
                          We created the indexes for foreign keys but still getting deadlock errors.
                          • 10. Re: Performance Issue: Deadlocks RAC
                            Dom Brooks
                            Ok.
                            What index did you create?

                            I would delve further into the deadlock trace files and collate as much info as possible about the resources and statements involved.

                            And if you're licensed for Diagnostic pack, you might find that looking at this from another angle helps - the raw ASH data should reveal more about the sessions, statements and locks involved in the bigger picture via GV/V$ACTIVE_SESSION_HISTORY and DBA_HIST_ACTIVE_SESSION_HISTORY.

                            Note that there is extra information in the deadlock file about the objects involved.
                            See:
                            DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
                              possible owner[65.5258] on resource TM-0001DF87-00000000
                            The '0001DF87' bit is hex for the object involved in the TM lock, this translates to decimal 122759.
                            Use this as object id in dba_objects to verify the object involved.
                            Ditto for another such ids in this or the other traces files.
                            • 11. Re: Performance Issue: Deadlocks RAC
                              874555
                              Thanks Dom, I have seen manay other updates on other tables and now I am creating index on foreign keys. I will update you
                              • 12. Re: Performance Issue: Deadlocks RAC
                                874555
                                No chance :( Still having deadlocks.
                                Please let me know if you need more information I could provide

                                  current SQL:
                                  update CCE_FULFILLMENT_ITEM_PROPS set ITEM_ID=:1 , CLUSTER_ID=:2  where ITEM_PROPS_ID=:3  and CLUSTER_ID=:4
                                DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
                                  possible owner[47.19424] on resource TM-0001DF1C-00000000
                                
                                *** 2011-11-16 08:22:18.937
                                Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
                                Global blockers dump end:-----------------------------------
                                Global Wait-For-Graph(WFG) at ddTS[0.235] :
                                BLOCKED 0x3d2439508 4 wq 2 cvtops x1 TM 0x1df1c.0x0(ext 0x0,0x0)[2F000-0001-00001853] inst 1
                                BLOCKER 0x3d75a66c8 4 wq 2 cvtops x1 TM 0x1df1c.0x0(ext 0x0,0x0)[49000-0002-000001CF] inst 2
                                BLOCKED 0x3d75a66c8 4 wq 2 cvtops x1 TM 0x1df1c.0x0(ext 0x0,0x0)[49000-0002-000001CF] inst 2
                                BLOCKER 0x3d2439508 4 wq 2 cvtops x1 TM 0x1df1c.0x0(ext 0x0,0x0)[2F000-0001-00001853] inst 1
                                • 13. Re: Performance Issue: Deadlocks RAC
                                  Dom Brooks
                                  Out of interest, can you clarify if this is a new application that's running on 11gR2? i.e. has this previously been running without issues on a previous version?

                                  I don't know if it's the same in one of these LMD deadlock trace files but in a single instance deadlock file there's a big warning that deadlocks are the result of poor application design. This isn't necessarily 100% true but can often be the case. I hinted at this when I commented above on the fact that you're updating the cluster_id column in these tables which is then referenced in fk relationships - it's not a design that I'm convinced about.

                                  What's the application doing? What does cluster_id represent?


                                  If there are multiple TM mode 5 deadlock occurrences on multiple objects, it would seem that you have multiple problems (presumably of the same missing fk type).

                                  As you can see from the info in the latest deadlock trace, the statement for the session involved is different - CCE_FULFILLMENT_ITEM_PROPS.
                                  Is this the same object as hex 0001DF1C, decimal object id 122652?

                                  Can you confirm structure of this object and any pk/fks and indexes?
                                  • 14. Re: Performance Issue: Deadlocks RAC
                                    874555
                                    Thank you Dom. There were a table missed foregn key index creation, after creating it, the deadlocks are fixed.

                                    Thanks so much for your great help!