Forum Stats

  • 3,750,457 Users
  • 2,250,181 Discussions
  • 7,866,986 Comments

Discussions

Row lock with for update

SB35
SB35 Member Posts: 60 Green Ribbon

Its version 19.9.0.0.0 of oracle exadata. We are seeing below application query which used to run around ~200-300 times in an hour and was used to take ~.5 sec/execution. But then after it started executing ~1000 times/hour and the execution time went up to ~3 to ~4 seconds/execution flooding the whole database with "row lock contention' application waits.

It seems this statement is taking a lock and doing some processing at the same time waiting for 20 seconds if not getting a lock. And the row lock contention appears when lot of sessions submitted doing same functionality and fighting with each other. So wanted to know, how should we fix such issue? Should we make it as NOWAIT and let the session skip that process when one is already working? or should we minimize the WAIT from 20 to ~10?

SELECT ID FROM LOCK_TAB WHERE LOCK_NAME = 'XXXX' FOR UPDATE OF ID WAIT 20

Best Answer

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,729 Gold Crown
    Accepted Answer

    I think @Solomon Yakobson has supplied a convincing demonstration that the dynamic sampling is a consequence of real-time sampling, and your comment about constant change of status is a hint that the way the application works may be doing enough to trigger real-time stats.

    The only remaining part of the puzzle is whether Oracle simply discard the histogram information because it's looking at real-time stats - and I think I may have seen something about that somewhere on the internet, possibly from @Sayan Malakshinov or @Mohamed Houri

    Strategically - i.e. long term, rather than immediate hack - the ideal approach is the function-based index with matching query. Among other things this may allow you to drop a very large index which is (a) constantly subject to updates and (b) mostly a waste of space.

    If you can't change the SQL at present and can't find a good way to disable the real-time stats collection for the task then you could simply connect an sql_plan_baseline to each of the two queries forcing the use of the index you want the optimizer to pick.



    Looking at these current actuals, and the details you've given about the gathered histogram, the most robust option, and the one least likely to result in performance swings it the function-based index approach - possibly with a function that hides the C and MT values and exposes all the rest, and I think I'd got for a global index rather than have Oracle probe 100 different index partitions to find a few rows.

    You'll notice that the D doesn't appear in the histogram data - the histogram is a hybrid and the 3,000 D rows have been captured in the ED bucket. (The ED repeat count is 53 but the ED bucket size is 2967). This is a simple indicator of the instability introduced by the "size repeat" option. I could invent scenarios that could result in the row path appearing as a side effect of this type of behaviour.


    Regards

    Jonathan Lewis

«13

Answers

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,570 Red Diamond

    A lock is for the protection of the integrity of a resource.

    You now have an increase in processes wanting to access the same set of resources.

    This obviously means an increase in contention for acquiring locks to access a resource.

    Reducing the time waited for a lock is not going to decrease the demand for locks.

    Two basic approaches to this.

    Decrease the processing time that a process requires to hold a lock on the resource.

    Address lock contention (processing wanting access to the same set of resources), by increasing the number of resources available, or by clever resource scheduling to reduce the likelihood of processes vying for access to the same resource at the same time.

    SB35
  • Solomon Yakobson
    Solomon Yakobson Member Posts: 18,658 Black Diamond
    edited Jul 1, 2021 10:48AM

    You need to check what type of lock/wait it is. It could be with increased number of concurrent transactions multiple concurrent transactions are now updating same row. Then you need to check if update is committed right away or transaction continues and row is locked for a long(er) time and see if that time can be reduced. Or, with increased number of concurrent transactions it could be more transactions are trying to update different but residing in same block rows while MAXTRANS is less than number of concurrent transactions updating same block or there is simply insufficient free space in block to allocate another ITL slot. In such case you need to rebuild table with higher MAXTRANS/INITRANS.

    SY.

    SB35
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,729 Gold Crown

    This looks like the type of code that might be used to "get the next sequential number, use it, commit" to produce an unbroken list of sequential values. If so then you can't avoid queueing.

    However, as SY says, you could ensure you minimise the time between the "select for update" and the "commit".

    On thought that may be relevant - if you don't have to complete all the processing between the select and the commit before releasing the locked row you could look at strategies that use the dbms_lock package to serialise the working sessions while minimising the time a session has to wait. dbms_lock allows you to release its "user-defined" locks without commiting, so your code might, in outline, change to:

    request exclusive "sync" lock
    do some dml
    release exclusive "sync" lock
    do some more dml
    commit;
    

    Sessions would queue on an enqueue/lock of type 'UL', but you might be able to shorten the wait time.

    Example of usage (though not an exact match for your requirement) available at this URL:

    https://jonathanlewis.wordpress.com/2011/01/14/statspack-on-rac/

    Regards

    Jonathan Lewis

    SB35
  • SB35
    SB35 Member Posts: 60 Green Ribbon
    edited Jul 5, 2021 4:04PM

    Thank You so much @Jonathan Lewis , @Solomon Yakobson . It looks like, the thought process of modifying it to 'NOWAIT' or decreasing the WAIT to lesser value , is not going to help us anyway in reducing those 'row lock contention'.

    The enque type at top in the AWR is "TX-Transaction (row lock contention)", not seeing much ITL related waits during this interval AWR report.

    I am not familiar with the dbms_lock but trying to understand the logic. The code doesn't seems to be fetching sequence number and using , rather doing something else. The part of the code from 'select for update' statement till 'ÇOMMIT' statement looks like below. Need to check with team, if we can really release lock before COMMIT and then perhaps we can get rid of execution of few statements. .

     
    


  • SB35
    SB35 Member Posts: 60 Green Ribbon

    Thank you @Billy Verreynne ,

    I am not able to get it quite clear. Can you please elaborate a bit more, in this case, how can i increase the resources to make the situation better/getting rid of row lock contention?

  • Paulzip
    Paulzip Member Posts: 8,407 Blue Diamond
    edited Jul 2, 2021 1:20AM

    The problem with your approach to serialisation in multi session environments with something like wait 20 and only one lock allocation ('LOCK_PRC'), is you no chance of scalability. With a handful of fairly active sessions, you will enqueue sessions almost immediately, with with waiters being blocked until the transaction is finished, especially if your delete or updates take a while. Then the first come first served contention happening all over again as soon as session 1 finishes its transaction. Your whole approach seems wrong to me, because of its synchronous and serialised nature.

    You may be better utilising an advanced queue for something like this - issuing the commands via a message enqueued and just dequeueing them asynchronously in order, the issue may be any feedback to UI if your process is relying on that. Personally, I think you need to rethink your model so that sessions aren't fighting over one lock allocation or go with optimistic locking and don't bother with "for update".

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,570 Red Diamond

    As I mentioned, you can try to reduce the time the process needs to lock the resource, or increase number of resources.

    I have no idea what resources you have defined in your LOCK_TAB table, and what the LOCK_PRC resource is.

    But the options I mentioned is to make the process using LOCK_PROC requiring less time for its lock duration by reducing and optimising the workload that needs to be done while LOCK_PRC is locked. Alternatively design a LOCK_PROC_01 to LOCK_PROC_10 for example in order to increase the number of LOCK_PROC resources and support up to 10 locks to be placed on LOCK_PROC.

    Whether one of these options are possible, depends on the reasons why there is currently only 1 LOCK_PROC resource, and just what is done workload wise while this resource is locked.

    The code snippet you posted seems to point to basic serialisation - ensuring that the LOCK_PROC resource workload can only ever be executed one at a time. Thus this indicates the workload is not thread safe (cannot be run in parallel).

    Serialisation destroys and kills performance when scaling a multi-process environment. Which is exactly what you done by increasing this resource usage from 200 to 300 times per hours to a 1000 times per hour.

  • SB35
    SB35 Member Posts: 60 Green Ribbon

    Thank you @Paulzip .

    I read in past 'for update' is really required for certain processing like say bank transaction, in which if we allow multiple session to get hold of same row/amount and work on them ,then they may endup updating to some wrong value which they should not. But as this code doesn't seems to be doing such , is that why you are suggesting of optimistic locking? and do you mean that in this case the optimistic locking i.e just the SELECT without any 'for update' is going to give use better performance as compared to 'for update nowait' or 'for update wait 10 or 5' etc?

  • SB35
    SB35 Member Posts: 60 Green Ribbon

    thank you @Billy Verreynne

     Still trying to understand this one and what exact resource its holding up, but actually the LOCK_PRC is the name of the procedure itself which is having this code, which i posted. So its gets called iteratively in the NO_DATA_FOUND exception. So it seems its only doing these insert/update/delete whatever is there in this code. but yes after commit it returns a refcursor something as below as output. But as this out refcursor is returned after commit(and no other code is there after), so should probably be considered not part of the locked resource i believe. Please correct me if wrong.

        SELECT  *  FROM tran_tab WHERE status = 'I' AND pid = hid AND lock_id = lockid  ORDER BY creat_date;.

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,570 Red Diamond

    My comments are from a s/w engineering perspective - why locking of resources is needed, and how one can deal with it ito short optimal workloads to reduce resource contention, and increasing the amount of resources.

    Had a quick look at the code you posted. But beyond seeing the serialisation of LOCK_PRC, it is difficult to provide suggestions on how to scale this. The bigger picture is needed. What business flow processing is addressed? What are the business requirements and constraints?

    Serialised access to a resource is important if the resource is not multi-thread capable. This is not an error.

    What can be the error is when the resource can be designed as multi-thread capable, or the requirements can be better addressed differently than via serialised access.

    There is no technical solution to your resource contention problem, without knowing and addressing the bigger picture.