Forum Stats

  • 3,724,514 Users
  • 2,244,774 Discussions
  • 7,851,063 Comments

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

SS Enqueue and sort segment request waits on 8.1.7.4 after trying to drop temp tablespace

Bobby Durrett
Bobby Durrett Member Posts: 140 Blue Ribbon

We have a case open with Oracle support but I thought I would throw this out there if that is ok.

We have an 8.1.7.4 database running on HP-UX 11.11 PA-Risc and it looks like any session that wants to use some temp space is hanging on an SS enqueue or in a few cases is waiting on a "sort segment request" wait.

Saturday we tried to move all the users over to a new temporary tablespace and drop the old one but the drop hung and we control-C'd out of it.  We put the users back to the original temp.  Note that both the old and new temp tablespaces are locally managed.

Before trying to drop the old temp tablespace we killed the existing sessions including one that had been running for two weeks and was hung on SMON.

Queries of v$fast_start_transactions and x$ktuxe don't indicate that SMON is rolling back some large transaction.

Also, SMON appears to be running this query continually:

SELECT file#, block#, LENGTH

  FROM uet$

WHERE ts# = :1 AND segfile# = :2 AND segblock# = :3 AND ext# = :4

Here are the locks held by SMON for what it's worth:

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
C000000028C43CD8 C000000028C43CF8          8 TT         74         16          4          0      59610          0
C000000028C43C68 C000000028C43C88          8 TS         74 -666633304          6          0      59650          0
C000000028C1CB38 C000000028C1CB58          8 ST          0          0          6          0         13          0

ST, Space Management Transaction
TS, Temporary Segment (also TableSpace)
TT, Temporary Table

Same database had a recovery scenario about a month or so back due to some deleted data files.  Also, over the holidays we had to rebuild a huge global index and increased our temp tablespace to get that done which is why we are trying to shrink it now by creating a new smaller one.  Also, the index creation was taking forever in the existing dictionary managed tablespace so we ended up moving the index to a locally managed one.  Also, UET$ has about 33,000,000 rows and most of the data is in dictionary managed tablespaces.  Queries to dba_free_space typically take 30 minutes to return - i.e. we know it is messed up and has been this way for a long time.

Fun for us.  If anyone has some insight that would be great.

- Bobby

Best Answer

  • Suntrupth
    Suntrupth Member Posts: 482
    Accepted Answer
    The tablespace name I really used corresponded to ts#=74 in v$tablespace.
    
    We are thinking of rebuilding the other indexes in this tablespace in a new locally managed tablespace and then dropping the existing dictionary managed tablespace.  My only question is whether some corruption will cause the drop tablespace to fail.
    

    Hi Bobby,

    I am not sure about your current situation now, Hope its better .

    I couldn't find time to reply back yesterday. Looking at the huge amounts of temporary segments in your tablespace (74), just got me thinking if you or Oracle Support have made you aware of an event DROP_SEGMENTS which is an event users may invoke to clear temporary segments. It drops the temporary segments just like SMON does in the background. This event has the advantage of not having the CPU consumed by SMON.

    Note 47400.1 : EVENT: DROP_SEGMENTS - Forcing cleanup of TEMPORARY segments

    There is also a method to patch (make changes) dictionary (not recommended otherwise). You could seek help from Oracle Support

    The method is to identify the segment( which you already have) and update the segment_type from temporary to 999.

    Note 388619.1: Last resort when SMON takes 100% CPU

    Regards,

    Suntrupth

«1

Answers

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,566 Gold Crown

    Bobby,

    This is dredging up some old memories - but your comment about 33M rows in uet$ is significant.

    The query being run by smon is its attempt to find an extent belonging to a segment because it's dropping that segment - and it's dropping the extents one by one. If you've managed to create some very large segments in the past it's going to take a very long time to tidy up the mess, and while it's happening all sorts of other space management events are going to run into problems.  If you bounce the database smon will just start off doing the same again within a couple of minutes of startup - or possibly straight away,

    There is an event you can set that will stop smon clearing temporary segments - 10061 I think - this may let you work around some of your problems, but you'll then have problems with space that will never go away - and you'll have to get on to Oracle.  It may be an acceptable short-term fix, though - the database will be working, even if it's growing when it shouldn't be.

    (On one occasion I know I deleted about 400,000 rows from uet$ and fixed up all the associated rows in seg$, fet$ and somewhere else - but I can't recall the details)

    Regards

    Jonathan Lewis

    Jonathan Lewis
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Jonathan,

    Thanks for your reply.  We did bounce the database and as you said it didn't help.  I think the final result of our support ticket will be something like what you describe but it is good to get some input outside of support.

    - Bobby

  • Suntrupth
    Suntrupth Member Posts: 482

    Hi Bobby,

    As you are dealing with Temp tablespaces, SMON cleanup of segments can cause issues as it will not service sort segment requests.

    As Jonathan has already pointed out setting event 10061, you can set it and this will disable SMON from cleaning up temp extents.  This

    will leave used (unavailable) space in the tablespace in which the extents reside. You can then point users to another temporary tablespace if necessary. The database can then be restarted off-peak without 10061 and SMON will clean up the temporary extents.

    A shutdown abort will succeed, but as soon as the database has been started again, SMON will carry on where it left off. Drop the tablespace in which the extents reside. This can be done after disabling SMON with 10061. The user process performing the drop will do the cleanup. However, this appears to be no quicker than SMON.

    Just for your information, you can also check if it is performing a coalescing of free extents by running "select count(*) from dba_free_space;" a couple of time in 5-10 minute intervals. If you observe the count dropping , most likely SMON is coalescing as well.

    You can disable this with the help of 10269 event. If SMON has started to coalesce, and you wish to disable it, you will have to set 10269 in the parameter file and restart the database.

    Regards,

    Suntrupth

    Suntrupth
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon
    edited January 2014

    Suntrupth,

    Thanks for your reply.  We are getting some similar feedback from Oracle support.  My on-call coworker is working the sev 1 ticket now.  Interesting thing is that I was doing counts on uet$ and fet$ and not seeing change. If smon is freeing extents it is taking its time on each one.

    Here are uet$ and fet$ counts about 13 hours apart:

    10:42

    select count(*) from SYS.UET$;

      COUNT(*)
    ----------
      35881366

    Elapsed: 00:00:41.03

    select count(*) from SYS.FET$;

      COUNT(*)
    ----------
        934720

    23:15:20

    select count(*) from SYS.UET$;

      COUNT(*)
    ----------
      35881367

    select count(*) from SYS.FET$;

      COUNT(*)
    ----------
        934720

    So, UET$ went up by one row and FET$ stayed the same between 10:42 am and 11:15 pm CST.

    - Bobby

  • Suntrupth
    Suntrupth Member Posts: 482

    Hi Bobby,

    Gathering an errorstack and a 10046 on the SMON session performing the cleanup should provide more details.

    SMON does this - every 5 mins looks at FET$ to find any free extents to coalesce and around 2 hours checks the seg$ to see if it can cleanup any temporaty segments.

    This should show how many items need cleaning:

    select count(*) from user$ u, obj$ o where u.user#(+)=o.owner# and o.type#=10 and not exists (select p_obj# from dependency$ where p_obj# = o.obj#) ;

    Best bet would be to set the events, and have them cleaned later.

    Regards,

    Suntrupth

    Suntrupth
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,566 Gold Crown

    You could try a 10046 trace at level 12 for a few minutes to see how many times smon executes that query in the time, whether the input bind values are changing, and whether smon finds any rows that match the inputs.  It may be that your uet$ has become corrupted and left smon running an infinite loop as a consequence. It may be that smon is simply taking a huge amount of time on each extent because is has to read so many cluster blocks each time it runs the query.

    Regards

    Jonathan Lewis

    Jonathan Lewis
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Suntrupth,

    Here is the output of the query you posted:

    select count(*)
      2  from sys.user$ u, sys.obj$ o
      3  where u.user#(+)=o.owner# and
      4  o.type#=10 and
      5  not exists
      6  (select p_obj# from sys.dependency$
      7  where p_obj# = o.obj#) ;

      COUNT(*)
    ----------
             0

    Thanks again for your reply.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Jonathan,

    Here is part of the 10046 trace.  It looks like this is the tablespace that we had trouble recreating the global index in.  We had dropped a number of older partitions on the base table and couldn't get the unusable index rebuilt in its dictionary managed tablespace.  According to a coworker of mine the space from the tablespace in question - about 500gig - has never been freed so that makes sense.  Oracle support made reference to a bug related to failed index builds so it may be related.

    Here is the output:

    *** 2014-01-22 02:24:47.928

    *** SESSION ID:(8.1) 2014-01-22 02:24:47.905

    =====================

    PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=1598955320 hv=1839874543 ad='34c149b0'

    select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4

    END OF STMT

    PARSE #2:c=0,e=2,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=0,tim=1598955320

    BINDS #2:

    bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9bc8 bln=22 avl=02 flg=05

       value=74

    bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b98 bln=24 avl=03 flg=05

       value=865

    bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b68 bln=24 avl=04 flg=05

       value=261032

    bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b38 bln=24 avl=04 flg=05

       value=176590

    EXEC #2:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955321

    FETCH #2:c=269,e=267,p=0,cr=52299,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955588

    STAT #2 id=1 cnt=0 pid=0 pos=0 obj=13 op='TABLE ACCESS CLUSTER UET$ '

    STAT #2 id=2 cnt=2 pid=1 pos=1 obj=9 op='INDEX UNIQUE SCAN '

    =====================

    PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=1598955588 hv=1839874543 ad='34c149b0'

    select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4

    END OF STMT

    PARSE #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955588

    BINDS #2:

    bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9bc8 bln=22 avl=02 flg=05

       value=74

    bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b98 bln=24 avl=03 flg=05

       value=865

    bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b68 bln=24 avl=04 flg=05

       value=261032

    bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b38 bln=24 avl=04 flg=05

       value=176589

    EXEC #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955588

    FETCH #2:c=269,e=269,p=0,cr=52299,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955857

    STAT #2 id=1 cnt=0 pid=0 pos=0 obj=13 op='TABLE ACCESS CLUSTER UET$ '

    STAT #2 id=2 cnt=2 pid=1 pos=1 obj=9 op='INDEX UNIQUE SCAN '

    =====================

    PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=1598955857 hv=1839874543 ad='34c149b0'

    select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4

    END OF STMT

    PARSE #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955857

    BINDS #2:

    bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9bc8 bln=22 avl=02 flg=05

       value=74

    bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b98 bln=24 avl=03 flg=05

       value=865

    bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b68 bln=24 avl=04 flg=05

       value=261032

    bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b38 bln=24 avl=04 flg=05

       value=176588

    EXEC #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955857

    FETCH #2:c=270,e=270,p=0,cr=52299,cu=0,mis=0,r=0,dep=1,og=4,tim=1598956127

    STAT #2 id=1 cnt=0 pid=0 pos=0 obj=13 op='TABLE ACCESS CLUSTER UET$ '

    STAT #2 id=2 cnt=2 pid=1 pos=1 obj=9 op='INDEX UNIQUE SCAN '

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    I think this is the problematic temporary segment that SMON is stuggling to clean up:

    select
      2  segment_name,
      3  segment_type,
      4  bytes/(1024*1024*1024) gigabytes
      5  from dba_segments
      6  where
      7  tablespace_name='NOTTHEREALONE' and
      8  header_file=865;

    SEGMENT_NAME                                                                      SEGMENT_TYPE        GIGABYTES
    --------------------------------------------------------------------------------- ------------------ ----------
    865.261032                                                                        TEMPORARY          521.414948

    The tablespace name I really used corresponded to ts#=74 in v$tablespace.

    We are thinking of rebuilding the other indexes in this tablespace in a new locally managed tablespace and then dropping the existing dictionary managed tablespace.  My only question is whether some corruption will cause the drop tablespace to fail.

    - Bobby

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,566 Gold Crown

    bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=8000000100000001 size=24 offset=0

       bfp=80000001000e9b38 bln=24 avl=04 flg=05

       value=176588

    EXEC #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=1598955857

    FETCH #2:c=270,e=270,p=0,cr=52299,cu=0,mis=0,r=0,dep=1,og=4,tim=1598956127



    Smon is deleting the extents in reverse order - since this is 8.1 the ela is centiseconds, so you've got 176,588 extents to go at 2.7 seconds per extent; total 4.77M seconds or fifty five days.


    Check this with Oracle, but I think what I did was:


    Take a backup of the database.

    Set event 10061 to stop smon from trying to clear up.

    Bounce the database

    Query uet$ for all the extents in this segment that were adjacent.

    Delete all the uet$ rows for the adjacent ones (couple of hundred thousand) and insert one uet$ entry that described the entire length of the set deleted.

    All the uet$ entries for the segment may have to be renumbered - except I think I got lucky and didn't have to do this because my 200,000 turned into "the last one".

    Update seg$ with the new number of extents.

    This left the segment with a "sane" number of extent for smon to delete properly so flush the uet$ blocks from the cache and restart.

    Regards

    Jonathan Lewis

    Jonathan Lewis
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Jonathan,

    Thanks again for your reply.  It sounds like you were manually coalescing the extents.  We will work it through with support.  There is only a few gigabytes of indexes in the tablespace in question so we may just rebuild them somewhere else and then drop the tablespace rather than messing with the uet$ table.  But, we will get Oracle support's direction either way.

    - Bobby

    p.s. I think I may have a grasp on why each query is taking 2.7 seconds.  It looks to me like the C_FILE#_BLOCK# cluster  that uet$ is part of has 122908 blocks on this system and the trace shows cr=52299 on the fetch so that means that the number of cr blocks scanned is about 42% of the number of blocks in the cluster.  I'm guessing that when you have segments with many extents there is a lot of chaining leading to many blocks of the cluster being visited for each query execution.  I looked at the cluster definition and it has size 250 so maybe that is too small when you have thousands of extents.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,566 Gold Crown

    Your comment about the cluster and block visits is correct - if you look at dcore.bsq you'll see the (Oracle 5, maybe 6) assumption still recorded:

    create cluster c_file#_block#(ts# number, segfile# number, segblock# number)

      size 225    /* cluster key ~ 25, sizeof(seg$) ~ 50, 5 * sizeof(uet$) ~ 150 */

      storage (initial 20K)               /* avoid space management during IOR I */

    /

    5 extents per segment on average!

    That doesn't mean there's a limit of 5 rows for any one segment in a block, of course.

    The numbers don't really add up - at 150 bytes per row you get about 50 rows per block, which would make 52,000 blocks (your CRs) equate to about 2.7M rows in uet$ - but the rate at which you're clearing them is to slow at present for you to have deleted down to 176,000 already so (a) you must have been deleting them much more quickly to start with and (b) the current CRs might also include a large number of undo blocks. 

    That's all academic, of course, the real problem is how to resolve the mess as rapidly as possible - and I'm not sure you'll be able to get rid of the tablespace without getting rid of the uet$ entries. So if you can't work around deleting them by hand have you considered exporting the entire database and creating a new one ? It might be the safest, fastest thing to do.

    Regards

    Jonathan Lewis

    Jonathan Lewis
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Jonathan,

    Thanks again for your reply.  It all makes good sense.  I'm concerned too about whether dropping the tablespace would clean up uet$ versus doing the manual update.

    It is a 4 terabyte database so export/import would take some time but it would definately be nice to move to a cleaner environment if we could.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    I think I figured out why our fet$ and uet$ counts didn't go down yesterday.  It looks like after the database restart smon starts over with the last extent even though it isn't on uet$.  I did this query to find the first and last ext# for the segment that smon is trying to clean up:

    select
      2  min(ext#),
      3  max(ext#)
      4  from sys.uet$
      5  where
      6  ts#=74 and
      7  segfile#=865 and
      8  segblock#=261032;

    MIN(EXT#)  MAX(EXT#)
    ---------- ----------
             0      63063

    But, if you look at the smon trace from yesterday it is working on extent 176590:

    *** 2014-01-22 02:24:47.928

    *** SESSION ID:(8.1) 2014-01-22 02:24:47.905

    =====================

    PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=1598955320 hv=1839874543 ad='34c149b0'

    select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4

    END OF STMT

    PARSE #2:c=0,e=2,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=0,tim=1598955320

    BINDS #2:

    bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9bc8 bln=22 avl=02 flg=05

       value=74

    bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b98 bln=24 avl=03 flg=05

       value=865

    bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b68 bln=24 avl=04 flg=05

       value=261032

    bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=08 oacfl2=1 size=24 offset=0

       bfp=80000001000e9b38 bln=24 avl=04 flg=05

       value=176590

    I got the idea of SMON starting over from this blog post:

    https://blogs.oracle.com/AlejandroVargas/entry/after_production_upgrade_thoug

    Here is a relevent quote: "Shutdown abort in this case was a bad idea, on startup it forced smon to scan all extents already set, a slow and painful process... trace of smon showed that it scanned uet$ row by row reviewing all the job that was already completed"

    Hopefully this means that uet$/fet$ are not corrupt and we just need to clean this up through some normal process but we will see.

    - Bobby

  • Suntrupth
    Suntrupth Member Posts: 482
    Accepted Answer
    The tablespace name I really used corresponded to ts#=74 in v$tablespace.
    
    We are thinking of rebuilding the other indexes in this tablespace in a new locally managed tablespace and then dropping the existing dictionary managed tablespace.  My only question is whether some corruption will cause the drop tablespace to fail.
    

    Hi Bobby,

    I am not sure about your current situation now, Hope its better .

    I couldn't find time to reply back yesterday. Looking at the huge amounts of temporary segments in your tablespace (74), just got me thinking if you or Oracle Support have made you aware of an event DROP_SEGMENTS which is an event users may invoke to clear temporary segments. It drops the temporary segments just like SMON does in the background. This event has the advantage of not having the CPU consumed by SMON.

    Note 47400.1 : EVENT: DROP_SEGMENTS - Forcing cleanup of TEMPORARY segments

    There is also a method to patch (make changes) dictionary (not recommended otherwise). You could seek help from Oracle Support

    The method is to identify the segment( which you already have) and update the segment_type from temporary to 999.

    Note 388619.1: Last resort when SMON takes 100% CPU

    Regards,

    Suntrupth

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Suntrupth,

    I'll take a look at those notes.  At this point we are running fine with 10061 and 10269 events set so there isn't a crisis.  But, of course we need to clean things up eventually so we are planning our next move.  But, no one stayed up late last night working on it since the system is running.

    Thanks again for your input.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    I put this question in the SR if anyone wants to see it:

    Do you know  if we dropped the tablespace if it would still cause the SS enqueues? The thing about this is that the index rebuild failed around December 25 and we didn't try to drop the temporary tablespace until January 18th so it seems that SMON spinning on the cleanup didn't cause the SS enqueues until we tried to drop the temporary tablespace. But it is hard to see how  dropping the tablespace is related to the cleanup of the failed index creation's temporary segment. I think it must be that when we tried to drop the temporary tablespace smon had to cleanup the sort segment that was in it and this cleanup was hungup behind the cleanup of the temporary segment. But, then we dropped the temporary tablespace with the 10061 and 10269 events set and yet were still seeing the ss enqueues (I think) so I'm not sure why that would be. If the drop tablespace cleanup doesn't affect anything else we can just run sqlplus nohup and let it run for two months (estimate that it will take 55 days).
    Another thing I'm not sure about is why, after dropping the temporary tablespace with the 10061 and 10269 events set and removing the events and restarting, are we still having application issues?  I didn't actually observe the ss enqueues recurring this last time we brought the database up after dropping the temporary tablespace and with the events taken out of the init.ora.  But, we had complaints from the users and these went away when we put 10061 and 10269 back in and restarted the database.  Strange, especially since the temporary tablespace was locally managed anyway.
    The point here is that there are two seemingly unrelated tablespaces - the dictionary managed index tablespace with a 500 gig temporary segment with 177000 extents and a locally managed temporary tablespace that we have dropped.  The SMON cleanup of the temporary segment wasn't causing system issues until we tried to drop the temporary tablespace on Saturday but why are they related and can we cleanup the temporary segment by dropping the index tablespace and letting it take forever to clean up in the background?  I.e. Will we start having sessions hang on SS enqueues if the server process doing the drop of the index tablespace cleans up uet$ instead of SMON?
    - Bobby
  • Suntrupth
    Suntrupth Member Posts: 482

    Hi Bobby,

    Well, If a process requires a sort segment, you need the SS enqueue.

    When SMON is trying to clean up your dead transaction (failed rebuild operation), it may not service other work like cleaning up temporary space, handling sort-segment requests.

    If you have removed the events, you are making SMON do your cleanup again. Why not have these events set and let the server process perform the cleanup so that you do not face any SS enqueues?

    I believe Jonathan had suggested dictionary patching by updating the seg$ as described in the note that I had referred previously.

    I don't think you would be able to drop the index tablespace as you still have temporary segments that needs a cleanup.

    Regards,

    Suntrupth

    Suntrupth
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,566 Gold Crown

    Bobby,

    A thought by twitter from Mark Farnham - if you do an "alter tablespace offline", "drop tablespace included contents and data files"  Oracle may have a fast way to eliminate all the entries from uet$, and fet$ because it doesn't have to move rows from uet$ into fet$ and check adjacency - it simply has to "delete where ts# = ".

    Like me, he was working from memory from a long time ago - so can't be certain of details: check with support, model on a spare database, and take a backup of production before trying it if you think it's worth trying.

    Regards

    Jonathan Lewis

    Jonathan Lewis
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Jonathan,

    Mark's comments about the alter offine and drop ring a bell.  I think early on researching this on MOS I got the idea of the tablespace drop from an Oracle support document.  We have a development database of the same version.  I've been thinking about trying to recreate the situation there by creating a table with millions of small extents to fill up UET$, etc.  At this point turning off the SMON cleanup isn't hurting us in production so I may have time to do some testing on our dev database.  Will definately verify a good backup of prod prior to doing anything radical.

    - Bobby

  • pudge
    pudge Member Posts: 60

    After a bit of out of forum "dinosaur" crosstalk with JL, I believe you are trying to solve two problems: 1) Get rid of an old TEMP tablespace 2) Cleanup uet$/fet$ efficiently from one or more broken/crashed 999.99999 named temporary objects that are probably residue of failed completion of creation of the final image/ object rename to the index in some user tablespace.

    From the thread it seems you are prepared to recreate any and all contents of the tablespace #74. That, or saving any source of  data objects (meaning clusters, tables, and iots [which are tables, doggone it, but some people think they are indexes]) and being willing to recreate any dependent objects such as indexes and materialized views that may physically exist in that tablespace.

    IF (and only if) that is true, I believe your likely best option is to take tablespace #74 offline and then drop that tablespace including contents. This should bypass the "careful" (and pessimal) algorithm for multiuser consistency of uet$/fet$ entries that has to be done for dictionary tablespaces that are open in favor of a straight delete by the file numbers of the components of the tablespace. This should work correctly without you having to do any rogue surgery directly on uet$/fet$. Of course, at the end of this operation everything in tablespace #74 will be GONE.

    As for the temporary tablespace (if it is having any affect on uet$/fet$, being the antique permanent sort of tablespace holding temp bits), then you should be able to do the same thing for it.

    Given the luxury of an instance stop, and acquiescence to the caveats above, probably together with a backup if this is production, I'd be inclined to startup mount, alter the datafiles component of the tablespaces in question offline, startup restricted, verify that the datafiles being offline left the tablespaces in question offline, so smon is not trying to molest them, and then execute the alter tablespaces drop including contents on first the user table and then the deprecated temp.

    With the noted huge numbers of rows, this won't be instanteous, but it should operate at reasonable amount of speed. This will not shrink the cluster containing uet$/fet$, but when the file numbers are re-used at least you'll re-use that space.

    It's been a while, but that as I recall it is the optimal scenario for clearing huge numbers of rows from uet$/fet$ quickly in a supported fashion from v6.0 through hmm, still, for dictionary managed tablespaces. As far as  I know they never improved that online dictionary free extents code in favor of working on locally managed tablespaces.

    Good luck.

    mwf

    pudge
  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Thanks for your reply.  Your recommendation is similar to what we had in mind, but the difference is that it doesn't require any wierd events being set to accomplish it.  Our plan was to move all the objects out of the tablespace with the 10061 and 10269 events set to stop SMON from working.  Your idea of making the tablespace offline before opening the database is nice.  Cleaner for sure.  The temp tablespace is already dropped.  It was locally managed so there is no uet$/fet$ cleanup there.  But, we want a way to cleanup the index tablespace after moving all the indexes out of it - only a few gigabytes are in there out of 500 or so of space allocated so shouldn't take long to move the indexes.

    I'm building a small test database now so I'll play with it there.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    From my tests it appears that dropping the tablespace is no more efficient than the smon cleanup.  It runs the same statements against uet$:

    select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4

    Do the drop should take a number of weeks.

    It is possible we could run with the events set for that long and just run the drop tablespace in the background.  I'm just not sure if it would impact something else.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Well, it looks like we will probably do the following:

    -- do all these steps as SYSDBA

    -- verify that events 10061 and 10269 are set:

    show parameter event

    shutdown immediate

    startup restrict

    update seg$ set type#=999 where type#=3 and file#=865 and block#=261032 and ts#=74;

    -- if only one row updates (it should)

    commit;

    shutdown abort

    -- must be abort

    -- take out events

    startup

    -- check if all seems ok

    shutdown immediate

    startup

    This is from Oracle's document "Last resort when SMON takes 100% CPU (Doc ID 388619.1)"

    Oracle support is encouraging us to 11.2.0.4 which makes sense.  But I guess the "last resort" is to change the segment type so SMON ignores it.

    But, I'm not sure how this will affect our issue with SS enqueues and sort segment request waits.

    - Bobby

  • Bobby Durrett
    Bobby Durrett Member Posts: 140 Blue Ribbon

    Hopefully my last update until we make the "last resort" change.  I think I understand the sort waits we were seeing.  I believe there were two situations.

    1  - SMON spinning when we dropped a locally managed temporary tablespace

    Since smon is busy it couldn't clean up the sort segments and user sessions that needed access to a sort segment hung.

    2 - SMON spinning after bouncing database

    SMON has to do something after a database bounce to let users start using a sort segment.  Looks like a busy SMON will lock out user sessions until SMON "adopts" the sort segment.

    The document I saw relating to case 1 regarding dropping temporary tablespaces, sort segments, and a busy SMON was this:

    Diagnosing BUG 5016142 enqueue ENQ - SS contention (Doc ID 416823.1

    I'm not 100% sure on 2 but it is similar to this case where SMON is doing transaction rollback after a bounce:

    Bug 1809682 : NEED SORT SEGMENT ... SMON IS BUSY DOING ROLLBACK

    My point here is that we had SMON spinning for three weeks with no sort waits, but when we tried to drop a temporary tablespace they started.  Then after setting the 10061 and 10269 events we were able to drop the temporary tablespace completely.  But, when we took the events back out we saw sort waits again!  But, the second set of waits was because we restarted the database and SMON was spinning from the beginning and never opened up the sort segment to the users.

    - Bobby

This discussion has been closed.