4 Replies Latest reply: Feb 13, 2014 10:18 PM by 972061 RSS

DB look like hang, Only sys user can connect to DB but can't query anything

972061 Newbie
Currently Being Moderated

Dear all,

 

On 13/Feb/2014@8.02AM13/Feb/2014@8.02AM, I'm found problem about oracle database (10.2.0.4) on Solaris10(Sparc) can connect only user sys (user system and all can't)    

sqlplus promp is not immediately return can't connect. It's pending long time and return "connection lost"    

I'm check on alert_SID.log and not found any error/warning    

    

Problem Information:    

1. Not involve about tns configuration or network. Cause of this problem still occur when logon by other user (not sys) at DB Server machine too.   

     and we can connect sys as sysdba from client by tns alias   

2. After connect to database by sys user. We can't query anything (such as select * from dba_tablespaces is freezing)   

3. User system wih role DBA can't connect too.   

4. I'm check udump, bdump, cdump, adump for useful trace information. And not found anything durning this error   

    

My Action:    

- Restart listener, It's still problem   

- Shutdown database by immediate. It's hanging after step "Stopping background process MMNL"   

   so, I'm shutdown abort and startup. Problem is clear   

    

Thu Feb 13 08:28:16 2014   

Shutting down instance: further logons disabled   

Thu Feb 13 08:28:16 2014   

Stopping background process CJQ0   

Thu Feb 13 08:28:16 2014   

Stopping background process QMNC   

Thu Feb 13 08:28:18 2014   

Stopping background process MMNL    <-- this is last step show on alert.log

    

Thu Feb 13 08:30:08 2014   

Shutting down instance (abort)   

License high water mark = 2926   

Instance terminated by USER, pid = 23752   

Thu Feb 13 08:30:24 2014   

Starting ORACLE instance (normal)   

    

Other action before problem.    

We have resize datafile on 7:49AM with complete state and I think not cause of problem.   

Thu Feb 13 07:49:12 2014   

/* OracleOEM */ ALTER DATABASE DATAFILE '/data/appdb/ar_data01.dbf' RESIZE  32767M   

Thu Feb 13 07:49:43 2014   

Completed: /* OracleOEM */ ALTER DATABASE DATAFILE '/data/appdb/ar_data01.dbf' RESIZE  32767M   

    

My investigate   

1 First I found about this note "No Database User Can Login Except Sys And System because Resource Manager Internal_Quiesce Plan Enabled (Doc ID 396970.1)"   

But after check scheduler window. It's not enable.   

I'm already test change Devlop Database to "QUIESCE mode". It's can login only sys. Other user can't login. But not same my situation. Because user sys   

can query and opeartion normal (my situation user sys is hang after query too)   

    

2 I'm check statistic of server (Solaris10) resource during hang. It's not peak anything (cpu, mem, iostat is OK)   

I'm check /var/adm/message, /var/log/syslog. It's not show any information   

    

Please help me investigate. I'm already open SR to Oracle support. But they ask me very simple missing point such as Tns configuration, connection string.. (-_-).    

    

Thank you    

  • 1. Re: DB look like hang, Only sys user can connect to DB but can't query anything
    rp0428 Guru
    Currently Being Moderated

    Could be a lot of things. Logs getting full and waiting for the archive process but a remote location not being available.

     

    Post info about the status of the logging and archiving configuration

  • 2. Re: DB look like hang, Only sys user can connect to DB but can't query anything
    972061 Newbie
    Currently Being Moderated

    Hi rp0428

     

    On lgwr trace in udump before hang. I found only "Warning: log write time xxx" which is normally happend long time ago by not effect system.

    *** 2014-02-13 06:11:31.274

    Warning: log write time 1890ms, size 5KB

    *** 2014-02-13 06:13:13.772

    Warning: log write time 1810ms, size 9KB

     

    About archive log, I'm check destination and free space of archive log. All of it's ok. We not change anything about config

    SQL> show parameter archive

    NAME                                 TYPE                             VALUE
    ------------------------------------ -------------------------------- ------------------------------
    archive_lag_target                   integer                          0
    log_archive_config                   string
    log_archive_dest                     string
    log_archive_dest_1                   string                           LOCATION=/orac/appdb/archive
    log_archive_dest_state_1             string                           enable
    log_archive_duplex_dest              string
    log_archive_format                   string                           %t_%s_%r.dbf
    log_archive_local_first              boolean                          TRUE
    log_archive_max_processes            integer                          2
    log_archive_min_succeed_dest         integer                          1
    log_archive_start                    boolean                          FALSE
    log_archive_trace                    integer                          0
    remote_archive_enable                string                           true
    standby_archive_dest                 string                           ?/dbs/arch
    However we found some post recommend increase number of log_archive_max_process

     

    More about archive, I have manual shell script force apply archive log every 5 Minute by crontab (by "alter system switch logfile" and "alter system archive log all;")

    This script starting around 1 month ago with no problem.

    Do you thinkg It's possible to 1 time apply 2 statement not complete in 5 minute and next interval try to apply and hang?

    However I think if that should be any alert/warning in alert_SID.log in this case.

     

    Please recommend if need more information:

     

    Thank you

  • 3. Re: DB look like hang, Only sys user can connect to DB but can't query anything
    Suntrupth Expert
    Currently Being Moderated

    Did you try gathering a systemstate dump at the time of the issue? This might have given you some clue.

     

     

    Did you check if there were any active transactions in your DB prior to the shutdown?

     

     

    "Stopping background process MMNL" - usually are just informational in nature and not the cause of a hang.

     

     

    "I have manual shell script force apply archive log every 5 Minute ( by

    "alter system switch logfile" and "alter system archive log all;")"

     

     

    - Any reason why you force log switches in such small intervals? Oracle recommends to switch logs with about 15-20 minutes intervals.

     

     

    Did you check if AWR snapshots were captured at the time of the issue? If yes, generate a report and check possible issues.

     

     

    Regards,

    Suntrupth

  • 4. Re: DB look like hang, Only sys user can connect to DB but can't query anything
    972061 Newbie
    Currently Being Moderated

    Hi Suntrupth,

     

    1. I'm never know about systemstate dump before. I'm just learning after problem happen and plan to keep this if found again follow by

    (How to Collect Diagnostics for Database Hanging Issues (Doc ID 452358.1)) <-- Still not clear how to, This doc keep dump systemstate by other session. It's can keep current problem?

     

    2. I can't check anything cause of after login to sys as sysdba at DB Local machine. It's can't query anything. It's freeze after issue statement.

     

    3. I'm log switch every 5 min cause of I'm need send archive log to DR-Site.

    I'm use concept of DataGuard by create clone of Prod by RMAN and set shell script to manual apply log every 5 minute. Use shell script check new log and rysnc send log to DR-Site destination.

    Use shell script at destination to apply archive log. So I will have DR which is sync with Production not less than 5 minute

     

    4. About auto hourly AWR Snapshot I'm found it's incomplete record during DB Hang (7.00-8.30AM). But I think cause we have shutdown abort around 8.28AM.

    So, It's can't keep end time. Please see

     

    Troubleshooting: AWR Snapshot Collection issues (Doc ID 1301503.1)

    select * from DBA_HIST_SNAPSHOT order by snap_id desc;
    select * from WRM$_SNAPSHOT order by 5 desc;
    SNAP_IDINSTANCE_NUMBERBEGIN_INTERVAL_TIMEFLUSH_ELAPSEDSTATUSERROR_COUNTSNAP_FLAG
    40966113/02/2014 05.00.52.248000000 AM+00 00:00:01.800000000
    40967113/02/2014 06.00.53.157000000 AM+00 00:00:01.900000000
    40968113/02/2014 07.00.54.083000000 AM100
    40969113/02/2014 08.30.36.000000000 AM+00 00:00:06.400000000

     

    select * from wrm$_snap_error where dbid = (select dbid from v$database)  order by snap_id;

    I'm not found record which is prefer show error number.

     

    I'm don't know about how to generate report from snapshot. (In this case mean I will generate report from snap_id 40968? )

    Please recommend method or article about this.

     

    Thank you very much

    Best regards,

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points