sqlplus promp is not immediately return can't connect. It's pending long time and return "connection lost"
I'm check on alert_SID.log and not found any error/warning
1. Not involve about tns configuration or network. Cause of this problem still occur when logon by other user (not sys) at DB Server machine too.
and we can connect sys as sysdba from client by tns alias
2. After connect to database by sys user. We can't query anything (such as select * from dba_tablespaces is freezing)
3. User system wih role DBA can't connect too.
4. I'm check udump, bdump, cdump, adump for useful trace information. And not found anything durning this error
- Restart listener, It's still problem
- Shutdown database by immediate. It's hanging after step "Stopping background process MMNL"
so, I'm shutdown abort and startup. Problem is clear
Thu Feb 13 08:28:16 2014
Shutting down instance: further logons disabled
Thu Feb 13 08:28:16 2014
Stopping background process CJQ0
Thu Feb 13 08:28:16 2014
Stopping background process QMNC
Thu Feb 13 08:28:18 2014
Stopping background process MMNL <-- this is last step show on alert.log
Thu Feb 13 08:30:08 2014
Shutting down instance (abort)
License high water mark = 2926
Instance terminated by USER, pid = 23752
Thu Feb 13 08:30:24 2014
Starting ORACLE instance (normal)
Other action before problem.
We have resize datafile on 7:49AM with complete state and I think not cause of problem.
Thu Feb 13 07:49:12 2014
/* OracleOEM */ ALTER DATABASE DATAFILE '/data/appdb/ar_data01.dbf' RESIZE 32767M
Thu Feb 13 07:49:43 2014
Completed: /* OracleOEM */ ALTER DATABASE DATAFILE '/data/appdb/ar_data01.dbf' RESIZE 32767M
1 First I found about this note "No Database User Can Login Except Sys And System because Resource Manager Internal_Quiesce Plan Enabled (Doc ID 396970.1)"
But after check scheduler window. It's not enable.
I'm already test change Devlop Database to "QUIESCE mode". It's can login only sys. Other user can't login. But not same my situation. Because user sys
can query and opeartion normal (my situation user sys is hang after query too)
2 I'm check statistic of server (Solaris10) resource during hang. It's not peak anything (cpu, mem, iostat is OK)
I'm check /var/adm/message, /var/log/syslog. It's not show any information
Please help me investigate. I'm already open SR to Oracle support. But they ask me very simple missing point such as Tns configuration, connection string.. (-_-).
On lgwr trace in udump before hang. I found only "Warning: log write time xxx" which is normally happend long time ago by not effect system.
*** 2014-02-13 06:11:31.274
Warning: log write time 1890ms, size 5KB
*** 2014-02-13 06:13:13.772
Warning: log write time 1810ms, size 9KB
About archive log, I'm check destination and free space of archive log. All of it's ok. We not change anything about config
SQL> show parameter archive
NAME TYPE VALUE
------------------------------------ -------------------------------- ------------------------------
archive_lag_target integer 0
log_archive_dest_1 string LOCATION=/orac/appdb/archive
log_archive_dest_state_1 string enable
log_archive_format string %t_%s_%r.dbf
log_archive_local_first boolean TRUE
log_archive_max_processes integer 2
log_archive_min_succeed_dest integer 1
log_archive_start boolean FALSE
log_archive_trace integer 0
remote_archive_enable string true
standby_archive_dest string ?/dbs/arch
However we found some post recommend increase number of log_archive_max_process
More about archive, I have manual shell script force apply archive log every 5 Minute by crontab (by "alter system switch logfile" and "alter system archive log all;")
This script starting around 1 month ago with no problem.
Do you thinkg It's possible to 1 time apply 2 statement not complete in 5 minute and next interval try to apply and hang?
However I think if that should be any alert/warning in alert_SID.log in this case.
Please recommend if need more information:
Did you try gathering a systemstate dump at the time of the issue? This might have given you some clue.
Did you check if there were any active transactions in your DB prior to the shutdown?
"Stopping background process MMNL" - usually are just informational in nature and not the cause of a hang.
"I have manual shell script force apply archive log every 5 Minute ( by
"alter system switch logfile" and "alter system archive log all;")"
- Any reason why you force log switches in such small intervals? Oracle recommends to switch logs with about 15-20 minutes intervals.
Did you check if AWR snapshots were captured at the time of the issue? If yes, generate a report and check possible issues.
1. I'm never know about systemstate dump before. I'm just learning after problem happen and plan to keep this if found again follow by
(How to Collect Diagnostics for Database Hanging Issues (Doc ID 452358.1)) <-- Still not clear how to, This doc keep dump systemstate by other session. It's can keep current problem?
2. I can't check anything cause of after login to sys as sysdba at DB Local machine. It's can't query anything. It's freeze after issue statement.
3. I'm log switch every 5 min cause of I'm need send archive log to DR-Site.
I'm use concept of DataGuard by create clone of Prod by RMAN and set shell script to manual apply log every 5 minute. Use shell script check new log and rysnc send log to DR-Site destination.
Use shell script at destination to apply archive log. So I will have DR which is sync with Production not less than 5 minute
4. About auto hourly AWR Snapshot I'm found it's incomplete record during DB Hang (7.00-8.30AM). But I think cause we have shutdown abort around 8.28AM.
So, It's can't keep end time. Please see
Troubleshooting: AWR Snapshot Collection issues (Doc ID 1301503.1)
|select * from DBA_HIST_SNAPSHOT order by snap_id desc;|
|select * from WRM$_SNAPSHOT order by 5 desc;|
|40966||1||13/02/2014 05.00.52.248000000 AM||+00 00:00:01.800000||0||0||0|
|40967||1||13/02/2014 06.00.53.157000000 AM||+00 00:00:01.900000||0||0||0|
|40968||1||13/02/2014 07.00.54.083000000 AM||1||0||0|
|40969||1||13/02/2014 08.30.36.000000000 AM||+00 00:00:06.400000||0||0||0|
select * from wrm$_snap_error where dbid = (select dbid from v$database) order by snap_id;
I'm not found record which is prefer show error number.
I'm don't know about how to generate report from snapshot. (In this case mean I will generate report from snap_id 40968? )
Please recommend method or article about this.
Thank you very much