This discussion is archived
13 Replies Latest reply: Jan 8, 2013 4:00 AM by XIC RSS

Oracle Standard Edition on Unix - kernel panick, Out of Memory

Veronica Newbie
Currently Being Moderated
Hi,

I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.

The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.

There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.

I have no DB Admin for help. Could you please guide me what I should look into to resolve the problem?
  • 1. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    SalmanQureshi Expert
    Currently Being Moderated
    Hi,
    You did not write the Oracle version and Linux version. How much RAM is on the server and how much SAWP space is allocated on the machine? What is size of SGA set for the database and how many concurrent sessions connect with the database?
    Select your databa/OS from this metrix and check the kernel settings recommenced for your oracle/linux version and make sure you have set parameters correctly.

    http://oracle-base.com/articles/linux/articles-linux.php

    Salman
  • 2. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    XIC Newbie
    Currently Being Moderated
    Hi,
    Can you please share outputs of the below commands[If the O/S is Linux]:

    1. uname -a
    2. cat /proc/cpuinfo
    3. cat /proc/meminfo
    4. top -20
    5. cat /etc/redhat-release

    Also you need to check whether any kernel modification was done recently and check if proper entry for kernel parameter is given in /etc/sysctl.conf file. We need to know these information to solve this issue.

    Regards,
    XIC
  • 3. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    Sorry I didn't post any relevant details the first time. Here is the data I've collected. The database have been running for 3 hours now.

    Oracle Version:
    Oracle Database 11g Release 11.2.0.1.0 - Production
    RAM and Swap on the machine(free -mt):
                 total       used       free     shared    buffers     cached
    Mem:          4022       3049        973          0        120       2597
    -/+ buffers/cache:        331       3691
    Swap:         6111          0       6111
    Total:       10134       3049       7085
    SGA (select * from V$SGAINFO):
    NAME                             BYTES                  RESIZEABLE 
    -------------------------------- ---------------------- ---------- 
    Fixed SGA Size                   1336960                No         
    Redo Buffers                     11644928               No         
    Buffer Cache Size                218103808              Yes        
    Shared Pool Size                 838860800              Yes        
    Large Pool Size                  16777216               Yes        
    Java Pool Size                   16777216               Yes        
    Streams Pool Size                0                      Yes        
    Shared IO Pool Size              0                      Yes        
    Granule Size                     16777216               No         
    Maximum SGA Size                 1690705920             No         
    Startup overhead in Shared Pool  117440512              No         
    Free SGA Memory Available        587202560         
    SGA target advice (select * from v$sga_target_advice order by sga_size):
    SGA_SIZE               SGA_SIZE_FACTOR        ESTD_DB_TIME           ESTD_DB_TIME_FACTOR    ESTD_PHYSICAL_READS    
    ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- 
    528                    0,5                    523                    1,002                  62738                  
    792                    0,75                   522                    1                      62738                  
    1056                   1                      522                    1                      62738                  
    1320                   1,25                   522                    1                      62738                  
    1584                   1,5                    522                    1                      62738                  
    1848                   1,75                   522                    1                      62738                  
    2112                   2                      522                    1                      62738   
    Concurent connection peak (
    SESSIONS_HIGHWATER column represents the highest number of concurrent user sessions since the instance started
    select SESSIONS_HIGHWATER from v$license):
    SESSIONS_HIGHWATER     
    ---------------------- 
    124                 
    My parameter settings:
    fs.suid_dumpable = 0
    fs.aio-max-nr = 1048576
    fs.file-max = 6815744
    kernel.shmall = 268435456
    kernel.shmmax = 4294967295
    kernel.shmmni = 4096
    kernel.sem = 250 32000 100 128
    net.ipv4.ip_local_port_range = 9000 65500
    net.core.rmem_default = 262144
    net.core.rmem_max = 4194304
    net.core.wmem_default = 262144
    net.core.wmem_max = 1048586
    1. uname -a
    Linux oralinux 2.6.32-100.34.1.el6uek.i686 #1 SMP Wed May 25 17:28:36 EDT 2011 i686 i686 i386 GNU/Linux
    2. cat /proc/cpuinfo
    processor     : 0
    vendor_id     : GenuineIntel
    cpu family     : 6
    model          : 44
    model name     : Intel(R) Xeon(R) CPU           E5606  @ 2.13GHz
    stepping     : 2
    cpu MHz          : 2133.307
    cache size     : 8192 KB
    fdiv_bug     : no
    hlt_bug          : no
    f00f_bug     : no
    coma_bug     : no
    fpu          : yes
    fpu_exception     : yes
    cpuid level     : 11
    wp          : yes
    flags          : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss nx rdtscp lm constant_tsc up arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm arat
    bogomips     : 4266.61
    clflush size     : 64
    cache_alignment     : 64
    address sizes     : 40 bits physical, 48 bits virtual
    power management:
    3. cat /proc/meminfo
    MemTotal:        4119360 kB
    MemFree:          707644 kB
    Buffers:          158376 kB
    Cached:          2773532 kB
    SwapCached:            0 kB
    Active:          1242268 kB
    Inactive:        1982836 kB
    Active(anon):     833904 kB
    Inactive(anon):   539200 kB
    Active(file):     408364 kB
    Inactive(file):  1443636 kB
    Unevictable:           0 kB
    Mlocked:               0 kB
    HighTotal:       3286984 kB
    HighFree:         132060 kB
    LowTotal:         832376 kB
    LowFree:          575584 kB
    SwapTotal:       6258680 kB
    SwapFree:        6258680 kB
    Dirty:                28 kB
    Writeback:             0 kB
    AnonPages:        293264 kB
    Mapped:           300936 kB
    Shmem:           1079916 kB
    Slab:              86204 kB
    SReclaimable:      48064 kB
    SUnreclaim:        38140 kB
    KernelStack:        2024 kB
    PageTables:        85352 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
    CommitLimit:     8318360 kB
    Committed_AS:    1953996 kB
    VmallocTotal:     122880 kB
    VmallocUsed:        4820 kB
    VmallocChunk:      99288 kB
    HugePages_Total:       0
    HugePages_Free:        0
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    DirectMap4k:       10232 kB
    DirectMap2M:      897024 kB
    4. top -20
    top - 12:25:02 up  2:43,  1 user,  load average: 0.00, 0.00, 0.00
    Tasks: 233 total,   1 running, 232 sleeping,   0 stopped,   0 zombie
    Cpu(s):  1.0%us,  0.3%sy,  0.0%ni, 97.0%id,  0.0%wa,  0.7%hi,  1.0%si,  0.0%st
    Mem:   4119360k total,  3420676k used,   698684k free,   158416k buffers
    Swap:  6258680k total,        0k used,  6258680k free,  2773596k cached
    
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
     6582 oracle    20   0 1775m  23m  22m S  0.7  0.6   0:01.59 oracle            
     3228 oracle    20   0 1777m  37m  34m S  0.3  0.9   0:06.75 oracle            
     5448 oracle    20   0 1775m  26m  25m S  0.3  0.7   0:11.94 oracle            
     5675 oracle    20   0 1775m  25m  23m S  0.3  0.6   0:08.07 oracle            
     6686 kignatow  20   0  2832 1112  812 R  0.3  0.0   0:00.07 top               
        1 root      20   0  2904 1472 1260 S  0.0  0.0   0:01.70 init              
        2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd          
        3 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0       
        4 root      20   0     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd/0       
        5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0        
        6 root      20   0     0    0    0 S  0.0  0.0   0:00.27 events/0          
        7 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cpuset            
        8 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khelper           
        9 root      20   0     0    0    0 S  0.0  0.0   0:00.00 netns             
       10 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr         
       11 root      20   0     0    0    0 S  0.0  0.0   0:00.00 sync_supers       
       12 root      20   0     0    0    0 S  0.0  0.0   0:00.01 bdi-default       
       13 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/0     
       14 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kblockd/0         
       15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpid            
       16 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpi_notify      
       17 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpi_hotplug     
       18 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ata/0             
       19 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ata_aux           
       20 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksuspend_usbd     
       21 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khubd             
       22 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kseriod           
       24 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khungtaskd        
       25 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kswapd0           
       26 root      25   5     0    0    0 S  0.0  0.0   0:00.00 ksmd              
       27 root      20   0     0    0    0 S  0.0  0.0   0:00.00 aio/0             
       28 root      20   0     0    0    0 S  0.0  0.0   0:00.00 crypto/0          
       33 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pciehpd           
       35 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kpsmoused         
       36 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kstriped          
       37 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksnapd            
       38 root      20   0     0    0    0 S  0.0  0.0   0:00.00 usbhid_resumer    
      200 root      20   0     0    0    0 S  0.0  0.0   0:00.01 scsi_eh_0         
      201 root      20   0     0    0    0 S  0.0  0.0   0:00.02 scsi_eh_1         
    5. cat /etc/redhat-release
    Red Hat Enterprise Linux Server release 6.1 (Santiago)
  • 4. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    792055 Newbie
    Currently Being Moderated
    We had the similar issue. After checking lots of oracle and system logs, we found that one of the process (Third party) was causing processor to get hung and ultimately database server used to restart (Since it was in RAC cluster). Please check if you have any error message on /var/log/messages and alert.log file.
  • 5. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    Also you need to check whether any kernel modification was done recently and check if proper entry for kernel parameter is given in /etc/sysctl.conf file. We need to know these information to solve this issue.

    I'm not sure what to look for, but here is the content of the /etc/sysctl.conf file:
    # Kernel sysctl configuration file for Oracle Linux
    #
    # For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
    # sysctl.conf(5) for more details.
    
    # Controls IP packet forwarding
    net.ipv4.ip_forward = 0
    
    # Controls source route verification
    net.ipv4.conf.default.rp_filter = 1
    
    # Do not accept source routing
    net.ipv4.conf.default.accept_source_route = 0
    
    # Controls the System Request debugging functionality of the kernel
    kernel.sysrq = 0
    
    # Controls whether core dumps will append the PID to the core filename.
    # Useful for debugging multi-threaded applications.
    kernel.core_uses_pid = 1
    
    # Controls the use of TCP syncookies
    net.ipv4.tcp_syncookies = 1
    
    # Disable netfilter on bridges.
    net.bridge.bridge-nf-call-ip6tables = 0
    net.bridge.bridge-nf-call-iptables = 0
    net.bridge.bridge-nf-call-arptables = 0
    
    # Controls the maximum size of a message, in bytes
    kernel.msgmnb = 65536
    
    # Controls the default maxmimum size of a mesage queue
    kernel.msgmax = 65536
    
    # Controls the maximum shared segment size, in bytes
    kernel.shmmax = 4294967295
    
    # Controls the maximum number of shared memory segments, in pages
    kernel.shmall = 268435456
    fs.file-max = 6815744
    net.core.rmem_default = 262144
    net.core.wmem_default = 262144
    net.core.rmem_max = 4194304
    net.core.wmem_max = 1048576
    fs.aio-max-nr = 1048576
    #kernel.sem = 100  100 
    kernel.sem = 250 32000 100 128
    net.ipv4.ip_local_port_range = 9000 65500
  • 6. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Dude! Guru
    Currently Being Moderated
    It seems your system has been running out of low memory pages. There could be all kinds of reasons including a hardware issue. The parameters you have shown are not the cause of the issue, nor will the system statistics be helpful while the machine is working fine. You may experience a memory leak that could build up over time, perhaps caused by a driver or incompatible hardware. There can be all sorts of reasons.

    Are you mounting or supporting any shared file systems?
    Are you using any kind of virtualization?
    Is the hardware known to work?

    I suggest to start with checking the syslog to find out if there are any clues that might explain why the OOM killer started to sacrifice processes in order to free up memory for the system.

    Btw, Oracle 11g versions prior to 11.2.0.3 are not certified for Oracle Linux 6. But I would not necessarily blame the Oracle database. The database may have performance issues and fail or not work properly if not configured correctly, but it should not bring the system down. The Oracle database runs in user space and cannot touch low memory, though it may trigger a bug in the system.
  • 7. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Veronica wrote:

    I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.
    The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.
    There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.
    Reasonable assumption - though it may happen in the database, but through no fault of the database itself, but as a result of database abuse.

    Two potential issues comes to mind.

    Server memory leakage by local client processes. I have seen this some years ago with Perl-DBI on Oracle 9i - where the Perl-DBI lib resulted in pretty severe memory leakage, with a single Perl process eventually consuming 600+ GB of private process memory. (work-around we implemented was to cycle the db connection (disconnect/connect) every 30 minutes in the same process)

    Then there is also server memory leakage caused by server code - and in Oracle that is often caused by poorly written clients that do not release server resources (Java clients not releasing SQL reference cursors on the server side is a common one), or poorly written PL/SQL user code creating arrays of unscalable sizes in PGA memory (often via abusing bulk processing or DBMS_OUTPUT).
  • 8. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    Salman,

    thanks for the link. I've checked the parameters carefully, and I've updated oracle user's maximum processess limit. Also they recommend to disable "secure linux". The machine have secure linux enabled, but I'm not sure how that would be a factor...
  • 9. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    user12263161 wrote:
    We had the similar issue. After checking lots of oracle and system logs, we found that one of the process (Third party) was causing processor to get hung and ultimately database server used to restart (Since it was in RAC cluster). Please check if you have any error message on /var/log/messages and alert.log file.
    There is absolutely nothing in the alert.log file, just the usuall, and then information about database starting (I've started it manually after rebooting the machine). /var/log/messages doesn't preserve the information what happened during crash, it's just new info since the system was rebooted. How do I preserve this information when the crash occurs? While the kernel panick occurs, I can't log into it, or do anything.
  • 10. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    Dude wrote:
    It seems your system has been running out of low memory pages. There could be all kinds of reasons including a hardware issue. The parameters you have shown are not the cause of the issue, nor will the system statistics be helpful while the machine is working fine. You may experience a memory leak that could build up over time, perhaps caused by a driver or incompatible hardware. There can be all sorts of reasons.

    Are you mounting or supporting any shared file systems?
    Are you using any kind of virtualization?
    Is the hardware known to work?
    It's a virtual machine on a VMware ESXi. I've scouted around for similar situations on Oracle Linux 6 machines. I've had a person claim something similar happened on one of these machines, but it was incidentall. The uptime of the other OL6 machines I've checked were up to two months, and if there were a recurring problem, I would have known.
    Dude wrote:
    I suggest to start with checking the syslog to find out if there are any clues that might explain why the OOM killer started to sacrifice processes in order to free up memory for the system.
    Nothing much so far in /var/log/message. Can you offer some help how do I preserve the syslog to be examined after the crash?
    Dude wrote:
    Btw, Oracle 11g versions prior to 11.2.0.3 are not certified for Oracle Linux 6. But I would not necessarily blame the Oracle database. The database may have performance issues and fail or not work properly if not configured correctly, but it should not bring the system down. The Oracle database runs in user space and cannot touch low memory, though it may trigger a bug in the system.
    That makes a lot of sense. But then again I don't understand why this is happening on this one machine. We've installed a operating system on a single virtual machine that served as a template. Then when a virtual machine was required, the templated were copied. So all vm we possess are essentially clones when it comes to operating system. And then there are 8 of such machines, but the issue exists only on the machine with a database. You're right, it runs in user space, but then again, why is that happening...
  • 11. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Veronica Newbie
    Currently Being Moderated
    Billy  Verreynne  wrote:
    Veronica wrote:

    I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.
    The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.
    There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.
    Reasonable assumption - though it may happen in the database, but through no fault of the database itself, but as a result of database abuse.

    Two potential issues comes to mind.

    Server memory leakage by local client processes. I have seen this some years ago with Perl-DBI on Oracle 9i - where the Perl-DBI lib resulted in pretty severe memory leakage, with a single Perl process eventually consuming 600+ GB of private process memory. (work-around we implemented was to cycle the db connection (disconnect/connect) every 30 minutes in the same process)

    Then there is also server memory leakage caused by server code - and in Oracle that is often caused by poorly written clients that do not release server resources (Java clients not releasing SQL reference cursors on the server side is a common one), or poorly written PL/SQL user code creating arrays of unscalable sizes in PGA memory (often via abusing bulk processing or DBMS_OUTPUT).
    Thanks for the ideas.
  • 12. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    Dude! Guru
    Currently Being Moderated
    Nothing much so far in /var/log/message. Can you offer some help how do I preserve the syslog to be examined after the crash?
    If the OOM killer starts killing processes as a last resort to free up low memory for the kernel to function it should be in syslog. Perhaps you need to look at archived versions of your message file in /var/log directory. You can try for instance "grep -i 'killed process' /var/log/messages*"
    It's a virtual machine on a VMware ESXi.
    I think if you had mentioned this in your intial post then responses would have been different. That's why I asked. The virtualization is probably where your low memory starvation stems from. I've seen related posts on the Interent about users having trouble with ESX and virtual machine memory reclamation. ESX server uses a memroy ballooning technique, which might be the culprit. You should get more specific hints and troubleshooting tips asking this question in the VMware forum.

    It is generally not possible to troubleshoot the kind of problem you are experiencing without knowing your exact hardware, installation and use details.
  • 13. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
    XIC Newbie
    Currently Being Moderated
    Hi,
    Can you please once check the values of /etc/sysctl.conf to install Oracle 11g R2 on Linux with Oracle Documentation? I don't think it as a DB issue. It might be a configuration issue of Operating System Kernel parameters for this DB version. Also be sure that the required O/S packages are installed properly. Hopefully the below links can help you.

    http://docs.oracle.com/cd/E11882_01/install.112/e24326.pdf
    http://docs.oracle.com/cd/E11882_01/install.112/e24324.pdf

    Regards,
    XIC

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points