13 Replies Latest reply: Jan 8, 2013 6:00 AM by XIC RSS

    Oracle Standard Edition on Unix - kernel panick, Out of Memory

    Veronica
      Hi,

      I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.

      The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.

      There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.

      I have no DB Admin for help. Could you please guide me what I should look into to resolve the problem?
        • 1. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
          Salman Qureshi
          Hi,
          You did not write the Oracle version and Linux version. How much RAM is on the server and how much SAWP space is allocated on the machine? What is size of SGA set for the database and how many concurrent sessions connect with the database?
          Select your databa/OS from this metrix and check the kernel settings recommenced for your oracle/linux version and make sure you have set parameters correctly.

          http://oracle-base.com/articles/linux/articles-linux.php

          Salman
          • 2. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
            XIC
            Hi,
            Can you please share outputs of the below commands[If the O/S is Linux]:

            1. uname -a
            2. cat /proc/cpuinfo
            3. cat /proc/meminfo
            4. top -20
            5. cat /etc/redhat-release

            Also you need to check whether any kernel modification was done recently and check if proper entry for kernel parameter is given in /etc/sysctl.conf file. We need to know these information to solve this issue.

            Regards,
            XIC
            • 3. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
              Veronica
              Sorry I didn't post any relevant details the first time. Here is the data I've collected. The database have been running for 3 hours now.

              Oracle Version:
              Oracle Database 11g Release 11.2.0.1.0 - Production
              RAM and Swap on the machine(free -mt):
                           total       used       free     shared    buffers     cached
              Mem:          4022       3049        973          0        120       2597
              -/+ buffers/cache:        331       3691
              Swap:         6111          0       6111
              Total:       10134       3049       7085
              SGA (select * from V$SGAINFO):
              NAME                             BYTES                  RESIZEABLE 
              -------------------------------- ---------------------- ---------- 
              Fixed SGA Size                   1336960                No         
              Redo Buffers                     11644928               No         
              Buffer Cache Size                218103808              Yes        
              Shared Pool Size                 838860800              Yes        
              Large Pool Size                  16777216               Yes        
              Java Pool Size                   16777216               Yes        
              Streams Pool Size                0                      Yes        
              Shared IO Pool Size              0                      Yes        
              Granule Size                     16777216               No         
              Maximum SGA Size                 1690705920             No         
              Startup overhead in Shared Pool  117440512              No         
              Free SGA Memory Available        587202560         
              SGA target advice (select * from v$sga_target_advice order by sga_size):
              SGA_SIZE               SGA_SIZE_FACTOR        ESTD_DB_TIME           ESTD_DB_TIME_FACTOR    ESTD_PHYSICAL_READS    
              ---------------------- ---------------------- ---------------------- ---------------------- ---------------------- 
              528                    0,5                    523                    1,002                  62738                  
              792                    0,75                   522                    1                      62738                  
              1056                   1                      522                    1                      62738                  
              1320                   1,25                   522                    1                      62738                  
              1584                   1,5                    522                    1                      62738                  
              1848                   1,75                   522                    1                      62738                  
              2112                   2                      522                    1                      62738   
              Concurent connection peak (
              SESSIONS_HIGHWATER column represents the highest number of concurrent user sessions since the instance started
              select SESSIONS_HIGHWATER from v$license):
              SESSIONS_HIGHWATER     
              ---------------------- 
              124                 
              My parameter settings:
              fs.suid_dumpable = 0
              fs.aio-max-nr = 1048576
              fs.file-max = 6815744
              kernel.shmall = 268435456
              kernel.shmmax = 4294967295
              kernel.shmmni = 4096
              kernel.sem = 250 32000 100 128
              net.ipv4.ip_local_port_range = 9000 65500
              net.core.rmem_default = 262144
              net.core.rmem_max = 4194304
              net.core.wmem_default = 262144
              net.core.wmem_max = 1048586
              1. uname -a
              Linux oralinux 2.6.32-100.34.1.el6uek.i686 #1 SMP Wed May 25 17:28:36 EDT 2011 i686 i686 i386 GNU/Linux
              2. cat /proc/cpuinfo
              processor     : 0
              vendor_id     : GenuineIntel
              cpu family     : 6
              model          : 44
              model name     : Intel(R) Xeon(R) CPU           E5606  @ 2.13GHz
              stepping     : 2
              cpu MHz          : 2133.307
              cache size     : 8192 KB
              fdiv_bug     : no
              hlt_bug          : no
              f00f_bug     : no
              coma_bug     : no
              fpu          : yes
              fpu_exception     : yes
              cpuid level     : 11
              wp          : yes
              flags          : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss nx rdtscp lm constant_tsc up arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm arat
              bogomips     : 4266.61
              clflush size     : 64
              cache_alignment     : 64
              address sizes     : 40 bits physical, 48 bits virtual
              power management:
              3. cat /proc/meminfo
              MemTotal:        4119360 kB
              MemFree:          707644 kB
              Buffers:          158376 kB
              Cached:          2773532 kB
              SwapCached:            0 kB
              Active:          1242268 kB
              Inactive:        1982836 kB
              Active(anon):     833904 kB
              Inactive(anon):   539200 kB
              Active(file):     408364 kB
              Inactive(file):  1443636 kB
              Unevictable:           0 kB
              Mlocked:               0 kB
              HighTotal:       3286984 kB
              HighFree:         132060 kB
              LowTotal:         832376 kB
              LowFree:          575584 kB
              SwapTotal:       6258680 kB
              SwapFree:        6258680 kB
              Dirty:                28 kB
              Writeback:             0 kB
              AnonPages:        293264 kB
              Mapped:           300936 kB
              Shmem:           1079916 kB
              Slab:              86204 kB
              SReclaimable:      48064 kB
              SUnreclaim:        38140 kB
              KernelStack:        2024 kB
              PageTables:        85352 kB
              NFS_Unstable:          0 kB
              Bounce:                0 kB
              WritebackTmp:          0 kB
              CommitLimit:     8318360 kB
              Committed_AS:    1953996 kB
              VmallocTotal:     122880 kB
              VmallocUsed:        4820 kB
              VmallocChunk:      99288 kB
              HugePages_Total:       0
              HugePages_Free:        0
              HugePages_Rsvd:        0
              HugePages_Surp:        0
              Hugepagesize:       2048 kB
              DirectMap4k:       10232 kB
              DirectMap2M:      897024 kB
              4. top -20
              top - 12:25:02 up  2:43,  1 user,  load average: 0.00, 0.00, 0.00
              Tasks: 233 total,   1 running, 232 sleeping,   0 stopped,   0 zombie
              Cpu(s):  1.0%us,  0.3%sy,  0.0%ni, 97.0%id,  0.0%wa,  0.7%hi,  1.0%si,  0.0%st
              Mem:   4119360k total,  3420676k used,   698684k free,   158416k buffers
              Swap:  6258680k total,        0k used,  6258680k free,  2773596k cached
              
                PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
               6582 oracle    20   0 1775m  23m  22m S  0.7  0.6   0:01.59 oracle            
               3228 oracle    20   0 1777m  37m  34m S  0.3  0.9   0:06.75 oracle            
               5448 oracle    20   0 1775m  26m  25m S  0.3  0.7   0:11.94 oracle            
               5675 oracle    20   0 1775m  25m  23m S  0.3  0.6   0:08.07 oracle            
               6686 kignatow  20   0  2832 1112  812 R  0.3  0.0   0:00.07 top               
                  1 root      20   0  2904 1472 1260 S  0.0  0.0   0:01.70 init              
                  2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd          
                  3 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0       
                  4 root      20   0     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd/0       
                  5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0        
                  6 root      20   0     0    0    0 S  0.0  0.0   0:00.27 events/0          
                  7 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cpuset            
                  8 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khelper           
                  9 root      20   0     0    0    0 S  0.0  0.0   0:00.00 netns             
                 10 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr         
                 11 root      20   0     0    0    0 S  0.0  0.0   0:00.00 sync_supers       
                 12 root      20   0     0    0    0 S  0.0  0.0   0:00.01 bdi-default       
                 13 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/0     
                 14 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kblockd/0         
                 15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpid            
                 16 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpi_notify      
                 17 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kacpi_hotplug     
                 18 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ata/0             
                 19 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ata_aux           
                 20 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksuspend_usbd     
                 21 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khubd             
                 22 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kseriod           
                 24 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khungtaskd        
                 25 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kswapd0           
                 26 root      25   5     0    0    0 S  0.0  0.0   0:00.00 ksmd              
                 27 root      20   0     0    0    0 S  0.0  0.0   0:00.00 aio/0             
                 28 root      20   0     0    0    0 S  0.0  0.0   0:00.00 crypto/0          
                 33 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pciehpd           
                 35 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kpsmoused         
                 36 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kstriped          
                 37 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksnapd            
                 38 root      20   0     0    0    0 S  0.0  0.0   0:00.00 usbhid_resumer    
                200 root      20   0     0    0    0 S  0.0  0.0   0:00.01 scsi_eh_0         
                201 root      20   0     0    0    0 S  0.0  0.0   0:00.02 scsi_eh_1         
              5. cat /etc/redhat-release
              Red Hat Enterprise Linux Server release 6.1 (Santiago)
              • 4. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                user12263161
                We had the similar issue. After checking lots of oracle and system logs, we found that one of the process (Third party) was causing processor to get hung and ultimately database server used to restart (Since it was in RAC cluster). Please check if you have any error message on /var/log/messages and alert.log file.
                • 5. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                  Veronica
                  Also you need to check whether any kernel modification was done recently and check if proper entry for kernel parameter is given in /etc/sysctl.conf file. We need to know these information to solve this issue.

                  I'm not sure what to look for, but here is the content of the /etc/sysctl.conf file:
                  # Kernel sysctl configuration file for Oracle Linux
                  #
                  # For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
                  # sysctl.conf(5) for more details.
                  
                  # Controls IP packet forwarding
                  net.ipv4.ip_forward = 0
                  
                  # Controls source route verification
                  net.ipv4.conf.default.rp_filter = 1
                  
                  # Do not accept source routing
                  net.ipv4.conf.default.accept_source_route = 0
                  
                  # Controls the System Request debugging functionality of the kernel
                  kernel.sysrq = 0
                  
                  # Controls whether core dumps will append the PID to the core filename.
                  # Useful for debugging multi-threaded applications.
                  kernel.core_uses_pid = 1
                  
                  # Controls the use of TCP syncookies
                  net.ipv4.tcp_syncookies = 1
                  
                  # Disable netfilter on bridges.
                  net.bridge.bridge-nf-call-ip6tables = 0
                  net.bridge.bridge-nf-call-iptables = 0
                  net.bridge.bridge-nf-call-arptables = 0
                  
                  # Controls the maximum size of a message, in bytes
                  kernel.msgmnb = 65536
                  
                  # Controls the default maxmimum size of a mesage queue
                  kernel.msgmax = 65536
                  
                  # Controls the maximum shared segment size, in bytes
                  kernel.shmmax = 4294967295
                  
                  # Controls the maximum number of shared memory segments, in pages
                  kernel.shmall = 268435456
                  fs.file-max = 6815744
                  net.core.rmem_default = 262144
                  net.core.wmem_default = 262144
                  net.core.rmem_max = 4194304
                  net.core.wmem_max = 1048576
                  fs.aio-max-nr = 1048576
                  #kernel.sem = 100  100 
                  kernel.sem = 250 32000 100 128
                  net.ipv4.ip_local_port_range = 9000 65500
                  • 6. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                    Catch-22
                    It seems your system has been running out of low memory pages. There could be all kinds of reasons including a hardware issue. The parameters you have shown are not the cause of the issue, nor will the system statistics be helpful while the machine is working fine. You may experience a memory leak that could build up over time, perhaps caused by a driver or incompatible hardware. There can be all sorts of reasons.

                    Are you mounting or supporting any shared file systems?
                    Are you using any kind of virtualization?
                    Is the hardware known to work?

                    I suggest to start with checking the syslog to find out if there are any clues that might explain why the OOM killer started to sacrifice processes in order to free up memory for the system.

                    Btw, Oracle 11g versions prior to 11.2.0.3 are not certified for Oracle Linux 6. But I would not necessarily blame the Oracle database. The database may have performance issues and fail or not work properly if not configured correctly, but it should not bring the system down. The Oracle database runs in user space and cannot touch low memory, though it may trigger a bug in the system.
                    • 7. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                      Billy~Verreynne
                      Veronica wrote:

                      I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.
                      The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.
                      There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.
                      Reasonable assumption - though it may happen in the database, but through no fault of the database itself, but as a result of database abuse.

                      Two potential issues comes to mind.

                      Server memory leakage by local client processes. I have seen this some years ago with Perl-DBI on Oracle 9i - where the Perl-DBI lib resulted in pretty severe memory leakage, with a single Perl process eventually consuming 600+ GB of private process memory. (work-around we implemented was to cycle the db connection (disconnect/connect) every 30 minutes in the same process)

                      Then there is also server memory leakage caused by server code - and in Oracle that is often caused by poorly written clients that do not release server resources (Java clients not releasing SQL reference cursors on the server side is a common one), or poorly written PL/SQL user code creating arrays of unscalable sizes in PGA memory (often via abusing bulk processing or DBMS_OUTPUT).
                      • 8. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                        Veronica
                        Salman,

                        thanks for the link. I've checked the parameters carefully, and I've updated oracle user's maximum processess limit. Also they recommend to disable "secure linux". The machine have secure linux enabled, but I'm not sure how that would be a factor...
                        • 9. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                          Veronica
                          user12263161 wrote:
                          We had the similar issue. After checking lots of oracle and system logs, we found that one of the process (Third party) was causing processor to get hung and ultimately database server used to restart (Since it was in RAC cluster). Please check if you have any error message on /var/log/messages and alert.log file.
                          There is absolutely nothing in the alert.log file, just the usuall, and then information about database starting (I've started it manually after rebooting the machine). /var/log/messages doesn't preserve the information what happened during crash, it's just new info since the system was rebooted. How do I preserve this information when the crash occurs? While the kernel panick occurs, I can't log into it, or do anything.
                          • 10. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                            Veronica
                            Dude wrote:
                            It seems your system has been running out of low memory pages. There could be all kinds of reasons including a hardware issue. The parameters you have shown are not the cause of the issue, nor will the system statistics be helpful while the machine is working fine. You may experience a memory leak that could build up over time, perhaps caused by a driver or incompatible hardware. There can be all sorts of reasons.

                            Are you mounting or supporting any shared file systems?
                            Are you using any kind of virtualization?
                            Is the hardware known to work?
                            It's a virtual machine on a VMware ESXi. I've scouted around for similar situations on Oracle Linux 6 machines. I've had a person claim something similar happened on one of these machines, but it was incidentall. The uptime of the other OL6 machines I've checked were up to two months, and if there were a recurring problem, I would have known.
                            Dude wrote:
                            I suggest to start with checking the syslog to find out if there are any clues that might explain why the OOM killer started to sacrifice processes in order to free up memory for the system.
                            Nothing much so far in /var/log/message. Can you offer some help how do I preserve the syslog to be examined after the crash?
                            Dude wrote:
                            Btw, Oracle 11g versions prior to 11.2.0.3 are not certified for Oracle Linux 6. But I would not necessarily blame the Oracle database. The database may have performance issues and fail or not work properly if not configured correctly, but it should not bring the system down. The Oracle database runs in user space and cannot touch low memory, though it may trigger a bug in the system.
                            That makes a lot of sense. But then again I don't understand why this is happening on this one machine. We've installed a operating system on a single virtual machine that served as a template. Then when a virtual machine was required, the templated were copied. So all vm we possess are essentially clones when it comes to operating system. And then there are 8 of such machines, but the issue exists only on the machine with a database. You're right, it runs in user space, but then again, why is that happening...
                            • 11. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                              Veronica
                              Billy  Verreynne  wrote:
                              Veronica wrote:

                              I have a Oracle Linux machine with nothing but Oracle Standard Edition database on it.
                              The machine runs fine for 3-4 weeks. Then "Kernel Panick - not syncing: Out of Memory and no killable process" occurs.
                              There is nothing else installed on the machine. Therefore, I assume it must be the database fault, and it eventually consumes all available memory.
                              Reasonable assumption - though it may happen in the database, but through no fault of the database itself, but as a result of database abuse.

                              Two potential issues comes to mind.

                              Server memory leakage by local client processes. I have seen this some years ago with Perl-DBI on Oracle 9i - where the Perl-DBI lib resulted in pretty severe memory leakage, with a single Perl process eventually consuming 600+ GB of private process memory. (work-around we implemented was to cycle the db connection (disconnect/connect) every 30 minutes in the same process)

                              Then there is also server memory leakage caused by server code - and in Oracle that is often caused by poorly written clients that do not release server resources (Java clients not releasing SQL reference cursors on the server side is a common one), or poorly written PL/SQL user code creating arrays of unscalable sizes in PGA memory (often via abusing bulk processing or DBMS_OUTPUT).
                              Thanks for the ideas.
                              • 12. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                                Catch-22
                                Nothing much so far in /var/log/message. Can you offer some help how do I preserve the syslog to be examined after the crash?
                                If the OOM killer starts killing processes as a last resort to free up low memory for the kernel to function it should be in syslog. Perhaps you need to look at archived versions of your message file in /var/log directory. You can try for instance "grep -i 'killed process' /var/log/messages*"
                                It's a virtual machine on a VMware ESXi.
                                I think if you had mentioned this in your intial post then responses would have been different. That's why I asked. The virtualization is probably where your low memory starvation stems from. I've seen related posts on the Interent about users having trouble with ESX and virtual machine memory reclamation. ESX server uses a memroy ballooning technique, which might be the culprit. You should get more specific hints and troubleshooting tips asking this question in the VMware forum.

                                It is generally not possible to troubleshoot the kind of problem you are experiencing without knowing your exact hardware, installation and use details.
                                • 13. Re: Oracle Standard Edition on Unix - kernel panick, Out of Memory
                                  XIC
                                  Hi,
                                  Can you please once check the values of /etc/sysctl.conf to install Oracle 11g R2 on Linux with Oracle Documentation? I don't think it as a DB issue. It might be a configuration issue of Operating System Kernel parameters for this DB version. Also be sure that the required O/S packages are installed properly. Hopefully the below links can help you.

                                  http://docs.oracle.com/cd/E11882_01/install.112/e24326.pdf
                                  http://docs.oracle.com/cd/E11882_01/install.112/e24324.pdf

                                  Regards,
                                  XIC