we have been running our PRD Oracle SOA Suite with a 2 nodes cluster , active active.
We lately face errors like the one
<Jan 2, 2012 5:55:40 AM UTC> <Error> <oracle.soa.bpel.engine> <BEA-000000> <
java.lang.OutOfMemoryError: unable to create new native thread
Ihave investigated a lot and I know that it is most likely related to threads consumption and the related O/S threads stack which might be exhausted in the O/S remaining memory, if we abstract the JVM one.
My question though is another one.
Where could I set/modify the number of threads (min, max ) and how the relation of the threads with the defined connections in the Data Sources is done.
I have read many docs from Oracle but I could not find a relevant one which contains all this info.
Can you please let me know which JVM vendor you are using e.g. Sun / Oracle HotSpot, Oracle Jrockit or IBM VM?
java.lang.OutOfMemoryError: unable to create new native thread
This means that your JVM is running out of native memory / C-Heap which is required when attempting to create a new Java Thread. This does not necessarily means that your OS is out of physical / virtual memory unless you are using a 64-bit VM.
There are many scenarios which can trigger this problem . One common problem is using a 32-bit VM along a Java Heap which is too big (e.g. 2.5 GB) vs. # of Threads created by your VM / Thread Pool settings.
For a 32-bit VM, the rule is the bigger your Java Heap, smaller is your C-Heap capacity and lower is your # of Java Threads that can be created by your VM.
Can you please provide the information below:
- JVM vendor and settings (Xms & Xmx, HotSpot vs. JRockit vs IBM, 32-bit vs. 64-bit)
- When problem is observed please generated a JVM Thread Dump so we can have a look at all your created and active Java Threads. Please also capture the total process size in MB and available physical / virtual memory of your OS (Solaris, Windows, AIX etc.)
One dummy question: what u mean by C-Heap? Ok I got. You meant the Native Heap (C-Heap)
The problem has occured 2 times around 03:00 in the morning. We are trying to find a way to have automatic thread dump generation.
Unfortunately we lack the knowledge in this regards and consultancy is out of question.
I have read a lot of ways to achieve it but I do not know the most effective one.
Hi Loukas and thanks for the provided information,
Since you also use the HotSpot VM, can you please provide me with your PermGen space settings as well e.g. PermSize / MaxPermSize e.g. 256 MB? 512 MB?
Given the capacity that you have from your physical server (16 GB) and Linux OS and fact that you are using a 32-bit VM, I'm definitely suspecting that you are simply running out of VM Native Heap (C-Heap).
The problem is that Weblogic is using a self tuning Thread Pool approach (since WLS 9.2) which means the # of created Threads may change and grow over time (grow / shrink) depending of your load or platform situation, slowdown conditions etc.. You need to ensure that you have enough Native Heap capacity to handle these dynamic Thread creations under any application scenario (happy and non happy paths).
That being said, your SOA Server 1 Java Heap is big for a 32-bit VM (2304 MB). Typically I do not recommend to my clients to go over 2 GB (2048 MB) for a single HotSpot VM with let's say 512 MB of PermGen space since this is leaving smaller space for your native Heap so you have to find a proper balance otherwise you need to either scale up your Weblogic domain vertically (add more JVM's / managed servers) or switch to a 64-bit VM.
My recommendations to you:
- Determine how much Java Heap you really need (enable verbose GC or monitoring tool of your choice) and determine if you can reduce your current settings to 2048 MB, this will re-allocate a 256 MB space segment along with memory addresses for your Native Heap, providing you extra capacity for Thread creations and Native VM operations.
Regarding Thread Dump generation, a simple approach is to generate one every 5 minutes using Linux / UNIX cron (kill -3 <Java Pid>). This is really non intrusive and will at least provide you with # and source of Threads at least 5 minutes before / after your OOM occurence.
One final point, If the # of threads created is too big e.g. few hundreds, then please consider using Weblogic Work Managers and Max Thread Constraints. This will prevent Weblogic from creating too many Threads from your application requests and overload your Native Heap capacity.
I followed the rule for 1/3 of the Heap Max/Min.
So you mean that Native Memory used by the JVM for Code Optimization and for loading the classes and libraries is not enough?
Would the following give me the remaining space left for native heap: Native Space = ( ProcessSize – MaxHeapSize – MaxPermSize)
and how could I find the Process Size in the top command on linux? I am doing it like this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15294 oracle11 25 0 3529m 3.0g 43m S 1 19.3 94:51.67 java
so it should be that the process size is 19,3 % of 16GB of the total physical memory?
**I forgot to mention that I do not have GC verbose but the following settings:
-Xms2304m -Xmx2304m -XX:PermSize=768m -XX:MaxPermSize=768m -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:/opt/oracle11/Middleware/user_projects/domains/soa_domain/servers/soa_server1/logs/gc_soa1.log
I will change it to verbose with the first opportunity. I also use Java Visual VM to monitor and I see it reaches very close to the 2.3 GB of the heap size.
Maybe it make sense to send you a snapshot from the gc log file which also includes a Full GC:
Yes, Native Heap capacity for a 32-bit VM is typically (4 GB limit - Xmx - MaxPermSize). As you can see, bigger is the Max Heap size & Max PermGen space, smaller is the Native capacity and smaller the # of Java Threads that can be created by this VM. Typically, once your process size is closing in to 3.5 GB, you will be running dangerously close to ouf of memory addresses and exposed to OurOfMemoryError for native operations.
PermGen space is applicable for HotSpot VM only and is used mainly for class metadata caching and some other optimizations. Native heap is required for Java Threads creation and other native VM operations.
In your case, you have both a big Java Heap (2304 MB) and PermGen space (768 MB) which is leaving very small room for Native Heap and Thread creation. This is your problem.
The GC data output is very useful and yes please include a Full GC, memory spaces post Full GC is what you need to determine how you can tune your Java Heap in a safely manner. Verbose GC and post Full GC will also give you the PermGen space footprint.
Please look at options and opportunities of reducing both Max Java Heap and PermGen by 256 MB.
your explanaion is really helpful. We will plan to descrease the 2 paramters next week, since we are in a freeze right now (no changes allowed on PRD).
The memory allocation of our 2 soa processes (on 2 different physical machines - today we started again the 2nd since it was down for functional issues with EDN) have more or less the same size:
So we are close to the 3.5GB you mentioned but fortunately the load is spread now and the heap size is really not used more than 1GB.
No, another question about the thread monitoring. I know that most of the info can be found in the generated thread dumps , as soon as we will put it in cron with kill -3.
Is there any other way to monitor and maybe set alarms about the thread count, before we examine the option of the work managers?
During the investigations for this issue we got also info that the issue could be related to the stack size the Linux OS allocates for the threads. This can be found by ulimit -s and in our case is 10MB. But I want to ask you, if this meant to be physical memory which is remaining if we calculate (total physical memory - processes memory - I/O-Caching memory used by Linux). That being said , I understood that each thread in the JVM maps an address space (memory) on the rest physical RAM and it can come to a stuck in case the max number of threads is reached in the physical memory outside the JVMs.
You feedback on this would also be very welcome.
Last but not least a question about the logging. By activating the kill -3 everything will be directed to the .out file . In our case we have the default like soa_server1.out which is also collecting info about the status of the service. Willn't be difficult to split this file whenever we will have an issue in order to abstract the lines for the dump?
Can we define a dedicated file for the dump?
When I check in the logging tab of the soa_server1 I see that the following are not selected:
Some explanation about the inner workings of the JVM can be found here: http://middlewaremagic.com/weblogic/?p=6930
JVM and operating system tuning can be found here: http://middlewaremagic.com/weblogic/?p=7083 (look in the system section)
More on operating system tuning can be found here: http://middlewaremagic.com/weblogic/?p=7737
Maybe the posts give you some idea on how to proceed.