We run OSGI tests for our product on windows server 2003, windows server 2008 and windows server 2008 64bit, running on vmware. I upgrade java from 1.6.0_20 to 1.6.0_26 recently and our hudson jobs were timing out.
Only one test suite occasionally hangs after OSGI is started or starting the tests. The process seems to just to be waiting on something.
You can NOT connect with jstack, jconsole, jvisualvm, ...., so any way of knowing what goes wrong in java is impossible and no normal jvm crash output on stdout.
I was able to forcibly get a windows dump file from the process, but I do not have any experience with these dump files. I tried with installing visual studio with windows debugging tools to open the dump file but I could not understand much of this.
Information that I understood from windbg:
# Child-SP RetAddr Call Site
00 00000000`0008ebf8 00000000`75852bcd wow64cpu!TurboDispatchJumpAddressEnd+0x690
01 00000000`0008ec00 00000000`758cd07e wow64cpu!TurboDispatchJumpAddressEnd+0x484
02 00000000`0008ecc0 00000000`758cc549 wow64!Wow64SystemServiceEx+0x1ce
03 00000000`0008ed10 00000000`77ce84c8 wow64!Wow64LdrpInitialize+0x429
04 00000000`0008f260 00000000`77ce7623 ntdll!RtlResetRtlTranslations+0x1b08
05 00000000`0008f760 00000000`77cd308e ntdll!RtlResetRtlTranslations+0xc63
06 00000000`0008f7d0 00000000`00000000 ntdll!LdrInitializeThunk+0xe
so the windows on windows64 translation layer is waiting for something.
I have a hunch that this line of code in one of the first tests has something to do with this:
Is there a way around this? or should be revert back to maybe update 24 and will java 1.7.0 have the same problems? (we're planning on upgrading when it will be released in June)?
Can you send a SIGQUIT signal to the hanging Java Process? The JVM will then print a full thread dump to STDOUT.
Or add/implement some background daemon thread which dumps the current thread dumps to STDERR on a regular interval?
Knowing the exact code pointer may be crucial to track the problem down and tackle it well. Update: On Windows I only know Ctrl-Pause in the console to trigger a JVM Dump. So maybe the 2nd approach might help.
Edited by: Ben on 11.07.2011 15:33
No, I tried this before, the process does not respond at all with a Ctrl-Pause, sorry I should have had mentioned this in my first post.
A co-worker had the same problem on win 7 but with update 24, so downgrading will not really be an option.
I added extra logging and it indeed locks up in:
final String fonts = GraphicsEnvironment.getLocalGraphicsEnvironment().getAvailableFontFamilyNames();
This is used by several classes at the startup of the tests.
I've rewritten it a bit that with some late initialization so that the concurrent access to the native AWT code is limited.
with this simple program the lock up never occurs:
public class Main
static int MAX = 5;
public static void main(String args) throws InterruptedException, BrokenBarrierException
CyclicBarrier bar = new CyclicBarrier(MAX +1);
for(int i = 0; i < MAX; i++)
new Thread(new TestRunnable(bar)).start();
private static final class TestRunnable implements Runnable
public TestRunnable(CyclicBarrier bar)
this.bar = bar;
public void run()
int k =0;
k = bar.await();
catch (InterruptedException e1)
catch (BrokenBarrierException e1)
for (int j = 0; j < 500; j++)
GraphicsEnvironment e = GraphicsEnvironment.getLocalGraphicsEnvironment();
String n = e.getAvailableFontFamilyNames();
System.out.println(k+" Font families:");
for (int i = 0; i < n.length; i++)
System.out.println(k+" " + n);
so I'm thinking of a lockup in the classloaders or jni load of external libraries.