Forum Stats

  • 3,727,354 Users
  • 2,245,375 Discussions
  • 7,852,753 Comments

Discussions

developer studio 12.5 collector crashes with any 64bit 1.8 jvm on linux

Hi,

Does anyone experience the following (and is there solution available) oracle's developer studio collector tool crashes with any 1.8 (.65-.100) 64bit jvm I've tried on linux

It crashes almost immediately if I don't specify -XX:+PreserveFramePointer... and if that option is set it crashes anyway it just takes longer:

#

# A fatal error has been detected by the Java Runtime Environment:

#

#  SIGSEGV (0xb) at pc=0x00007effd095c4ec, pid=59207, tid=139630785566464

#

# JRE version: Java(TM) SE Runtime Environment (8.0_65-b17) (build 1.8.0_65-b17)

# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.65-b01 mixed mode linux-amd64 compressed oops)

# Problematic frame:

# C  [libcollector.so+0x444ec]

#

#

# If you would like to submit a bug report, please visit:

#   http://bugreport.java.com/bugreport/crash.jsp

#

---------------  T H R E A D  ---------------

Current thread is native thread

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000017f0164785c5

Registers:

RAX=0x0000000000000004, RBX=0x00007efe535f3fc0, RCX=0x0000000000000001, RDX=0x000017f0164785c5

RSP=0x00007efe535f3cf0, RBP=0x00007efe535f4610, RSI=0x000017f0164785c6, RDI=0x00007efe535f45e0

R8 =0x0000000000000004, R9 =0x00007effcd6ecb70, R10=0x0000000000000008, R11=0x0000000000000246

R12=0x00007efe535f5e80, R13=0x0000000000000000, R14=0x0000000000000004, R15=0x0000000000000000

RIP=0x00007effd095c4ec, EFLAGS=0x0000000000010217, CSGSFS=0x0000000000000033, ERR=0x0000000000000004

  TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007efe535f3cf0)

0x00007efe535f3cf0:   0000000000000000 0000000000000000

0x00007efe535f3d00:   0000000100000000 0000000000000000

0x00007efe535f3d10:   00000030f5c07c1a 00007efe535f8300

0x00007efe535f3d20:   00000030f5c05e27 00000030f5c05e40

0x00007efe535f3d30:   00000030f5c05e48 00000030f5c05e57

0x00007efe535f3d40:   00007effd06ff738 0000000200000000

0x00007efe535f3d50:   0000000000000000 0000000000000000

0x00007efe535f3d60:   0000000000000000 00001fffd0f24d18

0x00007efe535f3d70:   0000000000000000 0000000000000000

0x00007efe535f3d80:   00007efe535f3dd0 00000000d0b8c090

0x00007efe535f3d90:   00007effd0b89458 00000030f541164d

0x00007efe535f3da0:   000000000000001b 00007effd0d08820

0x00007efe535f3db0:   0000000000000000 0000000000000000

0x00007efe535f3dc0:   000017f0164785c5 ffffffffffffffff

0x00007efe535f3dd0:   00007efe535f4570 000000000000014b

0x00007efe535f3de0:   0000000000002f4a 00007effd0b8c090

0x00007efe535f3df0:   00007effd0b89458 00000030f5c0f710

0x00007efe535f3e00:   0000000000000001 0000000000000000

0x00007efe535f3e10:   0000000000000000 0000000000000002

0x00007efe535f3e20:   0000000000000000 000000000000002f

0x00007efe535f3e30:   00595b771bbd4591 0000000000000008

0x00007efe535f3e40:   0000000000000293 000000000000014b

0x00007efe535f3e50:   0000000000002f4a 00007effd0b8c090

0x00007efe535f3e60:   00007effd0b89458 000000000000014b

0x00007efe535f3e70:   00007effcdef7f4a 00007efe535f4570

0x00007efe535f3e80:   000000000002c80e 00000000000298c4

0x00007efe535f3e90:   0000000000000fde ffffffffffffffff

0x00007efe535f3ea0:   00007efe535f43b0 00000030f58db4dd

0x00007efe535f3eb0:   0000000000000293 0000000000000033

0x00007efe535f3ec0:   0000000000000004 000000000000000e

0x00007efe535f3ed0:   0000000000000004 000000000000000c

0x00007efe535f3ee0:   00007efe535f3fc0 0000000000000000

Instructions: (pc=0x00007effd095c4ec)

0x00007effd095c4cc:   48 c7 85 68 f7 ff ff 00 00 00 00 41 be 04 00 00

0x00007effd095c4dc:   00 b8 04 00 00 00 44 8b c0 48 8d 72 01 48 89 33

0x00007effd095c4ec:   0f b6 02 8d 50 c0 83 fa 27 77 63 48 8d 0d a6 32

0x00007effd095c4fc:   00 00 48 63 d2 48 03 0c d1 ff e1 41 b8 02 00 00

Register to memory mapping:

RAX=0x0000000000000004 is an unknown value

RBX=0x00007efe535f3fc0 is an unknown value

RCX=0x0000000000000001 is an unknown value

RDX=0x000017f0164785c5 is an unknown value

RSP=0x00007efe535f3cf0 is an unknown value

RBP=0x00007efe535f4610 is an unknown value

RSI=0x000017f0164785c6 is an unknown value

RDI=0x00007efe535f45e0 is an unknown value

R8 =0x0000000000000004 is an unknown value

R9 =0x00007effcd6ecb70 is an unknown value

R10=0x0000000000000008 is an unknown value

R11=0x0000000000000246 is an unknown value

R12=0x00007efe535f5e80 is an unknown value

R13=0x0000000000000000 is an unknown value

R14=0x0000000000000004 is an unknown value

R15=0x0000000000000000 is an unknown value

Stack: [0x00007efe534f9000,0x00007efe535fa000],  sp=0x00007efe535f3cf0,  free space=1003k

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

C  [libcollector.so+0x444ec]

C  [libcollector.so+0x421e7]  __collector_get_frame_info+0x407

C  [libcollector.so+0x4f023]

C  [libcollector.so+0x3e0a6]

Tagged:
Nikmolchanov-Oracle

Answers

  • nevgeniev
    nevgeniev Member Posts: 13
    edited March 2017

    just for those who interested libcollector.so from solaris studio 12.3 works just fine, yet you would have to use er_print from 12.5 to read results

    Nikmolchanov-Oracle
  • Nikmolchanov-Oracle
    Nikmolchanov-Oracle Member Posts: 81
    edited March 2017

    Thank you very much for the report. Unfortunately I cannot reproduce the problem on our test systems.

    Could you please provide additional information?

    1. Linux version

    2. H/W information

    If you run "perftools_whichami -a", it will print something like this (I replaced the machine name with "xxx"):

    MACHINE: xxx_x86_64_OL_UEK_7.3

    HOSTNAME: xxx

    OS: OL_UEK_7.3

    OSARCH: Linux

    OSNAME: Linux OL_UEK_7.3 (4.1.12-61.1.23.el7uek.x86_64) -XEN-

    NCPUs: 4

    CPUTYPE: Broadwell-EP Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz

    IMPLEMENTATION: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz

    FLAVOR: Linux

    VERSION: 7.3

    Also it would be great to profile the same application. Can you try to profile "analyzer"?

    Thanks.

    Nik

  • nevgeniev
    nevgeniev Member Posts: 13
    edited March 2017

    hi,

    MACHINE: xxx_x86_64_RHEL_6.6

    HOSTNAME: xxx

    OS: RHEL_6.6

    OSARCH: Linux

    OSNAME: Linux RHEL_6.6 (2.6.32-504.46.1.el6.x86_64)

    NCPUs: 32

    CPUTYPE: Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz

    IMPLEMENTATION: Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz

    FLAVOR: Linux

    VERSION: 6.6

    SYSTEM: Linux RHEL_6.6 (2.6.32-504.46.1.el6.x86_64) Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz

    TOOLSARCH: intel-Linux

    KERNEL_VERSION: 2.6.32-504.46.1.el6.x86_64

    GCC_VERSION: gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11)

    GPLUSPLUS_VERSION: g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11)

    GNU_libc_VERSION: <?>

    GNU_libc64_VERSION: <?>

    Nikmolchanov-Oracle
  • nevgeniev
    nevgeniev Member Posts: 13
    edited March 2017

    also... it looks like it happens specifically with application I need to profile which is quite complex and involves non-java threads as well

    I've tried to run collector on simplistic 'false sharing' java program and everything works just fine.

    oracle studio 12.3 didn't crash but doesn't decode java8 stacks unfortunately..

    if you could provide me debug version libcollector I should be able to provide you with exact line numbers from __collector_get_frame_info

    C  [libcollector.so+0x444ec]

    C  [libcollector.so+0x421e7]  __collector_get_frame_info+0x407

    C  [libcollector.so+0x4f023]

    C  [libcollector.so+0x3e0a6]

  • nevgeniev
    nevgeniev Member Posts: 13
    edited March 2017

    some more info (function names) based on

    objdump -d lib/analyzer/amd64/libcollector.so

    C  [libcollector.so+0x444ec]  __collector_getStackTrace

    C  [libcollector.so+0x421e7]  __collector_get_frame_info+0x407

    C  [libcollector.so+0x4f023]  __collector_register_module

    C  [libcollector.so+0x3e0a6]  __collector_hwc_out_of_range

  • Nikmolchanov-Oracle
    Nikmolchanov-Oracle Member Posts: 81
    edited March 2017

    Thank you very much for the additional information! We do not have such H/W (E5-2667),

    but we could probably reproduce the problem on our test systems if we have the same application.

  • nevgeniev
    nevgeniev Member Posts: 13
    edited April 2017

    Ok,

    It seems I found a problem... looks like collect fails to parse onload stack frames (solarflare userspace network driver)

    Can you check it? The onload version is:

    OpenOnload 201509-u1

    Copyright 2006-2016 Solarflare Communications, 2002-2005 Level 5 Networks

    Built: May 19 2016 12:44:01 (release)

    Kernel module: 201509-u1

    Nikmolchanov-Oracle
  • Nikmolchanov-Oracle
    Nikmolchanov-Oracle Member Posts: 81
    edited April 2017

    Thank you very much for localizing the problem!

    Recently we found and fixed several similar problems in "libcollector.so",

    and probably this one is already fixed, but we have to verify it.

    Do you know which application I can try to profile to reproduce this bug?

    And one more question: I asked my management if we can send you a debug

    version of collect and libcolledtor.so to verify that the problem is fixed.

    Would you be able to try this development debug version?

    Thanks.

    Nik

  • nevgeniev
    nevgeniev Member Posts: 13
    edited May 2017

    Hi,

    Yes, I would be glad to run debug version of collector/libcollect

    I'll try to produce 'demo' program which sends multicast packets and try to run it under onload/collector ... the actual program uses tibco ftl on top of onload, but I don't think ftl contribute anything to the bug as it runs just fine w/o onload

    Nikmolchanov-Oracle
  • Nikmolchanov-Oracle
    Nikmolchanov-Oracle Member Posts: 81
    edited May 2017

    I'm still waiting for the answer from our management on my question, so if you can create a demo program - that would be very useful!

    Thanks.

    Nik

  • Nikmolchanov-Oracle
    Nikmolchanov-Oracle Member Posts: 81
    edited May 2017

    I found out that I cannot send my debug version via email, but there is a chance that a special site can be created for you to download the debug version.

    To create such site we need some information from you. Can you send me email, and I'll reply with the explanation of what information is needed?

    My email: [email protected]

    Thanks.

    Nik

This discussion has been closed.