This discussion is archived
12 Replies Latest reply: Feb 18, 2013 8:49 AM by Dave Miner RSS

Instability and Poor Performance with 11 11/11 and 11.1

988819 Newbie
Currently Being Moderated
I've upgraded an OpenSolaris install to Solaris 11.1 over time and ever since I hit Solaris 11 11/11 and Solaris 11.1 my system has been unstable and slow (especially ZFS and GDM (which I had to disable in 11/11.1 because it was using too much CPU)). Whenever I shut down in Solaris 11/11.1 it causes a kernel panic. I run this command to shutdown:
-----
/usr/sbin/shutdown -y -g 60 -i 5
-----
and it causes this (then the system auto-restarts -- it never completes the shutdown):
-----
TIME UUID SUNW-MSG-ID
Jan 28 2013 23:19:14.682124000 54fbe302-2309-6f14-8d7f-c81e9c3369b7 SUNOS-8000-KL

TIME CLASS ENA
Jan 28 23:18:29.9322 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000

nvlist version: 0
version = 0x0
class = list.suspect
uuid = 54fbe302-2309-6f14-8d7f-c81e9c3369b7
code = SUNOS-8000-KL
diag-time = 1359433153 925385
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
__case_state = 0x1
topo-uuid = 78f32799-20fb-446f-b758-f24f4197b812
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru = sw:///:path=/var/crash/opensolaris/.54fbe302-2309-6f14-8d7f-c81e9c3369b7
resource = sw:///:path=/var/crash/opensolaris/.54fbe302-2309-6f14-8d7f-c81e9c3369b7
savecore-succcess = 0
os-instance-uuid = 54fbe302-2309-6f14-8d7f-c81e9c3369b7
panicstr = deadman: timed out after 120 seconds of clock inactivity
panicstack = fffffffffb9fcc56 () | genunix:cyclic_expire+ac () | genunix:cyclic_fire+76 () | unix:cbe_fire+65 () | unix:av_dispatch_autovect+74 () | unix:dispatch_hilevel+1f () | unix:switch_sp_and_call+13 () | unix:do_interrupt+f2 () | unix:cmnint+ba () | unix:mach_cpu_pause+21 () | unix:cpu_pause+7f () | unix:thread_start+8 () |
crashtime = 1359432916
panic-time = January 28, 2013 11:15:16 PM EST EST
(end fault-list[0])

fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x51074dc2 0x28a862e0
-----
Additionally, I've seen a huge slowdown in ZFS performance (I kept the old boot environments for the previous versions so I went back and pulled these #s using dd after I upgraded to Solaris 11.1):
-----
WRITE:
OpenSolaris SNV134     211 MB/s
Solaris 11 Express     194 MB/s
OpenIndiana 151a7     215 MB/s
Solaris 11 11/11     182 MB/s
Solaris 11.1          150 MB/s

READ:
OpenSolaris SNV134     470 MB/s
Solaris 11 Express     499 MB/s
OpenIndiana 151a7     417 MB/s
Solaris 11 11/11     177 MB/s
Solaris 11.1          276 MB/s
-----
Lastly, there's been a couple times where just running tests on my zfs pool would cause a kernel panic (like dd or bonnie++):
-----
TIME UUID SUNW-MSG-ID
Jan 26 2013 18:40:21.947381000 5a9c2174-51bd-6af5-cda3-ceb12d0591bb SUNOS-8000-KL

TIME CLASS ENA
Jan 26 18:39:33.6420 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000

nvlist version: 0
version = 0x0
class = list.suspect
uuid = 5a9c2174-51bd-6af5-cda3-ceb12d0591bb
code = SUNOS-8000-KL
diag-time = 1359243621 817586
de = fmd:///module/software-diagnosis
fault-list-sz = 0x1
__case_state = 0x1
topo-uuid = 08cec1f5-1959-c812-85e3-fa1bb969b7a3
fault-list = (array of embedded nvlists)
(start fault-list[0])
nvlist version: 0
version = 0x0
class = defect.sunos.kernel.panic
certainty = 0x64
asru = sw:///:path=/var/crash/opensolaris/.5a9c2174-51bd-6af5-cda3-ceb12d0591bb
resource = sw:///:path=/var/crash/opensolaris/.5a9c2174-51bd-6af5-cda3-ceb12d0591bb
savecore-succcess = 0
os-instance-uuid = 5a9c2174-51bd-6af5-cda3-ceb12d0591bb
panicstr = BAD TRAP: type=e (#pf Page fault) rp=fffffffc801bba00 addr=28 occurred in module "zfs" due to a NULL pointer dereference
panicstack = unix:die+105 () | unix:trap+153e () | unix:cmntrap+e6 () | zfs:arc_hash_remove+28 () | zfs:arc_evict_from_ghost+c0 () | zfs:arc_adjust_ghost+4e () | zfs:arc_adjust+51 () | zfs:arc_reclaim_thread+1aa () | unix:thread_start+8 () |
crashtime = 1359232124
panic-time = January 26, 2013 03:28:44 PM EST EST
(end fault-list[0])

fault-status = 0x1
severity = Major
__ttl = 0x1
__tod = 0x51046965 0x3877e308
-----
What could be causing all these issues -- why are the OpenSolaris and Solaris 11 Express installs faster/more stable? Is it a hardware incompatibility issue? How can I determine the root cause and fix it?

Thanks.

Edited by: RavenShadow on Feb 10, 2013 10:17 AM

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points