3 Replies Latest reply on Nov 26, 2012 10:26 AM by StewCobley

    Kernel panic when trying to start a OEL5u5 template vm


      I am in the process of following a guide to setup various oracle vm templates, so far I have installed OVS 2. 2 and got the OVM Manager working, imported the template for OEL5U5 and created a vm from it.. the problem comes when starting that vm.

      The log in the OVMM console shows the following;

      Update VM Status - Running
      Configure CPU Cap
      Set CPU Cap: failed:<Exception: failed:<Exception: ['xm', 'sched-credit', '-d', '32_EM11g_OVM', '-c', '0'] => Error: Domain '32_EM11g_OVM' does not exist.

      File "/opt/ovs-agent-2.3/OVSXXenVMConfig.py", line 2531, in xen_set_cpu_cap
      File "/opt/ovs-agent-2.3/OVSCommons.py", line 92, in run_cmd
      raise Exception('%s => %s' % (args, err))

      The xend.log shows;

      [2012-11-12 16:42:01 7581] DEBUG (DevController:139) Waiting for devices vtpm.
      [2012-11-12 16:42:01 7581] INFO (XendDomain:1180) Domain 32_EM11g_OVM (3) unpaused.
      [2012-11-12 16:42:03 7581] WARNING (XendDomainInfo:1907) Domain has crashed: name=32_EM11g_OVM id=3.
      [2012-11-12 16:42:03 7581] ERROR (XendDomainInfo:2041) VM 32_EM11g_OVM restarting too fast (Elapsed time: 11.377262 seconds). Refusing to restart to avoid loops .
      [2012-11-12 16:42:03 7581] DEBUG (XendDomainInfo:2757) XendDomainInfo.destroy: domid=3
      [2012-11-12 16:42:12 7581] DEBUG (XendDomainInfo:2230) Destroying device model
      [2012-11-12 16:42:12 7581] INFO (image:553) 32_EM11g_OVM device model terminated

      as always any help is much appreciated.

      Edited by: StewCobley on 13-Nov-2012 06:23
        • 1. Re: Set CPU Cap: failed when trying to start a OEL5u5 template vm
          I have set_on_crash="preserve" in the vm.cfg and have then run xm create -c to get the console screen while booting and this is the log of what happens..

          Started domain 32_EM11g_OVM (id=4)
          Bootdata ok (command line is ro root=LABEL=/ )
          Linux version 2.6.18- (mockbuild@ca-build10.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Mon Mar 29 18:27:00 EDT 2010
          BIOS-provided physical RAM map:
          Xen: 0000000000000000 - 0000000180800000 (usable)
          No mptable found.
          Built 1 zonelists. Total pages: 1574912
          Kernel command line: ro root=LABEL=/
          Initializing CPU#0
          PID hash table entries: 4096 (order: 12, 32768 bytes)
          Xen reported: 1600.008 MHz processor.
          Console: colour dummy device 80x25
          Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
          Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
          Software IO TLB disabled
          Memory: 6155256k/6299648k available (2514k kernel code, 135548k reserved, 1394k data, 184k init)
          Calibrating delay using timer specific routine.. 4006.42 BogoMIPS (lpj=8012858)
          Security Framework v1.0.0 initialized
          SELinux: Initializing.
          selinux_register_security: Registering secondary module capability
          Capability LSM initialized as secondary
          Mount-cache hash table entries: 256
          CPU: L1 I Cache: 64K (64 bytes/line), D cache 16K (64 bytes/line)
          CPU: L2 Cache: 2048K (64 bytes/line)
          general protection fault: 0000 [1] SMP
          last sysfs file:
          CPU 0
          Modules linked in:
          Pid: 0, comm: swapper Not tainted 2.6.18- #1
          RIP: e030:[<ffffffff80271280>] [<ffffffff80271280>] identify_cpu+0x210/0x494
          RSP: e02b:ffffffff80643f70 EFLAGS: 00010212
          RAX: 0040401000810008 RBX: 0000000000000000 RCX: 00000000c001001f
          RDX: 0000000000404010 RSI: 0000000000000001 RDI: 0000000000000005
          RBP: ffffffff8063e980 R08: 0000000000000025 R09: ffff8800019d1000
          R10: 0000000000000026 R11: ffff88000102c400 R12: 0000000000000000
          R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
          FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
          CS: e033 DS: 0000 ES: 0000
          Process swapper (pid: 0, threadinfo ffffffff80642000, task ffffffff804f4b80)
          Stack: 0000000000000000 ffffffff802d09bb ffffffff804f4b80 0000000000000000
          0000000021100800 0000000000000000 0000000000000000 ffffffff8064cb00
          0000000000000000 0000000000000000
          Call Trace:
          [<ffffffff802d09bb>] kmem_cache_zalloc+0x62/0x80
          [<ffffffff8064cb00>] start_kernel+0x210/0x224
          [<ffffffff8064c1e5>] _sinittext+0x1e5/0x1eb

          Code: 0f 30 b8 73 00 00 00 f0 0f ab 45 08 e9 f0 00 00 00 48 89 ef
          RIP [<ffffffff80271280>] identify_cpu+0x210/0x494
          RSP <ffffffff80643f70>
          <0>Kernel panic - not syncing: Fatal exception

          clear as mud to me.

          are there any other logs that will help me?

          I have now deployed another vm from the same template and used the default vm settings rather than adding more memory etc - I get exactly the same error.

          Edited by: StewCobley on 13-Nov-2012 06:21
          • 2. Re: Set CPU Cap: failed when trying to start a OEL5u5 template vm
            Which version of OVS are you using ? 3.1.1 ? Which template are you struggling with ?
            • 3. Re: Set CPU Cap: failed when trying to start a OEL5u5 template vm
              I have had word from Oracle on this and apparently it is a known issue with my server (Dell 815), our processors (Opterons), our OVS version (2.2) and the kernel verion of OEL 5u5 not working in combination together. change any of the variables and it works... which is brilliant except we have to use this server with this version of OVS for the particular 5u5 templates we are installing :D

              We are going to try a work around with live CD and upgrade the kernel once the templates are installed.