0 Replies Latest reply: Nov 6, 2012 9:24 AM by TonyM RSS

    Need help diagnosing possible server performance issue(s)

    TonyM
      First off, I'm a DBA and not a Solaris Admin. I don't have root access, so I can't see/get all details. I only have access to what a "normal" user can use for performance tools.

      We have a 2 node Oracle RAC running Solaris 10 and Oracle 11gR2 11.2.0.3. We have 14 databases running on this cluster. Some of the applications are poorly designed. We do daily exports and RMAN backups. Total size for all of the databases on the server is around 1.5TB. Needless to say it is a busy mess.

      Most of the time things run along well enough nobody complains. Occasionally we have issues where a query will go from taking 2 seconds to 20 seconds. The SysAdmins don't believe there is an OS/System issue. They always consider it an applicaitons code problem. My issue is that something that performs 95% of the time at 2 seconds and then 5% of the time at 20 seconds isn't due to application code; its a resource contention problem. I'm stuck between the application dev/dba's and the System Administration teams looking for answeres.

      We had some issues this morning with things running slow. Solely looking at some OS performance metrics, is there anything that stands out as a red flag to any of you? The user/sys/idle CPU numbers don't look bad, but I'm not sure about the other columns. To me, it looks like a lot of interrupts. Another thing is there's hardly ever more than 10 processes running on these servers even though there are 1400 processes on average.

      -----------------------------------------------------------------------------------------------------------------------
      uname -a for both servers:
      SunOS fin-ss65 5.10 Generic_147440-19 sun4u sparc SUNW,Sun-Fire-V890
      -----------------------------------------------------------------------------------------------------------------------

      Node 1:

      load averages: 18.90, 20.65, 21.63 08:34:43
      1430 processes:1413 sleeping, 6 running, 1 zombie, 1 stopped, 9 on cpu

      Memory: 64G real, 30G free, 24G swap free

      PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
      2834 oracle 2 10 0 0K 0K cpu0 101:25 6.28% oracle
      5092 oracle 11 22 0 0K 0K sleep 2:59 6.23% oracle
      11838 oracle 1 11 0 0K 0K run 0:28 6.14% oracle
      26070 oracle 11 21 0 0K 0K cpu18 7:03 5.12% oracle
      8120 oracle 2 21 0 0K 0K cpu16 0:25 3.44% oracle
      27138 root 11 100 -20 0K 0K sleep 24.0H 3.28% osysmond.bin
      21059 oracle 763 59 0 0K 0K sleep 74.1H 3.23% java
      19309 oracle 26 100 -20 0K 0K sleep 406.4H 2.99% ocssd.bin
      20144 oracle 55 59 0 0K 0K sleep 340.6H 2.28% oraagent.bin
      9013 hrbid 2 42 2 0K 0K sleep 37:19 2.04% oracle
      2638 oracle 2 53 0 0K 0K run 0:54 1.99% oracle
      3638 oracle 2 23 0 0K 0K sleep 1:01 1.95% oracle
      29052 oracle 2 59 0 0K 0K sleep 1:54 1.56% oracle
      12154 oracle 1 22 0 0K 0K run 0:04 1.46% oracle
      13593 oracle 2 11 0 0K 0K run 3:20 1.36% oracle


      CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
      0 705 1 1784 3323 2066 3379 731 980 234 3 6222 50 18 0 31
      1 747 1 1498 2767 1355 4008 794 1084 231 2 6013 48 16 0 35
      2 637 1 1321 14826 1416 3438 710 971 215 3 5859 51 16 0 33
      3 813 1 1327 2462 820 4181 857 1134 233 2 6252 49 15 0 35
      16 615 1 1331 2618 1335 3510 709 969 207 3 5842 51 16 0 33
      17 729 1 1265 2276 812 4131 805 1096 228 2 5984 49 15 0 36
      18 601 1 1181 2148 899 3413 669 941 195 3 5675 52 15 0 33
      19 716 1 1257 2293 829 4179 787 1109 228 2 5969 48 15 0 37

      tty md0 md1 md2 md3 cpu
      tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id
      0 27 208 6 17 155 4 17 157 4 17 152 4 19 50 16 0 34

      kthr memory page disk faults cpu
      r b w swap free re mf pi po fr de sr m0 m1 m2 m3 in sy cs us sy id
      2 1 0 42406296 32797288 851 5563 375 8 8 0 0 6 4 4 4 32712 47816 30240 50 16 34

      -----------------------------------------------------------------------------------------------------------------------

      Node 2:

      load averages: 12.05, 15.46, 17.72 08:31:01
      1429 processes:1416 sleeping, 3 running, 1 zombie, 2 stopped, 7 on cpu

      Memory: 64G real, 30G free, 24G swap free


      PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
      16033 root 11 100 -20 0K 0K cpu1 65.4H 8.21% osysmond.bin
      2783 oracle 11 21 0 0K 0K run 12:38 7.29% oracle
      388 oracle 1 31 0 0K 0K sleep 0:21 5.04% oracle
      392 oracle 1 32 0 0K 0K cpu0 0:14 3.81% oracle
      29839 oracle 1 53 0 0K 0K sleep 0:39 3.39% oracle
      17883 oracle 26 100 -20 0K 0K sleep 393.5H 2.90% ocssd.bin
      22891 oracle 2 59 0 0K 0K sleep 0:58 2.51% oracle
      11657 oracle 2 59 0 0K 0K sleep 54:00 2.35% oracle
      20082 oracle 55 59 0 0K 0K sleep 343.3H 2.29% oraagent.bin
      10305 oracle 2 59 0 0K 0K sleep 19:25 1.97% oracle
      17501 oracle 2 31 0 0K 0K sleep 41:56 1.71% oracle
      21345 oracle 1 49 0 0K 0K sleep 1:30 1.64% oracle
      4293 oracle 2 59 0 0K 0K sleep 1:10 1.62% oracle
      19823 oracle 756 59 0 0K 0K sleep 83.2H 1.54% java
      12371 oracle 2 59 0 0K 0K sleep 44:03 1.37% oracle


      CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
      0 731 1 1634 3206 1986 3347 674 979 219 2 6760 45 18 0 37
      1 728 1 1343 2618 1268 3998 680 1054 216 3 6578 46 16 0 38
      2 661 1 1220 13439 1373 3439 662 979 216 1 6529 45 16 0 39
      3 780 1 1228 2425 858 4174 729 1113 215 3 6805 47 15 0 38
      16 635 1 1223 2545 1291 3496 658 974 202 2 6490 45 16 0 39
      17 698 1 1156 2211 834 4056 678 1055 208 3 6446 47 15 0 38
      18 640 1 1116 2123 873 3482 644 965 197 1 6489 46 15 0 39
      19 692 1 1160 2243 854 4142 671 1076 209 3 6473 47 15 0 39

      tty md0 md1 md2 md3 cpu
      tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id
      0 23 200 6 17 150 4 17 152 4 17 147 4 18 46 16 0 38

      kthr memory page disk faults cpu
      r b w swap free re mf pi po fr de sr m0 m1 m2 m3 in sy cs us sy id
      2 1 0 41176984 32511592 724 5566 234 10 10 0 0 6 4 4 4 30811 52569 30134 46 16 38