This discussion is archived
13 Replies Latest reply: Apr 26, 2013 2:24 AM by user3206995 RSS

understanding Strace output

user3206995 Newbie
Currently Being Moderated
Hello All

I attempted to ascertain whether I have an issue with my the SAN storage.
Therfore I ran a connected to an oracle 11g database and used the szstem ID of my process.
I then ran the following command :
strace -cp 26806

I then ran my SQL statement once it had return my values etc I cancelled out and these are the results I got.
[oracle@xxxx tmp]$ strace -cp 26806
Process 26806 attached - interrupt to quit
Process 26806 detached
% time seconds usecs/call calls errors syscall
----------- --------- ----------------

60.76 1.167108 34 34379 read
25.56 0.490949 3 176282 poll
10.01 0.192197 0 444382 177753 recvmsg
2.15 0.041320 0 88878 sendmsg
1.33 0.025533 0 183723 times
0.11 0.002168 0 14809 getrusage
0.05 0.001027 0 2956 write
0.01 0.000215 31 7 munmap
0.01 0.000192 2 82 mmap
0.00 0.000000 0 7 semctl

100.00 1.920709 945505 177753 total

From the output the whole thing took 1.92 secs. Does this mean the SAN is really responding in that time or should I not use strace to determine that?

Thanks in advance
  • 1. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    How do you know if information was read from disk or buffer/memory?

    The strace utility writes the event trace from the STREAMS log driver. Some information may get lost if events are output faster than the strace process can handle. It is a debugging tool to analyze system calls and can show you what an application is doing.

    SQL performance analysis is normally done from within the database by analyzing the explain plan and wait events and SQL Performance Analyzer.
  • 2. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    Yes, that is understood, but it is possible to trace a process be it a oracle dedicated or any other process to find where time is being spent.
    I am trying to determine if our disk configuration has an issue.

    The server is a dedicated oracle server, and we are seeing high numbers on output column of vmstat wa and await columns of iostat.

    Therefore, I decided to use strace to get timings of a known process.

    Yes, we are looking at tuning some SQL but sofar SQL tuned has not had any impact on the stats mentioned above.

    I was more concerned with the output of strace and the timings mentioned


    Thanks in advance for your input
  • 3. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    Where do you see in the strace output that data was read or written from the disk of your SAN and not from database buffer cache or memory?
  • 4. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    Is this not a read from disk or cache if it is employed on the SAN.
    I could be wrong

    60.76 1.167108 34 34379 read
  • 5. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    I am also seeing vmstat ouptut often similar to.


    r b swpd free buff cache si so bi bo in cs us sy id wa st
    0 2 261380 1170524 1064492 50735272 0 0 1864 187 3492 10014 6 2 72 21 0
    2 2 261380 1167516 1064492 50735276 0 0 42672 2385 3066 10298 9 3 66 23 0
    1 2 261380 1167176 1064492 50735276 0 0 127464 667 3762 11153 18 3 57 22 0
    2 2 261380 1179096 1064492 50735280 0 0 106904 631 4341 12024 27 3 48 22 0
    0 2 261380 1180832 1064492 50735280 0 0 26288 1291 4616 12894 27 3 45 25 0
    3 2 261380 1176624 1064492 50735284 0 0 2272 755 3585 10554 10 2 64 24 0
    3 1 261380 1183336 1064496 50735280 0 0 1792 366 2909 9240 47 7 24 22 0
    5 1 261380 1184056 1064496 50735288 0 0 1072 1917 3606 10330 26 4 58 13 0
    2 1 261380 1184056 1064496 50735288 0 0 1088 509 3097 9549 24 3 49 24 0
    1 1 261380 1184016 1064496 50735292 0 0 1056 247 1760 8802 5 1 70 23 0
    3 0 261380 1183520 1064496 50735292 0 0 360 837 2267 9294 2 1 87 10 0
    3 0 261380 1183644 1064496 50735296 0 0 184 286 1913 8613 2 1 95 2 0
    2 0 261380 1183520 1064496 50735296 0 0 264 651 2519 8907 3 1 88 8 0
    2 1 261380 1183576 1064496 50735296 0 0 208 633 3078 9770 8 1 86 6 0
    0 0 261380 1183544 1064496 50735296 0 0 968 947 2928 9898 3 1 81 14 0
    1 1 261380 1183552 1064496 50735304 0 0 1152 173 2285 9450 3 1 76 20 0
    1 2 261380 1183180 1064496 50735304 0 0 1336 712 3730 13027 8 13 58 22 0
    1 1 261380 1184016 1064496 50735308 0 0 1320 1055 2748 10379 5 3 69 24 0
    1 2 261380 1184200 1064496 50735308 0 0 2248 2981 4024 11249 16 4 45 35 0
    2 0 261380 1183836 1064496 50735320 0 0 1088 1151 2229 9907 6 2 77 14 0
    0 0 261380 1184332 1064496 50735320 0 0 192 747 2550 10535 8 5 86 1 0

    Thanks
  • 6. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    From what I understand these tools can collect I/O statistics and show performance data, but there is no way to tell whether a specific I/O was buffered or physical.

    Btw, your virtual memory management can have a very large impact on performance. Oracle Database, for instance, is using shared memory and Kernel Hugepages can make a real performance difference.
  • 7. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    Hi dude,

    thanks for your continued input.
    For sure there appears to be an IO issue at the OS level.

    I guess my issue is to determine if there is anything that one can do at the os level to improve things.
    Is it possible to analyze how well the OS cache is performing, we do have a large OS cache size.


    Will huge pages improve performance then, I think I checked and it was not being used Linux 5.9.


    Thanks in advance
  • 8. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    Kernel hugepages can be a substantial improvement. However, so far you have not provided any information about your OS version, hardware configuration and Oracle database configuration. Based on what information should there be any recommendation? No one can troubleshoot your system or make any recommendation based on some SQL query that took 1.92 secs.
  • 9. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    Hi Dude

    Yes, I agree I should have provided a bit more system info, let me give it a go.

    we are running an Oracle 2 node RAC configuration on the following:

    Linux servername 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

    64973164k total memory

    Swap: 8193140k total, 279088k used, 7914052k free, 51461224k cached

    2 Physical CPUs
    2 cores per cpu
    4 Logical cpus

    Were are using an HP SAN.

    Oracle 11.2.0.3 two Node RAC with a dataguard to a single instance standy server


    If there is any other info that maybe required please le me know.

    Thanks in advance
  • 10. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    How are your Oracle SGA's configured?
  • 11. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    pga_aggregate_target 4G

    sga_max_size 20032M
    sga_target 20032M

    Total System Global Area 2,0911E+10 bytes
    Fixed Size 2237488 bytes
    Variable Size 6845107152 bytes
    Database Buffers 1,4026E+10 bytes
    Redo Buffers 38195200 bytes



    Thanks
  • 12. Re: understanding Strace output
    Dude! Guru
    Currently Being Moderated
    If you have not done so already, I suggest set up kernel hugepages and configure the database for ASMM. You may have to remove the memory_target and memory_max_target parameters.

    Oracle recommends kernel hugepages when using more than 8 GB. Posix /dev/shm shared memory uses 4 KB pages, whereas hugepages use 2 MB, resulting in a much smaller memory page table and better use of the TLB cache and drastically increasing performance. Kernel hugepages are reserved at system startup and cannot be swapped to disk. What level of performance increase you will see I do not know, but you might want to configure it anyway.

    To test your disk/SAN performance you can use a benchmark tool like Oracle Orion, which simulates database load.

    http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#BABFCFBC
    http://download.oracle.com/otn/utilities_drivers/orion/Orion_Users_Guide.pdf

    The orion tool is free, but no longer where it used to be, or Oracle has hidden it very well. Google does not find it anymore. However, Orion is included with 11gR2, located in $ORACLE_HOME/bin

    You also seem to be using the RHEL kernel and not Oracle UEK.
  • 13. Re: understanding Strace output
    user3206995 Newbie
    Currently Being Moderated
    Hi Dude

    ok, thanks for the tip. I will check it out. I only just noticed that my SGA is 20GB, I am not comfortable with that.

    Will have to investigate that one.
    Thanks again for your time and patience

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points