1 2 Previous Next 18 Replies Latest reply: Oct 30, 2012 7:24 PM by 970981 RSS

    Write I/O performance slow in RAC on SAN

    970981
      Currently building a new 11.2.0.3 RAC on OL6.

      I am using ASM on top of a SAN that is exposing 4x2TB LUNS to the 3 nodes, as the DATA disk group.
      The nodes are communicating over 10gb isci to the SAN.

      async_io=true
      filesystemio_options=setall

      When creating a test table space I am only seeing about 3-5 mb/s write throughput from Oracle. When testing with ioZone tool to write 8k block size files on to the drives
      I am seeing ~115 mb/s throughput.

      I have verified that DBWR is calling io_submit by using strace on the process.

      Not sure where to go from here with troubleshooting?

      Thanks.

      Edited by: mosho88 on Oct 29, 2012 12:43 PM
        • 1. Re: Write I/O performance to slow in RAC on ASM
          sb92075
          user4721711 wrote:
          Currently building a new 11.2.0.3 RAC on OL6.

          I am using ASM on top of a SAN that is exposing 4x2TB LUNS to the 3 nodes, as the DATA disk group.
          The nodes are communicating over 10gb isci to the SAN.

          async_io=true
          filesystemio_options=setall

          When creating a test table space I am only seeing about 3-5 mb/s write throughput from Oracle. When testing with ioZone tool to write 8k block size files on to the drives
          I am seeing ~115 mb/s throughput.
          post SQL that populates the table.
          post SQL & results that shows the 3 - 5 MB/s

          Have you performed the same test without SAM and/or ASM
          • 2. Re: Write I/O performance to slow in RAC on ASM
            970981
            Sql to create the tablespace:

            CREATE BIGFILE TABLESPACE TEST_TBS NOLOGGING
            DATAFILE '+DATA' SIZE 100G
            AUTOEXTEND ON NEXT 10G
            BLOCKSIZE 8K

            I am using the script located at http://oracle8dba.files.wordpress.com/2011/10/asmperf1.doc to gather the timings over a 60 second window.

            Output from one of the runs showing 24 mb/s which is 6mb/s for each of the 4 disks in the group.

            DISKGROUP_NAME READS WRITES READ_MB/s WRITE_MB/s READ_AVG_SPEED_MSEC WRITE_AVG_SPEED_MSEC AVG_RESP_TIME_MSEC
            DATA 652 1628 0.1 24.4 0.26 2.750 2.040


            Plan is on Monday to create a storage area off the SAN and test and ASM diskgroup to filesystem, then test with out ASM + SAN to benchmark that.

            Thanks.
            • 3. Re: Write I/O performance to slow in RAC on ASM
              JohnWatson
              I would try the built-in disc calibration facility. Run a script something like this:
              var al number
              var mi number
              var mm number
              exec dbms_resource_manager.calibrate_io(num_physical_disks=>4,max_iops=>:mi,max_mbps=>:mm,actual_latency=>:al)
              print al
              print mi
              print mm
              That should show the results for a "typical" database workload. Ther results are often surprising. You may find that a daabase on your laptop out performs something an a massive SAN.
              • 4. Re: Write I/O performance to slow in RAC on ASM
                Aman....
                In addition to John's suggestion, you may also want to look at Orion ,
                http://www.oracle.com/technetwork/topics/index-089595.html

                And SLOB
                http://kevinclosson.wordpress.com/2012/02/06/introducing-slob-the-silly-little-oracle-benchmark/

                HTH
                Aman....
                • 5. Re: Write I/O performance to slow in RAC on ASM
                  JohnWatson
                  I just ran exactly that routine on my laptop (which has one disc, not four) with these results:
                  orcl> print al
                  
                          AL
                  ----------
                          29
                  
                  orcl> print mi
                  
                          MI
                  ----------
                          67
                  
                  orcl> print mm
                  
                          MM
                  ----------
                          43
                  
                  orcl>
                  • 6. Re: Write I/O performance to slow in RAC on ASM
                    970981
                    Thanks for the suggestions, I ran the i/o calibration routine, here are the results

                    AL
                    --
                    19

                    MI
                    ----
                    2485

                    MM
                    ---
                    365


                    So it looks like the SAN is capable of much more than 24 mb/s, is it possible that the CREATE TABLESPACE just doesn't push that many writes per second?

                    I am downloading Orion and SLOB to give them a try.

                    Thanks
                    • 7. Re: Write I/O performance to slow in RAC on ASM
                      970981
                      After poking around further it seems that I/O is not slow, but I/O done in response to a CREATE statement, i.e. CREATE TABLESPACE
                      CREATE TABLE results in a "Datafile init write" wait on the SYSTEM tablespace and the creating session blocks on this wait.
                      • 8. Re: Write I/O performance to slow in RAC on ASM
                        JohnWatson
                        Does that mean that performance is OK? I wouldn't think that you will be running CREATE often on the live system.

                        But just in general, I would suggest that you ask for 8 or 16 LUNs for an 8T disc group. That will give ASM much better parallelism (which might correspond more closely to the underlying number of spindles) and will reduce the "required mirror free" space too.
                        Also, what alloction unit are you using? I use 4M now, not the 1M default. And of course you must create your tablespaces with uniform size, perhaps 8M, so that your segment extent boundaries will always align with the file extents.
                        • 9. Re: Write I/O performance to slow in RAC on ASM
                          Catch-22
                          Output from one of the runs showing 24 mb/s which is 6mb/s for each of the 4 disks in the group.
                          ASM is not RAID and functions on the concept of disk failure groups and available free space.

                          I would consider 24 or even 115 mb/s a very slow performance for a modern SAN system. 115 MB/s is what you should see on 1 Gigabit interface, but not 10gb iScsi as you wrote.

                          I suggest to verify your network and SAN configuration. Perhaps you have a bad cable, bad firmware, bad switch or bad configuration of the system.

                          Here is an interesting article to verify async io in Oracle database.
                          http://fritshoogland.wordpress.com/2010/07/21/doing-synchronous-io-on-asm/
                          • 10. Re: Write I/O performance to slow in RAC on ASM
                            970981
                            It does seem like the SAN is capable of much more. When I ran the diagnostic that JohnWatson suggested, the RAC was able to produce these numbers

                            avg response = 19ms

                            max io operations = 2485

                            max mb/s = 365

                            In reading the description of that procedure it seems to only use reads od DB_BLOCK_SIZE ? So maybe it is only write performance that is suffering. After testing more it seems to be pretty stable around 10mb/s per create statement, that is if I can get 3 creates running, one on each node I see about 30mb/s throughput, 2 ~ 20mb/s 1 ~ 10mb/s

                            I did read that article you suggested and performed the tests and verified that asyn i/o is on, I am beginning to think that CREATE statements perform sync i/o and that is just the way it is. But if this is the performance to expect from CREATE TABLESPACE, we are looking at ~50 hours to pre-allocate our 2TB production space...

                            Thanks
                            • 11. Re: Write I/O performance to slow in RAC on ASM
                              Catch-22
                              Also keep in mind that you cannot get 300 MB/s throughput when creating 1 MB of data.
                              • 12. Re: Write I/O performance to slow in RAC on ASM
                                970981
                                This statement has now been running for about 20hrs

                                CREATE BIGFILE TABLESPACE TEST_TBS
                                DATAFILE '+DATA'
                                SIZE 2T

                                and I/O throughput remains steady at ~10-12mb/s

                                The plan today is to try to write to the SAN from one of our single instance servers and benchmark speed from another rack and single node instance.

                                Thanks
                                • 13. Re: Write I/O performance to slow in RAC on ASM
                                  Levi Pereira
                                  Hi,
                                  I have faced some performance I/O issues when using ASM, but on all diagnose that realized problem was on OS or Storage level.

                                  When using ISCI and RAC we must be aware about latency of I/O, on this scenary the latency usually be the villain of the story.

                                  Generally I discard tests sequential reads or writes, because the characteristics of OLPT database is massive amount of writes/reads of small blocks randomly, is not a rule, but you can believe that if the random writes and reads are acceptable then the readings and writes sequential are excelent.

                                  So, one of the big mistakes is to focus the problem on ASM when talking about I/O perfomance, I recommend you first focus on OS and Storage layer, if you make sure that I/O perfomance is acceptable then diagnose ASM. This is because the ASM is only a layer that depends exclusively on the OS and Storage performance.

                                  I recommend you use ORION, because he is a tool for predicting the performance of an Oracle database, unlike other I/O calibration tools, Oracle Orion is expressly designed for simulating Oracle database I/O workloads using the same I/O software stack as Oracle. Orion can also simulate the effect of striping performed by Oracle Automatic Storage Management.

                                  For each type of workload, Orion can run tests using different I/O loads to measure performance metrics such as MBPS, IOPS, and I/O latency. Load is expressed in terms of the number of outstanding asynchronous I/Os. Internally, for each such load level, the Orion software keeps issuing I/O requests as fast as they complete to maintain the I/O load at that level. For random workloads, using either large or small sized I/Os, the load level is the number of outstanding I/Os.

                                  PS: Using Orion to write tests will obliterate all data on the specified LUNS.

                                  http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#PFGRF95222

                                  As you are using RAC you should make two test separately, because ORION does not simulate I/O concurrence on a Cluster.

                                  When you get report of ORION, make a second test using DBMS_RESOURCE_MANAGER.CALIBRATE_IO with all instance active.
                                  This statement has now been running for about 20hrs
                                  
                                  CREATE BIGFILE TABLESPACE TEST_TBS
                                  DATAFILE '+DATA'
                                  SIZE 2T
                                  
                                  and I/O throughput remains steady at ~10-12mb/s
                                  Also try change filesystemio_options=setall to filesystemio_options=asynch :
                                  Because, when setting the filesystemio_options=asynch you are forcing the use of the file system cache and not writing physically on the disk.
                                  This is because with Asynchronous I/O in use, the system call will return immediately once the I/O request has been passed down to the hardware or queued in the operating system, typically before the physical I/O operation has even begun.
                                  In this case, Async I/O is much faster than Direct IO or Concurrent IO, however, it does not verify that the data is physically written to disk for which we rely on the OS.


                                  Regards,
                                  Levi Pereira
                                  • 14. Re: Write I/O performance to slow in RAC on ASM
                                    Catch-22
                                    Why not simply trying a dd command to see what throughput you get:

                                    <pre>
                                    dd if=/dev/zero of=/san/u10/testfile bs=10G count=1
                                    </pre>

                                    You have also the hdparm command:

                                    <pre>
                                    hdparm -tT /dev/sda
                                    </pre>

                                    This should give you some idea what the OS/SAN can approx. achieve.

                                    You can also copy a file between the OS and ASM to see what you get (asmcmd).

                                    It might help you to get a clue whether the problem is OS or database. Do you have any local storage? What are the results create a datafile on a local disk?
                                    1 2 Previous Next