This content has been marked as final. Show 18 replies
user4721711 wrote:post SQL that populates the table.
Currently building a new 18.104.22.168 RAC on OL6.
I am using ASM on top of a SAN that is exposing 4x2TB LUNS to the 3 nodes, as the DATA disk group.
The nodes are communicating over 10gb isci to the SAN.
When creating a test table space I am only seeing about 3-5 mb/s write throughput from Oracle. When testing with ioZone tool to write 8k block size files on to the drives
I am seeing ~115 mb/s throughput.
post SQL & results that shows the 3 - 5 MB/s
Have you performed the same test without SAM and/or ASM
Sql to create the tablespace:
CREATE BIGFILE TABLESPACE TEST_TBS NOLOGGING
DATAFILE '+DATA' SIZE 100G
AUTOEXTEND ON NEXT 10G
I am using the script located at http://oracle8dba.files.wordpress.com/2011/10/asmperf1.doc to gather the timings over a 60 second window.
Output from one of the runs showing 24 mb/s which is 6mb/s for each of the 4 disks in the group.
DISKGROUP_NAME READS WRITES READ_MB/s WRITE_MB/s READ_AVG_SPEED_MSEC WRITE_AVG_SPEED_MSEC AVG_RESP_TIME_MSEC
DATA 652 1628 0.1 24.4 0.26 2.750 2.040
Plan is on Monday to create a storage area off the SAN and test and ASM diskgroup to filesystem, then test with out ASM + SAN to benchmark that.
I would try the built-in disc calibration facility. Run a script something like this:
That should show the results for a "typical" database workload. Ther results are often surprising. You may find that a daabase on your laptop out performs something an a massive SAN.
var al number var mi number var mm number exec dbms_resource_manager.calibrate_io(num_physical_disks=>4,max_iops=>:mi,max_mbps=>:mm,actual_latency=>:al) print al print mi print mm
Thanks for the suggestions, I ran the i/o calibration routine, here are the results
So it looks like the SAN is capable of much more than 24 mb/s, is it possible that the CREATE TABLESPACE just doesn't push that many writes per second?
I am downloading Orion and SLOB to give them a try.
Does that mean that performance is OK? I wouldn't think that you will be running CREATE often on the live system.
But just in general, I would suggest that you ask for 8 or 16 LUNs for an 8T disc group. That will give ASM much better parallelism (which might correspond more closely to the underlying number of spindles) and will reduce the "required mirror free" space too.
Also, what alloction unit are you using? I use 4M now, not the 1M default. And of course you must create your tablespaces with uniform size, perhaps 8M, so that your segment extent boundaries will always align with the file extents.
Output from one of the runs showing 24 mb/s which is 6mb/s for each of the 4 disks in the group.ASM is not RAID and functions on the concept of disk failure groups and available free space.
I would consider 24 or even 115 mb/s a very slow performance for a modern SAN system. 115 MB/s is what you should see on 1 Gigabit interface, but not 10gb iScsi as you wrote.
I suggest to verify your network and SAN configuration. Perhaps you have a bad cable, bad firmware, bad switch or bad configuration of the system.
Here is an interesting article to verify async io in Oracle database.
It does seem like the SAN is capable of much more. When I ran the diagnostic that JohnWatson suggested, the RAC was able to produce these numbers
avg response = 19ms
max io operations = 2485
max mb/s = 365
In reading the description of that procedure it seems to only use reads od DB_BLOCK_SIZE ? So maybe it is only write performance that is suffering. After testing more it seems to be pretty stable around 10mb/s per create statement, that is if I can get 3 creates running, one on each node I see about 30mb/s throughput, 2 ~ 20mb/s 1 ~ 10mb/s
I did read that article you suggested and performed the tests and verified that asyn i/o is on, I am beginning to think that CREATE statements perform sync i/o and that is just the way it is. But if this is the performance to expect from CREATE TABLESPACE, we are looking at ~50 hours to pre-allocate our 2TB production space...
This statement has now been running for about 20hrs
CREATE BIGFILE TABLESPACE TEST_TBS
and I/O throughput remains steady at ~10-12mb/s
The plan today is to try to write to the SAN from one of our single instance servers and benchmark speed from another rack and single node instance.
I have faced some performance I/O issues when using ASM, but on all diagnose that realized problem was on OS or Storage level.
When using ISCI and RAC we must be aware about latency of I/O, on this scenary the latency usually be the villain of the story.
Generally I discard tests sequential reads or writes, because the characteristics of OLPT database is massive amount of writes/reads of small blocks randomly, is not a rule, but you can believe that if the random writes and reads are acceptable then the readings and writes sequential are excelent.
So, one of the big mistakes is to focus the problem on ASM when talking about I/O perfomance, I recommend you first focus on OS and Storage layer, if you make sure that I/O perfomance is acceptable then diagnose ASM. This is because the ASM is only a layer that depends exclusively on the OS and Storage performance.
I recommend you use ORION, because he is a tool for predicting the performance of an Oracle database, unlike other I/O calibration tools, Oracle Orion is expressly designed for simulating Oracle database I/O workloads using the same I/O software stack as Oracle. Orion can also simulate the effect of striping performed by Oracle Automatic Storage Management.
For each type of workload, Orion can run tests using different I/O loads to measure performance metrics such as MBPS, IOPS, and I/O latency. Load is expressed in terms of the number of outstanding asynchronous I/Os. Internally, for each such load level, the Orion software keeps issuing I/O requests as fast as they complete to maintain the I/O load at that level. For random workloads, using either large or small sized I/Os, the load level is the number of outstanding I/Os.
PS: Using Orion to write tests will obliterate all data on the specified LUNS.
As you are using RAC you should make two test separately, because ORION does not simulate I/O concurrence on a Cluster.
When you get report of ORION, make a second test using DBMS_RESOURCE_MANAGER.CALIBRATE_IO with all instance active.
Also try change filesystemio_options=setall to filesystemio_options=asynch :
This statement has now been running for about 20hrs CREATE BIGFILE TABLESPACE TEST_TBS DATAFILE '+DATA' SIZE 2T and I/O throughput remains steady at ~10-12mb/s
Because, when setting the filesystemio_options=asynch you are forcing the use of the file system cache and not writing physically on the disk.
This is because with Asynchronous I/O in use, the system call will return immediately once the I/O request has been passed down to the hardware or queued in the operating system, typically before the physical I/O operation has even begun.
In this case, Async I/O is much faster than Direct IO or Concurrent IO, however, it does not verify that the data is physically written to disk for which we rely on the OS.
Why not simply trying a dd command to see what throughput you get:
dd if=/dev/zero of=/san/u10/testfile bs=10G count=1
You have also the hdparm command:
hdparm -tT /dev/sda
This should give you some idea what the OS/SAN can approx. achieve.
You can also copy a file between the OS and ASM to see what you get (asmcmd).
It might help you to get a clue whether the problem is OS or database. Do you have any local storage? What are the results create a datafile on a local disk?