This content has been marked as final. Show 24 replies
Anyway, :) iostat is very well aligned with storage controller statistic. My storage shows same result like iostat. It means that "someone" is issuing these 32K IOs to storage. My question is who is splitting Oracle 1M IO into chunks of 32K IOs? Question is very serious, as it is a significant penalty on IO throughput. We can increase backup speed in 4 times if Oracle process would be able to read using 512K requests and not 32K.
I would proceed with SR on that.
Just speculating here, but are you using the large pool with rman? On an older version on a different OS, I think I noticed that using the large pool made for for much smoother and larger I/O (though I could be misremembering, as the problem I was solving at that time was OS-specific I/O buffer fragmentation).
Anyways, it sounds like you have a stripe width v. AU conflict. Does the ASMCMD volinfo command tell us anything?
Well, it could be that it is a conflict but look:
[root@lnxru0240vg081 bin]# blockdev --report
RO RA SSZ BSZ StartSec Size Device
rw 256 512 4096 0 2396860186624 /dev/xvdb
rw 256 512 512 32768 1198396539392 /dev/xvdb1
rw 256 512 512 2340683776 1198396539392 /dev/xvdb2
rw 256 512 4096 0 3998614552576 /dev/xvdc
rw 256 512 512 32768 1999273722368 /dev/xvdc1
rw 256 512 512 3904897024 1999273722368 /dev/xvdc2
rw 256 512 4096 0 2094870298624 /dev/xvdd
rw 256 512 512 32768 1047401595392 /dev/xvdd1
rw 256 512 512 2045771776 1047401595392 /dev/xvdd2
rw 256 512 4096 0 2396860186624 /dev/xvde
rw 256 512 512 32768 1198396539392 /dev/xvde1
rw 256 512 512 2340683776 1198396539392 /dev/xvde2
rw 256 512 4096 0 2396860186624 /dev/xvdf
rw 256 512 512 32768 1198396539392 /dev/xvdf1
rw 256 512 512 2340683776 1198396539392 /dev/xvdf2
rw 256 512 4096 0 2396860186624 /dev/xvdg
rw 256 512 512 32768 1198396539392 /dev/xvdg1
rw 256 512 512 2340683776 1198396539392 /dev/xvdg2
rw 256 512 4096 0 2396860186624 /dev/xvdj
rw 256 512 512 32768 1198396539392 /dev/xvdj1
rw 256 512 512 2340683776 1198396539392 /dev/xvdj2
rw 256 512 4096 0 2396860186624 /dev/xvdk
rw 256 512 512 32768 1198396539392 /dev/xvdk1
rw 256 512 512 2340683776 1198396539392 /dev/xvdk2
rw 256 512 4096 0 2396860186624 /dev/xvdl
rw 256 512 512 32768 1198396539392 /dev/xvdl1
rw 256 512 512 2340683776 1198396539392 /dev/xvdl2
rw 256 512 4096 0 3998614552576 /dev/xvdm
rw 256 512 512 32768 1999273722368 /dev/xvdm1
rw 256 512 512 3904897024 1999273722368 /dev/xvdm2
ASM disk group which is storing tablespace used in test query is striped over /dev/xvdb /dev/xvde /dev/xvdf /dev/xvdg /dev/xvdj /dev/xvdk /dev/xvdl partitions with external redundancy
As you can see partitions on these LUNs are well alligned on 16MB boundary. Our AU size is 1MB. Our storage strip size is 512 KB.
Dude, we are on Xen HVM +PV guest. Physycal storage is IBM V7000 connected through 4 8Gb FC ports. I can not pos graphs here,but what is interesting:
When we are doing storage calibration with Orion tool, it shows performance close to what we expect: 1.8GB/sec.
We use following Orion command line and lun file:
./orion_linux_x86-64 -run advanced -testname sas_20121206 -matrix point -num_small 0 -num_large 1 -size_large 512 -num_disks 35 -type seq -num_streamIO 10 - simulate raid0 -cache_size 0 -duration 120 -verbose
ORION: ORacle IO Numbers -- Version 126.96.36.199.0
Test will take approximately 3 minutes
Larger caches may take longer
Name: /dev/xvdb1 Size: 1198396539392
Name: /dev/xvde1 Size: 1198396539392
Name: /dev/xvdf1 Size: 1198396539392
Name: /dev/xvdg1 Size: 1198396539392
Name: /dev/xvdj1 Size: 1198396539392
Name: /dev/xvdk1 Size: 1198396539392
Name: /dev/xvdl1 Size: 1198396539392
Name: /dev/xvdb2 Size: 1198396539392
Name: /dev/xvde2 Size: 1198396539392
Name: /dev/xvdf2 Size: 1198396539392
Name: /dev/xvdg2 Size: 1198396539392
Name: /dev/xvdj2 Size: 1198396539392
Name: /dev/xvdk2 Size: 1198396539392
Name: /dev/xvdl2 Size: 1198396539392
14 FILEs found.
In fact when we are looking to storage statistics or IOSTAT during these tests (they perfectly match) we see that storage is receivng ~40000 requests over FC apron 6000 per LUN this gives ~40K per request. It is confusing for me, because we suppose to issue 512K reads
With DB process picture is a bit different, but still confusing.
While searching on the web, I found interesting report (i am not sure I could post links here, anyway):
If you look through pages 8-9 you will find:
This iostat output shows that, although Oracle was making 1MB read calls, the Linux kernel was breaking these calls into multiple smaller calls – in this case, around 350KB in size. This is a function of the Linux block layer, which was operating correctly based on the default configuration of the system. No tuning of the kernel-block layer took place for this testing because, although more throughput could be obtained by eliminating this request fracturing, as long as this phenomenon was the same for both HBA types then the comparison remained valid. Indeed, it reflected the reality that many end-user environments do not tune the block layer at all.
Seems that James has some information to share :)
You mean on VM? or on host?
On VM in fact we don't have such file at all.
[root@lnxru0240vg081 ~]# cat /etc/modprobe.conf
cat: /etc/modprobe.conf: No such file or directory
On VM host - nothing special:
[root@ovsru0240vh021 ~]# cat /etc/modprobe.conf
alias scsi_hostadapter shpchp
#alias eth0 bnx2
#alias eth1 bnx2
alias eth2 be2net
alias eth3 be2net
alias scsi_hostadapter1 megaraid_sas
alias scsi_hostadapter2 ata_piix
alias scsi_hostadapter3 lpfc
alias scsi_hostadapter4 usb-storage
alias eth0 bnx2
alias eth1 bnx2
install lpfc /sbin/modprobe bnx2 ; /sbin/modprobe -i lpfc ; true
Can you provide more information, e.g. kernel (uname -r) of Dom0 and DomU machine?
If the virtual machine is on Oracle Linux 6, you should have a directory named /etc/modprobe.d/. The reason I'm asking is because storage drivers, e.g. lpfc have settings that can be customized to affect channel count and queue sizes and thereby I/O size.
Also I think this post should probably be moved to the Oracle VM forum for closer topic alignment.