3 Replies Latest reply: Feb 4, 2013 2:15 PM by user12273962 RSS

    Oracle direct path read IO size

    Kirill.Boyko
      Hello!

      Initial discussion started in Database forum: Oracle direct path read IO size
      Dude suggested that I posted this question here.

      Just to repeat in few words our environment:
      Oracle VM 3.1.1. Linux ovsru0240vh021 2.6.39-200.1.9.el5uek #1 SMP Sun Oct 7 20:28:37 PDT 2012 x86_64 x86_64 x86_64 GNU/Linux
      Oracle Linux 6.2. Linux lnxru0240vg081 2.6.39-300.17.1.el6uek.x86_64 #1 SMP Fri Oct 19 11:29:17 PDT 2012 x86_64 x86_64 x86_64 GNU/Linux
      Oracle Database 11.2.0.3 Orion Version 11.1.0.7.0

      Storage is connected using FC IBM 42D0494 8Gb 2-Port PCIe FC HBA for System x
      Driver: Emulex LightPulse Fibre Channel SCSI driver 8.3.5.58.2p

      There are no specific driver parameters:
      lpfc_use_msi     =     2
      lpfc_hba_queue_depth=     4096
      lpfc_lun_queue_depth=     30


      LUNs are presented to VMS using DM multipath

      Linux scsi layer for Oracle VM is tuned the same way for all DM devices:
      add_random     1
      discard_granularity     0
      discard_max_bytes     0
      discard_zeroes_data     0
      hw_sector_size     512
      iostats     1
      logical_block_size     512
      max_hw_sectors_kb     32767
      max_integrity_segments     0
      max_sectors_kb     32767
      max_segments     64
      max_segment_size     65536
      minimum_io_size     512
      nomerges     0
      nr_requests     128
      optimal_io_size     0
      physical_block_size     512
      read_ahead_kb     128
      rotational     1
      rq_affinity     1

      These LUNs are mapped to Oracle Linux 6.2 VM. Linux scsi layer in VM is tuned in following way:
      add_random     1
      discard_granularity     0
      discard_max_bytes     0
      discard_zeroes_data     0
      hw_sector_size     512
      iostats     1
      logical_block_size     512
      max_hw_sectors_kb     256
      max_integrity_segments     0
      max_sectors_kb     256
      max_segments     11
      max_segment_size     4096
      minimum_io_size     512
      nomerges     0
      nr_requests     128
      optimal_io_size     0
      physical_block_size     512
      read_ahead_kb     128
      rotational     0
      rq_affinity     1

      When we try to stress our storage from VM using:
      ./orion_linux_x86-64 -run advanced -testname sas_20121206 -matrix point -num_small 0 -num_large 1 -size_large 1024 -num_disks 35 -type seq -num_streamIO 20 - simulate raid0 -cache_size 0 -duration 120 -verbose

      We see:
      [root@lnxru0240vg081 ~]# iostat -xm 1
      Device:     rrqm/s     wrqm/s     r/s     w/s     rMB/s     wMB/s     avgrq-sz     avgqu-sz     await     svctm     %util
      xvdb     0     0     7027     1     292.79     0     85.32     67.78     9.93     0.13     92.6
      xvde     0     0     6865     0     286.04     0     85.33     58.41     8.48     0.14     94.1
      xvdf     0     0     6865     0     286.04     0     85.33     59.46     8.63     0.14     95
      xvdg     0     0     6881     3     286.73     0.09     85.33     67.05     9.73     0.14     94.6
      xvdj     0     0     6796     1     283.14     0.01     85.32     52.1     7.54     0.14     94
      xvdk     0     0     6890     0     287.09     0     85.33     74.58     10.88     0.14     95.6
      xvdl     0     0     6784     0     282.69     0     85.34     47.08     6.78     0.13     90.1

      which says that Linux sends to storage ~40000 requests per second with average size of 40K. Storage stats confirms that.
      Question is why Linux sends so many requests with size=40K if Orion is supposed to issue 1M requests?

      In fact same thing happens with Oracle Database with full table scan direct path read, the only change is that despite the fact that database issues IO which are 1M Linux splits them into ~32K chunks. This significantly reduce our throughput.

      Thanks in advance!

      Kirill

      Edited by: Kirill.Boyko on Feb 4, 2013 6:24 AM

      Edited by: Kirill.Boyko on Feb 4, 2013 6:25 AM

      Edited by: Kirill.Boyko on Feb 4, 2013 6:26 AM

      Edited by: Kirill.Boyko on Feb 4, 2013 6:29 AM
        • 1. Re: Oracle direct path read IO size
          user12273962
          Humm.....

          Are you using virtual disks on iSCSI or are you doing a direct attachment/pass through to the VM guest? I assume you're using ASM when you say you are using 1M blocks. (the default).

          Just a thought but did you make sure to use a partition your storage so the partition table isn't creating a stripe cross? Also, ASM is its own file system but if you're using external redundancy, you're still going to be writing blocks at whatever your external redundancy is set to do.
          • 2. Re: Oracle direct path read IO size
            Kirill.Boyko
            In our case disks are attached using FC to hypervisor. We do use ASM, and we alligned our partitions on 32M, but Orion does not depend on ASM.
            • 3. Re: Oracle direct path read IO size
              user12273962
              Sorry, I was reading more than one post at a time when I responded.

              I ran a quick test on one of my test database servers using random read i/o and got the following on a 1024k chunk. My storage it sharing block i/o across multiple database so "lat" is a little high.

              ran (large): VLun = 0 Size = 966363439104
              ran (large): Index = 0 Avg Lat = 6410.63 us Count = 1560
              ran (large): nio=1560 nior=1560 niow=0 req w%=0 act w%=0
              ran (large): my 1 oth 0 iops 155 lat 6411 us, bw = 155.93 MBps dur 10.00 s size 1024 K, min lat 0 us, max lat 102588 us READ

              I used the following.

              ./orion -run advanced -size_large 1024 -type rand -matrix detailed -duration 10

              Even though I issued the size_large command, orion still did 8k random reads.

              ran (small): VLun = 0 Size = 966363439104
              ran (small): Index = 0 Avg Lat = 4578.20 us Count = 2181
              ran (small): Index = 1 Avg Lat = 4683.84 us Count = 2133
              ran (small): Index = 2 Avg Lat = 4823.58 us Count = 2069
              ran (small): Index = 3 Avg Lat = 4656.20 us Count = 2146
              ran (small): Index = 4 Avg Lat = 4341.40 us Count = 2299
              ran (small): nio=10828 nior=10828 niow=0 req w%=0 act w%=0
              ran (small): my 5 oth 0 iops 1083 lat 4611 us, bw = 8.46 MBps dur 10.00 s size 8 K, min lat 0 us, max lat 189411 us READ

              Maybe that is what is happening to you as well. I'm not a orion expert though and it might have to do with my use o the "rand" option.

              The difference in "bw" would tell me that the 1024k chunk read was definitely different than than the 8k read.

              I would think your storage stats would only show the read and writes based on the actual block allocation on the storage. I assume you are not using 1M block allocation on the storage, so you would not see a block any bigger than the allocation size. So you would not see any truly 1024k block read/write stats.