I have a question which I have seen various points of view. I'm hoping you might be able to give me a better insight so I can either confirm my own sanity, or accept a new paradigm shift in laying out the file system for best performance.
Here are the givens:
Unix systems (AIX, HP-UX, Solaris, and/or Linux).
Hardware RAID system on large SAN (in this case, RAID-05 striped over more than 100 physical disks).
(We are using AIX 6.1 with CIO turned on for the database files).
Each Physical Volume is literally striped over at least physical 100 disks (spindles).
Each Logical Volume is also striped over at least 100 spindles (all the same spindles for each lvol).
Oracle software binaries are on their own separate physical volume.
Oracle backups, exports, flash-back-query, etc., are on their own separate physical volume.
Oracle database files, including all tablespaces, redo logs, undo ts, temp ts, and control files are in their own separate physical volume (that is made up of logical volumes that are each striped over at least 100 physical disks (spindles).
The question is if it makes any sense (and WHY) to break up the physical volume that is used for the Oracle database files themselves, into multiple logical volumes? At what point does it make sense to create individual logical volumes for each datafile, or type, or put them all in a single logical volume?
Does this do anything at all for performance? If the volumes are logical, then what difference would it to put them into individual logical volumes that are striped across the same one-hundred (+) disks?
Basically ALL database files are in a single physical volume (LUN), but does it help (and WHY) to break up the physical volume into several logical volumes for placing each of the individual data files (e.g., separating system ts, from sysaux, from temp, from undo, from data, from indexes, etc.) if the physical volume is created on a RAID-5 (or RAID-10) disk array on a SAN that literally spans across hundreds of high-speed disks?
If this does makes sense, why?
From a physical standpoint, there are only 4 hardware paths for each LUN, so what difference does it make to create multiple 'logical' volumes for each datafile, or for separating types of data files?
From an I/O standpoint, the multi-threading of the operating system should only be able to use the number of pathways that are capable based on the various operating system options (e.g., multicore CPUs using SMT (simultaneous multipath threading). But I believe they are still based on physical paths, not based on logical volumes.
I look forward to hearing back from you.
If you only have four paths you only have four paths. Logical is for maintenance purposes.
But keep in mind those disks are on a lot of separate physical shelves and, depending on your hardware, may have different controllers. What you see as 4 may not be.
From a performance standpoint my first thought would be to take down RAID5.
The second would be to check how the SAN's cache has been configured. Often the default configuration, read cache vs write cache, is an unseen culprit.
Like most performance related questions, it depends. While I've just recently gotten into the oracle administration directly, I've supported oracle dbas as a UNIX admin for the better part of 15 years. Historically, they tend to like lots of little disks instead of one big one.
More germane to your question is personal experience. I have seen oracle become disk blocked on a fairly beafy system that wasn't striped on the OS side. The disks were san attached EMC systems that were striped via metas in the box. The core lesson is that regardless of how many stripes are in the disk array, if you present them as one lun to the system, there will be only one disk queue
I also saw oracle channel block on a different but similarly configured system when all the disks were presented down one of two possible channels (have to love hp's vgimport - it'll import all the ctds on two channels, but it'll use the first channel it sees as primary). That one surprised me; I would have expected us to be a couple of years from that one.
Regardless of disk (apparent or actual) or channel blocks, you handle it the same way we did in the old fwd scsi days. Break the I/O over multiple disks/channels and stripe on the OS side.
How much does this apply to your situation? No way of knowing. You can configure it any way that works and keep an eyeball on the disk stats via sar. If the avwio and/or avsvc times start to elevate, it might be time to break up the one mega-lun into several smaller ones...
My $.02; hope that helps.
Thanks for your reply damorgan.
We have dual HBAs in our servers as standard equipment, along with dual controllers.
I totally agree with the idea of getting rid of RAID-5, but that is not my choice.
We have a very large (massive) data center and the decision to use RAID-5 was at the discretion of our unix team some time ago. Their idea is one-size-fits-all. When I questioned it, I was balked at. After all, what do I know? I've only been a sys admin for 10 years (but on HP-UX and Solaris, not on AIX), and I've only been an Oracle DBA for nearly 20 years.
For whatever it is worth, they also mirror their RAID-5, so in essence, it is a RAID 5-1-0 (RAID-50).
Anyway, as for the hardware paths, from my understanding, there are only 4 physical hardware paths going from the servers to the switches, to the SAN and back. Their claim (the unix team's) is that by using multiple logical volumes within a single physical volume, that it increases the number of 'threads' to pull data from the stripe. This is the part I don't understand and may be specific to AIX.
So if each logical volume is a stripe within a physical volume, and each physical volume is striped across more than one hundred disks, I still don't understand how multiple logical volumes can increase I/O through-put. From my understanding, if we only have four paths, and there are 100+ spindles, even if it did increase I/O somehow by the way AIX uses multipathing (SMT) with its CPUs, how can it have any affect on the I/O. And if it did, it would still have to be negligible.
Two years ago, I've personally set up three LUNs on a pair of Sun V480s (RAC'd) connected to a Sun Storage 3510 SAN. One LUN for Oracle binaries, one for database datafiles, and one for backups and archivelogs), and then put all my datafiles in a single logical volume on one LUN, and had fantastic performance for a very intense database that literally had 12,000 to 16,000 simultaneous active* connections using Webshere connection pools. While that was a Sun system, and now I'm dealing with an AIX P6 570 system, I can't imagine the concepts being that much different, especially when the servers are basically comparable.
Any comments or feedback appreciated.
Edited by: ji li on Jan 28, 2013 7:51 AM