6 Replies Latest reply on Jan 9, 2013 3:40 PM by Barry Schamach

    OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?

    Barry Schamach
      Solaris newb, here - I am configuring an x4270 with 16 135 GB drives. Basic approach is

      D0, D1: RAID 1 (Boot volume, Solaris, Oracle Software)
      D2-D13: RAID 6 (Oracle dB files)
      D14, D15: global spares

      After configuring the RAID's w/WebBIOS Utility, I am now trying to format/partition the RAID 6 Virtual Drive, which shows up as 1.327 TB 'Optimal' in the MegaRAID Storage Manager. After hunting around the ether for advice on how to do this, I came across http://docs.oracle.com/cd/E23824_01/html/821-1459/disksxadd-50.html#disksxadd-54639

      "Creating a Solaris fdisk Partition That Spans the Entire Drive"

      which is painfully simple: after 'format', just do an 'fdisk' and accept the default 100% "SOLARIS System" partition. After doing this, partition>print and prtvtoc show this:

      partition> print
      Current partition table (original):
      Total disk cylinders available: 59125 + 2 (reserved cylinders)

      Part Tag Flag Cylinders Size Blocks
      0 unassigned wm 0 0 (0/0/0) 0
      1 unassigned wm 0 0 (0/0/0) 0
      2 backup wu 0 - 59124 1.33TB (59125/0/0) 2849529375
      3 unassigned wm 0 0 (0/0/0) 0
      4 unassigned wm 0 0 (0/0/0) 0
      5 unassigned wm 0 0 (0/0/0) 0
      6 unassigned wm 0 0 (0/0/0) 0
      7 unassigned wm 0 0 (0/0/0) 0
      8 boot wu 0 - 0 23.53MB (1/0/0) 48195
      9 unassigned wm 0 0 (0/0/0) 0

      # prtvtoc /dev/dsk/c0t1d0s2
      * /dev/dsk/c0t1d0s2 partition map
      * Dimensions:
      * 512 bytes/sector
      * 189 sectors/track
      * 255 tracks/cylinder
      * 48195 sectors/cylinder
      * 59127 cylinders
      * 59125 accessible cylinders
      * Flags:
      * 1: unmountable
      * 10: read-only
      * Unallocated space:
      * First Sector Last
      * Sector Count Sector
      * 48195 2849481180 2849529374
      * First Sector Last
      * Partition Tag Flags Sector Count Sector Mount Directory
      2 5 01 0 2849529375 2849529374
      8 1 01 0 48195 48194

      My question: is there anything inherently wrong with this default partitioning? Database is for OLTP & fairly small (<200 GB), with about 140 GB being LOB images.


        • 1. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
          Bjoern Rost
          I do have two suggestions:

          - create a new partition (I always use number 6) that starts at cylinder 1, not 0. This may not be needed in every single case (only really important for disks you boot off) but it is a good practice nontheless
          - do not use raid 6 for databases - see http://www.baarf.com/

          raid 6 does not perform as well as raid 1 or raid 10. Depending on your workload, you may get away with it but I'd simply refuse to put a database on anything with parity data. And while you did not specify the performance requirements for this systems, you mention that you only need less than 200GB anyway, so you don't even need all the space a raid6 would provide.

          1 person found this helpful
          • 2. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
            Barry Schamach
            Thanks, Bjoern; good info. And the BAARF movement is certainly food for thought - at the 1st opportunity I am going to fill my drive bays and go RAID10. (In addition to send RMAN backups to a Barracuda, I store an image dump on the host, making RAID10 a little too close for comfort, space-wise).
            • 3. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
              First off, RAID-5 or RAID-6 is fine for database performance unless you have some REALLY strict and REALLY astronomical performance requirements. Requirements that someone with lots of money is willing to pay to meet.

              You're running a single small x86 box with only onboard storage.

              So no, you're not operating in that type of environment.

              Here's what I'd do, based upon a whole lot of experience with Solaris 10 and not so much with Solaris 11, and also assuming this box is going to be around for a good long time as an Oracle DB server:

              1. Don't use SVM for your boot drives. Use the onboard RAID controller to make TWO 2-disk RAID-1 mirrors. Use these for TWO ZFS root pools. Why two? Because if you use live upgrade to patch the OS, you want to create a new boot environment in a separate ZFS pool. If you use live upgrade to create new boot environments in the same ZFS pool, you wind up with a ZFS clone/snapshot hell. If you use two separate root pools, each new boot environment is a pool-to-pool actual copy that gets patched, so there are no ZFS snapshot/clone dependencies between the boot environments. Those snapshot/clone dependencies can cause a lot of problems with full disk drives if you wind up with a string of boot environments, and at best they can be a complete pain in the buttocks to clean up - assuming live upgrade doesn't mess up the clones/snapshots so badly you CAN'T clean them up (yeah, it has been known to do just that...). You do your first install with a ZFS rpool, then create rpool2 on the other mirror. Each time you do an lucreate to create a new boot environment from the current boot environment, create the new boot environment in the rpool that ISN'T the one the current boot environment is located in. That makes for ZERO ZFS dependencies between boot environments (at least in Solaris 10. Although with separate rpools, I don't see how that could change....), and there's no software written that can screw up a dependency that doesn't exist.

              2. Create a third RAID-1 mirror either with the onboard RAID controller or ZFS, Use those two drives for home directories. You do NOT want home directories located on an rpool within a live upgrade boot environment. If you put home directories inside a live upgrade boot environment, 1) that can be a LOT of data that gets copied, 2) if you have to revert back to an old boot environment because the latest OS patches broke something, you'll also revert every user's home directory back.

              3. That leaves you 10 drives for a RAID-6 array for DB data. 8 data and two parity. Perfect. I'd use the onboard RAID controller if it supports RAID-6, otherwise I'd use ZFS and not bother with SVM.

              This also assumes you'd be pretty prompt in replacing any failed disks as there are no global spares. If there would be significant time before you'd even know you had a failed disk (days or weeks), let alone getting them replaced, I'd rethink that. In that case, if there were space I'd probably put home directories in the 10-disk RAID-6 drive, using ZFS to limit how big that ZFS file system could get. Then use the two drives freed up for spares.

              But if you're prompt in recognizing failed drives and getting them replaced, you probably don't need to do that. Although you might want to just for peace of mind if you do have the space in the RAID-6 pool.

              And yes, using four total disks for two OS root ZFS pools seems like overkill. But you'll be happy when four years from now you've had no problems doing OS upgrades when necessary, with minimal downtime needed for patching, and with the ability to revert to a previous OS patch level with a simple "luactivate BENAME; init 6" command.

              If you have two or more of these machines set up like that in a cluster with Oracle data on shared storage you could then do OS patching and upgrades with zero database downtime. Use lucreate to make new boot envs on each cluster member, update each new boot env, then do rolling "luactivate BENAME; init 6" reboots on each server, moving on to the next server after the previous one is back and fully operational after its reboot to a new boot environment.
              • 4. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
                Barry Schamach
                Thanks, "5287726", for the very thorough reply. A few questions/observations:
                1. Don't use SVM for your boot drives.
                Based on the rest of #1, I think your main point is: do not use ONLY hardware RAID for the boot drives, that slapping the extra layer of redundancy (via the mirrored ZFS pool) will make dealing with upgrades a whole lot easier.
                2. ...You do NOT want home directories located on an rpool within a live upgrade boot environment.
                Yes, excellent point, and one that I actually learned after my initial post: for numerous reasons (including the live upgrade scenario that you mention), it is never good to run the Oracle software on the same filesystem as the OS.
                3. ...otherwise I'd use ZFS and not bother with SVM
                My RAID controller does indeed support RAID-6, so I'll go that route. But re: using ZFS only: I've been burned by having 'software RAID-only' (the database files on mirrored ZFS pools w/no RAID parity): last year, a power outage (combined with a usesless UPS) irrevokably corrupted the ZFS data pool: fortunately it was not in production, but I had major egg on my face: my x-86 server was the only one that had problems - all the Windows and Solaris HW RAID servers came back on line just fine. This left me wary of any 'ZFS-only' configurations...anecdotally, among my limited peer group, I have come across nobody willing to put all their eggs in the ZFS basket.

                RE prompt recognition of drive failures
                I think I'd be more comfortable with some global spares in the hopper - @ the 'Standard' support level, Oracle usually takes few days to get replacement disks delivered.

                Thanks again for the insight, much appreciated!
                • 5. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
                  Don't mirror the root pools with ZFS - use the 4 disks in two separate hardware RAID-1 LUNs, put a separate ZFS pool on each one. (One will be created automatically on install - IIRC that one will be called "rpool").

                  Make another ZFS pool (rpool2 maybe) on the other two-disk RAID-1 array. When you do a live upgrade, if the current boot environment (OS install) exists in rpool, create the new boot environment with lucreate on rpool2. If the current boot environment exists on rpoo2, create the new boot environment on rpoo1. That way the live upgrade process will just copy data between pools - it won't create a mess of file system snapshots and clones.

                  You do wind up using four hard drives to support a single OS installation, but it's easily maintainable while still allowing you to use live upgrade to minimize downtime for patching and updates. If you put all those live upgrade boot environments that can accumulate over a long server lifetime into one ZFS rpool, things can get REALLY nasty to maintain.

                  IMO it's better to use hardware RAID when possible so disk replacement is easier - pull out the dead one, put in the new one, and watch the hardware RAID controller fix it.
                  1 person found this helpful
                  • 6. Re: OK to use fdisk/100% "SOLARIS System" partition for RAID6 Virtual Drive?
                    Barry Schamach
                    That's an important clarification - creating two separate rpools for the OS is what gives you the ludpate flexibility (mirroring a single root pool would not provide the same benefits)...

                    Thanks again for all the good info. These oracle forums are very hit and miss, in my experience...somewhat analagous to a pubic toilet: to a large degree, the expericence is defined by who was in there before you.