This discussion is archived
1 2 Previous Next 17 Replies Latest reply: Apr 22, 2013 7:15 PM by Avi Miller RSS

Looking for a filesystem for OLTP DB Workload on OEL6

athlonrob Newbie
Currently Being Moderated
Good Afternoon!

We're running OEL 6.4 on top of OVM 3.1.1 on a set of Oracle servers in support of our EBS R12 upgrade. Our test databases are currently residing on Btrfs, allowing us to take and use snapshots rapidly and easily, but we've been attempting to improve performance and have found even throwing 25,000 IOPS SSDs at it from our EMC VNX storage array leaves us maxing out at 6,000 IOPS during the upgrade process, which is a highly OLTP-type workload.

We suspect Btrfs is to blame and that if we move away from Btrfs to something else, we'll see some dramatic performance improvements.

We hate to give up snapshots, however. We're evaluating "dumb" filesystems like ext2, then snapshotting with either our VNX or LVM, and I'm also trying to look at OCFS2 with its REFLINK capability.

Does anybody have any suggestions? Will a REFLINK on each datafile introduce a significant performance degradation? Are we better off running on ext2?

The database in question is an 11gR2 database.

Thoughts?

Rob
  • 1. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Dude! Guru
    Currently Being Moderated
    The recommended and preferred fileystem for Oracle database is the ASM Cluster Filesystem. ASM is not a software RAID, which some people erroneously assume. It offers additional features on top of redundancy options provided by your SAN storage. If ASM is currently not an option, I suggest OCFS2, XFS or EXT3.
    We hate to give up snapshots
    What are you planing to do with BTRFS snapshots? Snapshots are a quick way to undo changes, but it's not a backup and does not duplicate any data. If you loose the parent filesystem of a snapshot, the snapshot is useless. Any snapshot, or any modification of data as part of a snapshot results in data fragmentation. Whether you can actually use any snapshot to undo or recover Oracle database failure is very questionable. It is not meant to be a filesystem for Oracle datafiles, or at least not currently supported.
  • 2. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    athlonrob Newbie
    Currently Being Moderated
    I'm hesitant to bring ASM into this environment; I had some unexplained crashes with it about two years back and it left a bad taste in my mouth. Additionally, nobody else in the organization has familiarity with it right now. I'm also not quite sure about ASM support for PVM guests in OVM; I know RAC requires HVM guests, which we're trying to avoid.

    We're using the btrfs snapshots (and have been for about nine months now) to snapshot a point in time for our upgrade process, then roll back to that point in time if we need to change the procedure a bit. As you said, it is definitely not a backup solution; it's a means to quickly roll back to a prior step in the upgrade plan if something goes wrong. Backups take hours, snapshots take seconds. Using a snapshot for a backup or a backup for a snapshot doesn't quite make sense.

    We've been successfully taking database snapshots and rolling them back. We just shut the database down, snapshot the filesystems, and bring the database back online.

    I don't believe ASM has such functionality; although OCFS2 does, after a fashion, via reflink copies.

    Thank you for your input... I will take a look at ASM again and continue my hunt with your thoughts in mind!

    Rob
  • 3. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Dude! Guru
    Currently Being Moderated
    You may find the following link interesting. It should further answer your doubts or questions:

    Oracle 11g, OL 6 and ext4
    Re: Oracle 11g, OL 6 and ext4
  • 4. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    athlonrob Newbie
    Currently Being Moderated
    Dude,

    Good information in there.

    If we snapshot an ASM LUN from the SAN side while the system is offline, run on it for a while, shut down, then roll back, is ASM going to get mad at us?

    Rob
  • 5. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Dude! Guru
    Currently Being Moderated
    I suggest to find out how snapshot technology works in your setup. If you have a copy-on-write snapshot, like BTRFS, you will probably not want to use it for any Oracle database data files. Every snapshot you create will cause data fragmentation for any data that changes and most likely drop your storage performance the more snapshots you create. And as previously mentioned, a snapshot requires all parent snapshots to be useful for data recovery. Using database Flashback and restore points or PITR is going to be more useful.
  • 6. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Avi Miller Guru
    Currently Being Moderated
    Rob wrote:
    I know RAC requires HVM guests, which we're trying to avoid.
    This is incorrect: RAC requires PVM guests and ONLY supports ASM for database storage on physically mapped LUNs. Anything else is not supported on Oracle VM. You should not be using btrfs for database storage as it is a) neither supported nor certified and b) as you've discovered, woefully slow.
    I don't believe ASM has such functionality; although OCFS2 does, after a fashion, via reflink copies.
    Yes, it does. ASM can use the Flashback Recovery Area to reproduce the same concept, i.e. you can snapshot your database at a point in time. You should bring your DBAs up to speed on what is possible on ASM. There are lots of examples of how to use the FRA with EBS to do upgrade testing. You should post in the EBS forums for more info on this.
  • 7. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Avi Miller Guru
    Currently Being Moderated
    Dude wrote:
    And as previously mentioned, a snapshot requires all parent snapshots to be useful for data recovery.
    What? btrfs snapshots are independent -- they do not require parent snapshots for recovery. In fact, once a snapshot has been taken, it's literally indistinguishable from the original. You can remove/change/burn with fire the original and the snapshot will continue to operate. That's the whole point of snapshots. :)
  • 8. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    alvaromiranda Explorer
    Currently Being Moderated
    Hello,

    My advise being an Oracle APPS dba and Linux admin is:

    Leave prod for the role it have, PROD, and do in a separate machine the other tasks you want,

    A normal setup that involve snapshots can be.

    Prod with ASM

    using Dataguard, setup a StandBy db, with ASM, then you can duplicate the DB to OCFS2, and fire snapshots on those datafiles on top of OCFS2 with reflink.

    In anything goes wrong with the snapshots, prod is still up. If the snapshots have performance impact, those won't impact prod, nor will impact the stand by db on ASM.

    Alvaro.
  • 9. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Avi Miller Guru
    Currently Being Moderated
    Rob wrote:
    We're using the btrfs snapshots (and have been for about nine months now) to snapshot a point in time for our upgrade process, then roll back to that point in time if we need to change the procedure a bit. As you said, it is definitely not a backup solution; it's a means to quickly roll back to a prior step in the upgrade plan if something goes wrong. Backups take hours, snapshots take seconds. Using a snapshot for a backup or a backup for a snapshot doesn't quite make sense.
    Considering you're already running an unsupported and uncertified environment, you may as well upgrade to the latest 3.8 playground kernel from public-yum.oracle.com which has a much newer btrfs version. I would also recommend mounting your data volume with nodatacow: https://btrfs.wiki.kernel.org/index.php/FAQ#Can_copy-on-write_be_turned_off_for_data_blocks.3F for improved performance.
  • 10. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    athlonrob Newbie
    Currently Being Moderated
    What part of our environment is uncertified or unsupported?
  • 11. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Avi Miller Guru
    Currently Being Moderated
    Rob wrote:
    What part of our environment is uncertified or unsupported?
    Using btrfs as the filesystem to store Oracle Database data is both uncertified and unsupported. See Supported and Recommended File Systems on Linux [ID 236826.1] for more details. Oracle Linux, Filesystem & I/O Type Supportability [ID 279069.1] is old, but has some good tips as well (though hasn't been updated for Oracle Linux 6 yet, I see). I've pinged the support teams responsible for each of these to review and update them for OL/btrfs as well.
  • 12. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    Dude! Guru
    Currently Being Moderated
    Avi Miller wrote:
    Dude wrote:
    And as previously mentioned, a snapshot requires all parent snapshots to be useful for data recovery.
    What? btrfs snapshots are independent -- they do not require parent snapshots for recovery. In fact, once a snapshot has been taken, it's literally indistinguishable from the original. You can remove/change/burn with fire the original and the snapshot will continue to operate. That's the whole point of snapshots. :)
    Doesn't BTRFS use copy-on-write technology? As far as I understand, creating a snapshot does not involve any copying of data and as such is similar to a restore point marker. When mounting a snapshot or subvolume, any file system changes are recorded in metadata, which contains information about what data blocks were changed and the differential data itself. Snapshot and subvolumes are independent files systems and have their own copy of a B-tree. When modifying data, such actions are again recorded and affect only the currently mounted volume. But unless data was modified, there is still only one copy of a data block on disk and hence the snapshot won't help if such data is lost due to a bad disk block or disk error.

    SAN and virtualization products use similar snapshot technologies. As far as I'm aware, this is great for undoing data changes by deleting snapshots or mounting different versions of subvolumes, but its not a solution for full data recovery and not a suitable solution to recover Oracle databases.

    Is this incorrect, please let me know where I'm mistaken. Thanks!
  • 13. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    athlonrob Newbie
    Currently Being Moderated
    Look for the Oracle 11gR2 on Linux certification document. I don't have the number handy (on my iPad at home, feeding kids), but btrfs was listed on there as a supported filesystem. Moreover, I specifically requested clarification from support about nine months ago before implementing this configuration.

    As long as you're on the UEK with 11gR2, btrfs seems to be supported... although it's slower than a sloth in Antarctica. :-(

    I think we're going to roll with ASM using VNX snapshots for snapshotting. It does a ROW instead of COW, which is supposed to be faster. We'll see what kind of numbers we get against the SSD pool with that when we test Friday.

    Rob
  • 14. Re: Looking for a filesystem for OLTP DB Workload on OEL6
    athlonrob Newbie
    Currently Being Moderated
    Dude-

    Snapshots aren't meant for direct backups. They can be a component of a backup system (you can snapshot, bring services online, then backup the snapshot, and destroy the snapshot), but they are not backups. They're for quick point-in-time rollbacks.

    For instance, we snapshot right before critical points of our upgrade process. If they fail, we roll back and adjust fire accordingly. When we're trying to optimize a certain part of the process, we'll snapshot, try plan A, roll back, try plan B, roll back, try plan C, and implement the best of the three.

    Btrfs snapshots, amazingly (coming from ZFS here) do not rely on each other. To roll back, you delete the snapshotted subvolume and snapshot the snapshot into the subvol you just deleted. It is quite ingenious. A garbage collection process then fires up to reclaim space.

    If you've run ZFS on Solaris, you're probably familiar with the basics, although btrfs is definitely a different implementation. We were spoiled with ZFS and can't imagine going back to non-snapshotted days for big upgrade projects.

    Rob
1 2 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points