6 Replies Latest reply on Jul 8, 2019 7:47 AM by Dude!

    High performance db with Oracle RAC on hyper converged infrastructure

    Cvetan Grigorov

      Hi,

      I am doing a PoC of a RAC architecture built with two node servers with local NVME disks, which I want to use as main disks for the database db group.

       

      I was searching for a solution that can provide high io for an db consolidation project.

      I don’t have requirements about space. What I found was very expensive storage SAN solutions using all flash storage systems from different vendors EMC, Hitschi, IBM.

      After some investigations  I found some solutions like flashgrid.io. These solutions use the Oracle RAC stretched cluster concept .  The nodes are equipped with local NVME  disks and use high speed interconnect ( 40, 100Gbps ethernet or Infiniband) . Then these disks are shared between nodes using iscsi.

       

      According to some spec the average io for one nvme disk is about 500k IOPS read and 350k write which is a really big numbers. With a 10 Gbps ethernet you can achieve around 80 K iops on an iscsi shared disk using the latest RDMA technologies.

       

      I am trying to explore this concept and I made a lab in vbox with two nodes.

      each node has one defined nvme disk . I am using a third node for providing shared storage for voting disk, cr and quorum disks.

      I shared the local nvme disks using iscsi , then i use udev rules to make the name of local and shared disks consistent.

       

      I managed to install grid infrastructure and I managed to create a disk group using the shared local disks. I am continuing testing.

      I am posting this because I found almost zero information about this concept.

       

      The next step is to install a DB using the new db group and to test redundancy and availability.

       

      Still I don‘t know can i use such disk sharing a local disk is shared using iscsi target. The first node uses local block device the second node  accesses the same disk through  iscsi.

       

      regards

      Cvetan

        • 1. Re: High performance db with Oracle RAC on hyper converged infrastructure
          Dude!

          iSCSI is not the best option when you are looking for performance. The protocol is known for high latency and not efficient for databases  that typically transfer small amounts of data. TCP/IP is a networking protocol and was designed for a different purpose than disk storage, unlike SCSI, Fiber or Infiniband. I'm not saying iSCSI cannot work, but the combination of iSCSI and NVME or SSD doesn't really make sense when your goal is performance.

           

          Whether or not you can share iSCSI LUNs depends on whether or not you configured the iSCSI target in shared mode, and of course you will need a cluster filesystem, such as OCFS2 or use ASM.

           

          The first node uses local block device the second node  accesses the same disk through  iscsi.

          I'm not sure this is technically possible. I guess not - you cannot have different bus controllers access the same device from an electrical standpoint. According to my experience with ISCSI, once a device or volume is assigned to an iSCSI LUN, iSCSI is the only access. Keep in mind that iSCSI proivdes the raw block devices and is not network file sharing. The later, like NFS, will allow local access and shared network access. You can also use NFS for a RAC cluster, if I remember correctly (Oracle Direct NFS) but it will give you limited performance.

          • 2. Re: High performance db with Oracle RAC on hyper converged infrastructure
            Billy~Verreynne

            My busiest (2.5 billion rows processed yesterday), and largest (32+ TB) RAC, runs on an older ODA (Oracle Database Appliance) X5-2.

             

            The new ODA X7-2 (S) model has SSD only (small model) - and is pretty fast.

             

            ODA is relatively cheap (with its integrated storage), and fast. The full h/w and s/w stacks are certified, and supported, by Oracle.

             

            This should be your primary consideration IMO.

            • 3. Re: High performance db with Oracle RAC on hyper converged infrastructure
              Billy~Verreynne

              BTW, I have build (some years ago) a RAC with 3 storage (Supermicro) NAS servers, using iSER (ISCSI over Infiniband RDMA).

               

              I/O thruput is okay'ish, but meh in comparison with ODA.

              • 4. Re: High performance db with Oracle RAC on hyper converged infrastructure
                Cvetan Grigorov

                Hi ,

                Thank you for your replays!

                 

                Adding more information.

                 

                It seems in  the lab environment  the system works pretty well.

                I managed to install a RAC database and grid infrastructure using storage build from the local disks shared with iscsi. I followed the concept of streched cluster.

                The grid infrastructure is working well until now, but I need time to do more tests.

                 

                I am doing some tests for high availability and reservation now.

                if all the tests succeed I will confirm and will share the specification.

                 

                About iscsi , I am planning to use iser with 100gbps ethernet. I chose Mellanox ethernet adapters which support RoCE and iser . I plan to interconnect the nodes with direct attached cables.

                 

                The oracle hardware is ok, but I prefer to use standard hardware.

                 

                Regards,

                cvetan

                • 5. Re: High performance db with Oracle RAC on hyper converged infrastructure
                  Billy~Verreynne

                  The Oracle h/w (as in the recommended ODA), is standard Intel-based h/w - no different than Intel-based servers from HP or Dell, or whoever.

                   

                  Just a word of warning - I/O is the most expensive operation for a database. I have all to frequently seen CPUs waiting on I/O, in clusters (dating back to Oracle Parallel Servers in the 90's) - simply because the I/O layer is not fast enough, and does not scale fully. And pushing I/O calls over IP is not fast - IP protocols TCP and UDP were never designed for this.

                   

                  The only exception, where I almost never have seen idle CPUs waiting on I/O, is on the ODA platforms we have. I do not hesitate in recommending ODA - I have built my share of RACs from scratch, this last 15 years. And based on that, ODA is the easiest and quickest to setup, and delivers excellent performance. And it turns out to be the cheapest.

                   

                  Also, I never had to file kernel bugs and firmware issues (such as soft CPU lockups) with the relevant vendors, that never got satisfactorily resolved, as vendors pointed fingers, blaming each other. On ODA, the ENTIRE h/w and s/w stack is supported, and certified, by a single vendor - Oracle.

                  • 6. Re: High performance db with Oracle RAC on hyper converged infrastructure
                    Dude!

                    How do you evaluate performance? How do you compare your results to other possible solutions? Of course you can use iSCSI and see acceptable results depending on what you expect, but that doesn't mean it is the only or best solution.

                     

                    iSCSI is poor man SAN and subject to TCP/IP networking. What are the requirements? Do you need to create a cluster where shared storage has to rely on TCP/IP networking ? Because if not, you might be investing into technology that is more expensive than necessary and only give you suboptimal performance.

                     

                    I suggest to verify your idea about a cluster that is using local storage and iSCSI as shared storage. You can configure ASM to have failure groups comprised of local devices and shared iSCSI, but as far as I know these need to be different devices. Also keep in mind that ASM writing is in a round-robin fashion and performance will not be consistent among different types of failure groups.

                     

                    I was searching for a solution that can provide high io for an db consolidation project.

                    I don’t have requirements about space. What I found was very expensive storage SAN solutions using all flash storage systems from different vendors EMC, Hitschi, IBM.

                    It's not difficult to find the most expensive solutions. My experience is that you can always find something expensive and spent 500k on hardware and service contracts, but you may just accomplish the same investing 10k if you understand what you need and what you don't need.