In today's world, high availability is very important so automating your virtual IP failover is critical to keep your applications running to allow your users to perform their duties without any impact if something happens to your main Cloud servers.

This article explains how easily you can automate the VirtualIP failover process on Oracle Cloud Infrastructure (OCI) using Linux Corosync/Pacemaker along with OCI command line interface (CLI).

The main goal is to show how you can setup an OCI Secondary IP to automatic failover in case of a downtime situation.

Getting Started

Lets understand the components that will be used first. As mentioned above there are three main components to allow your Virtual IP (OCI Secondary IP) to failover automatically:

  • Corosync is a Group Communication System with additional features for implementing high availability within applications.
  • Pacemaker which is an Open Source, High Availability resource manager suitable for both small and large clusters.
  • OCI CLI is responsible for integrating Linux Corosync/Pacemaker VirtualIP IPaddr2 resource with Oracle Cloud Infrastructure vNIC Secondary IP.

Screen Shot 2017-10-02 at 9.55.48 AM.png

Requirements

  • At least 2 Oracle Linux 7.x OCI instances (VM or BM Shapes). Other Linux distros can be used as well
  • Corosync/Pacemaker
  • OCI CLI
  • Floating IP (i.e.; 172.0.0.10/24)

Preparing your OCI Instances for VirtualIP Failover

Once your Oracle Linux instances have been provisioned you will need to setup OCI CLI as explained in the public documentation, install and configure your Corosync/Pacemaker Cluster along with its requirements (stonith, quorum, resources, constraints, etc). After configuring your Corosync/Pacemaker cluster and OCI CLI, you will need to setup your VirtualIP resource. Below is a quick example about how to setup a VirtualIP resource on Corosync/Pacemaker using command line. The same process can be done through the web browser UI as well.

$ sudo pcs resource create Cluster_VIP ocf:heartbeat:IPaddr2 ip=172.0.0.10 cidr_netmask=24 op monitor interval=20s

NOTE: the ‘cidr_netmask=24’ in the Pacemaker command is dependent on the subnet size being /24

 

Next step is selecting one of the Oracle Linux Corosyn/Pacemaker OCI nodes and assigning a new OCI Secondary IP address (172.0.0.10 will be used based on VCN 172.0.0.0/16 ) using OCI Console as explained in the public documentation. That OCI Secondary IP will be used as the Corosync/Pacemaker floating IP.

Integrating Linux Corosync/Pacemaker with OCI CLI

Now that you have your Oracle Linux Corosync/Pacemaker OCI instances up and running along with OCI CLI, you will need to identify your nodes attached Virtual Network Interface Cards (VNICs) Oracle Cloud IDs (OCID) that will be used to automate the failover process. More details about how to do that can be found here. Write down your Cluster Nodes OCIDs, update the below CLI with your own OCIDs and run the following commands on ALL NODES to update Corosync/Pacemaker IPaddr2 resource.

$ sudo sed -i '64i\##### OCI vNIC variables\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '65i\server="`hostname -s`"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '66i\node1vnic="ocid1.vnic.oc1.phx.NODE1-vNIC-OCID"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '67i\node2vnic="ocid1.vnic.oc1.phx.NODE2-vNIC-OCID"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '68i\vnicip="172.0.0.10"\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '614i\##### OCI/IPaddr Integration\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '615i\        if [ $server = "node1" ]; then\' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '616i\                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $node1vnic  --ip-address $vnicip \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '617i\        else \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '618i\                /root/bin/oci network vnic assign-private-ip --unassign-if-already-assigned --vnic-id $node2vnic  --ip-address $vnicip \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

$ sudo sed -i '619i\        fi \' /usr/lib/ocf/resource.d/heartbeat/IPaddr2

 

NOTE:

  1. Replace ocid1.vnic.oc1.phx.NODE1-vNIC-OCID and ocid1.vnic.oc1.phx.NODE2-vNIC-OCID by your own OCI vNICs OCIDs.
  2. Replace /root/bin/ path in case you have installed OCI CLI in a different location.
  3. Replace "node1" and "node2" hostname entries by your own Clusternodes hostname ones.
  4. The above example is for 2 Oracle Linux 7.4 Corosync/Pacemaker Cloud Nodes and it's quite easy to adjust it for additional nodes in case your cluster has more than 2 nodes.

Testing the VirtualIP Failover

Your configuration is done and now it's time to test the VirtualIP failover. You can do that by simulating a crash, disabling the node where the virtual IP was started on or simply moving the Corosync/Pacemaker VirtualIP resource through command line from one node to another one.

The below example assumes your VirtualIP (Cluster_VIP) resource is running on node1 so, to move it to node2 you need to run the following command:

$ sudo pcs resource move Cluster_VIP node2

Demonstration

Watch this Automatic VirtualIP Failover on OCI video (4 minutes) to see how the failover happens automatically during a downtime or resource migration process without impacting your end user access communication.