Skip navigation

SlavaUrbanovich

12 posts

Hello Oracle Community,

Sharing this recent experience around enabling X11 on Exadata after running into similarly the same confusion back in 2016 and now in 2018.

 

First, we had ran into an issue with enabling X11 on Exadata compute nodes in May-June 2016 which was ultimately resolved by installing additional RPMs to support X11.

We had worked with several hardware and software versions of Exadata prior to that and never needed to install the additional RPMs before.

There were other issues preventing DBAs, for example, from running X11 based tools like dbca or runInstaller.

Some of them were due to X11 forwarding not being enabled in the sshd_config file or the firewall rules or the issues with the DISPLAY variable ,

especially when running X11 based tools over the VPN connection.

 

We had an SR opened for that issue and we were told that there were some recent changes to the Exadata image and several X11 RPMs didn't make it anymore as far as being part of the default Exadata image.

 

Fast forward 2+ years and we are in a very similar situation in 2018.

More specifically, Exadata machines that were deployed with the image version 18.1.4.0 from January 2018 were "missing" a few X11 RPMs while Exadata machines deployed with the image version 18.1.6.0 from May 2018 didn't require any additional X11 RPMs.

 

The moral of this story , as I see it, is that this well tested Oracle Support DocID is still your good friend when it comes to troubleshooting X11 on Exadata compute nodes.

Unable to run graphical tools (gui) (runInstaller, dbca and dbua) on Exadata 12.1.2.1.0 - 18.1.X.X.X (Doc ID 1969308.1)

 

Thank you for reading, Slava Urbanovich

Hello,

Sharing this recent experience around having to replace an IB switch in the Exalogic machine due to the hardware failure.

It has started as a fairly simple maintenance related to a hardware failure on the redundant equipment - such as IB switch in Exalogic machine.

 

Hardware failures do happen and IB switches are no exception to that rule. Also, IB switches in Exalogic machines aren't customer serviceable units or CRUs.

Thus everything looks fairly simple, right ? With either ASR or OEM or something else  reporting a hardware failure an SR gets created, initial troubleshooting and RCA are done and Oracle Field Engineer has been scheduled to replace the failed IB switch.

So far so good.

 

At the agreed upon time, Oracle FE shows up at your datacenter , replaces the IB switch, matches the firmware on the replacement switch to the healthy one, configures ILOM and management on the new IB switch, declares "Mission Accomplished" (remember that banner on the carrier?) and leaves.

 

You also run Exachk report and it comes back clean - life is good, right ? Wrong. Very wrong unfortunately.

 

Even though IB switch isn't customer serviceable indeed, restoring configuration on it is.

Essentially, if you - the customer - haven't followed this MOS document immediately after Exalogic IB switch replacement you are exposed for a big problem.

Exalogic Infiniband Switch Replacement - Follow-up Actions (Restoration) (Doc ID 2218689.1)

 

How big of a problem one may ask ? Think of the complete outage of all VMs running on that Exalogic machine, all at once.

This is what just happened to one of our customer.

 

The moral of the story here is very simple - don't accept FEs report that everything is good since everything is good just from the HW perspective.

Your freshly replaced IB switch still has no configuration on it and in a case of the IB fail-over attempt, planned or un-planned, your Exalogic customers could get agitated very quickly.

 

Make sure to follow all the steps from this document and perform a fail-over test for at least one of the VMs.

Hope this article saves you from a major Exalogic outage or at least allows you to recover from it as quickly as possible.

Thank you for reading, Slava Urbanovich

Hello Oracle Community,

There are several scenarios when telnet could be a useful tool on Exadata or ODA.

For example, when there is a need to see why a specific  server in the environment isn't responding to Exadata as expected.

More specific examples could be related to troubleshooting the email alerts on Exadata compute nodes or the storage cells or

to validating the ports needed for OEM setup or even X11.

 

One immediate thought could be: "Let's just run # telnet <hostname> <port> command and be done"

 

That is easier said than done on Exadata though as telnet is not installed as part of standard Exadata image.

Also, in many cases you may not be permitted to install either telnet or any additional tools on Exadata due to the policies and restrictions in place.

 

Luckily though the following command could be used in place of Telnet with ease:

# echo > /dev/tcp/<hostname>/<port>; echo $?;

 

Since the  method above is using the basic Linux "echo" command that technique works equally well on the compute nodes as well as the storage cells.

The result should be 0 (zero) when the requested host responds on the port number provided.

 

Coming back to our more specific example of testing SMTP relay and whether or not it is accepting requests from our Exadata machine's compute nodes or the storage cells the syntax could be as following:

# echo > /dev/tcp/smtp.acme.com/25; echo $?;

0

 

If you see 0 (zero) for the output things are looking up and your SMTP relay smpt.acme.com is indeed responding on the port 25, as expected.

It is that simple.

 

Thank you for reading, Slava Urbanovich

Hello Oracle Community reader,

This article describes my experience deploying Bare Metal and Virtualized pre-X6 ODA without GUI / X11 interface needed.

I have had used it for 40+ ODA deployments , both Bare Metal (BM) and Virtualized (ODAVP) so I feel comfortable saying that it works well.

The (re)deployment process for the X6-2 and X7-2 ODAs is different though as it is rather web browser based instead of X11, although a similar logic applies there as well.

 

Standard ODA deployment guide is describing the process with VNC viewer needed to complete cluster deployment after the "oakcli configure firstnet" step and placing the end user bundle on the ODA.

At the same time , I had to deploy ODAs (both Bare Metal and Virtualized) in the environments where the use of X11 and VNC even more so were constrained.

It could be happening due to the network design and firewalls being setup between the network segments, for example.

The X11 processes and applications may require TCP ports 6000-6063 being opened , as per documentation, while the more typical ports range is 6000-6003.

Additionally, VNC viewer could be prohibited in several environments as it is not all that secure.

 

The workaround for those possible constraints around GUI / X11 steps during ODA deployment is quite simple actually.

The entire process doesn't require any additional time , it could be more streamlined too as the ODA's onecommand.params file with the configuration information could be created ahead of deployment time.

 

At the high level the process looks as following

Step 1. Download ODA offline configurator, which is simply a Java based tool supplied by Oracle that has all the same steps as you would see during the standard ODA deployment

Oracle Database Appliance Manager Configurator

Step 2. Complete ODA offline configurator process and create the onecommand.params file with the configuration information required to deploy ODA.

             ODA configurator could be run on practically any MS Windows server or a laptop, it doesn't have to reside in the target network for the new ODA deployment either.

Step 3. Complete ODA (re)imaging and "oakcli configure firstnet"

Step 4. Transfer (or copy/paste) the file created by ODA offline configurator to the ODA that needs to be deployed.

             ! Make sure that filename is /opt/oracle/oak/onecmd/onecommand.params

Step 5. Run ODA command /opt/oracle/oak/onecmd/GridInst.pl -l for BM or /opt/oracle/oak/onecmd/GridInst.pl -l -o for Virtualized ODA deployment

             That command will validate that the file onecommand.params is in place and is readable as well as it will provide the list of steps required to deploy ODA.

Step 6. The key for deploying ODA without having to use X11 based GUI is to use the step #0 to validate the content of the onecommand.params file.

             GridInst.pl tool could be used to run either individual steps or the range of steps in one go.

            You could use a syntax like ./GridInst.pl -s 0 for BM or ./GridInst.pl -o -s 0 for ODAVP.

Step 7. Once the onecommand.params file has been validated via performing step #0 for GridInst.pl you could complete the deployment process either by completing individual steps or the range of steps.

               For example: ./GridInst.pl -r 1-24 to complete the deployment steps 1 through 24.

More information about using ./GridInst.pl could be obtained from MOS site, for example from the MOS DocID:

ODA (Oracle Database Appliance): Deployment & Cleanup Steps (Doc ID 1409835.1)

 

The overall ODA (re)deployment process will differ for different hardware releases and could be estimated at 1-2 hrs, provided that no significant troubleshooting is needed.

Thank you for reading, Slava Urbanovich

Hello,

We have observed a fairly severe issue after Bare Metal ODA patching from the bundle 12.1.2.9.0 to the bundle 12.1.2.12.0 (the latest available among the 12.1 bundles at the time of this writing)

Namely - none of the 11.2.0.4 databases would start as soon as Infrastructure & GI components of the ODA were patched via the "oakcli update -patch 12.1.2.12.0 -server" command.

 

Error messages were indicating that DB control file could not be identified.

We were still able to see each 11.2.0.4 DB's control file via the asmcmd>lsdg exactly where it was expected to be.

At the same time either on CRS restart or after the "srvctl start instance" command the DB instance would show up in the "ps -ef | grep pmon" output for a few moments and then it would disappear.

 

After looking into the DB alert files the following error messages were spotted for the times when the attempts to start 11.2.0.4 databases were made.

ORA-600 [kfdJoin3] [20] [65535] [100]

 

Also. luckily enough, this MOS DocID has been posted in mid-February 2018 and it describes the same exact issue ... and the solution !

Database Instance Will Not Start ORA-600 [kfdJoin3] [20] [65535] [100] After Patch 12.1.2.12.0 Including the ODA (Doc ID 2341707.1)

 

The solution came down to completing ODA patching up to and including the RDBMS homes and then downloading the one-off patch p18948177_112040_Linux-x86-64.zip that simply needed to be applied per its README.

It is important to note that this one-off patch should NOT be applied to the GI homes on the ODA since those homes would be already at 12.1 and this one-off patch is specific to 11.2.0.4 homes.

 

Once we applied the patch, via opatch apply, all 11.2.0.4 databases had started as expected.

Hope this article could help if you end up being in the situation that we had found ourselves after applying ODA bundle patch 12.1.2.12.0 to the appliance where databases 11.2.0.4 are still running.

Thank you for reading, Slava

This is to share the results of our recent research on whether and when Oracle database 12.2 will be supported on ODA platform.

 

As of January 6, 2018 the answer is two pronged - if you are lucky enough to have ODA X7-2 then Database 12.2 is already supported and it could be downloaded and deployed with or upgraded to ODA bundle 12.2.1.1.0 as needed.

 

For everybody else, that is for the owners of ODAs X6-2 and older Oracle DB 12.2 is still a thing from the future as there is no ODA bundle at the moment that will bring 12.2 to older ODA hardware.

While the different estimates have being given for it to happen the time frame seems to be February-March, again as the most likely.

Admittedly, 2018 has managed to start with one really Big Bang event as ever more dramatic security vulnerabilities had been disclosed for practically every smartphone, tablet , laptop. PC, on-prem server or a cloud server that is currently in use.

To make things even worse those newly disclosed vulnerabilities would not leave any trace if they were exploited.

Finally, as if all of it wasn't enough, the core of the problem lays within the modern Intel chip design - meaning that real fix would require a hardware replacement , not a software bug fix !

 

Yes, we are talking about Meltdown and Spectre vulnerabilities here as every responsible Exadata owner could be scrambling to understand how much this Modern Intel Chip Armageddon (MICA) could impact their environments.

While the jury is still out there on the final determination of the impact there are some strong indicators that things may not be as ugly as they sound.

 

For  the starters, Oracle Engineered systems are expected to be treated as appliances or as Oracle marketing department would be saying  "Hardware and Software Engineered to Work Together".

This is rather an important line here as on the appliances such as ODA you could barely put any third party package - even Oracle Linux kernel fix, without written permission from Oracle Support.

 

ZDLRA machine won't be that much behind ODA with that requirement, followed by Exadata and then Exalogic.

Indeed, customers are allowed to install third party products on Exadata compute nodes; although not on the storage cells or IB switches once again.

 

In short -  the majority of all Exadata environments would only have software from Oracle Corp and possibly from other trusted vendors.

That along diminishes the risk of somebody's exploiting Meltdown and Spectre vulnerabilities on Exadata and other Oracle Engineered Systems a lot.

 

Additionally, some of Oracle Engineered systems support Ksplice, for example Exadata and ODA.

Ksplice is a major differentor between Red Hat Enterprise Linux  (RHEL) and Oracle Enterprise Linux (OEL) that could come handy in the situations when OS kernel need to be updated without incurring downtime.

While Ksplice won't address the Meldown or Spectre or any other individual vulnerability  - activating Ksplice would allow Exadata owners to apply future kernel fixes and updates without a downtime.

More on Ksplice could be found in this MOS article:

HOWTO: Install ksplice kernel updates for Exadata Database Nodes (Doc ID 2207063.1)

 

For the actual remediation of the current issues, looks that it will first come in the form of software workarounds in the OS kernel.

These MOS documents below could be used to track the progress on the fixes availability and for the customer's that need to be more proactive a security / vulnerability SR could be created for each Oracle Engineered system type in the environment. That way each customer with security / vulnerability SRs will be notified as soon as an approved fix / workaround is available.

 

How to research Common Vulnerabilities and Exposures (CVE) for Exadata packages (Doc ID 2256887.1)

Responses to common Exadata security scan findings (Doc ID 1405320.1)

 

While not much could be done to address Meltdown and Spectre vulnerabilities on Exadata at the time of this writing it could be a good idea and a reminder to review the section of Exachk reports that outlines discovered security gaps as well as to consider reviewing and strengthening Exadata / Exalogic / ZDLRA / ODA security beyond the levels provided by their typical deployments.

 

As one could say these are the days to Keep Calm and Keep Securing Exadata (among other systems)

This topic just came up during customer's ODA patching with the ODA bundle 12.1.2.9.0 , which was still the latest as of mid February 2017.

 

Apparently, RDBMS homes on the ODAs were not getting the OJVM updated.

That is still "by design" as OJVM patches are not currently covered by the ODA bundle patches.

More details could be found here OJVM Patches on Oracle Database Appliance (ODA) ( Doc ID 2100820.1 )

 

We had opened an SR to clarify on the matters and were told via that SR that

"... There is an enhancement request submitted to include OJVM patches to ODA patchsets to ensure all the standards are being followed ..."  

 

Additionally, we were told via the same Oracle Support SR that that OJVM patch updates could be added to ODAs still, following the normal patches conflict resolutions approaches.

For example, per this MOS document Oracle Recommended Patches -- "Oracle JavaVM Component Database PSU" (OJVM PSU) Patches (Doc ID 1929745.1)

Also, there is a possible caveat that OJVM patches may not be applied on ODAs in rolling fashion and it may need to be rolled back / removed during the future patching on the ODAs where they were installed outside of oakcli based process.

 

Hope this article helps to clarify the matters.

Feel free to ask questions, as always.

 

Slava Urbanovich

Sr. Principal Engineer – Oracle Engineered Systems, Cloud Solutions


c | (847) 691-8843

p | (847) 983-2958   

surbanovich@forsythe.com

After having to fish for the availability (and history) of some Exadata features more than one I have decided put them together in somewhat shortened and high level format.

This publication is intended to track new features that are coming with every new Exadata software and hardware release and to keep their timeline in one easy to read, hopefully, article.

 

For the most part bug fixes and smaller enhancement coming with the lowest patch level won't be covered here.

For example, new features coming with Exadata storage software 12.1.2.3.0 will be covered in this article but not the bug fixes delivered with Exadata storage software patch 12.1.2.3.3.

You see the logic, right? With all that being said lets get started:

 

The following features were added as new for Oracle Exadata Database Machine 12c Release 1 (12.1.2.3.0) <== This is the current / latest release as of Nov 07, 2016

The following features were added as new for Oracle Exadata Database Machine 12c Release 1 (12.1.2.2.0)

Think for a moment about how much of cloud technology was behind Chicago Cubs' fantastic winning game of 2016 and throughout the season?

Content rich sites like MLB.com and others with all their highlights, pictures, real time stats, updates, new flashes and more demand a lot of compute power for sure.

Not to mention all the storage needed for exponentially increasing traffic during the World Series, for example.

 

Here is what some of the industry experts have to say on it and the numbers are truly staggering.

According to Cloud collaboration is a hit for MLB's rich media management

 

"MLB.com uses cloud collaboration and homegrown digital asset management to create and manage sports highlights for 2.61 billion annual visitors."

Note: That is 2.61 billions with a B or close to one half of Earth' population !

 

"Because there's a separate production group for each game played by the 30 major-league teams during the 162-game regular season, there can be up to 15 four-person teams creating, archiving and distributing rich media and digital assets every night in the studio from April through September. That content is displayed on MLB.com and its team sites, and among partners like Yahoo Sports and WatchESPN.com, so an effective digital asset management system backed up by enterprise collaboration is critical.

And it's made even more important because it's not just baseball highlights."

 

More detail is described at https://www.oracle.com/corporate/pressrelease/mlb-network-leads-off-with-oracle-to-chronicle-baseball-history-100815.htm… ,

"MLB Network Leads Off with Oracle to Chronicle Baseball History and Expand Programming" and below are some interesting excerpts from it:

 

- Over the course of a single year, MLB Network, the ultimate television destination for baseball fans, collects and stores more than 120,000 hours of content from its daily TV programming, game telecasts and replays. It all resides in an active archive built on Oracle DIVArchive and Oracle’s StorageTek SL8500 modular library system, that’s easily searched and mined to allow the network to create differentiated content every day.

Initially, MLB Network built its digital asset management system to log live games and take advantage of its vast historical library. Dubbed the DIAMOND System, it includes a custom set of metadata tools to search, retrieve, and create collections and lists of elements from its video footage.  Using Oracle DIVArchive as part of its solution, MLB Network can manage and access  every baseball game played each season, with as many as seven different video feeds per game, catalog over 600 hours of live video content daily and archive  over 7 petabytes of information annually.

 

“Baseball fans are some of the most fervent and dedicated on Earth, and today’s game is rooted in the ability and opportunity to catalog every play in every game that takes place during each season,” said Tab Butler, director, post production and media management, MLB Network. “With Oracle DIVArchive and Oracle’s StorageTek SL8500 modular library system, we have been able to cater to baseball fans’ needs by meticulously housing every highlight and being able to bring every angle into our daily programming. As a result, we’ve been able to not only chronicle baseball history as it’s happening, but also grow our programming schedule significantly and drive our business forward by providing our fans with the content they truly desire on a daily basis.”

 

MLB Network’s archive resides on Oracle’s StorageTek SL8500 modular library system with 3,000 slots, which takes in about 50 TB of content and swaps out 400-500 cartridges daily. To support continued growth over the next seven years, MLB Network is also evaluating expanding to a second StorageTek SL8500 modular library system and migrating to Oracle’s StorageTek T10000D enterprise drives which would give the network a 10:1 cartridge space reduction and overall higher throughput over its current environment."

 

Technology behind the game is equally amazing as the game itself ... Oracle's modular StorageTek library system that swaps 400-500 cartridges daily!

That is one gigantic and busy storage system behind MLB.com Eco-system !

 

Finally, as you could have imagined - technology doesn't stand still with more innovations and additions being made earlier in 2016 as well.

Oracle Announces New Cloud Storage Options At NAB 2016 | StorageReview.com - Storage Reviews

 

"24 hours of video can require up to 86TB of capacity and this will only go up as cameras become better. In order to tackle this influx of data Oracle has integrated DIVA with Storage Cloud Serve – Archive Storage which gives customers an incremental cloud extension tier to on-premises archive solutions, such as Oracle’s StorageTek SL8500 and SL3000 tape libraries. This combined system allows customers to tier storage to both on-premises and the cloud giving users the performance and economics needed.

Oracle has also made updates to some of its other products that support storage and archiving media:

  • Oracle DIVAnet 2.0: With the latest release of Oracle DIVAnet, media and entertainment organizations can securely share and back up rich media assets globally by connecting up to 10 Oracle DIVArchive systems through a single global namespace. Access control by site and by application is now enabled in DIVAnet 2.0 to provide additional security for valuable video assets.
  • Oracle DIVArchive 7.4: Oracle’s top content storage management application now has incorporated Oracle Database 12c and has added support for Oracle Linux, further enhancing the benefits of running Oracle software on Oracle hardware.
  • AXF Explorer: With the release of Oracle AXF Explorer, Oracle DIVA customers gain access to the SMPTE standard open format for sharing and managing content among disparate storage systems.

Oracle’s DIVA solutions are powered by Oracle Database 12c and are used by the top media companies. In fact the top 10 media companies run on Oracle software. One example Oracle cites is Major League Baseball (MLB). Using Oracle DIVArchive, MLB moves more that 600 hours of video content every day and 7PB of data annually."

 

Thank you for reading

Update on Dec 01, 2016:

Oracle has provided a solution for the "Dirty COW" (CVE-2016-5195) on Exadata - which is delivered as a complete patch 12.1.2.3.3.161109

It is important to apply 12.1.2.3.3 with the 161109 build date as there were earlier build dates for this release and they do not resolve the "Dirty COW" vulnerability.

 

Exadata 12.1.2.3.3 release and patch (24441458) (Doc ID 2181366.1)

 

VersionPatch

Notes

See Note 1270094.1 for additional fixes that address critical issues.

12.1.2.3.3Patch 24441458 - Storage server and InfiniBand switch software (12.1.2.3.3.161109)
Patch 24669306 - Database server bare metal / domU ULN exadata_dbserver_12.1.2.3.3_x86_64_base OL6 channel ISO image (12.1.2.3.3.161109)
Patch 24669307 - Database server dom0 ULN exadata_dbserver_dom0_12.1.2.3.3_x86_64_base OVM3 channel ISO image (12.1.2.3.3.161109)

Recommended

Supplemental README Note 2181366.1

12.1.2.3.3 was updated from 12.1.2.3.3.161013 to 12.1.2.3.3.161109 to include important fixes.  See the fix list in patch 24441458 for details.

===================

Original article below

===================

Since almost all Exadata, Exalogic and ZDLRA machines in the  world are running on Oracle Linux they could be vulnerable to the CVE-2016-5195 which exploits access elevation during Copy On Write operations - hence the Dirty COW nickname.

 

This vulnerability became widely known in mid-October 2016 and according to the sources like Risk Assessment | Ars Technica UK  the current state of the appropriate patch development is as following:

"The underlying bug was patched this week by the maintainers of the official Linux kernel. Downstream distributors are in the process of releasing updates that incorporate the fix. Red Hat has classified the vulnerability as "important.""

 

The dangers of this vulnerability is also described on Risk Assessment | Ars Technica UK  as following

"As their names describe, privilege-escalation or privilege-elevation vulnerabilities allow attackers with only limited access to a targeted computer to gain much greater control. The exploits can be used against Web hosting providers that provide shell access, so that one customer can attack other customers or even service administrators. Privilege-escalation exploits can also be combined with attacks that target other vulnerabilities. A SQL injection weakness in a website, for instance, often allows attackers to run malicious code only as an untrusted user. Combined with an escalation exploit, however, such attacks can often achieve highly coveted root status."

 

Oracle has already categorized this vulnerability as of October 21, 2016 and RPMs that include the fix for this CVE have been released as well.

Please refer to this page to get up to the minute update (and the patched RPMs) for this vulnerability linux.oracle.com | CVE-2016-5195

Depending on your Exadata bundle patch level the Oracle Linux kernel version would either 6 (hopefully ) or 5, so please look for the appropriate errata links on the page above.

 

Update on 10/27:

The main page that tracks this vulnerability linux.oracle.com | CVE-2016-5195 is getting more updates pretty much daily now.

Updated RPMs that are mentioned there you could be found at https://oss.oracle.com/sources/  under the Oracle 5 or Oracle 6 links.

 

Additionally, for the current Exadata, Exalogic and ZDLRA patch levels , which assumes OEL 6, the following MOS document could be used as wellff

 

Oracle Linux 6: Reference Index of Security Vulnerability Bug fixes, CVE IDs and Oracle Linux Errata (Doc ID 2112930.1)

...

Customers may find status of fixes for CVEs for Oracle Linux through our Unbreakable Linux Network (ULN). Please refer to Oracle Support Document 1593465.1 "Unbreakable Linux Network (ULN) Administrative Features for Errata and CVEs"

This listing is sorted by the date of publication by Oracle.

 

 

Errata Date Component CVE ID Errata
25-Oct-2016Kernel-2.6.32CVE-2016-5195 (dirty COW)ELSA-2016-2105
21-Oct-2016Kernel-UEK-2.6.39CVE-2016-5195 (dirty COW)ELSA-2016-3634
21-Oct-2016Kernel-UEK-3.8.13CVE-2016-5195 (dirty COW)ELSA-2016-3633
21-Oct-2016Kernel-UEK-4.1.12CVE-2016-5195 (dirty COW)ELSA-2016-3632

...

There has been one well known and widely used security feature that was not available on Exadata, at least not until earlier in 2016.

 

it may worth a separate discussion on why InfiniBand (IB) partitioning was not available on Exadata until 2016, but regardless of the reasons it is now possible to have IB partitioning enabled on any previously deployed Exadata machine. IB partitioning is also available for the new Exadata deployments and a number of improvements and bug fixes were made in that area of Oracle Exadata Deployment Assistant (OEDA).

Minimum Exadata storage software requirements are still applicable of course, so you may still need to patch your Exadata machine(s) first

 

Here is why IB partitioning is important in many cases and why you may want to consider it seriously.

With IB partitioning it is now possible to prevent Exadata Oracle RAC nodes from one cluster from communicating via InfiniBand (IB) fabric (used for Exadata RAC clusters interconnect) with the nodes belonging to any other Oracle RAC clusters on the same Exadata machine, or even with any remote RAC nodes on another IB "daisy-chained" Exadata machines.

 

Exadata machines since their early incarnations going back to 2008+ era were fast, extremely fast actually, yet not extremely cheap in terms of hardware, software and service costs.

As a result of extreme performance at somewhat premium price Exadata machines were often times , well .. shared ... between different lines of business within the same company or even between several external customers. Such sharing had been handled with either several Oracle RAC clusters or with several ASM disk-groups created for the same cluster on either Bare Metal or Virtualized Exadata machines.

This approach is indeed providing certain level of isolation although all cluster node to cluster node and cluster node to the storage cells communication had been occurring over the same Infiniband (IB) partition with the same (default !) IB partition key that was shared by ALL Exadata machines in the world !

The good news with the new IB partitioning capability is that cluster nodes (BM and VM) could be forced to only communicate with the nodes from the same RAC cluster and/or a dedicated set of the storage cells.

What makes it even better is that this new "world order" could be equally enforced on any previously deployed Exadata as well as any new Exadata deployment as of April 2016.

 

Once IB partitioning is configured and enforced the RAC nodes could only communicate to the members of the same IB partition, without even being aware of any other nodes on the same Exadata machine.

Same is true for the "compute nodes-to-storage cells" IB communication, once storage IB partition keys are are created and assigned the compute nodes belonging to any set of storage IB keys will be allowed to communicate only with the storage cells that were assigned the same IB P-key.

 

This is clearly the concept that Facebook would not tolerate , but when it comes to Exadata cluster nodes cluster communication it makes perfect sense indeed.

This major Exadata security enhancements that was made rather silently, yet it may need be considered seriously for the Exadata environments where data from multiple business lines / customers could be co-hosted.

 

Below is a quote from the MOS document that describes this "walled English garden" approach to cluster interconnect communication quite well:

"Every Node within the infiniBand fabric has a partition key table which may be viewed under /sys/class/infiniband/mlx4_0/ports/[1-2]/pkeys. Every Queue Pair(QP) of the node has an index (P_Key) associated with it that maps to an entry in that table. Whenever a packet is sent from the QP’s send queue, the indexed P_Key is attached with it. Whenever a packet is received on the QP’s receive queue, the indexed P_Key is compared with that of the incoming packet. If it does not match, the packet is silently discarded. The receiving Channel Adapter does not know it arrived and the sending Channel Adapter gets no acknowledgement as well that it was received. The sent packet simply gets manifested as a lost packet. It is only when the P_Key of the incoming packet matches the indexed P_Key of the QP’s receive queue, a handshake is made and the packet is accepted and an acknowledgment is sent to the sending channel adapter. This is how only members of the same partition are able to communicate with each other and not with hosts that are not members of that partition (which means those hosts that does not have that P_Key in their partition table)."