This discussion is archived
8 Replies Latest reply: May 3, 2013 4:41 PM by 1007059 RSS

RDS

BillyVerreynne Oracle ACE
Currently Being Moderated
I'm struggling to find a clear and definitive description of what RDS needs ito a s/w stack and how to configure it.

It runs over Infiniband. Documentation seem to indicate that IPoIB must be enabled (makes sense as IP addressing and interface are needed).

But what else does RDS run on? Can it run via RDMA (seen mention in some web notes of rds-rdma)? How does this differ from running it via IP (using rds-ip)?

Basically, what is needed in the kernel (specific modules) in order to load RDS for running via IB?

The most detailed document I've found thus far was from IBM - and outdated as it referred to an older OFED version (which was very buggy in my experience) for a much older Linux kernel than RHEL/OL5 (never mind RHEL/OL6).

Reference URLs and comment will be much appreciated.
  • 1. Re: RDS
    Dude! Guru
    Currently Being Moderated
    Here is what I can find about the topic. Perhaps it is some useful info.

    https://oss.oracle.com/projects/rds/
    RDS is now included in the Linux kernel ... Currently supported transports include TCP sockets and IB Verbs Reliable Connected connections ... RDS/IB listens for incoming connections on port 18634. RDS uses RDMA private connect parameter data, both when initiating and accepting a connection.

    http://www.oracle.com/technetwork/database/clustering/tech-generic-linux-new-086754.html
    RDS over IB is supported .... Oracle only supports InfiniBand HCA with Mellanox chip set

    You will probably need to install rds-tools:
    # yum info rds-tools
    Various tools for support of the RDS API. RDS is specific to InfiniBand and iWARP networks and does not work on non-RDMA hardware.

    Using RDS over InfiniBand:
    http://www.dell.com/downloads/global/power/ps2q07-20070279-Mahmood.pdf

    What is RDS RDMA?
    https://oss.oracle.com/pipermail/rds-devel/2007-November/000205.html

    More related references including InfiniBand, RDMA and RHEL 6 from the IBM site:
    http://publib.boulder.ibm.com/infocenter/lnxinfo/v3r0m0/topic/performance/howtos/infinibandintro.htm
  • 2. Re: RDS
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Thanks. Been messing about with it on a dev servers that have HCAs and wired to an IB switch.

    rds-tcp loads on top of the rds kernel module - which seems to indicate pure IP NIC implementation. (also caused a hard kernel reset when running a ping - target server crashed without so much as an error in the kernel log)

    rds-rdma loads on top of the ib-core kernel module - indicating RDS over IB via RDMA (Remote Direct Memory Access) protocol. Same layer that the scsi IB protocol (SRP) runs via.

    But not planning at the moment to look any further at this weekend... all work and no play makes one a dull boy. ;-)
  • 3. Re: RDS
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Got rds to work properly on Oracle Linux 5.9.

    The driver stack looks as follows:
    rds       // RDS module
    rds_rdma  // Infiniband driver and not IP driver
    rdma_cm   // RDMA connection manager
    ib_core   // Infiniband Core module
    The problem I ran into was this:
    [root@... ~]# cat /etc/modprobe.d/rds.conf
    install rds /sbin/modprobe --ignore-install rds && /sbin/modprobe rds_tcp && /sbin/modprobe rds_rdma
    Really messes up modprobe and results (in my case) in over 19,000 modprobe processes and unknown symbol errors. Fix was to simply comment out this instruction.
  • 4. Re: RDS
    user12028852 Newbie
    Currently Being Moderated
    Can you tell me if you used ofed packages or yum install infiniband support to
    get the infiniband/rds drivers? Also which kernel are you using?

    Thanks
    Alan
  • 5. Re: RDS
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Kernel version:
    2.6.32-300.39.2.el5uek

    OFED drivers:
    ofa-2.6.32-300.39.2.el5uek.x86_64

    Misc:
    rds-tools.x86_64
    infiniband-diags.x86_64

    There are like a range of RPM dependencies that will result in additional package to be automatically installed.

    Oracle support recommended the following to me (kind of includes the kitchen sink too ;-) ):
    >
    # yum install opensm opensm-devel opensm-libs opensm-static openib ibutils ibutils-devel ibutils-libs infiniband-diags libcxgb3 libcxgb3-libs libibcm libibcm-devel libibcm-static libibcommon libibcommon-devel libibcommon-static libibmad ibibmad-devel libibmad-static libibumad libibumad-devel libibumad-static libibverbs libibverbs-devel libibverbs-static libibverbs-utils libipathverbs libipathverbs-static libmlx4 libmlx4-static libmthca libmthca-static librdmacm librdmacm-devel librdmacm-static ibrdmacm-utils libsdp mstflint perftest qlvnictools qperf srptools tvflash ibnes ibnes-static
    >

    As mentioned, ran into issues doing a modprobe rds_rdma - and I've found the +/etc/modprobe.d/rds.conf+ to be the culprit.
  • 6. Re: RDS
    tmh Newbie
    Currently Being Moderated
    I have experienced the same thing and have an open service request with Oracle about this.
    This has occurred to me 3 times when upgrading from an existing 5.8 to 5.9.

    After update and reboot, modprobe runs thousands of times until the server runs out of memory and won't contine.


    If something comes from my SR I'll see if I can update again.
  • 7. Re: RDS
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Have you tried the fix that I've described working in my case?

    Try the following. Rename file +/etc/modprobe.d/rds.conf+ to +/etc/modprobe.d/rds.conf-backup+ and reboot. If this still does not work, rename the file back to its original name.

    BTW, are you trying to run RDS over IP or IB?
  • 8. Re: RDS
    1007059 Newbie
    Currently Being Moderated
    The original error message with modprobe generating 19,000 errors and running the system out of memory is a race condition in the boot loader.
    I've created an SR for this issue and have a patched RPM from oracle support. There is a fairly simple work around. Let me dig up my SR.


    "
    There is a conflict between rds.conf and ofa-2.6.32-300.32.2.el5uek.conf.
    remove any one of them, boot will works well.
    It seems the "alias rds rds_rdma" make "install rds" run failure.
    "


    I use RDS for high performance Oracle RAC clusters.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points