8 Replies Latest reply: May 3, 2013 6:41 PM by 1007059 RSS

    RDS

    Billy~Verreynne
      I'm struggling to find a clear and definitive description of what RDS needs ito a s/w stack and how to configure it.

      It runs over Infiniband. Documentation seem to indicate that IPoIB must be enabled (makes sense as IP addressing and interface are needed).

      But what else does RDS run on? Can it run via RDMA (seen mention in some web notes of rds-rdma)? How does this differ from running it via IP (using rds-ip)?

      Basically, what is needed in the kernel (specific modules) in order to load RDS for running via IB?

      The most detailed document I've found thus far was from IBM - and outdated as it referred to an older OFED version (which was very buggy in my experience) for a much older Linux kernel than RHEL/OL5 (never mind RHEL/OL6).

      Reference URLs and comment will be much appreciated.
        • 1. Re: RDS
          Catch 22
          Here is what I can find about the topic. Perhaps it is some useful info.

          https://oss.oracle.com/projects/rds/
          RDS is now included in the Linux kernel ... Currently supported transports include TCP sockets and IB Verbs Reliable Connected connections ... RDS/IB listens for incoming connections on port 18634. RDS uses RDMA private connect parameter data, both when initiating and accepting a connection.

          http://www.oracle.com/technetwork/database/clustering/tech-generic-linux-new-086754.html
          RDS over IB is supported .... Oracle only supports InfiniBand HCA with Mellanox chip set

          You will probably need to install rds-tools:
          # yum info rds-tools
          Various tools for support of the RDS API. RDS is specific to InfiniBand and iWARP networks and does not work on non-RDMA hardware.

          Using RDS over InfiniBand:
          http://www.dell.com/downloads/global/power/ps2q07-20070279-Mahmood.pdf

          What is RDS RDMA?
          https://oss.oracle.com/pipermail/rds-devel/2007-November/000205.html

          More related references including InfiniBand, RDMA and RHEL 6 from the IBM site:
          http://publib.boulder.ibm.com/infocenter/lnxinfo/v3r0m0/topic/performance/howtos/infinibandintro.htm
          • 2. Re: RDS
            Billy~Verreynne
            Thanks. Been messing about with it on a dev servers that have HCAs and wired to an IB switch.

            rds-tcp loads on top of the rds kernel module - which seems to indicate pure IP NIC implementation. (also caused a hard kernel reset when running a ping - target server crashed without so much as an error in the kernel log)

            rds-rdma loads on top of the ib-core kernel module - indicating RDS over IB via RDMA (Remote Direct Memory Access) protocol. Same layer that the scsi IB protocol (SRP) runs via.

            But not planning at the moment to look any further at this weekend... all work and no play makes one a dull boy. ;-)
            • 3. Re: RDS
              Billy~Verreynne
              Got rds to work properly on Oracle Linux 5.9.

              The driver stack looks as follows:
              rds       // RDS module
              rds_rdma  // Infiniband driver and not IP driver
              rdma_cm   // RDMA connection manager
              ib_core   // Infiniband Core module
              The problem I ran into was this:
              [root@... ~]# cat /etc/modprobe.d/rds.conf
              install rds /sbin/modprobe --ignore-install rds && /sbin/modprobe rds_tcp && /sbin/modprobe rds_rdma
              Really messes up modprobe and results (in my case) in over 19,000 modprobe processes and unknown symbol errors. Fix was to simply comment out this instruction.
              • 4. Re: RDS
                user12028852
                Can you tell me if you used ofed packages or yum install infiniband support to
                get the infiniband/rds drivers? Also which kernel are you using?

                Thanks
                Alan
                • 5. Re: RDS
                  Billy~Verreynne
                  Kernel version:
                  2.6.32-300.39.2.el5uek

                  OFED drivers:
                  ofa-2.6.32-300.39.2.el5uek.x86_64

                  Misc:
                  rds-tools.x86_64
                  infiniband-diags.x86_64

                  There are like a range of RPM dependencies that will result in additional package to be automatically installed.

                  Oracle support recommended the following to me (kind of includes the kitchen sink too ;-) ):
                  >
                  # yum install opensm opensm-devel opensm-libs opensm-static openib ibutils ibutils-devel ibutils-libs infiniband-diags libcxgb3 libcxgb3-libs libibcm libibcm-devel libibcm-static libibcommon libibcommon-devel libibcommon-static libibmad ibibmad-devel libibmad-static libibumad libibumad-devel libibumad-static libibverbs libibverbs-devel libibverbs-static libibverbs-utils libipathverbs libipathverbs-static libmlx4 libmlx4-static libmthca libmthca-static librdmacm librdmacm-devel librdmacm-static ibrdmacm-utils libsdp mstflint perftest qlvnictools qperf srptools tvflash ibnes ibnes-static
                  >

                  As mentioned, ran into issues doing a modprobe rds_rdma - and I've found the +/etc/modprobe.d/rds.conf+ to be the culprit.
                  • 6. Re: RDS
                    tmh
                    I have experienced the same thing and have an open service request with Oracle about this.
                    This has occurred to me 3 times when upgrading from an existing 5.8 to 5.9.

                    After update and reboot, modprobe runs thousands of times until the server runs out of memory and won't contine.


                    If something comes from my SR I'll see if I can update again.
                    • 7. Re: RDS
                      Billy~Verreynne
                      Have you tried the fix that I've described working in my case?

                      Try the following. Rename file +/etc/modprobe.d/rds.conf+ to +/etc/modprobe.d/rds.conf-backup+ and reboot. If this still does not work, rename the file back to its original name.

                      BTW, are you trying to run RDS over IP or IB?
                      • 8. Re: RDS
                        1007059
                        The original error message with modprobe generating 19,000 errors and running the system out of memory is a race condition in the boot loader.
                        I've created an SR for this issue and have a patched RPM from oracle support. There is a fairly simple work around. Let me dig up my SR.


                        "
                        There is a conflict between rds.conf and ofa-2.6.32-300.32.2.el5uek.conf.
                        remove any one of them, boot will works well.
                        It seems the "alias rds rds_rdma" make "install rds" run failure.
                        "


                        I use RDS for high performance Oracle RAC clusters.