0 Replies Latest reply on Jan 4, 2010 1:53 AM by 807557

    Sun DataCenter Infiniband 36 FIRMWARE?

      Hi -
      Can someone point me to instructions on how to check the current version of firmware on our Sun DataCenter Infiniband 36 QDR switch? And, point me to any info on upgrading the firmware if necessary?

      I can SOMETIMES run mvapich or openmpi over IB and it works, but generally I get
      a "CQ polling error". So I went back to the rdma tests and see some problems.

      We have installed OFED 1.4.1-4, and because I was having problems I upgraded the firmware on the HCAS:

      lspci | grep -i infin
      0b:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0 2.5GT/s] (rev a0)

      mstflint -d 0b:00.0 q
      Image type: ConnectX
      FW Version: 2.6.0
      Device ID: 25418
      Chip Revision: A0
      Description: Node Port1 Port2 Sys
      GUIDs: 0003ba000100d770 0003ba000100d771 0003ba000100d772
      MACs: 0003ba00d771 0003ba00d772
      Board ID: (SUN0060000001)
      PSID: SUN0060000001

      An rping from the client to server gives
      created cm_id 0x10ca7c70
      cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x10ca7c70 (parent)
      cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x10ca7c70 (parent)
      rdma_resolve_addr - rdma_resolve_route successful
      created pd 0x10caa3d0
      created channel 0x10caa3f0
      created cq 0x10caa410
      created qp 0x10caa550
      rping_setup_buffers called on cb 0x10ca5010
      allocated & registered buffers...
      cq_thread started.
      cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x10ca7c70 (parent)
      rmda_connect successful
      RDMA addr 10caaa90 rkey 2002800 len 100
      send completion
      cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x10ca7c70 (parent)
      client DISCONNECT EVENT...
      wait for RDMA_WRITE_ADV state 6
      cq completion failed status 5
      rping_free_buffers called on cb 0x10ca5010
      destroy cm_id 0x10ca7c70

      I found the HCA firmware at

      and 2.6.0 is the latest available for Sun OEM, though it has been suggested to me that I upgrade to 2.6.100 or 2.7.0 but I'm not sure which image I should download from the mellanox site.

      In terms of hardware, we have X6250 blades. Software-wise we are at Linux kernel = 2.6.18-92.1.26.el5_lustre. and OFED 1.4.1-4. These X6250 blades have really been a pain to get working with IB, we've been at it a long time...