3 Replies Latest reply on Jan 21, 2013 4:48 PM by omarh-oracle - oracle

    Building a department GPU cluster

      Hi all.

      I'm writing to you to ask for advice or a hint to the right direction.

      In our department, more and more researchers ask us (IT administrators)
      to assemble (or to buy) GPGPU powered workstations to do parallel computing.

      As I already manage a small CPU cluster (resources managed using OGE),
      with my boss we talked about building a new GPU cluster. The problem is
      that I have no experience at all with GPU clusters.

      Apart from the already running GPU workstations, we already have some
      new HW that looks promising to me as a starting point for temporary
      building and testing a GPU cluster.

      - 1x Dell PowerEdge R720
      - 1x Dell PowerEdge C410x
      - 1x NVIDIA M2090 PCIe x16
      - 1x NVIDIA iPASS Cable Kit

      I'd be grateful if you could kindly give me some advice and/or hint to
      the right direction.

      In particular I'm interested on your opinion on:
      1) is the above HW suitable for a small (2 to 4/6 GPUs) GPU cluster?
      2) is OGE suitable (or what should we use?) as a queuing and resource
      management system? We would like the cluster to be usable by many users
      at once in a way that no user has to worry about resources, just like we
      do on the CPU cluster with OGE.
      3) What distribution of linux would be more appropriate?
      4) necessary stack of sw? (cuda, OGE, torque, hadoop?, other?)

      Any hint will be greatly appreciated.

      Thank you very much for all your valuable insight!

      Best regards.