I've seen this before as I indicated and I can give you a rough outline of what is most likely happening.
You start a node ... any node ... and it is happy.
You try to start a second node and it can not join the cluster for one of two reasons. Either (A) It can not see the voting disk or (B) communications across the cluster interconnect are failing.
My experience leads me to favor (B) but that is only because that is how I have personally experienced it in the past. So how did we diagnose it? Well first we had to get the network admins to acknowledge that there is more to networks than seeing blinking green lights. Then we had them look at the metrics for the number of packet resends ... in short the number of times the existing instance rejected the packets being sent by the new instance or that the switch rejected packets sent by either instance.
If your network admins are unskilled at this level, which is most likely the case, they will need to demonstrate a little vulnerability and call the switch vendor's support and ask for help doing this task: It is not something more than a small percentage know how to do.
Can you get us the exact specifics on the cache fusion interconnect ... how many switches, how they are connected, 1Gb or 10Gb, Jumbo Frames (MTU 1500 or 9000) for every port on the switch and the servers, when the last software and firmware patches were done on the switches, etc. And we need all of it not some of it to take this further.
You can read my thoughts about most network admins here:
Look for the <RANT> tag.