We went from 3.1 to 3.7 - we got screwed by the fact that the partitions are done differently and take up more memory. On 3.1 we had hoardes of small (512mb) heap storage nodes that were using maybe 80% of their heap capacity - after the upgrade they started to fill up and go pop-goes-the-weasel on us until we repacked them onto fewer nodes with larger heaps. We were going to do the fewer nodes with larger heaps bit as part of the original 3.7 upgrade but we were told we didn't need to by a well-meaning support person who probably forgot about memory footprint changes in the distant past that we should not bother changing our memory footprint.
In 3.7 they removed a bunch of methods that had been deprecated since 3.2 or something - it wasn't that hard to find their equivalent new method calls.
There were some other undocumented changes - in 3.1 if you had a near cache and you did a cache.put(key, foo) it put it into both your near cache and the back cache. In a later version they changed that. We had some code in places that relied on those values being in the near cache so we had some strange bugs crop up in our testing. We had to literally follow that put with a subsequent get request to get it to actually populate the near cache - so now those near cache puts were more way more expensive (they were only in a few spots so it wasn't that big of a deal, but it was annoying that there was no way to configure it to act the old way of having cache.put() operations into a near cache actually populate a near cache)
The upgrade was good - if you are considering the upgrade because of stability I can say that for us 3.7 is way better than 3.1... or at least it was once we got the heaps sized better. 3.1 was really awful. 3.7 has been really solid. When we were stuck on 3.1 our lives were miserable.
In our installation we have seen a few lingering issues that haven't been fixed yet. There's one annoying issue where the storage nodes will go into crazy storms and send NIC-saturating blasts of duplicate packets at our client nodes at such a rate that our switches start to get overwhelmed and other services are impacted (we're talking 24 nodes each sending things at 1Gb speed towards 2 or 3 client nodes - it's a bug related to how multi-point packets are resent). We're hoping that 18.104.22.168 will fix part of that when we put it on our live servers next week, but we're not holding our breath since we were told that to make storage nodes use flow control when dealing with resending multi-point packets is "too big of a change for a point release". (but we did have some amazing help from support in getting to the bottom of the problem - including one of the engineers writing his own custom packet filter to analyze our traffic dumps to figure out what was actually being sent out in these traffic storms - and other than the traffic storms 3.7 has performed well for us)
So my advice is: lean on oracle support, but realize that when it comes to upgrading ancient versions of coherence they probably don't remember what has changed since then. (Although 3.3 isn't as ancient as 3.1 was - we were one of the first coherence customers so there can't be that many other people who were on 3.1). So be sure to test things on your own as well.