To summarize my reply that fell to Senor Quicksdraw's guns: ;-)
1) I wasn't commenting on performance one way or the other.
2) I didn't necessarily mean literal hardware cache and RAM (main mem). I was referring to the JLS/JVM abstractions. However, I would think a typical implementation would provide a pretty close mapping between them.
3) Along with #2, a CPU's instructions to force cache coherency can allow a JVM to meet the requirements of the spec and still let the hardware give threads their own copies of variables in actual physical cache.
4) I should have said, "Your code +must behave as if+ .... yadda yadda cache, etc."