This content has been marked as final. Show 7 replies
I've found -Xcomp. But I'm still not seeing my frequently-called methods mentioned by -XX:+PrintCompilation. Could it be because they're being inlined? I was using
but now I'll experiment further.
It would be helpful to know the relation between compiling and inlining.
causes the aforementioned frequently-called methods to be shown by -XX:+PrintCompilation.
My question about the relation between compilation and inlining: does inlining prevent compilation?
Last night, with large values for the above switches to encourage aggressive inlining, and without -Xcomp, a frequently-called method was not listed as compiled. The method is mostly a switch statement. When I broke out some of the case code to separate new methods -- replacing the code in the switch with a method call -- the new methods were listed as compiled. Maybe the frequently-called method was inlined? But why weren't the new methods inlined? Perhaps no nested inlining?
Well to be honest I would not worry too much if your code is inlined or not - decides that your code is too large it has good reasons to not inline it.
There are tons of benchmarks that show too agressive inlining to be really bad - because of L1/L2 instructoin misses.
However hotspot has some limitations which sort of code it does/can compile and which not, so it could be that your methods are not compiled at all - however I don't think this because you already said that with some arguments they are compiled.
My guess would be that the small methods are first inlined and then compiled and you only see the base-methods.
I would do some micro-benchmarks (please read some articles first abput common pitfalls and microbenchmarking in java), and compare the performance...
I assume that you are working with the HotSpot server VM.
Up front, I should say that +PrintCompilation does not indicate which methods have been inlined, only the methods that are the root of the compilation. (There are ways in debug VMs to print inlining decisions.)
The inlining algorithm in the server VM is complicated, and the VM could profit from some simplification in this area.
There is an overriding size limit on methods that can get compiled. In 5.0, methods larger than 8000 bytes (of bytecode) are not considered for compilation.
Method size is not a compilation criterion, but it is an inlining criteria. To avoid bloat in the CodeCache, the VM will shy away from inlining larger methods. You apparently have discovered this, as you experimented using the MaxInlineSize and FreqInlineSize knobs. Furthermore, if the called method is already compiled, it may still be inlined into the caller, but it is subject to an additional test against the size of the generated code. Additionally, there are rules for accessor methods to allow them to be inlined nearly always. You can look at byecodeInfo.cpp in the HotSpot source to get a more detailed sense of the inline decision making.
In the other direction, inlining should not prevent compilation, but could delay it. Compilation is triggered on the number of times a method is invoked in the interpreter. If a method is inlined at a hot call site, the rate of interpreter invocations will obviously decrease, delaying standalone compilation of the method .
As observed with the reference to final/static methods, direct inlining can only occur when the called method is known. In some cases, inlining can occur at virtual call sites, but only when there is a single implementor or when a runtime class check is incorporated into the generated code.
As an aside, -Xcomp is not a good choice as a rule of thumb. This option causes some methods to be compiled prematurely, that is, before referenced classes are loaded. For the server compiler, profiling data is non-existent, and the VM may deoptimize/recompile a method several times before getting a "optimal" compile.
Hope this helps.
This may be an obvious answer, but try profiling your code first, if you care about performance.
Rarely the hotspot performance is an issue, in my experience most delays are either caused by a coding error (unnecessary calls, O(N^2) or worse) or unneeded GC pauses (due to allocation).
In any case, premature optimization without hard data is usually a bad idea.
com.stevebrecher wrote:What you actually want is to pass -server to the JVM. What you think you want is wrong, basically because Java code compiled that way would suck until it is recompiled again, after some profiling. In Java, all method calls are virtual, unlike C++. But >90% of virtual calls always call the same method, because the destination object type is the same. After HotSpot runs the code a bit, these cases are recognized, so when the code is compiled those calls can be made non-virtual or the called method can be inlined (and when guesses are wrong, the code detects this and does a real virtual call - no misbehaviour, it just won't be that fast in that case).
Much more generally, I lack understanding of why comprehensive compilation of large portions of my code is not (as far as I know) an option. In the case of this application and subsequent ones in the endeavor (akin to scientific research) it would be perfectly acceptable to have the jvm spend "a lot" of time at the start of a run doing compilation.
Without doing this, inlining is impossible, and this is a big problem for optimizations. Even Common Subexpression Elimination is a problem, because the return value of the same method on the same args may change in general - only after inlining the optimizer can see where this won't happen.
You know about profile-guided optimizations? They can be done also in C/C++: you first run your application with profiling enabled on "typical" inputs, then the profile data is fed back into the compiler for recompiling it again. Obviously, if the actual input is very different and the obtained profile is different, you lose. And it's annoying, so nobody does it. With advanced JVMs, these problems don't happen.