We've been doing a lot of work over the past couple years to accelerate many complex Java 2D operations on the GPU via OpenGL fragment shaders. (Fragment shaders are little programs that operate on each pixel being rendered on the screen; they're infinitely more flexible than the fixed functionality that's historically been available in OpenGL.) The sky's the limit when it comes to the kind of effects one can achieve by writing shader programs.

The first GPUs with fully programmable shader support were made available from Nvidia and ATI in 2002. It took a couple years for the new hardware to penetrate into the consumer market, but now shader-capable GPUs are extremely prevalent (thanks in part to the hefty graphical requirements of Mac OS X and Windows Vista), so much so that we can reliably take advantage of shaders to accelerate complex Java 2D operations. Even those first-generation boards are capable of providing huge performance gains, and each new generation of hardware seems to give an order of magnitude improvement over the last. This should be quite evident from the charts that follow. While CPU speeds seem to be nearing an asymptote, GPU performance continues to rocket, and now Java 2D is able to benefit from that power. Not only does this mean improved performance for your Swing or Java 2D application, but also reduced CPU consumption, thus freeing up your CPU to crunch on application logic rather than getting bogged down with rendering tasks.

[This is one of those blog entries that could be novel length, but no one would actually read the words, because there are too many pretty bar charts to distract the reader. Blah blah blah, words words words. See? No one's reading. So let's skip the prose and get on with it... Oh, but first, I have to tell you how to read these charts. I generated these numbers on a couple different machines, using J2DBench on Windows XP with the latest graphics drivers (ATI Catalyst 7.3 and Nvidia 93.71). Since the machines vary slightly in processor performance and bus speed (the GeForce 7800 is a PCI-E board, the rest are AGP), I decided to use our software pipeline as a baseline, and then compare the OGL pipeline numbers to that baseline. For example, if you see a result that lines up with the number 2000, it means that test is 2000% of baseline, or in other words, it is about 20 times faster on the GPU than on the CPU. Your mileage may vary, but the big takeaway is that most operations are many times faster when executed on the GPU...]

Text Rendering (Bug 6274813: Available in JDK 6)

LCD-Optimized Text
I already discussed this a bit in a blog entry from about a year ago. Since then, we've enabled this by default in JDK 7 (and soon in a JDK 6 update) when the OGL pipeline is enabled. It's cool to see how software performance improves little over time, but each new generation of GPUs brings big performance improvements.


BufferedImageOps (Bug 6514990: Available in JDK 7 b08)

ConvolveOp is commonly used for modern UI effects such as blurs and drop shadows. Due to limitations in first-generation shader-level hardware, we are currently only acceleratingConvolveOp for 3x3 and 5x5 sized Kernels. These are fairly common kernel sizes, but most drop shadow and glow effects require larger kernels, so we are working to loosen these restrictions and accelerate a wider range of kernel sizes.



LookupOp is often used to perform simple brightness and contrast adjustments on images. To simplify our code, we are currently only accelerating LookupOp forByteLookupTables and ShortLookupTableswith a maximum length of 256 elements, and for 1-, 3-, and 4-band sRGB images only.



RescaleOp is basically a degenerate case ofLookupOp that can be accelerated very efficiently in shaders; after all, it's just a multiply and an add. We are currently accelerating RescaleOp for 1-, 3-, and 4-band sRGB images only.


Multi-stop Gradient Paints (Bug 6521533: Available in JDK 7 b10)

For 2-stop linear gradients (NO_CYCLE orREFLECT), we can delegate to our existingGradientPaint codepath, which is already ridiculously fast via OpenGL's fixed functionality. For all other linear gradients (for all CycleMethods andColorSpaceTypes, up to a maximum of 12 color stops), we can accelerate the operation using shaders, for both antialiased and non-antialiased rendering.



The same restrictions for linear gradients also apply to radial gradients (maximum of 12 color stops, etc). For gradients with more than 12 stops, we simply fall back on our existing software routines.


Extended Composite Modes (Bugs 6531647,5094232: On the way)

Antialiased Painting (with Non-SrcOver AlphaComposite)
Historically the OGL pipeline has been able to accelerate the compositing step of an antialiased painting operation only whenAlphaComposite.SrcOver (orAlphaComposite.Src, if the paint is opaque) due to the math involved. (This is described in the quasi-official guide to the OpenGL-based Java 2D pipeline, but that document could use a refresh for JDK 6 and beyond.) But now with shaders we're able to accelerate antialiased rendering for any arbitrary AlphaComposite mode.

Coming Soon: PhotoComposite
It's not officially approved (or integrated) yet, but I've been working on adding more blending modes to Java 2D, in addition to those already provided by AlphaComposite. Many of these modes come from traditional photography techniques, thus the name "PhotoComposite". Some modes are simple (Add, Multiply) and can be accelerated easily using OpenGL's built-in blending rules, others are more involved (ColorBurn, SoftLight) and benefit greatly from the use of shaders for efficient rendering.

What's Next? 

There are plenty more optimizations that can be made to common Java 2D operations by leveraging shader technology; we'll keep working on this. Also, there have been some discussion on the interest list recently about the use of shaders in Java 2D. Some folks would like to be able to write arbitrary shaders and have them work on Java 2D content. I think it would be hard to come up with a general solution (in the public API) to make this work everywhere, and would shift an unreasonable burden to Java 2D (which is designed with WORA in mind).

However, I do recognize that it would be great if it were easy for developers to make use of shaders in their applications (as Romain has demonstrated) without getting bogged down in the OpenGL/GLSL learning curve. To that extent, I've been working on a few utility classes for JOGL in the vein of TextureIO and related classes. This is another way to make the transition easier for existing Swing developers to leverage JOGL in their applications. More on that in a future blog entry.

Finally, it's worth mentioning that all the shader-based optimizations I've described above are currently only available for the OpenGL-based Java 2D pipeline, but that will soon change. For JDK 7 (maybe sooner?), we have a newly redesigned Direct3D-9-based pipeline in the works that will share much of the architecture (and code) of the OpenGL-based pipeline. We fully expect that all of these shader-based optimizations will be available for most Windows users in the near future. Stay tuned.

In my ears: Panda Bear, "Person Pitch"
In my eyes: JPG, Issue 9