Explorations: Wildcards in the Generics Specification Blog

Version 2


    " hrefaction="pub">

    In my last column, I talked ab out some of the more subtle aspects of the current generics implementation for Java. In particular, I talked about erasure and bridging. Both erasure and bridging are implementation techniques that the designers of the generic specification adopted for backwards compatibility.

    In this column, I'm going to continue talking about advanced aspects of the generics specification. But I'm going to leave the implementation details behind and instead focus onwildcards. Wildcards are a recent addition to the generics specification based on the idea that sometimes you don't want to precisely specify the value for a type parameter. Instead, you want to leave the type parameter unbound, as a signal to the compiler that the exact type isn't important (the important thing is that there is a type parameter, not the particular binding). The name wildcards comes from the fact that what we're talking about is replacing either a named type parameter, such asT, or an actual type with the unbound (and therefore unreferenceable) type parameter ? (the ?is the "wildcard").

    Before we start, I'd like to add the usual caveat: JDK 1.5 isn't in its final form yet. It's entirely possible that wildcards won't make it into JDK 1.5, or will make it into JDK 1.5 in a slightly different form.

    Acknowledgements: Tom Hill and Martin O'Connor did yeomanlike work, slogging through the early drafts of this article and providing feedback. It's almost certainly still got a few mistakes, but they did a fine job at winnowing the number down.

    The Syntax of Wildcards

    The board game Othello used to have a slogan: "A minute to learn, a lifetime to master." Similarly, wildcards are a fairly small change to the syntax of generics (a minute to learn), yet are one of the more subtle ideas in the generics specification (a lifetime, or at least an entire article, to master). So rather than dive into the mentally challenging parts of the specification right away, we're going to start slowly and begin with the syntax for wildcards. Once you're familiar with the syntax, we'll move on to talking about the reasons wildcards were added to the specification.

    Because the syntax for wildcards is so simple, we're not going to bother with a formal specification. Instead, I'll give you a quick rule, and then proceed using examples. The general rule is that anywhere you could use a type parameter in a field, method, or variable declaration, you can instead use a ?, providing that you don't actually reference the (previously) named type parameter in your code.

    Thus, for example, suppose you had the following method:

    // Version 1. public <T extends Animal> void printAnimals(Collection <T> animals) { for (Animal nextAnimal: animals) { System.out.println(nextAnimal); } }

    Note: I'm using the new for loop that's coming in JDK 1.5. See Joshua Bloch's comments for more details.

    The body of the method doesn't reference T at all -- the type parameter is declared but not used. In this case, you can remove the declaration of T and can replace the use of T with a "wildcard" (which is always denoted by?). When we rewrite the method, we get the following code:

    // Version 2. Version 1, wildcarded. public void printAnimals(Collection <? extends Animal> animals) { for (Animal nextAnimal: animals) { System.out.println(nextAnimal); } }

    It's very important that the type parameter not be referenced in the method body. The ? is supposed to drive an intuition that a wide variety of types "match" the declaration (in much the same way that ? is used in regular expressions). If you start trying to use the type of the? in the body of your code, you're constructing an implicit binding for it (and violating the idea of a wildcard). What's more, your code won't even compile. Consider, for example, the following method, which explicitly referencesT.

    // Version 3. We use the Type Parameter in the method body. public <T extends Animal> void printAnimals(Collection<T> animals) { for (T nextAnimal: animals) { System.out.println(nextAnimal); } }

    Replacing T by ? gives you:

    // Version 4. The ? confuses the compiler public void printAnimals(Collection<? extends Animal> animals) { for (? nextAnimal: animals) { System.out.println(nextAnimal); } }

    This code doesn't even compile. The compiler reports an "illegal start of expression" error because it's trying to figure out the ternary operator inside the for loop.

    Note: if you try to compile Version 3 with the current download, you'll run into a bug in the generics compiler. But if you use an iterator instead of the new for loop, you'll see what I mean.

    Similarly, you can use wildcards in field declarations. For example, in the following code snippet, we have a private collection defined using the type parameter T.

    public class Trainer<T extends Animal> { private Collection<T> _myAnimals = new ArrayList<T>();

    If the fact that the collection contained instances ofT wasn't important, we could have declared the collection as follows:

    private Collection<? extends Animal> _myAnimals = new ArrayList<T>();

    If you understood these examples, you understand the syntax of wildcards. The only other syntactic extension I want to mention in this article is the "super" keyword. Just as you can declare? extends Animal, you can declare ? super Animal. For example, the following declaration defines a field whose value must be an instance of Collectionwhose type parameter is a superclass of Animal.

    private Collection<? super Animal> _myCollection;

    An instance of Collection<Object> would match this declaration; an instance of Collection<Dog>would not. While we won't cover super in this article (I might make a blog entry about it at some point), it does turn out to be useful in several scenarios and is worth mentioning briefly.

    The First Benefit of Wildcards: Cleaner Code

    Now that you understand the syntax, you might be wondering,"What's the point? Why would you want to have an unbound type parameter? When would you use it? Does this really add anything of value to the language?"

    Well, the major benefit (below) requires a bit of a deep dive into the type system. But even before that, it's worth noting that wildcards do make code cleaner and easier to read. In a very real sense, the use of ? encapsulates intent, and makes the meaning and purpose of methods clearer. I much prefer

    public void printAnimals(Collection <? extends Animal> animals) {

    which tells me that this is a general purpose method which doesn't really use any of the properties of Animalonce the type has been verified, to

    public <T extends Animal> void printAnimals(Collection <T> animals) {

    which implies that maybe, somewhere in the method body,T is actually used (and I need to read the method to find out where).

    Generics and Inheritance

    The primary benefit of wildcards is that they extend Java's type system in a way that makes it much easier to assign instances of generic types to fields. Unfortunately, to explain this benefit, we're going to have to digress into the relationship between generics and inheritance.

    Let's start our discussion of inheritance with a quick quiz. What does the following code print out?

    public class InheritanceTester { public static void main(String[] args) { List<Dog> animals = new ArrayList<Dog>();     animals.add(new Dog("Fido")); animals.add(new Dog("Harvey")); message(animals); } private static void message(Collection<Animal> animals) { System.out.println("You gave me a collection of animals."); } private static void message(Object object) { System.out.println("You gave me an object."); } }

    The answer is that it prints out "You gave me an object" becauseList<Dog> is not a subclass ofCollection<Animal>. Depending on how object-oriented your brain is, this is either blindingly obvious or amazingly counterintuitive. I did an informal poll, and most programmers I asked seem to react with "Huh? A list is a collection and a dog is an animal. So a list of dogs is a collection of animals." Which is understandable, but wrong.

    To see why it's wrong, think about the contract between a supertype (if you're uncomfortable with "supertype," feel free to think "superclass") and a subtype (again, feel free to use "subclass"). Subtypes make the following guarantee: any method call that can be made on instances of the supertype can be made on instances of the subtype. List<Dog> clearly can't make that guarantee with respect toCollection<Animal>. For one thing, you can add instances of Cat toCollection<Animal> and you can't add instances of Cat to List<Dog>.

    In fact, there isn't any inheritance relationship betweenList<Dog> andCollection<Animal>.Collection<Animal> isn't a subclass ofList<Dog> because Collection isn't a subclass of List (and therefore, there are methods you can call on List<Dog> that you cannot call on Collection<Animal> ).

    This can be really annoying. Suppose Dog implements the Serializable interface. You really want, hope, and expect that List<Dog> is a subclass ofCollection<Serializable>. But it's just not the case. Moreover, you can't fix it by a cast. The following code doesn't even compile:

    public static void main(String[] args) { List<Dog> animals = new ArrayList<Dog>();     animals.add(new Dog("Fido")); animals.add(new Dog("Harvey")); writeToFile((Collection<Serializable> )animals); // Cast to fix inheritance issue } private static void writeToFile(Collection<Serializable> serializable) { // ... some generic serialization code in here. }

    Wildcards were designed to solve this problem with inheritance. In essence, wildcards insert new types into the type hierarchy, and let you use them for field declarations and method definitions. To be precise, the rule for the subtype relationship when wildcards are involved is:

    A generic type A is a subtype of a generic type B if and only if the type parameters are identical and A's raw type is a subtype of B's raw type.

    That is, ArrayList<Dog> is a subtype ofArrayList<? extends Dog> which is itself a subtype of List<? extends Dog>. AndCollection<?> is a supertype of any collection of any type of object, and can be used that way in code. But cases like these are the only time a subtype relationship holds.

    So if we change the program slightly, replacingCollection<Serializable> withCollection<? extends Serializable>, we get a much different result. The following code compiles and does what we'd expect (at runtime, writeToFile is called with our list of dogs).

    public static void main(String[] args) { List<Dog> animals = new ArrayList<Dog>();     animals.add(new Dog("Fido")); animals.add(new Dog("Harvey")); message(animals); } private static void writeToFile(Collection<? extends Serializable> animals) { // ... some generic serialization code in here. } }

    The Second Benefit of Wildcards: Variable Assignments

    With that discussion of inheritance out of the way, let's return to wildcards. The point of the example that ended the previous section was that because wildcards define new types, you can perform variable assignments to fields that otherwise couldn't be allowed. For example,

    List<Dog> dogList = = new List<Dog>() // ... add some dogs to the list Collection<? extends Serializable> finalCopy = dogList;

    compiles and does what you'd expect. Whereas replacingCollection<? extends Serializable> withCollection<Serializable> results in code that won't even compile.

    This example might not seem very impressive in isolation. And, quite honestly, saying "Well, wildcards let you assign variables in a typesafe way" sounds lame. That's the big problem with wildcards -- there's no five-line killer example that explains how to use them.

    So, to underline how useful this can be, I've written a slightly longer example. It contains a class calledSerializationBatcher whose sole reason for existence is to allow collections of instances of Serializableto be serialized to files in a background thread. That is, the main thread of execution passes collections toSerializationBatcher and a background thread persists the objects. If this example doesn't seem compelling, feel free to think of any other task you perform in a background or event thread. If you don't perform many tasks in background or event queues, well, shame on you.

    The code is straightforward: client code passes in collections of serializable objects using the static public methodSerializationBatcher.serializeInBackgroundThread. Each call is encapsulated in an instance ofSerializationData, which is put into an instance ofLinkedList for processing by a background thread. The background thread serializes the data to the appropriate file. The key point in this code example is the declaration of the fieldCollection <? extends Serializable> _objectsToSerialize; and the fact that we can assign any collection of serializable objects to it.

    Here's the code:

    public class SerializationBatcher { private static Thread batchThread = new Thread(new SerializationBatcherRunnable()); private static LinkedList<SerializationData> dataContainer = new LinkedList<SerializationData> (); public static void serializeInBackgroundThread(String fileName, Collection <? extends Serializable> objects) { SerializationData data = new SerializationData(fileName, objects); addDataToQueue(data); } private static void addDataToQueue(SerializationData data) { synchronized (dataContainer) { dataContainer.add(data); dataContainer.notify(); } } private static SerializationData removeDataFromQueue() { synchronized (dataContainer) { try { if (0 == dataContainer.size()) { dataContainer.wait(); } } catch (Throwable t) { } if (0 == dataContainer.size()) { return null; } return dataContainer.removeFirst(); } return null; } private static class SerializationBatcherRunnable implements Runnable{ public void run() { while (true) { SerializationData next = removeDataFromQueue(); try { if (null!=next) { next.writeToFile(); } } catch (Exception ignored) { } } } } private static class SerializationData { private String _fileName; private Collection <? extends Serializable> _objectsToSerialize; public SerializationData( String fileName, Collection <? extends Serializable> objects) { _fileName = fileName; _objectsToSerialize = objects; } private void writeToFile() throws Exception { FileOutputStream fileOutputStream = new FileOutputStream(_fileName); ObjectOutputStream objectOutputStream = new ObjectOutputStream(fileOutputStream); for (Serializable object : _objectsToSerialize) { objectOutputStream.writeObject(object); } } }

    The key point in all of this is the declaration and assignment in SerializationData.

    private Collection <? extends Serializable> _objectsToSerialize; // ... _objectsToSerialize = objects;

    We want to be able to pass any collection of serializable objects toSerializationBatcher.serializeInBackgroundThread; whether it is an instance of List<Dog> or an instance of TreeMap<Point> doesn't matter. And the natural way to do this with variable assignments (as opposed to explicitly copying the array) is either to use wildcards (as in the above code) or use a raw collection type, as in the following declaration:

    private Collection _objectsToSerialize;

    Needless to say, if you like generics at all, you probably don't like the latter option.

    A Side Benefit of Wildcards: Collections That Are Partially Immutable

    I'm not sure whether this counts as a bug or a feature, but there is one other significant side effect to using wildcards. You can't add objects to a collection that was declared using wildcards. For example, the following function definition won't compile:

    public void addDogToList(Dog dog, List<? extends Dog> list) { list.add(dog); }

    The reason is simple: the list matches too many possible lists, and in most of them, the assignment won't work. Consider, for example, the following code:

    Dog dog = new Schnauzer("Fido"); List<HuntingDog> list = new List<HuntingDog>(); addDogToList(dog, list);

    Since Schnauzer isn't a subclass ofHuntingDog, this code shouldn't compile. Since we can't, in general, guarantee that all possible operations across all possible types are typesafe, the compiler rejects the methodaddDogToList with a plaintive bleat of "cannot find symbol."

    Note that we can still remove items from collections. That's still legal. You're only prevented from adding items to collections. (And, to be perfectly honest, there's a loophole. You can always add null to a collection. I'll leave the type-theoretic explanations for that as an exercise for the reader.)

    Final Thoughts

    Wildcards are an interesting feature in the new generics specification. When I first heard about them, I thought they were more trouble than they were worth. But the more I thought about them, the more I realized that they're actually quite nice. Without wildcards, the interaction between generics and inheritance is just too weak. In fact, without wildcards, it almost feels like generics break inheritance.

    This is going to be my last column on generics for a while (unless the specification changes). Even though we didn't really discuss the super functionality, three articles on generics is more than enough until JDK 1.5 ships. Next month, we're going to start talking about something completely different; we'll be dissecting a small application I wrote, and talking about how to take advantage of the web services that are already available on the Internet.