Nuances of the Java 5.0 for-each Loop Blog

Version 2


    This article is an in-depth examination of one of the simplest but most pleasing features in Java 5.0--thefor-each loop. I present eleven short items discussing various nuances of usage, pitfalls to be aware of, and possible optimizations surrounding the use of thefor-each loop. In the first section, I discuss what kind of iterations are possible with thefor-each. The next section illustrates common programming errors in using thefor-each loop. The final section shows how to write new classes that can be used as targets of afor-each loop. I also talk about advanced implementations that allow multiple iterable views; lazily construct objects just in time for iteration; and enable possible generic algorithm and compiler optimizations of thefor-each loop.

    Thefor-each Loop: What Is It? What Can It Do? What Can't It Do?

    Sun documents sometimes refer tofor-each as the "enhancedfor loop," calling the old version the "basicfor loop." Designed specifically for iterating over arrays (and collections), for-eacheliminates much of the clutter surrounding the traditionalfor or while loop involving the setup and management of an index variable or iterators. Let us look at an example:

    public Integer sum(List<Integer> liszt) { int total = 0; for(int num : liszt) { total += num; } return total;

    The official guides suggest that the colon (:) be read as "in"--the above listing would read as "for [each] num in liszt do total += num." By not introducing an appropriate new keyword (such as foreach) into the language, the designers preserved strict backward compatibility with any pre-Java 5.0 code that may have used the keyword as an identifier.

    Item 1: The for-each loop allows only one iteration variable, which precludes simultaneously iterating over two or more arrays or collections. For example, in computing the dot product of two vectors, I need to multiply together the corresponding elements of both vectors, which needs two iterators. This cannot be coded using the "enhancedfor" loop.

    public Integer dotproduct(List<Integer> u, List<Integer> v) { assert (u.size() == v.size()); int product = 0; for(Iterator<Integer> u_it = u.iterator(), v_it = v.iterator(); u_it.hasNext() && v_it.hasNext(); product += * ; // no body return product; } 

    Item 2. Nested iteration is possible. Nested iteration is different from simultaneous iteration: the inner loop is finished before each outer iteration. Thefor-each loop easily accommodates this. For instance, in the product of two matrices, the element at the location [i,j] is the dot product of theith row and jth column of the multiplied matrices:

    public static void matrixProduct(Matrix x, Matrix y, Matrix product) { assert (x.rows().size() == y.cols().size() && y.rows().size() == x.cols().size()); int rNum = 0; for (List<Integer> xRow : x.rows()) { int cNum = 0; for (List<Integer> yCol : y.cols()) { product.setElement(rNum, cNum, dotProduct(xRow,yCol)); cNum++; } rNum++; } } // where the matrix is defined as... public interface Matrix { public List<List<Integer>> rows(); public List<List<Integer>> cols(); public void setElement(int rowNum, int colNum, Integer value); // ... other methods }

    The Equivalent Basic for Loop and Its Implications onfor-each Usage

    In effect, the for-each loop is nothing more than syntactic sugar. It makes the code much easier to read, but provides no new core functionality. In fact, the Java Language Specification describesthe meaning of the new for-each loop in terms of an equivalent basic for loop. Essentially, an intermediate Iterator of the right kind is created, and it is guaranteed to be an identifier different from all other user variables in the scope. Below, I show the equivalent translation for the first example. To show how much clutter is effectively removed by the new syntax, the parts actually typed in by the programmer in that example are shown in bold:

    public Integer sum (List<Integer> liszt) { int total=0; 
    for(Iterator<Integer> iter=
    liszt.iterator(); iter.hasNext();
    ) { 
    int num = ((Integer); 
    total += num; 
    } return new Integer(total); }

    I now examine the deeper implications of this equivalence.

    Item 3: Beware of auto-unboxing. In the previous example, the Integer item returned has to be put into an intvariable. This involves auto-unboxing, which has been shown through the call toInteger.intValue(). This can have an effect on the performance of large iterations. Unless you are forced to use native types because of their support for arithmetic operations, it is preferable to use subclasses of Number(Integer, Double, etc.) in loop variables rather than the equivalent native types (int,double, etc.). An exception to this rule is when the iteration is around an array of a native numeric type. For example, if liszt was an int [], there would be no auto-unboxing involved.

    Item 4: Watch out for null pointers. The first example never explicitly de-references liszt (by, for example, calling a method such as liszt.add()). Yet,sum() can raise a NullPointerException if the liszt passed in is null. This particularly insidious error happens because of the hidden compiler-generated call to get the iterator. To make the code robust, I need to preface the for-eachloop with a check like: if (null != liszt) for(int num : liszt)//for loop body....

    Item 5. Iterating over varargs. Java 5.0 now allows a variable number of arguments of a single type to be passed in as the last parameter to a method. The compiler collects these varargs parameters into an array of that type. Typically, the body of the method treats the varargs as an array and iterates over them using a for-each loop.

    public Integer sum(Integer... intArray) { int total = 0; for (int num : intArray) { total += num; } return total; }

    If no vararg argument is specified in calling the method (e.g., a call like sum()), the compiler replaces the call with a zero-length array (sum(new Integer[0])). Thus, in the body of the method, I may safely drop the check for a null array argument.

    Note that if a null is explicitly passed in without a cast (i.e., sum(null)), the compiler issues a warning as it cannot disambiguate whether the intended argument is the null array (Integer [])null, which would cause aNullPointerException as warned in Item 4 above, or a vararg argument(Integer)null, which safely gets converted to the single item array argument{(Integer)null}.

    Item 6. Return zero length arrays or empty lists rather than nulls. Client code will look cleaner because the check for whether the returned value was null can be safely omitted. If all API code followed this rule, rule 4 would become nearly obsolete! In Item 27 of Effective Java, Joshua Bloch points out that returning nulls for special cases involves writing the same amount of code as returning a zero length array. One argument for returning nulls is that it does not involve needless object creation. While the effect of object creation on the performance of an average method is questionable, I could avoid the issue altogether by returning the same immutable copy each time the method is called. After all, immutability guarantees safety in shareability (Item 13 of Effective Java). Empty arrays are immutable and immutable empty lists of any type can be obtained by calling the generic methodjava.util.Collections.emptyList(). The code in bold below shows how to do this:

    public class ArrayBackedMatrix implements Matrix { private Integer[][] array; public List<List<Integer>> rows() { 
    if (array.length == 0) { return Collections.emptyList(); // rather than return null; } List<List<Integer>> retList = new ArrayList<List<Integer>>(); for (Integer []row : array) { retList.add(Arrays.asList(row)); } return retList; } //... other methods }

    Item 7. Do not modify the list during iteration. While for-each syntax does not provide direct access to the iterator used by the equivalent basic for loop, the list can be modified by directly calling other methods on the list. Doing so can lead to indeterminate program behavior. In particular, if the compiler-inserted call to iterator() returns a fail-fast iterator, ajava.util.ConcurrentModificationException runtime exception may be thrown. But this is only done a best-effort basis, and cannot be relied upon except as a means of detecting a bug when the exception does get thrown. On my JVM, the following behavior was seen: in a list of integers, removing the second-to-last element did not cause an exception, but removing an earlier element did:

    List<Integer> liszt = new ArrayList<Integer>(); liszt.add(0);liszt.add(1);liszt.add(2);liszt.add(3);liszt.add(4); // liszt.add(5); Uncomment to see ConcurrentModificationException for(int item: liszt) { // the iterator returned by java.util.ArrayList is fail-fast if(item==3) liszt.remove(new Integer(item)); //after this, loop behavior is indeterminate else System.out.println(item); }

    You Don't Need to Return java.util.List!

    The examples we have seen thus far would suggest that iteration using for-each requires using instances of java.util.List. But observing the JLS-specified equivalent basic for loop from the fourth listing closely, we can see that the only List method invoked is the iterator() method. In fact, the Java Language Specification only requires an array or a class that implements thejava.lang.Iterable interface. This interface defines a single method, iterator(), which returns thejava.util.Iterator that thefor-each loop uses. Thus, any class can be made the target of for-each loops simply by defining an iterator with next() andhasNext() methods. (Iterator.remove() is never called in the for-each, so a bare-bones iterator implementation of remove() can throw an UnsupportedOperationException.)

    Item 8. Consider returning an Iterablerather than a List. Even if you are not defining a custom iterator, if a method in your API is returning aList that is expected to be solely used infor-each loops, consider changing the return type to java.lang.Iterable instead. This effectively hides the details of the current implementation and allows for later optimizations such as lazy construction of each iterated item (Item 10).

    Item 9. Consider returning Iterable views rather than implementing Iterable. The Java collections classes can be used infor-each loops because they implementjava.lang.Iterable. If a class implementsjava.lang.Iterable, then all its subclasses inherit its iterator() method. Overriding subclasses are still expected to follow the semantics of the parent class. This may be acceptable in most cases, but the issue can be avoided by creating a method to return an Iterable view of the class. Subclasses can then implement Iterable and return one of the iterable views in the body of the iterator()method. This has the added advantage that multiple iterable views can be defined, with different sort orders. Some iterable views can even return relevant subsets of the collection being iterated over.

    In the example below, if City itself implementsIterable, a for-each loop using the City class would look like for (Resident resident: city) {//...}. If City does not implement Iterable, clients need to make an explicit call to obtain one of the Iterable views of the City. The code still reads just as naturally as before; e.g., for(Resident resident: city.residents ()) {//...}.

    public interface City //extends Iterable<Resident> { // By extending Iterable, only one Iterable view can be supported // Iterator<Resident> iterator(); // the default iterable view, // which could be returned by iterator() of most subclasses; Iterable<Resident> residents(); // return same set of objects as residents(), but different sort order Iterable<Resident> residentsInAlphaOrder(); // return a subset of residents() who satisfy an age barrier test Iterable<Resident> adultResidents(); }

    Item 10. Consider lazy construction of iterated items. Many times, you may need to iterate over large lists of items that cannot all be accommodated in memory; for instance, when iterating over large ResultSets returned from a JDBC query. At other times, it may be expensive to compute each object in the list. A very useful technique in such cases is to return an iterator and construct each item only when required.

    The example below shows a generic class that can wrap anyResultSet and make it the target of afor-each loop. It uses a callback factory method to create Java objects from each row of theResultSet. Note that only one ResultSetrow needs to be in memory at any point in the iteration. The example also shows how to use the wrapper with aPerson class whose two fields are read from the columns of the ResultSet rows.

    public static class IterableResultSetWrapper<T> implements Iterable<T>{ public static interface ResultSetRowReader<T> { // construct a T from a ResultSet row T create(ResultSet resultRow) throws SQLException; } public IterableResultSetWrapper(ResultSet results, ResultSetRowReader<T> rowFactory) { this.results = results; this.rowFactory = rowFactory; } public Iterator<T> iterator() { return new Iterator<T>() { public boolean hasNext() { try {return !results.isAfterLast();} catch (SQLException e) {return false;} } public T next() { try {; return rowFactory.create(results); }catch (SQLException e) {e.printStackTrace();return null;} } public void remove() { throw new UnsupportedOperationException(); } }; // end new Iterator } private ResultSet results; private ResultSetRowReader<T> rowFactory; } // Test code to read Persons from ResultSets public class Test { public static class Person { public Person(String name, int age) { //...} public static IterableResultSetWrapper.ResultSetRowReader<Person> resultSetReader = new IterableResultSetWrapper.ResultSetRowReader<Person>() { public Person create(ResultSet resultRow) throws SQLException { // create a Person by reading the // first and second cols of ResultSet return new Person(resultRow.getString(1), resultRow.getInt(2)); } }; // end new IterableResultSetWrapper } public static void personForEachExample(ResultSet results) { Iterable iterablePersonResultSet = new IterableResultSetWrapper<Person> ( results, Person.resultSetReader); for(Person person: iterablePersonResultSet) { // process person object } } }

    Sometimes, legacy APIs might have a return type ofList. The above lazy construction strategy can still be used in such cases, by returning a custom subtype ofjava.util.AbstractSequentialList, which provides implementations for all other List operations on top of a java.util.ListIterator. The subtype can then implement a ListIterator that constructs each item lazily as above.

    Item 11. When appropriate, implementjava.util.RandomAccess to allow for compiler optimizations. This rule applies to custom implementations of the List interface that could be the targets offor-each loop. The language specification only stipulates that for-each should behave like its equivalent basic for loop. The compiler could, for example, translate thefor-each from the first example as below:

    public int sum(List<Integer> liszt) { int total = 0; for(int i=0; i < liszt.size(); i++) { total += liszt.get(i).intValue(); } return total; }

    If, for typical instances of a particular Listimplementation, the above loop would work faster than the JLS-specified equivalent basic for loop shown in the fourth code listing, then that List should implementjava.util.RandomAccess. This is just a marker interface--it declares no methods--and is meant to provide a means for generic methods to optimize their for loops involving Lists if the Listimplementation's random access mechanisms are faster than iteration. For example, in the Java Collections framework, theArrayList, Stack, and Vectorimplement RandomAccess. A compiler could produce the following optimized translation of the first example:

    public int sum(List<Integer> liszt) { int total = 0; if ( liszt instanceof RandomAccess) for (int i=0; i < liszt.size(); i++) { total += liszt.get(i).intValue(); } else for (Iterator<Integer> iter=liszt.iterator(); iter.hasNext();) { int num = ((Integer); total += num; } return total; }

    Thus the language designers' choice of hiding theIterator in the for-eachprovides unexpected opportunities for compiler optimization. This is also an argument for using for-eachinstead of the explicit iterator: the compiler cannot replace an explicit method call to, as the programmer might have purposely used that method in order to cause a known side effect (say, a println() in the body ofnext()). If the compiler optimized the iterator loop and replaced it with its equivalent random access loop, theprintln() would never be seen. Thus the optimization can only be performed if the iterator is not explicitly called; i.e., if for-each is used instead, then the compiler is free to use any for-eachimplementation in keeping with the language specification.


    In this article, I looked at various nuances of the enhancedfor loop introduced in Java 5.0. I showed that nested iteration was possible, but simultaneous iteration over multiple collections is not supported by this syntax (Items 1 and 2). I discussed possible dangers to be aware of, such as auto-unboxing, null pointers in the for-each loop, and concurrent modification of the iterated collection (Items 3-7). I then illustrated how to create custom classes that can be used infor-each loops, and presented strategies for creating multiple iterable views (Item 9) and optimizations for writing Iterable classes (lazy construction, Item 10, and RandomAccess for generic algorithms and compiler optimizations, Item 11).