Version 4

by José Paumard

New ways of solving problems with the Stream API in Java SE 8.

Published Java Magazine March/April 2015 Edition

The Collections Framework, first released with Java 2, was designed on the Iterator patterns (among others) and is at the heart of all our applications. With the inclusion of generics in Java 5, the Collections Framework was rewritten, but the patterns for using it weren’t modified.

Shakespeare and Scrabble

Let’s say we have two files. One contains (supposedly) all the words Shakespeare used in his plays and poems. The other contains all the words in The Official Scrabble Players Dictionary.

The question is this: How well would Shakespeare have played Scrabble? Or, what is the best word Shakespeare could have played, and how much would it have scored?

It turns out that this question is a map/filter/reduce question that can be solved with the Stream API.

 private static final int [] letterScores = {// a, b, c, d, e, f,  g, h, i,   j, k, l, m, n, o, p,  q, r, s, t, u, v, w, x,  y, z      1, 3, 3, 2, 1, 4, 2, 4, 1, 8, 5, 1,  3,  1, 1, 3, 10, 1, 1, 1, 1, 4, 4, 8, 4, 10}

Listing 1.

Compute the Score of a Single Word

Computing a score is about taking a given word and returning a score that is an integer. This can be done with `Function<String, Integer>`, which can be implemented with a lambda expression.

Let’s build an array, `letter Scores`, which gives the score of each letter in the English Scrabble game (Listing 1). From this array, we can see that the score of m is 3, and the score of yis 4.

Inside this scoring function, we also need a mapping function to get the score of a given letter. Let us write this mapping function first. It is very simple and uses the old trick of reading the correct cell of the array that holds the scores of the letters (`letterScores`).

`IntUnaryOperator letterScore =    letter ->     letterScores[letter – 'a'];`

This mapping function is an `IntUnaryOperator`, a specialized type of function that takes and returns an `int`.

We can then write the scoring function using the mapping function:

`Function<String,  Integer> score = word ->     word.chars() // IntStream        .map(letterScore)        .sum();`

Having the `chars()` method added to theStringclass proves handy. Many classes have new methods that return streams on their internal structure.

The choice has been made to represent letters by integers, and the `chars()` method returns an `IntStream` instead of a `Stream<Integer>`.

Three streams of primitive types (`IntStream`, `LongStream`, and `DoubleStream`) are created for performance reasons. Using them is much more efficient than using the equivalent streams of objects, because we don’t have to pay the cost of boxing and unboxing their elements at each step. But sometimes we need to convert this `IntStream` into a `Stream<Integer>`, which is the case when we have to build histograms later.

How Shakespeare Performs

Now that we can compute the score of a given word, building the histogram of Shakespeare’s words by their scores is easy. A histogram is a simple map, computed by grouping the words by their scores. Thus, the keys of that map are the scores of the words, and the values are the words themselves, regrouped into lists.

Are we interested in all the content of the histogram? No. What we want to see is only the best scores. So we need to select the best key-value pairs from that map—with the best being the ones that have the greatest values for their keys. A simple way of computing that is like this:

1. Build the map.
2. Sort the map according to its keys, in descending order.
3. Extract the best (for example, the first three) key-value pairs.

Build the map. Building a map by regrouping is done with a collector. `Collector` is a new interface in the Java SE 8 Stream API that provides a mutable reduction. Mutable here means that we reduce our stream into a mutable container—a map, in the example shown in Listing 2.

 Map> wordsByScore =   shakespeareWords.stream()           .filter(scrabbleWords::contains)           .collect(               Collectors.groupingBy(                   score,                   () -> new TreeMap(Comparator. reverseOrder()),                   Collectors.toList()               )           );

Listing 2.

Sort the map according to its keys. In Listing 2, `shakespeareWords` is a collection of all the words of Shakespeare, from which a stream can be obtained. We can regroup these words by the scores of each word. The `Collectors.groupingBy()` code does just that. It exists in several versions, taking different sets of parameters. Here, we pass a supplier to build the resulting map, which is implemented by a `TreeMap`. We pass a comparator as a parameter to its constructor to have the map sorted in the descending order of its keys. The last parameter is a downstream collector to specify that we want our words in a list.

We added an extra filtering step to make sure that we process only words that are allowed in Scrabble.

Extract the best key-value pairs. Selecting the best key-value pairs can easily be done with the Stream API, but we need to use a trick here, because the `Map` interface doesn’t define a `stream()` method. We can’t build streams directly on maps in Java SE 8. The trick is to build that stream on the set of the key-value pairs, which can be built using the `entrySet()` method call. This method returns a `Set<Map.Entry<K,V>>`, on which we can build a stream. Because we built a `TreeMap`, this set is in fact a `NavigableSet`, which keeps the order of the key-value pairs. Displaying the three best elements becomes an easy task:

`wordsByScore.entrySet()  .stream()  .limit(3) // or any value  .forEach(System.out::println);`

At this point, the best word is whizzing, which is worth 33 points.

Not All Words Can Be Written

Take a closer look at that word whizzing. Can we really write that? Notice that it contains two occurrences of the letter z, but only one is available in the Scrabble game. How can we discard such a word?

What we need to do to tell whether a given word can be written with the letters that are available in the game is to count them, letter by letter, and compare the number of needed letters to the number of available letters. If this test fails, then the word cannot be written.

This operation can be done with an `allMatch()` call. Counting all the letters of a given word consists of building a histogram in which the keys are the letters, and the values are the number of times each letter is used to write that word. Let us first build this histogram (see Listing 3).

 Function> histoOfLetters =    word -> word.chars().boxed()                .collect(                    Collectors.groupingBy(                        Function.identity(),                        Collectors.counting()                    )                );

Listing 3.

The call to the `boxed()` method converts the `IntStream` returned by the `chars()` call into a `Stream<Integer>`. This is necessary because the `collect(Collector)` method we want to use is defined on streams of objects, not on streams of primary types (`IntStream`, `LongStream`, and `DoubleStream`).

Then we need to build a stream on its key-value pairs and compare those entries with the available letters in the game, as shown in Listing 4. In each pair, the key is the letter and the value is the number of times that letter is used in the word.

 private static final int [] scrabbleAvailableLetters = {// a, b, c, d,   e, f,  g, h,   i, j, k,  l, m, n, o, p, q,  r,  s,  t, u,  v, w,  x,  y, z     9, 2, 2, 1, 12, 2, 3, 2, 9, 1, 1, 4,  2, 6, 8, 2,  1, 6, 4, 6, 4, 2,  2,   1, 2,  1} ;Predicate canWrite =    word -> histoOfLetters.apply(word)                 .entrySet().stream()                 .allMatch(                     entry ->                     entry.getValue() <=                         scrabbleAvailableLetters[entry.getKey() - 'a']);

Listing 4.

Read the rest of the article at oracle.com/javamagazine