Lambda functions and closures are coming to Java 8. This has been a project that has been in progress for a while and recently I took the opportunity to take it for a test drive.
Some of the useful documentation for the project may be found at :
- Project Lambda Home page.
- State of the Lambda : A old paper (Dec 2011) describing the then status of lambda implementation. While the paper continues to be quite relevant, it may not describe a whole lot in terms of specifics of how to use the features.
- Lambda FAQ : This is a must read FAQ. In many ways it covers many of the same topics already covered in the state of the lambda, but does the same in a Q&A format which makes it easy to consume the information one small answer at a time. I refer to it frequently in this post.
- lambda-dev mailing list
Let us understand the motivation and goals of Java 8 lambdas. This is particularly important since as readers we can sometimes impose our preferences when we take a look at a new software. I focus on the motivation so we can understand what the authors set out to do, which will help us better evaluate the implementation and how well it addresses its goals
The page Why are lambda expressions being added to Java? describes the primary motivation
Lambda expressions (and closures) are a popular feature of many modern programming languages. Amongst the different reasons for this, the most pressing one for the Java platform is that they make it easier to distribute processing of collections over multiple threads. Currently, lists and sets are typically processed by client code obtaining an iterator from the collection, then using that to iterate over its elements and process each in turn. If the processing of different elements is to proceed in parallel, it is the responsibility of the client code, not the collection, to organise this.
In Java 8, the intention is instead to provide collections with methods that will take functions and uses them, each in different ways, to process their elements. The advantage that this change brings is that collections can now organise their own iteration internally, transferring responsibility for parallelisation from client code into library code.
On the page Are lambda expressions objects? we come across the following quote
To understand the situation, it is useful to know that there are both short-term goals and a longer-term perspective for the implementation in Java 8. The short-term goals are to to support internal iteration of collections, in the interests of efficiently utilising increasingly parallel hardware. The longer-term perspective is to steer Java in a direction that supports a more functional style of programming. Only the short-term goals are being pursued at present, but the designers are being careful to avoid compromising the future of functional programming in Java, which might in the future include fully-fledged function types such as are found in languages such as Haskell and Scala.
A useful line is to be found in one of the posts in the lambda-dev mailing list referred to above ie. A peek past lambda
Lambda is the down-payment on this evolution, but it is far from the end of the story. The rest of the story isn't written yet, but preserving our options are a key aspect of how we evaluate the decisions we make here.)
Whither Java Collections Framework
Many of the newer features being added are not being added directly to the classes currently forming the Java Collections Framework. There's a new interface
Stream and helper class
Streams thats at the center of these changes. The rationale for Stream is explained as follows in Where is the Java Collections Framework going?
Analysis of the usage of the Java collections shows one pattern to be very common, in which bulk operations draw from a data source (array or collection), then repeatedly apply transformation operations like filtering and mapping to the data, often finally summarising it in a single operation such as summing numeric elements. Current use of this pattern requires the creation of temporary collections to hold the intermediate results of these transformations. However, this style of processing can be recast to a pipeline using the well-known “Pipes and Filters” pattern, with significant resulting advantages: elimination of intermediate variables, reduction of intermediate storage, lazy evaluation, and more flexible and composable operation definitions. Moreover, if each operation in the pipeline is defined appropriately, the pipeline as a whole can often be automatically parallelised (split up for parallel execution on multicore processors). The role of pipes, the connectors between pipeline operations, is taken in the Java 8 libraries by implementations of the Stream interface; examining this interface will clarify how pipelines work.
Another page Why are Stream operations not defined directly on Collection? documents the rationale behind why many of the stream methods are not defined directly on the Collection interface.
Implementation used in this post
I have used the b78 version of the Java™ Platform, Standard Edition 8 Early Access with Lambda Support from here. It could happen that some of the api methods I refer to continue to evolve. The JDK api docs for this particular version can be found here
Typical pipeline structure.
Let us explore the typical pipeline structure. For a given collection, convert it into a stream, perform a set of transformations on the same and finally accumulate the results into a Collector. We shall look at the details of a collector later, but for the moment we can imagine the return value of executing a pipeline is an object that is returned by the collector.
1 2 3 4 5 6
Let us evaluate the steps in the pipeline
Arrays.asList("Larry", "Moe", "Curly"): instantiates a list with three strings viz. “Larry”, “Moe” and “Curly”.
.stream(): Converts the list into a stream
.map(s -> "Hello " + s): Performs a map operation which prepends “Hello ” to each item in the stream
.collect(Collectors.toList()): Defines a new collector which will collect the results of the pipeline into a list and invokes collect to collect the results into it (and eventually returning the collected list).
As you can observe, the type of the original collection is not preserved during the transformation. In fact you should be able to take a collection and invoke
.stream() on it, whereby you effectively lose the type of the original collection itself. All the transformations are defined on streams and eventually you can convert the result
Stream into your desired collection type by supplying the appropriate
Collector. This leads me to imagine that lambda based transformations are really stream transformers with converters at both ends - one to convert a collection into a stream and eventually to collect the stream back into a collection of a desired type. (One could of course choose to have the end result continue to be a Stream that one can use subsequently). The stream allows one to account for considerations related to efficiency, parallelisation,and even collection agnosticity. The collections at the ends provide a concrete implementation best suited for other tasks outside the pipeline.
Streams are amongst the most important types to be used in lambda based transformations. So it makes sense to take a look at the two associated classes. The interface
java.util.stream.Stream and the helper
Let us first take a look at
Streams (javadoc). This is a helper class and has lot of supporting operations for creating streams. (note:
.boxed() as used below converts a stream of primitive ints into a stream of Integers)
eg. some of the methods (for a full listing see the javadocs) are
Streams.intRange(start, stop, step)will generate a stream of primitive
ints. (step if not provided is 1)
Streams.longRange(start, stop, step)will generate a stream of primitive
longs. (step if not provided is 1)
Streams.concat(stream1, stream2)will concatenate two streams
1 2 3 4 5 6
Streams.zip(stream1, stream2, function)will return a stream based on a function being applied on consecutive pairs of elements across stream1 and stream2
1 2 3 4 5
Note: As mentioned earlier, any collection can be converted into a stream by invoking
.stream() on it.
Now that we've see how to create, concatenate or merge streams, let us explore some of the more useful operations that can be performed on the stream. I won't be dwelling much on these transformations in this post. I intend to come back to them in a future post.
|allMatch||true if all elements of the stream match the predicate|
|anyMatch||true if any element of the stream match the predicate|
|filter||return a stream of subset of the elements matching the predicate|
|findAny||return an element of the stream matching the predicate|
|findFirst||return the first element of the stream matching the predicate|
|flatMap||Return a stream where each input element is transformed into 0 or more values|
|Return a stream where each input element is transformed into a stream|
|forEach||Perform an operation on each element of the stream (usually for side effects)|
|limit||Return a stream with no more than the first maxSize elements of this stream|
|map||Transform the stream into another containing the results after applying the function mapper on each element of the stream|
|max||Return the maximum element of this stream based on the supplied comparator|
|min||Return the minimum element of this stream based on the supplied comparator|
|peek||Return the same stream even as the elements are also provided to the consumer|
|reduce||Reduce the stream to a single value, performing the reducer operation on each element along with the accumulated value (accumulated value starting with the identity).|
|sorted||Sort the stream based on natural order or supplied comparator|
|subStream||Return a stream after discarding the first startingOffset elements (and those after optionally provided endingOffset elements)|
|toArray||Convert stream to array|
The result of performing the operations above can result in zero or 1 element (eg. findFirst) from the stream, a boolean result (eg. allMatch), no result whatsoever (eg. forEach), a reduced value (eg. reduce) or simply, another stream (eg. map, filter, flatMap, limit, subStream etc.). When a stream is returned, similar transformations could be applied on it again leading to a pipeline of transformations.
If you looked at the javadocs for the methods above, you will have seen that many of them require some kind of functions as predicates or transformers. These are similar to the
.map(s -> "Hello " + s) invocation we saw earlier. And if you are used to the earlier versions of Java, they no doubt look very compact.
Let us for a moment imagine we are working with Java 7 or before and imagine how such a functional transformation could get passed to another method as an input. A completely hypothetical equivalent using classic java would be
1 2 3 4 5 6 7 8
Contrast this with the compact declaration that java 8 Stream.map method accepts. You can quickly note the following :
- There is no anonymous class instantiation required (
- There is no function declaration (
public String mapabove)
- There are no type declarations (the two
public String map(String str)above)
Also note that instead of traditional class being a parameter, we passed what for all practical purposes seems like a function. But, can java accept functions as arguments ? And if we do not pass the signature for the function how does it infer it?
For answers, we need to look at Functional Interfaces.
Functional interfaces is a very interesting (and to the extent of my awareness possibly unique) introduction to the java 8 lexicon. For details we can refer to What is a functional interface?
a functional interface is defined as any interface that has exactly one abstract method. (The qualification is necessary because an interface has non-abstract methods inherited from Object and may also have non-abstract default methods). This is why functional interfaces used to be called Single Abstract Method (SAM) interfaces, a term that is still sometimes seen.
So functional interface can inherit non abstract methods. It also can have non-abstract default methods. But it can have exactly one abstract method.
Aside : Java 8 introduces the ability to define default implementations in interfaces. The treatment of the same is beyond the scope of this post. But you could read up on What are default methods? and the subsequent four pages if interested.
So, what was getting passed to map() earlier was not a function, but in java 8 terms, an implementation of a functional interface. If you look at the various parameters being passed to the Stream methods discussed above, you should find that many of the arguments are functional interfaces.
Java 8 also introduces a substantial degree of syntactic sugar to succinctly define the implementation of a functional interface. In the case above it was
s -> "Hello " + s
Lets look at how the compiler can make sense out of all this. First the argument provided to a
map method is a
Function<T,R> interface. If you look at the signature of this interface you will find not one, but two methods. Thankfully one of them called
compose has a default implementation provided. Thus there is only one abstract method
apply which takes a
T and returns a
Given that the requirement of a functional interface has been met (since there is a single abstract method - apply), we can look at the signature and say that since the
Stream<T> is of type
Stream<String>, the input argument to the apply method will be of type
String. Now the compiler can treat the function supplied as an argument to the map method, as the body of apply method of an instance of the
Function<T,R> interface. The input argument to the apply method ie.
T is a
String. What about the output?
Here's where we meet another interesting aspect introduced by Java 8. ie. type inference using target types. The ultimate result of the pipeline above was a
List<String>. Which was a result of a
.collect(Collectors.toList()) operation. So the input to the collect should've been a
Stream<String>. Since the output of the
map method call has to be a
Stream<String>, it means the output of the apply method must be a
String. That fits nicely with our expectations that since the input is a
"Hello " + s must also be a
String. So all is well and the compilation can go ahead. You can read more about this at What is the type of a lambda expression?. One of the curious aspects of this is that the type of the lambda isn't fully expressed till one assigns it to a variable of a given type or uses its return value where the expected type is fully known. Note: The exact sequence through which the type of the lambda is derived here might just be a little different but what is important is that the type of the lambda is evaluated by looking at the overall context it fits in
Another interesting aspect of java 8 is how it terminates the processing pipelines ie. collectors. To briefly recap, pipelines can start with a collection with a
.stream() method on in which converts it into a
Stream. Many operations on the stream eg.
flatMap etc. can be chained on the same which will continuously transform the stream into another stream and yet another stream and so on. At some stage you could invoke a method such as
reduce or something similar which will return just one value. eg. the following will return a single value 109 because of the last reduce invocation.
1 2 3 4 5
However in many other cases you might prefer a collection as the result. This is where collectors are useful.
A collector has the capabilities necessary to
- instantiate the necessary object of the target class. eg. it could be a List
(or some other collection you choose)
- take each element from the Stream and appropriately modify the target object (eg. add to the List)
- take two instances of the target object and combine them into a single instance. (this is related to parallelisation - if multiple lists were constructed in parallel, this helps combine all of them into one).
To recap, I will repeat the code block I showed earlier
1 2 3 4 5 6
In this example, I used
Collectors.toList() which returned an appropriately useful collector. In most cases one can use similar functions to create an appropriate collector, or create a collector using instances of three functional interfaces for the three capabilities listed in the prior paragraph. However should you want to carefully craft your collectors in a peculiar way, you could write your collector and use the same as well. It is the collector which eventually returns the result of the pipeline, eg. in the case above, it is a
Note: It is not a prerequisite that a collector must always eventually return a collection. The
reduce operation in the earlier example could also instead have been a
collect with a collector reducing the value into a single value.
Pending topics :
Hopefully I have been able to provide a reasonable feel of the capabilities of Java 8 Lambdas and how they can be used. However there are many topics that I haven't been able to cover in this post and intend to cover them in later posts. Some of these are :
- Many more examples of how to use stream transformations
- Primitives vs. objects. There are a many methods and classes which help work on the primitives alone.
- New classes like Optional
- Aspects related to parallelisation
- Hand crafting some of your own classes