A coworker and I tested this out when the announcement was originally made, at least in the use case of iterating a massive array. (Insert 100M trues and one false at the end, and find me the false)
The result was that streams were roughly 4x slower.
Still, I really like some of the new constructs that Java is getting - they make the language a bit more expressive, lack of which has always been my main gripe with the language.
Write code to be as clear and expressive as possible, then optimize for performance when you know performance is a problem. This is why I don't mind the performance cost of new language features like this. 90% of the time it won't matter, and you can optimize the other 10.
That's the mindset that eventually makes everything slow and you can't easily see that 10% anymore because it gets more spread out the more abstractions you introduce.
When loops stop looking like loops, it makes it rather harder to find them.
Aka "flat profile". Using these things in full force requires really banking on JIT doing the right thing (if perf is a concern), which is optimistic. A simple loop is just as readable as these one liners, and carries less risk of not being compiled as tightly.
These constructs don't always make code any easier to understand / maintain. I have played a lot with NetBeans' feature that automatically translate loops with their "functional" equivalent. Sometimes the code was so clever I couldn't understand what was happening.
Sure, code is more compact, but compactness is not an end in itself.
Compactness is more of a side effect and can be a detrimental one. Personally I find that this kind of code is more declarative of intent than the more imperative for loop. By using particular functional tools for the job you're avoiding any possibility of a bug in your looping construct and making your purpose explicit.
I also like how it removes boiler plate to handle different container types making it easier to switch to different ones.
This reminds me a lot of Resharper's refactoring common things to LINQ in Visual Studio. A lot of times I just ran it thinking "oh wow that's a neat way to do it" and then reverted it back because it wasn't a common idiom or it made it much harder to understand.
The same thing is going to happen with Java 8 until the feature becomes less shiny.
Sometimes it is the loop that brings most overhead to the processing time (imagine big array of ints, and some simple transformations). In such case refactoring would be easy, but anyway...
If you want new language constructs, why bother writing 90% of your Java code in pure Java to begin with anyway? Use something that compiles to bytecode like Groovy. Profile your app, and write whatever is performance critical in pure Java.
You chose a scripting language as your example. The profiler would probably tell you to rewrite the entire app in Java, which can be tricky because although the Groovy code uses the Java syntax, the semantics are often different. If you use a language for building systems, the profiler would probably tell you nothing needs rewriting.
Warmup
Warming up done
9262288 # for loop
6156414 # stream
Done
edit: tuning WARMUP_RUNS to something lower (like 500 vs 10k) and for loop wins consistently vs warmup runs at 10k.
If I put on -XX:+PrintCompilation I see some extra compilation output but I don't really understand the output. I assume some of this contributes but there's way more output then I expected tbh
4929 263 3 java.lang.invoke.LambdaForm$DMH/1581781576::invokeStatic_L_L (14 bytes) made not entrant
4929 339 4 java.util.function.Predicate::isEqual (20 bytes)
Ya I thought it might mark it as dead but even if I append the result to something like a List and print the list at the end (so it can't just not run the code?), Stream wins. Anyway I'm off to work for today, maybe I'll post in evening.
Well, you're not really testing for loops since 5there are other artifacts here:
1) forEach driver method is receiving multiple types, it's not monomorphic
2) you may be hitting OSR compilations
3) for loop may hit range checks on each get()
4) for loop version warms the cache for the stream version and this benchmark is mem ref heavy
So, please try to use JMH to get more accurate picture. And, as mentioned, this isn't really testing for loop vs streams.
HotSpot (the OpenJDK JVM) does, and it should in this case, too, but this usage suffers from "the inlining problem"[1] and/or the profile pollution problem[2]. These are problems that are continuously addressed and improved with each release, but have not yet been satisfactorily resolved.
I don't think the 100M entries example suffers from inlining or profile pollution. It's just that the manual loop is going to be as tight as you can get it and it's likely the stream version leaves artifacts behind that are noticeable when the loop kernel is dead simple like this.
-XX:UnlockDiagnosticVMOptions -XX:+PrintInlining will tell you whether this inlined, and dumping the JIT asm can be done to see what was actually generated.
Sumatra is dead, AFAIK. But why would graal depend on sumatra for integration? Bigger challenge is how to bootstrap graal itself such that compilation time and thus time to peak perf isn't degraded substantially.
Sumatra was about GPU offload, not so much metacircular JVM. If you look at sumatra dev mailing list archive, you'll see the last email there states it's not in active development. The project appeared to have been driven by AMD, but they may have re-prioritized things.
I also don't think it's currently possible to write the bulk of the JVM in java, if you want comparable performance and memory footprint to Hotspot.
> I also don't think it's currently possible to write the bulk of the JVM in java, if you want comparable performance and memory footprint to Hotspot.
Better check Graal and JikesRVM research papers then.
One reason why reference JDK JIT doesn't get rewritten is the ROI.
Just check how long has taken to rewrite C# and VB.NET compilers while keeping the new compilers 1:1 compatible or the new RyuJIT and the multiple AOT compiler iterations in .NET land.
As already mentioned, Graal is just the JIT compiler, it's not the entire VM. JikesRVM is a research VM, which has different needs/characteristics from production JVMs.
Graal is not a HotSpot replacement. It's a JIT for HotSpot or an AOT compiler for SubstrateVM which is a separate JVM altogether. If Graal matures and proves itself, it will become HotSpot's JIT. And Substrate may or may not become a product regardless.
And project Sumatra -- while cool -- was never a big influence over OpenJDK's plans. Being able to run streams on GPUs is absolutely awesome, but not the number one priority for the majority of Java users. My point is that Sumatra wouldn't have played a significant role in the decision of when to make Graal HotSpot's default JIT.
BTW, you don't even need Graal to be the default JIT in order to support Sumatra, anyway. Graal as a plugin (JEP 243) is good enough for that.
It's not the whole JVM -- just the JIT. What difference does it make what language the JIT is written in? It is my understanding that if Graal proves itself, it will replace C2 (if not C1 as well).
Don't think there's anything in 9 for JIT caching unless I missed it - do you have a reference? JIT caching is non trivial problem for Hotspot due to the nature of speculative optimization, so it may take quite some time for that to appear. Having said that, Azul has some form of it in its ReadyNow feature, but I don't know the details.
It's related to Project Jigsaw, but I don't know if it's actually scheduled for Java 9. I assumed the idea is to cache C1 output, not C2. I think I saw it in one of Paul Sandoz's videos. I'll look for it, and if I can't find it, I'll ask Paul.
I think there's some other secret sauce happening in there, especially since making the process parallelized is as easy as changing .stream to .parallelStream.
Code is here: https://gist.github.com/Karunamon/abc6483ac1d08f6cc137.
The result was that streams were roughly 4x slower.
Still, I really like some of the new constructs that Java is getting - they make the language a bit more expressive, lack of which has always been my main gripe with the language.