I want to share some of the inside story behind one of the biggest changes in GridGain 3.0 – introduction of the most comprehensive Functional Programming (FP) capabilities to Java-based framework.
It all started over two years ago when we finally released GridGain 2.0 in February of 2008 when we started thinking on what should be part of GridGain 3.0 product roadmap. We’ve had a pretty clear idea on GridGain 3.0 major directions including native cloud integration and built from the ground up Data Grid subsystem.
But we have one specific issue we wanted to address – we wanted to simplify further the usage model of GridGain. Now, this may seems like a strange idea as GridGain’s marquee feature is its elegant and simple approach to MapReduce programming. In fact, no one could produce a shorter or simpler implementation for ubiquities grid-enabled “Hello World” example than GridGain 2.0:
public class HelloWorld {
public static void main(String[] args) throws GridException {
GirdFactory.start();
try {
sayIt("Hello Grid Enabled World!");
}
finally {
GirdFactory.stop(true);
}
}
@Gridify(taskClass = Task.class)
public static void sayIt(String msg) {
System.out.println(msg);
}
}
public class Task extends GridifyTaskSplitAdapter<Object> {
@Override protected Collection<? extends GridJob> split(int gridSize, GridifyArgument arg)
throws GridException {
Collection<GridJob> jobs = new LinkedList<GridJob>();
for (final String word : ((String)arg.getMethodParameters()[0]).split(" ")) {
jobs.add(new GridJobAdapterEx() {
@Override public Object execute() throws GridException {
HelloWorld.sayIt(word);
return null;
}
});
}
return jobs;
}
@Override public Object reduce(List<GridJobResult> results) throws GridException {
return null;
}
}
To this date, by the way, no other grid computing framework (except for GridGain 3.0) can do it better. Still, this example shows plenty of boilerplate code that just seems out of place and after consideration we rather quickly decided that we’ve reached the limit of Java capabilities and need to look for DSL approach to further simplify the usability.
So, we’ve looked at various JVM-based languages that would provide DSL capabilities like JRuby, Groovy, Clojure, and Scala to implement our DSL. We quickly discounted JRuby and Clojure as they have no basis in enterprise software; we didn’t need Grails – therefore Groovy/Groovy++ was dropped. And the more we looked at Scala – the more it was turning out to be the perfect fit.
Now, what the hell does it have to do with Java, right? Keep on reading…
Long story short – we picked Scala and few months later produced the 1st version of what today is known as Scalar – Scala-based DSL for cloud computing running on top of GridGain runtime. And now, the same code as above could have been written in Scalar like that:
object HelloWorld {
def main(args: Array[String]) = scalar {
grid !!~ (for (w <- "Hello Grid Enabled World!".split(" ")) yield () => println(w))
}
}
When I made it to work for the first time (not for a faint of heart in its 1st version) – I literally stared at it for a few minutes as I was perplexed by how simple and elegant it looks. It does EXACTLY what Java code above does but it removed ALL boilerplate code entirely.
Now… this is where Java story begins. We went back to the whiteboard and re-designed our Java APIs almost from ground up while maintaining full backward compatibility. That set us back almost 9 months – but it WAS 100% WORTH IT.
After looking at FP frameworks for Java (bolts, FJ, lambdaJ) we’ve decided to build our own state of the art distributed FP framework – which is the cornerstone of all our new APIs in GridGain 3.0 for MapReduce and Data Grids.
And so, with GridGain 3.0 out, the Java code for the same grid-enabled “Hello World” looks now like this:
public class HelloWorld {
public static void main(String[] args) throws Exception {
G.start();
try {
G.grid().run(SPREAD, F.yield("Hello Grid Enabled World!".split(" "), F.println()));
}
finally {
G.stop(true);
}
}
}
Pretty close to Scala, isn’t it
But beyond the cosmetics the GridGain 3.0 enables the fundamental shift in how we build the distributed applications. We’ll be writing more and more about in the coming months.
Enjoy!