TSSJS2010: Cloud Computing with Scala & GridGain

I’m here at the beautiful Caesar’s Palace waiting to hear more about Scala.   James Gosling’s keynote speech this morning was a bit of a letdown.   I was hoping to hear a visionary speech from the father of Java, perhaps not with Obama’s oratory skills, but nevertheless captivating.   What I got instead was a 60 minute informercial on Sun’s current product offerings.   Yes, Java EE 6 and GlassFish 3.0 are totally slick, and I do like the concept of using annotations for event handling.   That said, I was hoping he would talk more about the future of Java, especially with all the Oracle merger and everything.   Anyway, no-go.   So now I’m here waiting to hear what the hell is Scala and how I can use it.   From what I’ve read, they’re using it at Twitter, so it’s gotta be somewhat scalable, right?  😉

The presentation is broken down with 20% talking and 80% live coding.   Nikita Ivanov, the presenter, has a very good stage presence and I enjoy his direct approach.

We start with the talking part by defining a few terms for us neophytes (including myself!)  What is Grid/Cloud Computing?   A Grid is defined as two or more computers working in parallel.   Grid Computing is comprised of Computer Grids + Data Grids.    The Cloud, meanwhile, is comprised of Data Center Virtualization.  Clouds are the new way to deploy and run grid applications.

Why Grid/Cloud Computing?

It solves problems often unsolvable otherwise.   Google has ~1,000,000 nodes in its grid.   Put another way, it’s about money.   Amazon says that 100 ms latency cost 1% of sales.  Google says that 500 ms latency drops traffic 20%.   In the financial sector, one millisecond costs $4M in currency markets.

GridGain at a Glance

The project was started in 2005 as Java-based Cloud Development Platform:
* Compute Grid (aka, MapReduce)
* Data Grid (aka. Distributed Cache)
* Auto-scaling on the cloud

Scala at a Glance

* Started in 2004 by Matin Odersky at EPFL  (author of ‘javac’ and Java Generics)
* Scala is Post-Functional Language (combines functional and OO approach)
* Fully inter-compatible with Java (runs on JVM, Call Java and called from Java methods)
* Statically typed (Unique and powerful type inference)

Apparently, there’s more use of Scala in Europe than there is here in the US.   A large national French bank already has a dozen Scala projects rolling out.   In the audience here today, only one hand was raised when asked who was using Scala today?

Why Scala?
* Performance largely equal to Java
* Statically typed
* Inter-compatible with Java
* Scalable language

Scalar – Scala-based cloud computing DSL + GridGain 3.0
* Uses Scala
– Functional-impertive
– Runs on JVM
– Reuses 100% Java libraries
* Running on top of GridGain 3.0 runtime

DSL – Domain Specific Language
* Provide simple cloud computing model
* Draws on functional features of Scala
* Dramatically simplifies cloud computing applications

The demo is to build a Scala grid application in 10 minutes!

import org.gridgain.grid.gridify.Gridify
import org.gridgain.grid.GridFactory

object ScalaDemo {
    def main args: Array string {
        GridFactory start

        try {
            say "hello Scala Las Vegas"
        }
        finally
            GridFactory stop true
        }
    }

    @Gridify
    def say msg: String   { println msg }
}

This code automatically deployed the object to the Grid and executed it on one of the nodes!   That’s mindblowing and so cool!   Clearly there’s tons of complexity behind the scenes.   The class definition must be serialized, the grid must be located, some scheduler must identify the node on which the object is to run.   I’m totally blown away by how painless it is to deploy on a grid!

Ivanov proceeded to write a task in about 10 lines of code that split the string into words and dispatched the job of printing those words onto the various nodes.

The same demo was then given using Scalars, which is a GridGain construct.    That got a little more cryptic, in my humble opinion.   Still, it did require less lines of codes to get the job done.   The tradeoff, though, seems to be readability.  You’re slowly creeping into the world of Python, which I’ve never enjoyed because only the original developer can ever make bug fixes.

Still, Scala clearly is pretty slick and it looks like the integration with GridGain in order to parallelize tasks is very easy.   I’ll definitely need to investigate how I can leverage this for risk analysis.