1 Billion ACID Updates per Second on 10-nodes Cluster Using GridGain

Welcome to In-Memory Computing…

And, yes, it wasn’t a typo in the title: 1,000,000,000 distributed fully transactional updates per second on 10 nodes cluster costing less than $50K using GridGain’s In-Memory Data Platform.

Most of the time we at GridGain are not at liberty to discuss customers’ benchmark and POC – but I want to share some numbers we’ve recently demonstrated to one of the largest financial institution in the world (under the strict open tender rules). The task was rather simple and isolated – yet the one that presents a challenge to achieve the target performance numbers.

Use Case
Imagine you are building a hypothetical real-time risk analytics system. You have 500 events per second coming into your system and you need to update approximately 10,000,000 positions per each event based on some predefined formula. For obvious performance considerations all data must reside and be processed in memory with possible overflow to disk, when necessary. System should scale linearly up to 100+ of nodes and work on any type of commodity hardware.

What I really like about these requirements is that almost any series financial organization would have projects with similar requirements to these – if not exactly like these. We are all moving towards same-day processing and more and more into the realm of real-time processing regardless of how big the book of business is. And when it comes to risk analytics, fraud protection, or any type of trading – we are seeing these requirements almost on a weekly basis…

Results
Back to this POC. One of our top engineers spent 10 days building this pilot and after few configuration & algorithmic improvements was able to achieve 1 billon ACID updates per second on the target dataset using GridGain 4.3 “Big Data” edition running on 10 nodes cluster consisting of commodity Dell PowerEdge R410 servers with 96GB RAM each.

GridGain 4.3 provides several key features that were necessary in this POC to achieve the performance numbers:

  • World’s fastest marshaling algorithm, up to 5x faster than Google Kryo.
  • Highly optimized co-located cache mode
  • Pluggable & user customizable affinity distribution function
  • Affinity-aware group locking
  • Pluggable cache store with pre-loading
  • Compute and data loaders with back-pressure controlling

There are very few technologies today on the market that can deliver 1,000,000,000 transactions per second on $50K hardware – if any. If you need it today – GridGain 4.3 delivers this performance 100%.

Improving MapReduce: GridGain To The Rescue! JAX London 2012, Oct. 16, London, UK.

GridGain will be presenting at JAX London 2012 conference in London, October 16th, 2012. We’ll be talking about “Improving MapReduce: GridGain & Scala To The Rescue!”. If you are interesting in pretty cool live Scala coding of Streaming MapReduce application – come to our talk and watch it live 🙂

Hope to see you there! All information about the conference is here.

GridGain 4.3 is Released!

GridGain 4.3 incorporates many optimizations that significantly improve GridGain’s performance and memory footprint based directly on feedback from our largest deployments and is 100% backwards compatible with prior 4.x versions.

The new features and performance numbers speak for themselves:

  • Up to 20x better serialization performance compared to standard Java.
  • 3x better performance for collocating computations and data.
  • 3x better query indexing from an improved cache locking implementation.
  • 50% reduction in cache overhead and significantly reduced garbage collection overhead.
  • Dramatically improved BigMemory support provides high-performance off-heap data store that eliminates lengthy garbage collection pauses when working with hundreds of gigabytes of memory.
  • New Data-Affinity-Aware Router supports client communication through the firewall.
  • Significant performance enhancements for Java, C++, and .NET clients.

Download now!