Cassandra Summit 2012 Highlights

I was lucky enough to have the opportunity to speak at the Cassandra World Summit 2012 on August 8 in Santa Clara. It was an amazing opportunity to share with the community the types of things that SimpleReach does with Cassandra. Not only that, I learned a lot about the roadmap and got to put a bunch of faces with the names behind the project.

The whole show opened up with a bunch of Japanese drummers performing followed by Johnathon Ellis, Datastax CTO giving the keynote. He gave a few great notes about the roadmap. The 3 biggest points of which are:

  1. Collection types
  2. Virtual Nodes
  3. Row caching

The cool thing about collection type support is that it is allowing Cassandra to become more versatile for storing different types of data (read more about collections on the Cassandra developers blog here). With the addition of sets, lists and maps, many organizations will be able to move some of their caching and data organization off of Redis and into Cassandra.

Along the idea of caching is the new row-based caches. This works out incredibly well as it removes the need for (some) machines acting as a caching layer in front of Cassandra. It is not an uncommon paradigm to cache the output of an entire row of data in a caching layer. Having C* do this natively would allow the caching layer to be reserved for more complex objects. Additionally, having row caching inside Cassandra means that when a node coordinating a query hits the other nodes where it would expect to find the result, the node with the result already in its query will respond first and quickly. Meanwhile, the other node(s) will not need to respond to the coordinator and still have the result added to the row cache for future use.

Last but not least is the idea of virtual nodes. For a lot more information than this, I recommend watching Sam Overton’s talk on them from the summit. In short, virtual nodes make token management basically unneeded, improves bootstrapping and decommission speed, and make incremental cluster growing and shrinking possible. And considering how often anyone with a cluster of > 15 nodes does all of those things, this is a next level feature to be on the lookout for.

For more information, check out the blog post I wrote (which talks a little less about the roadmap) on the SimpleReach blog entitled A Big Stage for Cassandra.

And if you don’t want to click through, here is are the 2 videos that I am a part of. The first video is my talk at the summit. Topic of my talk is Polyglotteny. This is the notion that is acceptable to use more than one language and more than one data store to support your applications various needs. It is a little tour through how SimpleReach started out with Ruby and Mongo and ended up at Ruby/Node.js/Python and Mongo/Redis/Infobright/Cassandra.

For fun, here are the slides that go along with the talk (some of which you can see on the video).

This second video is of Russell Bradberry (Principal Architect at SimpleReach) and I talking to John Furrier of Silicon Angle. We talk a little about SimpleReach and how we use Cassandra to manage our data.

All in all the summit was a great experience, both in getting to share ourselves (SimpleReach) with the community and to meet and greet with the people that make it happen. Datastax was also kind enough to name me one of their MVPs for being a participating part of the Cassandra community. Cassandra has really made a name for itself and it’s great to be a part of the community behind that name.