Archive for the ‘ Cassandra ’ Category

Batching Work For Efficiency and Tuning

Monday, June 23rd, 2014

I’ve been talking a lot about message systems in distributed architectures lately. And one of the slides I show in my talks is a slide about compressing messages before writes to the database. In other words, if you have 150k messages per second coming in which would translate 1:1 in writes and force your database(s) to incur a 150k write per second load, you pull in all those messages in to memory for a short period (say one minute) and group them and write the group in batch. Depending on how much you can group, you can easily cut your write load by an order of magnitude. (more…)

Range Repairs: Step-by-Step

Friday, May 9th, 2014

It’s been a long time since I was able to run a repair on my Cassandra cluster. Basically since I went to 1.2, it just hasn’t been possible. And since repairs in Cassandra are pretty much a requirement to normal operation, this is clearly a problem. So in order to deal with the disarray that is Cassandra repairs in 1.2, I found a script originally written by Matt Stump and edited to work with virtual nodes (vnodes) by Brian Gallew. The tl;dr is that the script breaks the repairs down into manageable chunks and allows the repairs to finish. It is available here. (more…)