ec2-consistent-snapshot With Mongo

I setup MongoDB on my Amazon EC2 instance knowing full well that it would have to be backed up at some point. I also knew that by using XFS, I could take advantage of filesystem freezing in a similar fashion to LVM snapshots. I had remembered reading about backups on XFS with MySQL being done with ec2-consistent-snapshot. As with any piece of open source software, it just took a little tweaking to make it do what I wanted it to do.
Read the rest of this entry »

Getting a Random Record From a MongoDB Collection

One of my issues with MongoDB is that, as of this writing, there is no way to retrieve a random record. In SQL, you can simply do something similar to “ORDER BY RAND()” (this varies depending on your flavor) and you can retrieve random records (at a slightly expensive query cost). There is not yet an equivalent in MongoDB because of its sequential access nature. There is a purely Javascript method in the MongoDB cookbook here. If you are really interested, I would also read the Jira ticket thread #533 on this issue.
Read the rest of this entry »

Posted in MongoDB. Tags: , . 6 Comments »

New Massachusetts Security Law Passed For Databases

In case you haven’t heard about the new Massachusetts state law regarding consumer or client information in databases, you can read about it here, at Information Week, or just Google for “Massachusetts data security law”. And if you haven’t read about, then I strongly suggest you do. This is one of those instances where I believe their heart is in the right place, even if the execution/implementation wasn’t perfect.
Read the rest of this entry »

Speeding Up Your Selects and Sorts

When you are using a framework, they typically set your VARCHAR size automatically to 255. This is normally fine since you are letting the framework abstract you away from most of the SQL. But if you interact with your SQL, there is a way to get a decent speed increase on your SELECTs and ORDER BYs when you are working with VARCHARs.

The VARCHAR data type is only variable character size for storage, not for sorting and buffering. In fact, since the MySQL optimizer doesn’t know how big the data in that column can be, it has to allocate the maximum size possible for that column. So for sorting and buffering of the name and email columns below would take up 310 bytes per row.
Read the rest of this entry »