Getting a Random Record From a MongoDB Collection

One of my issues with MongoDB is that, as of this writing, there is no way to retrieve a random record. In SQL, you can simply do something similar to “ORDER BY RAND()” (this varies depending on your flavor) and you can retrieve random records (at a slightly expensive query cost). There is not yet an equivalent in MongoDB because of its sequential access nature. There is a purely Javascript method in the MongoDB cookbook here. If you are really interested, I would also read the Jira ticket thread #533 on this issue.
Read the rest of this entry »

Posted in MongoDB. Tags: , . View Comments

New Massachusetts Security Law Passed For Databases

In case you haven’t heard about the new Massachusetts state law regarding consumer or client information in databases, you can read about it here, at Information Week, or just Google for “Massachusetts data security law”. And if you haven’t read about, then I strongly suggest you do. This is one of those instances where I believe their heart is in the right place, even if the execution/implementation wasn’t perfect.
Read the rest of this entry »

Speeding Up Your Selects and Sorts

When you are using a framework, they typically set your VARCHAR size automatically to 255. This is normally fine since you are letting the framework abstract you away from most of the SQL. But if you interact with your SQL, there is a way to get a decent speed increase on your SELECTs and ORDER BYs when you are working with VARCHARs.

The VARCHAR data type is only variable character size for storage, not for sorting and buffering. In fact, since the MySQL optimizer doesn’t know how big the data in that column can be, it has to allocate the maximum size possible for that column. So for sorting and buffering of the name and email columns below would take up 310 bytes per row.
Read the rest of this entry »

Database Read/Write Splitting in Frameworks/ORMs

Although one of the primary ideas behind frameworks is to keep things as simple as possible, sometimes they create issues in the long run. What I am about to discuss is something of a luxury problem (as scaling usually is), but it is a problem nonetheless.

When initially starting a project, whether you are using Ruby on Rails (Ruby), Django (Python), CakePHP (PHP), Catalyst (Perl), or any of the other 100s of frameworks in any of the languages out there, the first and most important thing to do is to get it out the door. Once you have done that, it’s time to get users, fix bugs, and add features. After you have done all that and you have a great web app, its time to think scaling. (Yes I realize that I have trivialized this process immensely, but its for a point, I promise).
Read the rest of this entry »