There are a lot of things I really like about Cassandra. But one thing in particular I like in creating a schema is having access to composite columns (read about composite columns and their origins here on Datastax’s blog). Let’s start simple with explaining a composite columns and then we can dive right into why they are so much fun to work with. (more…)
I setup MongoDB on my Amazon EC2 instance knowing full well that it would have to be backed up at some point. I also knew that by using XFS, I could take advantage of filesystem freezing in a similar fashion to LVM snapshots. I had remembered reading about backups on XFS with MySQL being done with ec2-consistent-snapshot. As with any piece of open source software, it just took a little tweaking to make it do what I wanted it to do.
In case you haven’t heard about the new Massachusetts state law regarding consumer or client information in databases, you can read about it here, at Information Week, or just Google for “Massachusetts data security law”. And if you haven’t read about, then I strongly suggest you do. This is one of those instances where I believe their heart is in the right place, even if the execution/implementation wasn’t perfect.
When you are using a framework, they typically set your VARCHAR size automatically to 255. This is normally fine since you are letting the framework abstract you away from most of the SQL. But if you interact with your SQL, there is a way to get a decent speed increase on your SELECTs and ORDER BYs when you are working with VARCHARs.
The VARCHAR data type is only variable character size for storage, not for sorting and buffering. In fact, since the MySQL optimizer doesn’t know how big the data in that column can be, it has to allocate the maximum size possible for that column. So for sorting and buffering of the name and email columns below would take up 310 bytes per row.
Having recently implemented Thinking Sphinx on one of my web sites, I thought it would be cool to be able to search every indexed model. With Thinking Sphinx, it’s easy to have a bunch of different classes returned in the results. The tougher part is displaying them in a way that is organized (although admittedly not very DRY).
I have been writing a lot of code that has been interacting with MySQL lately. Sometimes I find it easier to work the result set in a dictionary form and other times it is easier with an array. But in order to not break all your code, it is necessary to set a default cursor class that keeps your code consistent. More often than not, I find using using a arrays is easier since I just want quick access to all the retrieved data. I also end up making my SELECT calls while specifying the columns and order of the columns I want returned.
The reason that using cursor classes is handy is because Python doesn’t come with a mysql_fetch_assoc like PHP or selectrow_hashref like Perl’s DBI interface. Python uses cursor dictionaries to bridge this gap. Ultimately your result is the same. But as with Perl and PHP, defaulting to cursor dictionaries isn’t a good idea for larger datasets because of the extra processing time and memory required to convert the data.
Normally when do a database migration in Rails, when adding ownership from a model to another model, you use the concept of belongs_to or references:
create_table :comments do |t|
Interestingly enough, these methods are only available during the initial table creation. If you want to add a reference to a model that is created later, you have to do it the old fashioned way, by just adding a column:
add_column :comments, :group_id, :integer
Doing it this way is clean, easy, and definitely meets the KISS principle. But I do find it interesting that one can’t add an association later in the game. Sometimes the Rails way is just KISS and adding the column by hand.