I am always for using the right tool for the right job. A lot of time, that tool is Python. I have always had trouble finding solid documentation on using MySQL with Python. There was generally enough to get by, but the more the merrier. Enter MySQL for Python by Albert Lukaszewski. (more…)
Although one of the primary ideas behind frameworks is to keep things as simple as possible, sometimes they create issues in the long run. What I am about to discuss is something of a luxury problem (as scaling usually is), but it is a problem nonetheless.
When initially starting a project, whether you are using Ruby on Rails (Ruby), Django (Python), CakePHP (PHP), Catalyst (Perl), or any of the other 100s of frameworks in any of the languages out there, the first and most important thing to do is to get it out the door. Once you have done that, it’s time to get users, fix bugs, and add features. After you have done all that and you have a great web app, its time to think scaling. (Yes I realize that I have trivialized this process immensely, but its for a point, I promise).
I have been writing a lot of code that has been interacting with MySQL lately. Sometimes I find it easier to work the result set in a dictionary form and other times it is easier with an array. But in order to not break all your code, it is necessary to set a default cursor class that keeps your code consistent. More often than not, I find using using a arrays is easier since I just want quick access to all the retrieved data. I also end up making my SELECT calls while specifying the columns and order of the columns I want returned.
The reason that using cursor classes is handy is because Python doesn’t come with a mysql_fetch_assoc like PHP or selectrow_hashref like Perl’s DBI interface. Python uses cursor dictionaries to bridge this gap. Ultimately your result is the same. But as with Perl and PHP, defaulting to cursor dictionaries isn’t a good idea for larger datasets because of the extra processing time and memory required to convert the data.
If you’ve ever been plagued by an error 1033 issue in MySQL (replication will show it as well), then I might be able to help you out. The error reads something like, “Incorrect information in file: ‘./mydb/table.frm’. I classify this as another one of MySQLs cryptic error messages. Here is how I determined that this was my problem.
Googling around got me an answer, but I had to read a bunch of different responses to piece together the answer. Essentially this issue (in my case) was a result of the InnoDB engine not loading up when MySQL was restarted. Therefore when MySQL tried to read the frm file (table description) which was written for an InnoDB table with the MyISAM reader, it didn’t like it. Since MyISAM is the fallback engine, it went to that and the table became unusable.
There really isn’t a solid Python module for multiprocessing and MySQL. Now this may be because MySQL on a single server is disk bound and therefore limited in speed or just because no one has written it. So here is a quick and dirty example using the Pool module in multiprocessing in Python 2.6 and MySQLdb.
I also tried using PySQLPool. This was designed for threading and not forking as I am doing with Pool method. Although I am sure it is possible to use PySQLPool with forking by passing the connection (pool) object down to the child process or possibly doing something with IPC, I decided to keep it simple (although slightly more expensive) and instantiate MySQLdb connections upon fork.
While writing a simple Python script to access and process data in a database, I came across an error that said:
Error 2014: Commands out of sync; you can't run this command now
After quite a bit of Googling and with very little findings, I had to dive in a little and try to figure out what was going on. The whole error looked like this:
For everyone who was not able to attend this year, here is the link to the Percona Perfmance Conference slides: http://conferences.percona.com/percona-performance-conference-2009/schedule.html.
I suggest you check them out if you work with MySQL. Enjoy.
Every so often I find some statistical need that although Perl program would be easy to write for it, its probably something the database should just handle. So I have a column in an email management table that has just the email addresses in the format firstname.lastname@example.org.
I want to know which domains make up the majority of the users. (The numbers have been changed to protect the innocent):
mysql> SELECT SUBSTRING_INDEX(email, '@', -1) as Domain, count(*) as Total FROM email_list GROUP BY Domain ORDER BY Total DESC LIMIT 15; +----------------+---------+ | Domain | Total | +----------------+---------+ | yahoo.com | 1304000 | | hotmail.com | 908400 | | aol.com | 800000 | | msn.com | 168000 | | gmail.com | 161000 | | comcast.net | 143000 | | sbcglobal.net | 110000 | | bellsouth.net | 62000 | | cox.net | 58000 | | verizon.net | 56000 | | earthlink.net | 52000 | | charter.net | 46000 | | juno.com | 30000 | | optonline.net | 22000 | | netzero.com | 17000 | +----------------+---------+
I recently ran into some character encoding issues and I wanted to share the fun. The default character encoding for MySQL on Gentoo is latin-1 or iso-8859-1. This wasn’t a problem until we recently started putting content straight from the DB through Java and onto the web. Java connects to the DB with a character encoding (typically UTF-8). Since UTF-8 is roughly a superset of iso-8859-1, it generally wasn’t a problem. Except when UTF-8 and UTF-16 characters were put into an iso-8859-1 database without translation.
What was essentially happening was that the data was being stored as iso-8859-1. The Java code was connecting to the DB in UTF-8 and pulling it into Java (which is usually UTF-16, but in this case was being handled as UTF-8). It was then being sent to the browser as URL encoded UTF-8 when in reality, it hadn’t even properly been put into UTF-8 character encoding. This then gave the web browser some funny yen symbols and question marks. This was not quite what we were aiming for.
The moral of this story is that it is necessary to realize the character encoding of the start point and end point of your data. It is crucial that the code points match up otherwise they could potentially make for an interesting screen given to the reader. All this could have been avoided with a simple: ALTER TABLE myTable MODIFY myColumn VARCHAR(255) CHARACTER SET utf8;.