Python Multiprocessing Pools and MySQL

21
Dec

There really isn’t a solid Python module for multiprocessing and MySQL. Now this may be because MySQL on a single server is disk bound and therefore limited in speed or just because no one has written it. So here is a quick and dirty example using the Pool module in multiprocessing in Python 2.6 and MySQLdb.

I also tried using PySQLPool. This was designed for threading and not forking as I am doing with Pool method. Although I am sure it is possible to use PySQLPool with forking by passing the connection (pool) object down to the child process or possibly doing something with IPC, I decided to keep it simple (although slightly more expensive) and instantiate MySQLdb connections upon fork.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import MySQLdb
import multiprocessing

def main():
    tablename = 'address_demographics'
    inserts = []
    id = 0
    max_id = 1000000
    while id < max_id:
        sql = """
        INSERT IGNORE INTO `%s` (person_id)
        SELECT person_id
          FROM `people`
         WHERE id BETWEEN %d AND %d
        """
% (tablename,id,(id + 100000))
        inserts.append(sql)
        id += 100001
    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    pool.map(sqlWorkerInsert, inserts, 1)
    pool.close()
    pool.join()

def sqlWorkerInsert(sql):
    conn = MySQLdb.connect(host = "127.0.0.1",
                           user = "USER",
                           passwd = "PASS",
                           db = "DB")
    cursor = conn.cursor()
    # Need to do this since AUTOCOMMIT = 0 by default (wtf?)
    cursor.execute("SET AUTOCOMMIT=1")
    cursor.execute(sql)
    cursor.close()
    conn.close()

It may not be the cleanest method to open a new connection to MySQL for every worker subprocess. Since MySQL opens a new thread for every connection, I don’t think this is that big of a deal. However I am always open to new ways of doing things.

2 Responses to “Python Multiprocessing Pools and MySQL”

  1. […] posted here: Python Multiprocessing Pools and MySQL | Eric's Tech Blog By admin | category: python | tags: dirty-example, multiprocessing, mysqldb, python | […]

  2. brian says:

    try:

    conn = None
    def sqlWorkerInsert(sql):
    global conn
    try:
    conn.ping()
    except MySQLdb.OperationalError:
    conn = None
    if not conn:
    conn = MySQLdb.connect(…)
    … the rest of sqlWorkerInsert except the final conn.close()
    cursor.close()

Leave a Reply