Python has basic support for threading built in: for example, here’s a program that runs two threads, each of which prints out messages after sleeping a particular amount of time:
from threading import Thread, local
import time
class MessageThread(Thread):
def __init__(self, message, sleep):
self.message = message
self.sleep = sleep
Thread.__init__(self) # remember to run Thread init!
def run(self): # automatically run by 'start'
i = 0
while i < 50:
i += 1
print i, self.message
time.sleep(self.sleep)
t1 = MessageThread("thread - 1", 1)
t2 = MessageThread("thread - 2", 2)
t1.start()
t2.start()
However, due to the existence of the Global Interpreter Lock (GIL) (http://docs.python.org/api/threads.html), CPU-intensive code will not run faster on dual-core CPUs than it will on single-core CPUs.
Briefly, the idea is that the Python interpreter holds a global lock, and no Python code can be executed without holding that lock. (Code execution will still be interleaved, but no two Python instructions can execute at the same time.) Therefore, any Python code that you write (or GIL-naive C/C++ extension code) will not take advantage of multiple CPUs.
This is intentional:
http://mail.python.org/pipermail/python-3000/2007-May/007414.html
There is a long history of wrangling about the GIL, and there are a couple of good arguments for it. Briefly,
- it dramatically simplifies writing C extension code, because by default, C extension code does not need to know anything about threads.
- putting in locks appropriately to handle places where contention might occur is not only error-prone but makes the code quite slow; locks really affect performance.
- threaded code is difficult to debug, and most people don’t need it, despite having been brainwashed to think that they do ;).
But we don’t care about that: we do want our code to run on multiple CPUs. So first, let’s dip back into C code: what do we have to do to make our C code release the GIL so that it can do a long computation?
Basically, just wrap I/O blocking code or CPU-intensive code in the following macros:
Py_BEGIN_ALLOW_THREADS
...Do some time-consuming operation...
Py_END_ALLOW_THREADS
This is actually pretty easy to do to your C code, and it does result in that code being run in parallel on multi-core CPUs. (note: example?)
The big problem with the GIL, however, is that it really means that you simply can’t write parallel code in Python without jumping through some kind of hoop. Below, we discuss a couple of these hoops ;).