Concurrency in Python can be tricky. Users of the
threading module know how easy it is to get wrong. One thing that bit me quite a bit was how easy it was to get races when using
The Race Condition
The following example is a fairly common race condition developers run into when using
threading.Thread - we start a program in a thread, then operate on it before it is fully ready, causing our program to explode:
from threading import Thread class ChildProgram: def __init__(self): self.connection = None def start(self): # some expensive setup... self.connection = True def some_logic(self): # do something with self.connection pass program = ChildProgram() thread = Thread(target=program.start) thread.start() # the following explodes because # program has not finished setting up! program.some_logic()
- We define a
ChildProgramto run in a separate
- Then we call
thread.start(), which calls
program.start(), which has some expensive setup logic.
- Finally, we call
program.some_logic(), expecting our
ChildProgramto be completely ready, and our program explodes.
thread.start() is a non-blocking operation - when called, or program did not wait until
program.start() finished. Instead, it continued on, causing us to call
program.some_logic() before all of our setup logic completed. That's because
program.start() was executing concurrently in another thread.
So how can we be sure our
ChildProgram is fully ready before operating on it from our main thread?
Using the Event Synchronization Primitive
The correct way to solve this problem is by utilizing a
threading.Event, which allows:
...one thread to signal an event and other threads wait for it.
In our case, we want the main thread to wait until the child thread to signal that it is ready.
Fortunately, this is trivial:
from threading import Event, Thread class ChildProgram: def __init__(self, ready=None): self.ready = ready self.connection = None def connect(self): # lets make connection, expensive self.connection = SomeConnection() # then fire the ready event self.ready.set() def some_logic(self): # do something with self.connection pass ready = Event() program = ChildProgram(ready) # configure & start thread thread = Thread(target=program.connect) thread.start() # block until ready ready.wait() # now we can safely use program program.some_logic()
Now that we've setup our ready
Event(), we were able to ensure that our
ChildProgram is fully initialized, and
program.some_logic() is safe to call.
Don't rely on is_alive()
One might be tempted to call
Thread.is_alive() to determine if the program in the thread is ready to go. This, however, would be a mistake because:
...this method returns
Truejust before the
This means that
is_alive() will return
True even if the program you've started in the thread is not fully ready to accept work. In other words, if the code you are running in the thread takes awhile to setup, then relying on
is_alive() to determine if the program in the thread is ready to interact is not enough.
Check out the example updated to use
thread = Thread(target=program.start) thread.start() # block until thread is alive while not thread.is_alive(): pass # the following explodes because # program has not finished setting up! program.some_logic()
The above is not reliable because
is_alive() will return
True before our
ChildProgram has finished setting up.