Context managers in Python

By John Lekberg on October 11, 2020.

This week's post is about context managers in Python. You will learn:

What context managers are and why you should care.
How Python uses context managers in the standard library.
How to create your own context managers.
How to handle multiple context managers.
How to register callback functions with contextlib.ExitStack.

What are context managers? Why should I care?

Context managers (introduced in PEP 343) are Python objects that handle setup and teardown for resources like open files and database connections.

You should care about context managers because doing this setup and teardown by hand requires extra code that you can't forget to write, every time you work with that resource.

For example, I have a file of numbers

data.txt

I want to use Python to find the sum of these numbers, so I write this code:

print("opening file")
file = open("data.txt")
total = 0
for line in file:
    total += int(line)
print("closing file")
file.close()
print("total is", total)

opening file
closing file
total is 567

This seems to work fine, but it fails to close the file if an exception is raised:

data.txt

61
25
51
77
not a number
11
53
51
96
76

print("opening file")
file = open("data.txt")
total = 0
for line in file:
    total += int(line)
print("closing file")
file.close()
print("total is", total)

opening file
ValueError: invalid literal for int() with base 10: 'not a number\n'

(Notice that closing file is not printed.)

I can fix this by using a try-statement with a "finally" clause:

print("opening file")
file = open("data.txt")
try:
    total = 0
    for line in file:
        total += int(line)
finally:
    print("closing file")
    file.close()
print("total is", total)

opening file
closing file
ValueError: invalid literal for int() with base 10: 'not a number\n'

(Notice that closing file is printed, even though the exception is raised.)

Because this "try-finally" pattern is so common, Python added context managers to simplify this sort of code.

For example, since files are context managers, I can take this code:

file = open("data.txt")
try:
    total = 0
    for line in file:
        total += int(line)
finally:
    file.close()

And rewrite it to use a with-statement that activates the file's context manager:

with open("data.txt") as file:
    total = 0
    for line in file:
        total += int(line)

Context managers are just objects with __enter__ and __exit__ methods. Any object with these methods is considered to be a context manager. __enter__ handles the setup. __exit__ handles the teardown.

Context managers are activated by the with-statement (introduced in PEP 343). __enter__ is called at the beginning of the statement, and __exit__ is called at the end of the statement. Any control flow that exits the statement -- like return, raise, break, or continue -- will also cause __exit__ to be called.

For the exact semantics of the with-statement, please refer to:

How Python uses context managers in the standard library

Python's standard library uses context managers in many different places, but there are 5 main patterns to keep in mind:

Close and open files.
Commit or rollback database transactions.
Acquire and release concurrency locks.
Start and shutdown concurrency/process managers.
Handle specific scenarios with contextlib.

Close and open files. Objects that inherit from io.IOBase are context managers that call .close() on exit. This includes:

The results of the built-in function open.
The results of urllib.request.urlopen.
Objects from module tempfile: TemporaryFile, NamedTemporaryFile, SpooledTemporaryFile, TemporaryDirectory.
zipfile.ZipFile objects, and the results of zipfile.ZipFile.open.
tarfile.TarFile objects.

Here's an example:

with open("data.txt") as file:
    print(file.read())

61
25
51
77
not a number
11
53
51
96
76

Commit or rollback database transactions. sqlite3.Connection objects are context managers that will either commit or rollback a transaction, depending on how the context manager is exited. If the context manager exits normally, then the transaction is committed. If an exception is raised, then the transaction is rolled back.

Here's an example: I have a database of accounts with value transfers:

import sqlite3

conn = sqlite3.connect(":memory:")
conn.executescript("""
CREATE TABLE Account(name TEXT, change REAL);
INSERT INTO Account
VALUES ('Bob', 300), ('Henry', 200);
""")


def report():
    print(conn.execute("""
      SELECT name, sum(change)
        FROM Account
    GROUP BY name
    ORDER BY name ASC
    """).fetchall())


report()

[('Bob', 300.0), ('Henry', 200.0)]

Bob has $300, and Henry has $200. Then Henry transfers $100 to Bob:

with conn:
    conn.execute(
    	"INSERT INTO Account VALUES ('Bob', 100)"
    )
    conn.execute(
        "INSERT INTO Account VALUES ('Henry', -100)"
    )
report()

[('Bob', 400.0), ('Henry', 100.0)]

(Notice that this transaction was committed because there were no issues.)

Then Henry transfers $100 to Bob again, but an error occurs in the middle of the transaction:

with conn:
    conn.execute(
    	"INSERT INTO Account VALUES ('Bob', 100)"
    )
    raise Exception()
    conn.execute(
        "INSERT INTO Account VALUES ('Henry', -100)"
    )

Exception:

report()

[('Bob', 400.0), ('Henry', 100.0)]

(Notice that this transaction was rolled back because an exception was raised. Bob did not gain $100.)

Acquire and release concurrency locks. Concurrency lock objects are context managers that call .acquire() on enter and call .release() on exit. This includes:

Locks from module threading: Lock, RLock, Condition, Semaphore, BoundedSemaphore.
Locks from module multiprocessing: Lock, RLock, Condition, Semaphore, BoundedSemaphore.

Here's an example: I have two tasks that I want to run concurrently that both print output:

import threading
import time


def taskA():
    for i in range(5):
        print("A", i)
        time.sleep(1)


def taskB():
    for i in range(5):
        print("B", i)
        time.sleep(1)


threadA = threading.Thread(target=taskA)
threadB = threading.Thread(target=taskB)
threadA.start()
threadB.start()

AB 0
 0
AB 1
 1
BA 2
 2
BA 3
 3
AB 4
 4

The problem with this is that the outputs from the two tasks overlap. I can fix this by using a lock as a context manager:

print_lock = threading.Lock()


def taskA():
    for i in range(5):
        with print_lock:
            print("A", i)
        time.sleep(1)


def taskB():
    for i in range(5):
        with print_lock:
            print("B", i)
        time.sleep(1)


threadA = threading.Thread(target=taskA)
threadB = threading.Thread(target=taskB)
threadA.start()
threadB.start()

A 0
B 0
A 1
B 1
A 2
B 2
A 3
B 3
A 4
B 4

Start and shutdown concurrency/process managers. Concurrency/process managers are context managers that are started on enter, and are terminated on exit:

multiprocessing.pool.Pool calls .terminate() on exit.
multiprocessing.manager.BaseManager and multiprocessing.manager.SyncManager call .start() on enter (if not already started), and call .shutdown() on exit.
concurrent.futures.Executor objects (ThreadPoolExecutor and ProcessPoolExecutor) call .shutdown() on exit.
subprocess.Popen closes standard file descriptors and waits for the process on exit.

Here's an example: I use a ThreadPoolExecutor object to run two tasks concurrently and then print a message after the executor is shutdown:

from concurrent.futures import ThreadPoolExecutor
import threading
import time

print_lock = threading.Lock()


def taskA():
    for i in range(5):
        with print_lock:
            print("A", i)
        time.sleep(1)


def taskB():
    for i in range(5):
        with print_lock:
            print("B", i)
        time.sleep(1)


print("Get ready!")
with ThreadPoolExecutor() as executor:
    executor.submit(taskA)
    executor.submit(taskB)
print("All finished!")

Get ready!
A 0
B 0
A 1
B 1
A 2
B 2
B 3
A 3
A 4
B 4
All finished!

Handle specific scenarios with contextlib. The contextlib module provides several context managers for specific scenarios:

contextlib.closing calls an object's .close() method on exit. This is useful when working with objects that have a .close() method, but don't natively support the context manager protocol.

For example, generator functions must call .close() to properly clean up any resources that they use. Here's some code that uses contextlib.closing to manage a generator function:
```
from contextlib import closing


def genfunc():
    with open("data.txt") as file:
        for line in file:
            yield int(line)


with closing(genfunc()) as numbers:
    for x in numbers:
        if x < 30:
            print("Found x < 30:", x)
            break
```
```
Found x < 30: 25
```
(In this example, if genfunc is not closed, then the opened file in genfunc will not be closed.)

contextlib.suppress allows your code to ignore specific types of exceptions.

For example, I want to print the square of each number in a data file, but the data file contains some bad data:

with open("data.txt") as file:
    for line in file:
        num = int(line)
        print(num ** 2)

3721
625
2601
5929
ValueError: invalid literal for int() with base 10: 'not a number\n'

I can use contextlib.suppress to ignore the ValueError and continue on with the loop:

from contextlib import suppress

with open("data.txt") as file:
    for line in file:
        with suppress(ValueError):
            num = int(line)
            print(num ** 2)

contextlib.redirect_stdout and redirect_stderr can temporarily redirect standard output and standard error.

For example, I want to capture the output of the built-in function help into a string:

from contextlib import redirect_stdout
import io


def help_str(x):
    buffer = io.StringIO()
    with redirect_stdout(buffer):
        help(x)
    return buffer.getvalue()


help_str("pow")

'Help on built-in function pow in module
builtins:\n\npow(base, exp, mod=None)\n    Equivalent to
base**exp with 2 arguments or base**exp % mod with 3
arguments\n    \n    Some types, such as ints, are able to
use a more efficient algorithm when\n    invoked using the
three argument form.\n\n'

contextlib.ExitStack can be used to manage multiple context managers. (Read more about this in the following sections.)

How to create context managers

You can create a context manager by implementing __enter__ and __exit__ methods on an object. E.g.

class Talky:
    def __enter__(self):
        print("Wow! Entering a context.")

    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is not None:
            print("Leaving with an error?!")
        else:
            print("Leaving normally. Boring...")


with Talky():
    pass

Wow! Entering a context.
Leaving normally. Boring...

with Talky():
    raise Exception()

Wow! Entering a context.
Leaving with an error?!
Exception:

What are the three parameters exc_type, exc_value, traceback?

If the context manager is exited without an exception, they are all None.
Otherwise, they context exception information.

For more details, see "3.3.9. With Statement Context Managers".

Even if you only need to use one of the methods (__enter__ or __exit__), you need to implement both of them. E.g. just implementing __enter__ causes an error:

class TalkEnter:
    def __enter__(self):
        print("Entering a context.")


with TalkEnter():
    pass

AttributeError: __exit__

Here are two useful shortcuts:

You can use the decorator contextlib.contextmanager to create a context manager from a generator function. E.g. The Talky example from above can be rewritten as

from contextlib import contextmanager


@contextmanager
def Talky():
    print("Wow! Entering a context.")
    try:
        yield
    except:
        print("Leaving with an error?!")
        raise
    else:
        print("Leaving normally. Boring...")


with Talky():
    pass

Wow! Entering a context.
Leaving normally. Boring...

with Talky():
    raise Exception()

Wow! Entering a context.
Leaving with an error?!
Exception:

If the resource that you want to manage has a .close() method, then you can use contextlib.closing to turn it into a context manager. E.g.

from contextlib import closing


class Thingy:
    def close(self):
        print("Closing this resource.")


with closing(Thingy()) as x:
    print(x)

<__main__.Thingy object at 0x7ffead1eb520>
Closing this resource.

How to handle multiple context managers

A simple way to handle multiple context managers is nested with-statements. E.g.

with open("in.txt") as fin:
    with open("out.txt", "w") as fout:
        for line in fin:
            fout.write(line.upper())

You can also have multiple items in one with-statement, separated by commas. E.g.

with open("in.txt") as fin, open("out.txt", "w") as fout:
    for line in fin:
        fout.write(line.upper())

(This is equivalent to nesting the context managers, from left to right.)

But, if you have a large amount (or a variable amount) of context managers, then you should use contextlib.ExitStack. E.g.

from contextlib import ExitStack

with ExitStack() as stack:
    fin = stack.enter_context(open("in.txt"))
    fout = stack.enter_context(open("out.txt", "w"))
    for line in fin:
        fout.write(line.upper())

Here's a more complex example: I have 20 files, in-1.txt, in-2.txt, ..., in-20.txt of data. Each of these files is sorted, and I want to create out.txt that contains the sorted lines of all the input files. Here's how I would manage these files using ExitStack:

from contextlib import ExitStack
import heapq

filenames = [f"in-{n}.txt" for n in range(1, 21)]
with ExitStack() as stack:
    fins = [
        stack.enter_context(open(filename))
        for filename in filenames
    ]
    fout = stack.enter_context(open("out.txt", "w"))
    for line in heapq.merge(*fins):
        fout.write(line)

How to register callback functions with `ExitStack`

contextlib.ExitStack.callback can also be used to register callback functions that are executed when the context manager is exited. E.g.

from contextlib import ExitStack

with ExitStack() as stack:
    print(1)
    stack.callback(print, 2)
    print(3)
    stack.callback(print, 4)

I find this useful for joining threads in multithreaded code. E.g. Here's a task that I want to run in several threads:

import threading

print_lock = threading.Lock()


def task(n):
    for i in range(n, n + 3):
        with print_lock:
            print(f"{n}-{i}")
        time.sleep(1)

I could manage the threads by creating a list of threads, then looping over to that list to start the threads, and then looping over that list again to join the threads:

threads = [
    threading.Thread(target=task, args=[i])
    for i in range(5)
]

for thread in threads:
    thread.start()

for thread in threads:
    thread.join()

0-0
1-1
2-2
3-3
4-4
1-2
3-4
2-3
0-1
4-5
1-3
2-4
0-2
3-5
4-6

But, I can remove the need to explicitly keep track of threads for starting and joining by registering thread.join as a callback to an ExitStack:

with ExitStack() as stack:
    for i in range(5):
        thread = threading.Thread(target=task, args=[i])
        thread.start()
        stack.callback(thread.join)

0-0
1-1
2-2
3-3
4-4
1-2
4-5
0-1
2-3
3-4
1-3
4-6
0-2
2-4
3-5

In conclusion...

In this week's post you learned:

Context managers are useful for managing resources because they work correctly even in the presence of different control flow scenarios, like raising an exceptions or returning from a function.
Python's standard library uses context managers to open and close files, commit or rollback database transactions, and acquire and release concurrency locks.
Python's contextlib module provides context managers for specific scenarios (like redirecting stdout or suppressing exceptions), as well as utilities like contextlib.contextmanager and contextlib.ExitStack.
contextlib.ExitStack is useful for managing multiple context managers and for registering callback functions (e.g. ensuring that threads are joined at the end of the block).

My challenge to you:

Create a context manager called tag that prints opening and closing XML tags. E.g.

with tag("body"):
    with tag("h1"):
        print("My Document")
    with tag("p"):
        print("Lorem ipsum")
    with tag("ul"):
        for i in range(4):
            with tag("li"):
                print(i)

<body>
<h1>
My Document
</h1>
<p>
Lorem ipsum
</p>
<ul>
<li>
0
</li>
<li>
1
</li>
<li>
2
</li>
<li>
3
</li>
</ul>
</body>

If you enjoyed this week's post, share it with your friends and stay tuned for next week's post. See you then!

(If you spot any errors or typos on this post, contact me via my contact page.)