Return to Blog

Comprehensions and generator expressions in Python

By John Lekberg on July 16, 2020.


This week's post is about using comprehensions and generator expressions in Python. You will learn:

What are comprehensions?

Comprehensions are a way to create lists, sets, and dictionaries by transforming and filtering other iterables. E.g.

[ (x ** 2) for x in [1, 1, 2, 3, 5] ]
[1, 1, 4, 9, 25]
{
    word.casefold(): word
    for word in ["Hello", "THERE", "geneRAL"]
}
{'hello': 'Hello', 'there': 'THERE', 'general': 'geneRAL'}
real_commands = { "sit", "stay", "heel", "wait" }
issued_commands = {
    "SIT", "Sit", "Bark", "jump", "heel", "BARK"
}
{
    command
    for command in issued_commands
    if command.lower() not in real_commands
}
{'BARK', 'Bark', 'jump'}

Here's how I would write the above code without using comprehensions.

There are three types of comprehensions:

Using multiple for-loops and if-statements in a comprehension

Comprehensions can have multiple for-loops and if-statements. E.g. Here is a list comprehension that generates some Pythagorean triples:

domain = range(1, 100)
[
    (a, b, c)
    for a in domain
    for b in domain
    for c in domain
    if a < b
    if b < c
    if a ** 2 + b ** 2 == c ** 2
]
[(3, 4, 5),
 (5, 12, 13),
 (6, 8, 10),
 ...
 (57, 76, 95),
 (60, 63, 87),
 (65, 72, 97)]

Without using a list comprehension, I would write this as:

domain = range(1, 100)
triples = []
for a in domain:
    for b in domain:
        for c in domain:
            if a < b:
                if b < c:
                    if a ** 2 + b ** 2 == c ** 2:
                        triples.append((a, b, c))
triples
[(3, 4, 5),
 (5, 12, 13),
 (6, 8, 10),
 ...
 (57, 76, 95),
 (60, 63, 87),
 (65, 72, 97)]

The for-loops and if-statements can be mixed together. Doing this can lead to speedups:

What are generator expressions?

Generator expressions (PEP 289) are a way to create generators (a kind of iterator) by transforming and filtering other iterables. Think of generator expressions as "iterator comprehensions". E.g. Here's a generator expression of the first few square numbers:

( x ** 2 for x in range(10) )
<generator object <genexpr> at 0x10511cc80>

Generator expressions have a similar syntax to comprehensions. They look like

( exp for-in ... )

To access the data in the generator, I need to consume the iterator using, e.g., list:

list(( x ** 2 for x in range(10) ))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

NOTE: When a generator expression is the only argument to a function, the parentheses do not need to be written:

list(( x ** 2 for x in range(10) ))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
list(x ** 2 for x in range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

What's the use of generator expressions?

Because generator expressions can keep memory usage lower than comprehensions, I like to use them with these functions:

In conclusion...

In this week's post, you learned about comprehensions and generator expressions, which are a concise way to create lists (and other containers) by transforming and subsetting iterables. The comprehension "notation" is similar to set-builder notation in mathematics.

My challenge to you:

I have a 3 byte passcode with this SHA-256 checksum:

6d6125cc4538aaec9dbef490ab1091a6cb4af5348f96a5cb0bfeeeda6edfebbe

Use a comprehension or generator expression to figure out what 3 byte passcode produces this checksum.

You can generate the checksum using hashlib.sha256:

from hashlib import sha256

sha256(b"Hello World").hexdigest()
'a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e'

If you enjoyed this week's post, share it with your friends and stay tuned for next week's post. See you then!


(If you spot any errors or typos on this post, contact me via my contact page.)