Using enumerate() to simplify Python for-loops

By John Lekberg on January 08, 2020.

When you write for-loops like this:

line_number = 0
for line in file:
    line_number += 1
    ...

Use enumerate instead.

enumerate (proposed in PEP 279) is a built-in function that solves the loop-counter problem:

names = ["John", "Suzy", "Bill"]

for x in names:
    print(x)

John
Suzy
Bill

for i, x in enumerate(names):
    print(i, x)

0 John
1 Suzy
2 Billy

for i, x in enumerate(names, start=1):
    print(i, x)

1 John
2 Suzy
3 Billy

I often use enumerate to solve two problems:

Keeping track of line numbers in a file.
Tracking progress on long running tasks.

Here's how.

Keeping track of line numbers in a file

I'm writing an article and I write down TODOs for unfinished tasks needed to complete the article:

article.txt

 1  There are many graph path-finding algorithms to
 2  choose from and their effectiveness depends on the
 3  structure of the graph. For this graph:
 4  
 5  TODO: compute upper bound on graph size
 6  
 7  TODO: justify algorithm choice based on graph size
 8  
 9  I will use Breadth First Search
10  (TODO: find link to algorithm). Here is a Python
11  implementation of BFS:
12  
13  TODO: add Python implementation of BFS

I don't want to accidentally publish an unfinished article, so I wrote a Python script that lists any TODOs and which lines they appear on:

with open("article.txt") as file:
    for line_number, line in enumerate(file, start=1):
        if "TODO" in line:
            print("line", line_number, ":", line)

line 5 : TODO: compute upper bound on graph size

line 7 : TODO: justify algorithm choice based on graph size

line 10 : (TODO: find link to algorithm). Here is a Python

line 13 : TODO: add Python implementation of BFS

Without enumerate, I would have written

with open("article.txt") as file:
    line_number = 0
    for line in file:
        line_number += 1
        if "TODO" in line:
            print("line", line_number, ":", line)

But this is less readable for two reasons:

The line counting code is split across multiple lines. (line_number = 0 and line_number += 1.)
Even though I wrote line_number = 0, you won't know if I start counting from 0 or from 1 until you find where I wrote line_number += 1. If it appears at the end of the for-loop, I count 0, 1, 2, ...; if it appears at the beginning of the for-loop, I count 1, 2, 3, ....

I use enumerate to keep the line counting code in one place and clearly communicate the purpose of line_number: counting the lines in file, starting from 1.

Tracking progress on long running tasks

I have a script that downloads gigabytes of data, processes it, and streams the results to a CSV file:

for record in download_records():
    result = process_record(record)
    export_result(result)
print("done")

(I omit the definitions of download_records, process_record, and export_result to focus on the role of enumerate in this code.)

When I run this script, I wait 30 minutes to see:

done

But this gives me no feedback about the script's progress. The script could hang and I wouldn't notice. To fix this, I use enumerate to count how many records I've processed and output that information as the script executes.

for i, record in enumerate(download_records(), start=1):
    print("processing record", i)
    result = process_record(record)
    export_result(result)
print("done")

Running this immediately starts to give me feedback:

processing record 1
processing record 2
processing record 3
processing record 4
...
processing record 1443803
done

I also use two variations of this technique:

Printing out every X steps (e.g. 10,000 steps), because sometimes printing out information on every step substantially slows my code down.
Printing out a percentage if I know an upper bound on how many steps my code will take.

for i, record in enumerate(download_records(), start=1):
    if i % 10_000 == 0:
        print("processed", i, "records")
    result = process_record(record)
    export_result(result)
print("done")

processed 10000 records
processed 20000 records
processed 30000 records
...
processed 1440000 records
done

for i, record in enumerate(download_records(), start=1):
    print(format(i / 1443803, "%"))
    result = process_record(record)
    export_result(result)
print("done")

0.000000%
0.000691%
0.001381%
...
99.998619%
99.999309%
100.000000%
done

(If you spot any errors or typos on this post, contact me via my contact page.)