Using enumerate() to simplify Python for-loops
By John Lekberg on January 08, 2020.
When you write for-loops like this:
line_number = 0
for line in file:
line_number += 1
...
Use enumerate
instead.
enumerate (proposed in PEP 279) is a built-in function that solves the loop-counter problem:
names = ["John", "Suzy", "Bill"] for x in names: print(x)
John
Suzy
Bill
for i, x in enumerate(names): print(i, x)
0 John
1 Suzy
2 Billy
for i, x in enumerate(names, start=1): print(i, x)
1 John
2 Suzy
3 Billy
I often use enumerate to solve two problems:
- Keeping track of line numbers in a file.
- Tracking progress on long running tasks.
Here's how.
Keeping track of line numbers in a file
I'm writing an article and I write down TODOs for unfinished tasks needed to complete the article:
article.txt
1 There are many graph path-finding algorithms to
2 choose from and their effectiveness depends on the
3 structure of the graph. For this graph:
4
5 TODO: compute upper bound on graph size
6
7 TODO: justify algorithm choice based on graph size
8
9 I will use Breadth First Search
10 (TODO: find link to algorithm). Here is a Python
11 implementation of BFS:
12
13 TODO: add Python implementation of BFS
I don't want to accidentally publish an unfinished article, so I wrote a Python script that lists any TODOs and which lines they appear on:
with open("article.txt") as file: for line_number, line in enumerate(file, start=1): if "TODO" in line: print("line", line_number, ":", line)
line 5 : TODO: compute upper bound on graph size
line 7 : TODO: justify algorithm choice based on graph size
line 10 : (TODO: find link to algorithm). Here is a Python
line 13 : TODO: add Python implementation of BFS
Without enumerate
, I would have written
with open("article.txt") as file:
line_number = 0
for line in file:
line_number += 1
if "TODO" in line:
print("line", line_number, ":", line)
But this is less readable for two reasons:
- The line counting code is split across multiple lines. (
line_number = 0
andline_number += 1
.) - Even though I wrote
line_number = 0
, you won't know if I start counting from 0 or from 1 until you find where I wroteline_number += 1
. If it appears at the end of the for-loop, I count 0, 1, 2, ...; if it appears at the beginning of the for-loop, I count 1, 2, 3, ....
I use enumerate
to keep the line counting code in one place and clearly
communicate the purpose of line_number
: counting the lines in file
,
starting from 1.
Tracking progress on long running tasks
I have a script that downloads gigabytes of data, processes it, and streams the results to a CSV file:
for record in download_records():
result = process_record(record)
export_result(result)
print("done")
(I omit the definitions of download_records
, process_record
, and
export_result
to focus on the role of enumerate
in this code.)
When I run this script, I wait 30 minutes to see:
done
But this gives me no feedback about the script's progress. The script could
hang and I wouldn't notice. To fix this, I use enumerate
to count how many
records I've processed and output that information as the script executes.
for i, record in enumerate(download_records(), start=1):
print("processing record", i)
result = process_record(record)
export_result(result)
print("done")
Running this immediately starts to give me feedback:
processing record 1
processing record 2
processing record 3
processing record 4
...
processing record 1443803
done
I also use two variations of this technique:
- Printing out every X steps (e.g. 10,000 steps), because sometimes printing out information on every step substantially slows my code down.
- Printing out a percentage if I know an upper bound on how many steps my code will take.
for i, record in enumerate(download_records(), start=1): if i % 10_000 == 0: print("processed", i, "records") result = process_record(record) export_result(result) print("done")
processed 10000 records
processed 20000 records
processed 30000 records
...
processed 1440000 records
done
for i, record in enumerate(download_records(), start=1): print(format(i / 1443803, "%")) result = process_record(record) export_result(result) print("done")
0.000000%
0.000691%
0.001381%
...
99.998619%
99.999309%
100.000000%
done
(If you spot any errors or typos on this post, contact me via my contact page.)