Dictionary view objects in Python
By John Lekberg on September 19, 2020.
This week's post is about dictionary view objects in Python. You will learn:
- What dictionary view objects are and what you can do with them.
- What differentiates dictionary view objects from tuple objects and other collections.
- When to not use dictionary view objects.
- The special, set-like behavior of dict.keys.
What are dictionary view objects?
Dictionary view objects are read-only proxies to dictionary objects.
You can create dictionary view objects with the built-in methods dict.keys, dict.values, and dict.items, as well as with the class types.MappingProxyType. E.g.
d = {"a": 1, "b": 2} d.keys()
dict_keys(['a', 'b'])
d.values()
dict_values([1, 2])
d.items()
dict_items([('a', 1), ('b', 2)])
from types import MappingProxyType MappingProxyType(d)
mappingproxy({'a': 1, 'b': 2})
Notice that these dictionary view objects have their own types. They are not list objects or tuple objects.
type(d.keys())
dict_keys
type(d.values())
dict_values
type(d.items())
dict_items
type(MappingProxyType(d))
mappingproxy
dict.keys, dict.values, and dict.items come from PEP 3106 "Revamping dict.keys(), .values(), and .items()". In Python 2, these methods returned list objects instead of dictionary view objects.
types.MappingProxyType was introduced in Python 3.3. For details, see "What's New In Python 3.3", and "Issue14386 - Expose dict_proxy internal type as types.MappingProxyType".
What can you do with dictionary view objects?
With dict.keys, dict.values, and dict.items, there are 3 things that you can do:
-
You can measure their size with the built-in function len:
d = {"a": 1, "b": 2} len(d.keys())
2
-
You can do membership testing:
"a" in d.keys()
True
2 in d.values()
True
("a", 1) in d.items()
True
-
You can iterate over the elements:
for k in d.keys(): print(k)
a b
tuple(d.keys())
('a', 'b')
With types.MappingProxyType objects, you can do any dictionary operation that doesn't attempt to mutate the dictionary:
from types import MappingProxyType d = MappingProxyType({"a": 1, "b": 2}) d["a"]
1
d.keys()
dict_keys(['a', 'b'])
d["a"] = 4
TypeError: 'mappingproxy' object does not support item assignment
del d["a"]
TypeError: 'mappingproxy' object does not support item deletion
Operations that attempt to mutate the dictionary will raise an exception.
How are dictionary view objects different than tuple objects and other collections?
Because dictionary view objects are proxies to dictionary objects:
-
Dictionary view objects change value when the underlying dictionary object changes value.
E.g. removing an element updates dict.keys:
d = {"a": 1, "b": 2} d_view = d.keys() d_view
dict_keys(['a', 'b'])
del d["a"] d_view
dict_keys(['b'])
But this doesn't update a tuple object created from dict.keys:
d = {"a": 1, "b": 2} d_tuple = tuple(d.keys()) d_tuple
('a', 'b')
del d["a"] d_tuple
('a', 'b')
-
Dictionary view objects are faster to create than, e.g., tuple objects.
E.g. I have a dictionary with 100,000,000 entries. On my computer, creating a dictionary view object of the dictionary object's keys takes 0.000 seconds:
CAUTION:
dict.fromkeys(range(100_000_000))
uses a lot of memory. If you want to run this code on your computer, then I suggest that you start by creating something smaller (e.g.dict.fromkeys(range(10_000))
).import cProfile d = dict.fromkeys(range(100_000_000)) cProfile.run("d.keys()")
4 function calls in 0.000 seconds
And creating a tuple object of the dictionary object's keys takes 19.495 seconds:
cProfile.run('tuple(d.keys())')
4 function calls in 19.495 seconds
(Measured using cProfile.run.)
-
Dictionary view objects take up less memory than, e.g., tuple objects.
E.g. I have a dictionary with 100,000,000 entries. On my computer, creating a dictionary view object of the dictionary object's keys takes 56 B of memory:
import sys d = dict.fromkeys(range(100_000_000)) sys.getsizeof(d.keys())
56
And creating a tuple object of the dictionary object's keys takes 800 MB of memory:
sys.getsizeof(tuple(d.keys()))
800000056
(Measured using sys.getsizeof.)
When should you not use dictionary view object?
Iterating over a dictionary view object and changing the size of the underlying dictionary at the same time will raise an exception. E.g.
d = {"Sushi": 10, "Pizza": 9, "Spinach": 3, "Pickles": 2} for food, rating in d.items(): if rating < 8: del d[food] d
RuntimeError: dictionary changed size during iteration
You can resolve this problem by converting the dictionary view object into a tuple object (or some other collection that is not a proxy). E.g.
d = {"Sushi": 10, "Pizza": 9, "Spinach": 3, "Pickles": 2} for food, rating in tuple(d.items()): if rating < 8: del d[food] d
{'Sushi': 10, 'Pizza': 9}
The special, set-like behavior of dict.keys
Besides the general behavior described above -- measuring size, membership testing, and iteration -- the dictionary view object created by dict.keys behaves like a set object:
from collections.abc import Set
isinstance(dict().keys(), Set)
(dict.keys creates an object that implements abstract base class collections.abc.Set.)
Why is this useful?
-
It's easy to check that two dictionary objects have the same set of keys:
d1 = {"Sushi": 10, "Pizza": 9} d2 = {"Pizza": 7, "Sushi": 10} d1.keys() == d2.keys()
True
d3 = {"Pizza": 9} d1.keys() == d3.keys()
False
-
It's also easy to check that a dictionary has all "required" keys:
required_keys = {"Sushi", "Pizza"} records = [ {"Sushi": 10, "Pizza": 9, "Soda": 7}, {"Sushi": 7, "Pizza": 10}, {"Pizza": 9, "Soda": 3}, ] for record in records: if not (record.keys() >= required_keys): print( f"Bad record {record}.", f"Missing keys {required_keys - record.keys()}.", )
Bad record {'Pizza': 9, 'Soda': 3}. Missing keys {'Sushi'}.
(In this code, the operator
>=
tests for a superset and the operator-
performs the set difference operation.) -
And it's just as easy to check that a dictionary doesn't contain any "illegal" keys:
illegal_keys = {"SSN", "Password"} records = [ {"Name": "Bob", "SSN": "000-00-0000"}, {"Name": "John"}, {"Name": "Chris", "Password": "12345"}, ] for record in records: record_illegal_keys = record.keys() & illegal_keys if len(record_illegal_keys) > 0: print( f"Bad record {record}.", f"Contains illegal keys {record_illegal_keys}.", )
Bad record {'Name': 'Bob', 'SSN': '000-00-0000'}. Contains illegal keys {'SSN'}. Bad record {'Name': 'Chris', 'Password': '12345'}. Contains illegal keys {'Password'}.
(In this code, the operator
&
performs the set intersection operation.)
In conclusion...
In this week's post, you learned about dictionary view objects. You learned how they differ from other collections, (like list objects and tuple objects). You also learned when they are useful, and when not to use them. And, you learned about the special, set-like behavior of dict.keys.
My challenge to you:
Create a Python function,
validate
, that takes, as parameters,
dicts
-- A list object of dictionary objects.required_keys
-- A set object of "required" keys.illegal_keys
-- A set object of "illegal" keys.And this function will report any dictionary object that
- Doesn't contain all of the required keys, OR
- Contains any of the illegal keys.
Then, try calling your function with these arguments:
validate( dicts=[ {"Name": "Bob", "Age": 43, "SSN": "000-00-0000"}, {"Name": "John", "Age": 32}, {"Name": "Chris", "Age": 27, "Password": "12345"}, {"Age": 33}, ], required_keys={"Name"}, illegal_keys={"SSN", "Password"}, )
If you enjoyed this week's post, share it with your friends and stay tuned for next week's post. See you then!
(If you spot any errors or typos on this post, contact me via my contact page.)