Python tips: reverse a dictionary
In Python, if you want to reverse a dictionary (
Method 1: values are unique, solution based on
For the Python 2 version, use
For Python 3.x
For Python 3.x
For Python 3.x
For Python 3.x
The original dictionaries with unique values are initialized using this template:
e.g.
The other dictionaries with non-unique values are initialized with the same kind of template but with the difference that each value in the original dictionary is associated with exactly two keys.
dict
), i.e. swap the keys and values, there are three methods that can help you achieve that goal depending on your needs:- Method 1: makes use of
dict
comprehension, and thedict
must contain unique values - Method 2: makes use of
dict.get()
, and thedict
doesn't contain unique values - Method 3: makes use of map(reversed, iter), and you want the reverse
dict
to preserve the type and order of the originaldict
(if for example it is anOrderedDict
)
I provided code implementations for each methods that work for Python 2.7.x and 3+, and the embedded code can be executed directly from the blog post through JDoodle, an online compiler and editor for many other languages (68 in total such as C/C++, Java, Go, SQL).
I also compiled the average running times of the different methods in a table at the end ( Spoiler alert: Click to reveal winning method)
The code used for computing the average run times of the three methods can be found on my github.
I also compiled the average running times of the different methods in a table at the end ( Spoiler alert: Click to reveal winning method)
The code used for computing the average run times of the three methods can be found on my github.
Contents
4.1 Major updates
4.2 Results
Method 1: values are unique, solution based on dict
comprehension
In the first method, the assumption we make is that your
dict
must contain unique values. For the first method to work, this assumption about unicity in the values must hold.
The method makes use of
dict
comprehension by swapping the keys and values. Here are the code implementations:Code (Python 2.7 & 3)
For Python 2.7.xmy_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5} inv_dict = {v: k for k, v in my_dict.iteritems()}
For the Python 2 version, use
dict.iteritems()
which is more efficient than dict.items()
.Table 1 shows the average running times for the different methods in Python 2, and you can see from looking at the first two rows that by using iteritems()
for method 1 you can reduce your running time significantly (especially when lots of items are involved) compared with items()
. The same kind of speed up can be seen also for the other methods when using iteritems()
instead of items()
in Python 2.For Python 3.x
Drawbacks of method 1
- Method 1 breaks if the values in the original
dict
are not unique: all keys but one that are associated with the duplicated values will be ignored. See Method 2 on how to solve this case with duplicated values in the originaldict
. - Method 1 won't necessarily preserve the original insertion order which is to be expected since Python
dict
s don't have any ordering (unlikeOrderedDict
). However, from Python 3.7 onwards, insertion order will be a language feature unlike in Python 3.6 which was an implementation detail, i.e. as of Python 3.7, you can be guaranteed that insertion order is supported fordict
s.
References
- Code of Method 1 (for both Python 2 and 3)
Method 2: non-unique values
If in the original dictionary, the values are not unique, then you could still do a reverse of the dictionary by saving all the keys associated with the same value in a list. Thus, your reverse
Example: contents of the original and inverse
dict
would store as values a list of all the keys associated with the same value in the original dict
.
Example: contents of the original and inverse dict
s when the values in the original dict
are not unique
Code (Python 2.7 & 3)
For Python 2.7.x
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {} for k, v in my_dict.iteritems(): inv_dict[v] = inv_dict.get(v, []) inv_dict[v].append(k)
Method 2 of reversing a
dict
with non-unique values
code @ jdoodle.com (Python 2.7.15)
For Python 3.x
Method 2 of reversing a
dict
with non-unique values
code @jdoodle.com (Python 3.6.5)
Code using setdefault (Python 2.7 & 3)
If you prefer
For Python 2.7.x
setdefault
, you can set the values in the inverse dict
with one line, as can be seen in the following implementations:
For Python 2.7.x
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'} inv_dict = {} for key, value in my_dict.iteritems(): inv_dict.setdefault(value, []).append(key)
Method 2 of reversing a
dict
with non-unique values using dict.setdefault
code @ jdoodle.com (Python 2.7.15)
For Python 3.x
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'} inv_dict = {} for key, value in my_dict.items(): inv_dict.setdefault(value, []).append(key)
Method 2 of reversing a
dict
with non-unique values using dict.setdefault
code @ jdoodle.com (Python 3.6.5)
Drawbacks of method 2
- Like Method 1, Method 2 also won't necessarily preserve the original insertion in the
dict
. UseOrderedDict
if you need this property. dict.get()
anddict.setdefault()
both have similar very poor average run times when working in Python 3, as can be seen from Table 2 (Python 3). In Python 2,dict.setdefault()
provides better code speed up thandict.get()
, as can be seen from Table 1 (Python 2). Also,dict.setdefault()
allows you to use more compact code than withdict.get()
. Finally as stated before, if yourdict
s have non-unique values, then Method 2 might be a good solution.
Method 3: type and order preserved
Method 3 makes use of
map(reversed, iter)
to swap the keys and values of an arbitrary mapping type (e.g. OrderedDict
or defaultdict
). Thus, the reverse mapping will not be necessarily forced to have dict
type like it was the case in the other two methods. If you start with OrderedDict
as the original collection, then your reverse collection will be of OrderedDict
type.Code (Python 2.7 & 3)
For Python 2.7.xdef reverse_mapping(f): return f.__class__(map(reversed, f.iteritems())) my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'd', 5: 'e'} inv_dict = reverse_mapping(my_dict)
Method 3 of reversing a
code @ jdoodle.com (Python 2.7.15)
dict
with unique values using map(reversed, iter)
code @ jdoodle.com (Python 2.7.15)
For Python 3.x
Method 3 of reversing a
code @ jdoodle.com (Python 3.6.5)
dict
with unique values using map(reversed, iter)
code @ jdoodle.com (Python 3.6.5)
Drawbacks of method 3
- Like Method 1, Method 3 also breaks if the original
dict
doesn't have unique values. If you need to work with dictionaries with no unicity in the values, check Method 2 which takes care of this situation. - Method 3 might have the advantage of preserving both the type and order of the original dictionary (e.g. if working with
OrderedDict
), but its average run time is the worst when used in Python 2.7.x, see Table 1. - The one-line code might not be easily readable compared to the other methods.
References
Average running times of the three methods
Major updates
2018-09-16:- I updated the code base and re-ran the
python commands to re-populate the following two tables with new results. The
changes consisted in implementing the important missing option
--use_setdefault
which I thought was already implemented when I generated the results the first time. Big mistake on my part! Thus the old results for the row Method 2:setdefault
were generated actually withdict.get()
instead ofdict.setdefault()
:( - Also, I factorized methods.py
where the different
dict
-reversing methods are defined by putting all the common code from Python 2 & 3 methods into base classes.
Boring stuff
The whole code for computing the average run times of each method can be found on my github where you will also find details on how to install the program and run the main script to compute the average run times of different methods.The original dictionaries with unique values are initialized using this template:
{'ki': 'vi'}
where i
is any number starting from 1
e.g.
{'k1': 'v1', 'k2':''v2'}
.
The other dictionaries with non-unique values are initialized with the same kind of template but with the difference that each value in the original dictionary is associated with exactly two keys.
Results
Table 1 presents the average run times (µsec) of the three methods for different number of items with Python 2.7. The average time when using 1000 items was based on running the methods 100k times. For the other cases (10k and 100k items), the average run time was computed by running the code 1k times for each method. Table 2 presents the same kind of results but for methods working in Python 3.
From Table 1 and Table 2, we conclude that:
From Table 1 and Table 2, we conclude that:
- Method 1 offers the best average run times for both Python 2.7 and Python 3.
- Method 2 offers very bad average run times for both Python 2.7 and Python 3.
- Method 3 presents the second best average run time with Python 3, but when working with Python 2, Method 3 doesn't present the same kind of performance as seen with Python 3.
- Python 2 recommendation: it is highly recommended to use
dict.iteritems()
instead ofdict.items()
, as it can be see from Table 1 where method 1 was tested withdict.items()
(the row with the red highlight).
Py2 Method | Avg time (µsec), 1k items, 100k times | Avg time (µsec), 10k items, 1k times | Avg time (µsec), 100k items, 1k times |
---|---|---|---|
Method 1: dict compre.,iteritems | 217.96 | 2665.56 | 45301.06 |
Method 1: dict compre.,items | 256.79 | 4125.60 | 70901.29 |
Method 2: dict.get ,iteritems | 838.27 | 10410.03 | 127703 |
Method 2: setdefault ,iteritems | 656.03 | 8539.27 | 108347.76 |
Method 3: map ,iteritems | 895.95 | 10788.73 | 146115.25 |
Py3 Method | Avg time (µsec), 1k items, 100k times | Avg time (µsec), 10k items, 1k times | Avg time (µsec), 100k items, 1k times |
---|---|---|---|
Method 1: dict compre. | 89.97 | 905.46 | 18487.63 |
Method 2: dict.get | 417.17 | 4949.22 | 66599.29 |
Method 2: setdefault | 333.02 | 4171.31 | 58628.69 |
Method 3: map | 311.59 | 3366.20 | 46960.80 |
Summary of results
Thus, the results can be summarized as follows:- if your dict has unique values, use dict comprehension (see Method 1)
- if your dict doesn't have unique values, use dict.setdefault() when populating the dict (see Method 2) as it provides a more compact form than using
dict.get()
- no matter what method you use, if you are working in Python 2.7, use dict.iteritems() instead of dict.items(). The reason is that since
dict.iteritems()
makes use of generators, it is more memory efficient.
Comments
Post a Comment