By: Hristo Hristov | Updated: 2022-02-16 | Comments | Related: > Python
Problem
Python is a powerful scripting language for the data professional. It offers great flexibility, readability, and general ease of use due to its concise syntax. A comprehension is a Python specific construct especially useful for applying expressions with or without conditions to the elements of a sequence, mapping, or a set.
Solution
Comprehensions allow the programmer to build sequences based on another sequence or mapping data type. Comprehensions are applicable to lists, tuples, sets and dictionaries, meaning each of these data types can be the source or the output.
Every comprehension can consist of:
- an input sequence: list, tuple, set or dictionary,
- a variable representing the members of the input sequence,
- an output expression producing elements of the output data type,
- an optional predicate expression.
Here is an example with an optional predicate expression. We raise every element of the sequence to the power of 2 if that element is even:
Now let's examine comprehensions for each of the data types we mentioned so far.
Python List Comprehension
By following the guide above, you can define a list comprehension by wrapping a suitable expression with square brackets. Let's convert the members of a list to upper-case if they contain the letter 'o' with the following example:
What is the benefit of this? In short, space and speed. A list comprehension could be faster than the equivalent for loop:
However, note that in this simple example it is not. We measure this by using
the timeit
module:
The execution speed of the list comprehension is roughly half a second, whereas
for the loop it is a third of a second. In this case the reason for that is that
the list comprehension returns a list. The code needs extra time to build and populate
that list. On the other hand, the for loop returns individual strings so there is
no overhead of making a list. While the difference is visible, let us not forget
the return values are different. If you want to modify a list and immediately get
the results in a new list, comprehensions are a great way to do that. So let us
introduce a new variable of type list, called result
and see what changes:
Comprehension:
For loop:
Now the result of the timing is almost identical. You can also see that the execution speed of the comprehension almost did not change. Note these timing results may vary on your machine. In another scenario, it is recommended to perform a number of timings (e.g. 10 or more) and average the results.
Python Tuple Comprehension
The tuple is a sequence data type so there is no problem for it to be the source for a comprehension. Tuple elements cannot change, but there is no problem to get our result out of a tuple:
This still is a list comprehension though. What will happen if you wrap your comprehension in parenthesis instead of square brackets? It will not return a tuple out of your comprehension. Instead, it will return a generator object. I will reserve generators for another tip but let us have a quick look for the sake of being exhaustive here.
The variable result
is a generator object and not a tuple resulting
from the comprehension. We can cast the generator to a list, or to a tuple, for
instance. However, you can't do both. When I try to cast result
to
a tuple as well, I get an empty tuple. The reason for that is why generators exist:
to provide the next variable on request, if needed. Since we cast the resulting
generator to a list already, that same generator is now empty. You must define your
generator again and if needed get the next value:
This subtlety means there are two ways to make a tuple comprehension:
- Cast your expression to a
tuple()
:
- Unpack your generator by using
*
:
Note the trailing comma, without this will not return a tuple but an error.
Python Dictionary Comprehension
You can take advantage of a dictionary comprehension by following the same expression guidelines as before but wrapping them in curly brackets. Let us define a dictionary and extract its keys in a list:
Again, this expression is not a dictionary comprehension – it still outputs a list of the keys in the dictionary. Here is how to take advantage of dictionary comprehension. For example, let us say you have a list variable. You want to map the members of that list to their indices in the list and store the pairs of indices and related variable members in a dictionary. A prototypical one-liner for doing so would be:
Notice the colon separating the key-value mapping in the first part of the comprehension.
Then we are using the range function to generate an iterable of the indices
of the list. Thus, indices are mapped to the dictionary keys by passing the
index
as the current index of the keywords
list. Now,
let us step it up a bit to see how helpful a dictionary comprehension would be if
we had a pair of lists. Let us imagine we wanted to map the members of the first
list to the members of the second list. Let us spice it up by making the second
list a nested one (a matrix). The two lists could look like this:
The first one holds tip titles and the second one holds a list of keywords. How can we use a dictionary comprehension to map the titles to their keywords (assuming the first title corresponds to the first nested list and so on)? For example:
We are mapping the key (from the tips
list) to a value from a nested
list from the keywords
list. This value is coming from the inner list
comprehension. There you also find the inner iterable which ensures we pick a keyword
from the nested list and not the nested list itself. Finally, the outer iterable
goes over each of the indices of the tips
list. I suggest you experiment
what happens if you have more tip titles than nested lists and if you have nested
keywords list of different length.
Python Set Comprehension
The set comprehension is handy knowing that sets contain only unique members. Therefore, casting a list to a set is the easiest way to find the unique values in that list. For example:
In case you wanted the members of your set to be a complex data type, like a tuple, that is surely possible:
The (tip,author)
tuple is our expression. We iterate over each
tip
and author
. In fact, this method is also helpful for
generating a cross join. You are pairing each element from the first list to each
element from the second list. A more complex example would be to generate a set
with elements that are a pair of tip and corresponding author. Here is
an example:
We define the set comprehension with curly brackets and iterate over each tip
title with tip_idx
. The second tuple element we get by using an inner
list comprehension. There we ensure that we are on the current author by predicating
the expression on the index match auth_idx == tip_idx
. We cast the
inner comprehension to a string – the output of it is a single-element list
and a list can't be a member of a set. Additionally, and because of that we should
clean the characters that remained because of the conversion from a list to a string
by using lstrip
and rstrip
.
Please keep in mind there are probably more efficient ways to do this, including with an outer for loop for the author index. The point here is to demonstrate set comprehensions and show nested comprehension. Therefore, a word of warning: do not nest too may comprehensions as it hinders the inherent readability.
Conclusion
In this Python tutorial we examined different ways to use Python comprehensions. This handy construct is applicable to lists, tuples, dictionaries, and sets. Mastering comprehensions means you can fully take advantage of Python's efficient syntax. Stay tuned for the next article to continue to learn Python code with me.
Next Steps
- Additional resources
- Learn Python with me:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-02-16