By: Hristo Hristov | Updated: 2022-04-22 | Comments | Related: > Python
Problem
Python is a powerful scripting language used for classic programming, scripting, and data analytics. So far in our Python programming series we have examined multiple concepts and constructs such as data types, functions, and control flow tools. In all these tutorials we have mentioned at least once the core terms "iterable", "iterator" and "generator". What are these objects, how to distinguish them from each other and how to benefit from them?
Solution
Follow along this tutorial to examine the similarities and differences between an iterable, an iterator and a generator. Behind these core Python terms you will find objects with bespoke functionalities and methods. Knowing how to handle them will enable you to take full advantage of the Python language. Let us begin.
Python Iterable
An iterable is best described as a sequence of items. The sequence can be anything: a list, a tuple, a string, a file or generally any data structure whose elements can be iterated over, i.e., extracted, or accessed one by one. The following characteristics apply to all iterables:
They can be looped over, i.e., used in a for loop. For example, a set:
my_set = {1,2,3,} for item in my_set: print(item)
Every iterable has a built-in __iter__
method.
The quickest way to check if an object is an iterable is to call the
dir
function on it and check if it implements it:
In a previous tip, we have examined
the iter()
general function which calls the
__iter__
method of the argument object thus turning
an iterable into an iterator.
The elements inside the iterable can be accessed one by one with an index:
my_list = [1,2,3] for i in range(len(my_list)): print(my_list[i])
In general, every sequence is an iterable, but not every iterable is an iterator. Let us now examine what an iterator is.
Python Iterator
An iterator is the resulting object of calling the __iter__
method of an iterable. The core functionality of an iterator is that it keeps track
of the next element in the sequence. Thus, the iterator is a stateful object that
can serve its elements one at a time. It "knows" which sequence element
is coming up next.
An iterator has a built-in method __next__.
In
a previous tip, we examined also the next()
general
function which calls the internal __next__
method
of the iterator argument object. If we turn a set object which by default is an
iterable, into an iterator we will see __next__
:
dir(iter(set()))
By default, an iterable object is not an iterator. Therefore, trying to use the
next()
general function will result in an error because
the object does not implement __next__
internally:
next(my_set) # defined in the previous example
To turn an iterable into an iterator it is enough to use the
iter()
function with this syntax:
my_set = {1,2,3} iter_my_set = iter(my_set) type(iter_my_set) print(next(iter_my_set)) print(next(iter_my_set)) print(next(iter_my_set)) print(next(iter_my_set))
After declaring the iterator iter_my_set
based
on the set iterable, we can access its elements one after the other with
next()
. The next()
method
though will raise the StopIteration
exception when
the iterator has been emptied. It is important to note that iterators can neither
serve the elements they contain in reverse, nor skip elements nor be restarted.
Once all elements have been served from an iterator, you must create another (or
reinitialize the variable), otherwise you will get the StopIteration
exception.
Calling manually the next
method is, in a way,
emulating the behavior of a for loop. The next example shows an approximate representation
what happens behind the scenes in every for loop you use.
my_set = {1,2,3} # iterable iter_my_set = iter(my_set) # iterable converted to iterator while True: # beware of infinite loop unless break used try: current_item = next(iter_my_set) print(current_item) except StopIteration: break
In fact, this is an added benefit of the iterators: if you cannot or do not want to use a for loop you can turn your iterable into an iterator and process it manually as shown in the example above.
In many contexts, the terms "iterable" and "iterator"
can be used interchangeably. For the purposes of explaining what they do we distinguish
them in the sense that an iterable implements __iter__
but not __next__
and an iterator implements both
__iter__
and __next__
.
Python Generator
Before examining generators, let us first investigate the relationship between iterators and generators. To do so, I will import some modules which will allow me to compare the different types and the hierarchical relationship among them:
from collections.abc import Iterator, Iterable from types import GeneratorType print(issubclass(GeneratorType, Iterator)) print(issubclass(Iterator, Iterable))
We see that the Generator
type is a subclass of
the Iterator
type. The Iterator
type, on the other hand, is a subclass of the Iterable
type. In short, an iterator is an iterable and a generator is an iterator. However,
not every iterator is a generator while every iterable can be an iterator. To make
this clearer let us examine a generator. We have the following function that accepts
a list as a single argument and returns a list of the members of the input sequence
multiplied by 1.5:
def index_numbers(numbers): result = [] for number in numbers: result.append(number*1.5) return result my_list = [1,2,10,20,30] index_numbers(my_list)
This regular function can be converted to a generator function by using the
yield
keyword:
def index_numbers(numbers): for number in numbers: yield number * 1.5 index_numbers(my_list)
When we call this new function the output is a generator object – not a list of indexed numbers anymore. The reason you see this and not any readable output is that the generator does not produce the whole result all at once. In contrast, the generator "yields" the results one at a time.
result = index_numbers(my_list) result next(result) # as mentioned a generator is an iterator so use next or a for loop
One benefit of using a generator function is the added readability and conciseness. With the generator we have reduced the number of lines twice – from 4 in the former example to 2 in the latter.
Summary
In the following table, we propose a summary of the comparison between the three objects:
iterable | iterator | generator | |
---|---|---|---|
Keyword |
none |
none |
yield
|
Element access | with a loop | next() or a loop | next() or a loop |
Key difference | - | more verbose code | more concise code |
Memory storage | whole object | state of next value |
whole object, yielding one value at a time |
Usage | by default |
must use iter()
and next()
|
Custom function or comprehension expression |
Finally, let us wrap it up by taking a closer look at memory storage. With the following code we instantiate an ordinary iterable. Based on it we also instantiate an iterator and a generator:
import sys my_list = [1,2,3,4,5] sys.getsizeof(my_list) my_iter = iter(my_list) sys.getsizeof(my_iter) my_gen = (x for x in my_list) sys.getsizeof(my_gen)
Having the objects, we use the getsizeof
function to check the amount of memory occupied by each one. The memory used to
store the iterable is 120 bytes. If we then cast the iterable to an iterator, we
see a decrease in storage of more than twice – 48 bytes. On the other hand,
if we instantiate a generator just by calling every element from my iterable without
modifying it, the size of the generator is very close to the size of the iterable.
This finding should give you a clue of the behavior of the three types with regard
to memory usage.
Next Steps
- Python generator, generator iterator and generator expression
- Learn Python with me:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-04-22