By: Hristo Hristov | Updated: 2022-02-03 | Comments | Related: > Python
Problem
In the previous tutorial of the Python programming series, we examined the built-in and commonly used Python data types, such as integer, float, string, Boolean, binary and datetime. There is more to the built-in data types – we have not discussed the "complex" ones, which are the basic data structures of Python.
Solution
In this Python tutorial we present the complex built-in data types in Python. They play a core role in any script, program, data science or data analysis scenario as they allow you to model and store different types of data in different ways. The data types are grouped by their inherent type: sequence, mapping and set. This tutorial will explain each data type by its category and mention applications and common operations. Let's go.
Sequence
The sequence data structures are 3: list, tuple, and range.
List
The Python list is a ubiquitous data structure which is used to store an indexed, ordered, and non-unique sequence of elements. As such a structure it largely resembles arrays in other programming languages. To instantiate a list, you need to encompass zero, one or more values with square brackets, e.g.:
Lists can vary in length, are mutable and can contain heterogeneous elements (i.e. of different data types).
Application
Use the list to store any type of data in a convenient, accessible, and trackable way. For particularly large lists, the numpy array data type may be more efficient and faster. The maximum size of a list on a 64bit computer system is 9223372036854775807, which is 2**63 – 1.
Common operations
Operation | Description | Example Syntax |
---|---|---|
[] | New empty list | |
my_list = [1,2,3] | New list with three elements | |
my_list = [1,2,[3,4]] | New list with a nested list | |
my_list[i] | Index. First element is always at index 0. | |
my_list[i][j] | Index of index | |
my_list[i:j] | Slice of list | |
len(my_list) | List length | |
my_list1 + my_list2 | List concatenation | |
My_list * 2 | List repeat | |
for x in my_list: print(x) | List iteration | |
x in my_list | Membership test | |
my_list.append(5) | Add an element | |
my_list.extend([6,7,8]) | Extend a list with elements from another iterable | |
my_list.insert(index, element) | Insert a new element | |
my_list.index(element) | Find the position (index) of an element. Returns error if element not found. | |
my_list.count(element) | Counts the occurrences of an element | |
my_list.sort() | Sorts the list in-place | |
my_list.reverse() | Reverses the list | |
del my_list[k] | Deletes the element at index k | |
del my_list[i:j] | Deletes the elements between indices i (inclusive) and j (non-inclusive) | |
my_list.pop() | Removes and returns the last element of a list | |
my_list.remove(element) | Removes the first occurrence of the element | |
my_list[i:j] = [] | Slice assignment | |
my_list[i] = 1 | Index assignment |
Tuple
A tuple is another sequence data type used to store an indexed, ordered, and non-unique sequence of elements. It is instantiated by encompassing zero, one or two or more values with parenthesis:
The tuple is an immutable type of sequence, like the string type and unlike the list data type. This means that individual elements of the tuple cannot be changed, but the values inside them can. So, a tuple can contain as an element a list:
Tuples can also contain heterogeneous elements.
Application
Use the tuple in scenarios when you want your sequence to stay the same. The members of the tuple can neither be changed or removed. To give you a reference, in Python libraries dealing with database integration result sets are usually returned as tuples.
Common operations
Tuples implement all the common sequence operation, which are listed here. Here's a quick overview:
Operation | Description | Example Syntax |
---|---|---|
() | New empty tuple | |
my_tuple = (1,) | One-item tuple | |
my_tuple = (1,2,3) | Three-item tuple, can also be declared without brackets | |
my_tuple = tuple('MSSQL') | Creates a tuple out of an iterable | |
my_tuple[i] | Index | |
my_tuple[i][j] | Index of index | |
my_tuple[i:j] | Slice of tuple. If last index is not supplied, everything up to the end of the tuple is returned. | |
len(my_tuple) | Tuple length | |
my_tuple1 + my_tuple2 | Concatenation | |
my_tuple * 3 | Tuple repeat | |
x in my_tuple | Membership check | |
For x in my_tuple: print(x) | Iteration |
Range
The range()
function is an iterator which we use
to generate items on demand. Generally, there are three ways to make the range function
work for you:
One argument: will return a list of integers non-inclusive of the argument's value:
Two arguments: the first argument becomes a lower bound, while the second an upper bound:
Three arguments: in addition to the upper and lower bounds, the third argument serves as a step
Application
As you can see from the screenshots above, to display the actual generated values,
I need to wrap my range
in a
list
. The range
is an
iterable object and we can go over each of its elements (e.g. in a for
loop) but we can't access all elements at once. Therefore, wrapping the
range
in a list
or a tuple
allows us to peek inside it. So, use the range function to define a
finite sequence which you can use in a for loop:
A typical use for the range function is when you want to manipulate a list:
The above code creates an iterator with the range function over the length of the given list. We can then access every element by its index and perform the operation needed.
Common methods
The range function does not implement specific methods of its own.
Mapping
Dictionary
A dictionary is a mapping data type - an unordered and mutable collection of key-value pairs. Dictionaries can contain heterogeneous data types too.
Application
A dictionary is handy when you want to map a key (which must be immutable) to a value (which can change). A simple example for that is storing phone number data:
The key is a name, which is of type string, i.e., immutable. The value is an integer, accessed by the corresponding key. The example can be expanded by nesting a dictionary for each name – this way we can store more data for a single person:
The nested dictionary contains the keys phone, address, and years of experience. With the first key we can access the nested dictionary:
Putting keys one after the other (called chaining) we access the value from the nested dictionary:
There is no limit to how many nested dictionaries we can have but it should be kept manageable according to the program or script's purpose.
Common methods
Operation | Description | Example Syntax |
---|---|---|
{} | New empty dictionary | |
my_dict = {'country':'USA', 'population':329500000} | Two-item dictionary | |
my_nested_dict = {1:{'name':'USA', 'population':329500000}} | Nested dictionary | |
my_dict = dict(country='USA', code=1) | Creates a dictionary with the dict keyword | |
my_dict = dict(zip(keys, values)) | Zipped pair wrapped with dict | |
my_dict = dict.fromkeys(['key1','key2']) | New dictionary from a list of keys | |
'my_key' in my_dict | Membership test | |
my_dict.keys() | Keys | |
my_dict.values() | Values | |
my_dict.items() | Keys and values | |
my_dict.copy() | Copies the dictionary | |
my_dict.get(key, default) | Gets the specified key as an argument or returns a default value | |
my_dict.update(another_dict) | Updates the original dictionary | |
my_dict.pop(key) | Removes the key and returns the value | |
len(my_dict) | Returns the length in number of keys | |
my_dict[key] = value | Changes the value of a key |
Set
Under this category there is also just one data type, the set.
Set
This type is an unordered collection of unique elements. The set supports a variety of operations that relate to mathematical set theory which find an application in database-related work.
Application
You can use a set to create a collection of unique elements and to guarantee element uniqueness during the collection's lifetime in memory. Elements can be added to the set freely only if they are immutable: lists and dictionaries cannot be members of a set, but tuples and strings can. The inherent and most practical way to take advantage of a set is to create a unique collection out of a non-unique collection, for example cast a list to a set thereby obtaining only its unique elements:
Common methods
Operation | Description | Example Syntax |
---|---|---|
set() | New empty set | |
set([1,2,3]) | New empty set with the set function | |
a.add(x) | Add element x to the set a | |
a.clear() | Reset the set a to an empty state, discarding all its elements | |
a.remove(x) | Remove element x from the set a. Returns an error if element not present | |
a.pop() | Removes an arbitrary element from the set a, raising KeyError if the set is empty | |
a.union(b) | All the unique elements in a and b | |
a.intersection(b) | All the elements in both a and b | |
a.update(b) | Set the contents of a to be the union of the elements in a and b | |
a.intersection_update(b) | Sets the contents of a to be the intersection of the elements in a and b | |
a.difference(b) | Returns the elements in a that are not in b | |
a.difference_update(b) | Sets a to the elements in a that are not in b | |
a.symmetric_difference(b) | All the elements in either a or b but not both | |
a.symmetric_difference_update(b) | Set a to contain the elements in either a or b but not both | |
a.issubset(b) | True if the elements of a are all contained in b | |
a.issuperset(b) | Returns True if the elements of b are all contained in a | |
a.isdisjoint(b) | Returns True if a and b have no elements in common |
We should also mention the frozensetsubtype here. As we said a set can contain only immutable datatypes. What if you wanted to nest a set inside another set? The set is mutable so that would not be possible:
You get an "unhashable type" error which means the same: you are
trying to add a mutable element to a collection that can contain only immutable
ones. For that purpose, you have the frozenset
:
This command succeeds as the frozenset is an immutable type.
Congrats if you made it to the end of this tutorial! You now have a complete overview of the complex built-in data types in Python. Stay tuned for the next tips that will take you deeper into the snake's pit.
Reference
This tip uses information from:
- "Learning Python" 4th edition, by Mark Lutz, published by O'Reilly 2009, chapters 5, 8, 9, 13.
- "Python for Data Analysis" 2nd edition by Wes McKinney, published by O'Reilly 2018, chapter 3.
Next Steps
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-02-03