By: Hristo Hristov | Updated: 2022-01-26 | Comments | Related: > Python
Problem
If you have followed along the last Python tutorial of this series, you may be eager to start working with Python as a data professional. Where do beginners get started though? I believe the natural first step is to examine the available data types in Python. They will allow you to store data and model your Python variables according to the problem at hand.
Solution
As with every programming starter's guide we should hit it off by examining the built-in data types and data structures. Mastering these core programming elements and concepts will ensure you have a firm grip on the Python programming language and can continue further with more complex concepts, such as creating your own data types (i.e. classes). It is important to note that we will only briefly list the different data types, but will not go into much detail. Every built-in data type deserves its own chapter in the tutorial.
Python Data Types Overview
Here is a brief overview of all built-in Python data types (meaning they are implemented for you in Python and you can use them via a class without importing additional modules) and some additional data types (e.g. date and time, which are not built-in and require importing a module). We will break down each one of these as we go along:
Category | Type | Example |
---|---|---|
Numeric | int | 11234 |
float | 0.5772 | |
complex | Complex(4,2) | |
Text | sstr | 'abcd', b'binary string type' |
Sequence Types | list | [1,2,3] |
tuple | (1,2) | |
range | range(5) | |
Mapping | dict | {source:”MSSQLTips”, language: English} |
Set | set | {'a', 'b', 'c'} |
Boolean | bool | TTrue or False |
Binary | bytes | \x00, \x01 |
bytearray | Mutable bytes type | |
memoryview | Bytes object reference | |
Date and time | date | (2021,12,20) |
datetime | (2021,12,20,22,33,59) | |
time | (22,33,59) | |
timedelta | days=365 |
In this tutorial we will examine numeric, text, string, binary, Boolean and date and time (again – it is not built-in but is included here for the sake of completeness). In a subsequent Python tutorial we will examine the sequence, mapping and set data types. The table does not include program-specific data types such as functions modules or classes.
Numeric
To represent a number there are three options in Python:
For integer values, int()
. You can cast a
string to an integer with int()
if it is indeed, or
an integer variable will be instantiated as such:
To represent a decimal number, you can use float()
. Again,
if you assign a decimal number to it right away, then you will get the float type:
The problem you may run into is that
float
may fail to represent accurately your decimal
number. To do so with maximum accuracy, you can use the decimal
module:
This will ensure you have the extra representation needed. Note how the number is passed as string to the Decimal constructor.
Complex
numbers: you can use the class complex()
and pass
to it a string and a number or a number and a number to produce a complex number:
Common operations
Addition, subtraction, multiplication, division, etc. are all covered by the default operators. Here is a reference to the operations you can do with the numeric data types, also including some extra operations:
Operation | Expression | Example | Explanation |
---|---|---|---|
Addition | + | The two terms are added. | |
Subtraction | - | The two terms are subtracted, amounting to either a positive or a negative result. | |
Multiplication | * | The two factors are multiplied. | |
Division | / | The result is always a float even if you divide integers. If needed 2.0 (lower screenshot) can be cast to an integer. | |
Integer division | // | Rounds down to the closest integer of the actual result. The actual result is 1.6667 so the outcome of the integer division is 1. | |
Modulo division | % | The easiest way to think about modulo division is "What is the difference between the numerator and how many times is the denominator contained in the numerator?” In this case 3 is contained once in 5 and 2 is the remainder. Therefore, 2 is the result of the modulo division. | |
Power | ** | Raises 2 (base) to the power of 3 (exponent). |
Text
Python strings must be enclosed with single (') or double quotation mark ("). It is not important which one you prefer but it is important to keep it consistent throughout your script or program. Here is an example:
What could you do with a string? For example, you can access its elements by the corresponding index:
The first letter sits at index 0, while the last one at index -1. This is very similar to how we access the elements of a tuple or a list. Strings, like the tuple type, are immutable and cannot be changed once assigned:
Escape sequences must be created if you want to include special characters in your string literal. For example, a single quote:
Valid ways to escape it are adding a backslash before it or enclosing the string with triple quotes. Here is a list of common characters that need escaping:
Escape Sequence | Description |
---|---|
\newline | Backslash and newline ignored |
\\ | Backslash |
\' | Single quote |
\" | Double quote |
\a | ASCII Bell |
\b | ASCII Backspace |
\f | ASCII Formfeed |
\n | ASCII Linefeed |
\r | ASCII Carriage Return |
\t | ASCII Horizontal Tab |
\v | ASCII Vertical Tab |
\ooo | Character with octal value ooo |
\xHH | Character with hexadecimal value HH |
Common operations
Operation | Expression | example | explanation |
---|---|---|---|
Access a character | Use the corresponding index | We can supply an index position or a range. In this case, it is from 2 to 5 (not inclusive of the last index). | |
Concatenate | + | Concatenates two or more strings. | |
Enumerate | enumerate | Lists all characters in a string with their position. | |
Test for membership | in | Using the previous variable title we check
if the string 'SQL' is contained in it with the
in operator.
|
Boolean Type
The two Boolean values in Python are True
or False
. In a numeric context, they can behave like 0
and 1, respectively. Using the bool()
function you
can return one of the Boolean values. For example, an expression returning
a positive or a negative integer will evaluate to True
:
/p>
Keep in mind the following built-in objects will always evaluate to
False
if you wrap them in bool():
- constants defined to be false:
None
andFalse
. - zero of any numeric type:
0
,0.0
,0j
,Decimal(0)
,Fraction(0,
1)
- empty strings, sequences or collections:
''
,()
,[]
,{}
,set()
,range(0)
Common operations
Operation | Expression | Example | explanation |
---|---|---|---|
And | and | These are the basic Boolean operations as defined in Boolean algebra. | |
Or | or | ||
not | not |
With Python's inherent readability, you can see how easy it would be to construct a logical expression testing a membership for something:
The same can be also be done for a numeric sequence:
Binary
In this data type category, we have three objects:
bytes
bytes
: returns an immutable
bytes
object initialized in one of three ways:
A zero-filled bytes object of a specified length: bytes(10)
:
an iterable of integers using range()
:
Copying existing binary data via the buffer protocol:
We can do a similar thing with a string object:
In this case
here the variable str_var
is binary encoded by prepending
b
to the value of the variable. If not encoded, it
cannot be represented as a bytes
object.
bytearray
bytearray
: same as bytes
but mutable. You must always call the constructor; it is not possible to use a literal
syntax (e.g. by prepending something to the value of the variable):
memoryview
memoryview
: create a
memoryview
that references an object which must support the buffer protocol,
such as bytes
and bytearray
.
The buffer protocol allows the subsequent instances of an object to work with the
same data as compared to making a new instance thus increasing memory and computational
requirements. This is particularly useful when working with large binary objects,
such as images, video, and audio. Here is an example:
Common operations
With the binary data types, you can perform bitwise operations, i.e. operation that work on each individual bit of a byte string. As a data professional, you may not need to execute bitwise operations frequently. The basic ones you can refer to here.
Date and time
For the sake of making this tip exhaustive, I am including this data type too,
although it is not built-in and not available by default in your Python distribution.
To work with datetime object types, you must import the datetime
module.
Here are a couple of examples with the available types the datetime class implements:
Get the current date and time: the result is a datetime
object showing year, month, date, hour, minute, second and microsecond:
Get the current date: Similarly, you can use the date
only type to return today's date.
Common operations
The core operation is adding or subtracting dates and/or time. For example, get
the difference between two dates. The result is a timedelta
object:
The variables start_date
and
end_date
are instantiated by using the constructor
for date.
Similarly, we can use the
datetime
constructor, to which at least an argument
for hour must be passed (additionally minute and second):
Pandas date types
Finally, I want to mention the pandas data types. These are available separately - from the pandas package. As a data professional, chances are you will extensively work with the pandas package to munch and wrangle your data. There are some subtle differences between the Python simple built-in data types and the pandas data types. You can check them out in the following table:
Pandas dtype | Python type | Usage |
---|---|---|
object | str | Text or mixed numeric and non-numeric values |
int64 | int | Integer numbers |
float64 | float | Floating point numbers |
bool | bool | True/False values |
datetime64 | NA | Date and time values. As we saw, this is implemented by the datetime module, but it is not built in. |
timedelta[ns] | NA | Differences between two datetimes. The corresponding type is datetime.timedelta. |
category | NA | Finite list of text values. In general, usage of object is advised. |
Congrats if you made it to the end! Now you know the basic built-in data types in Python. These core concepts will help you along your journey in the Python universe.
Reference
The article uses information from the official Python documentation, accessed December 2021 at https://docs.python.org/3/library/stdtypes.html.
Next Steps
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-01-26