What’s the problem?
Python is dynamically typed (don’t worry if you’re not sure what this means, we’ll go into detail soon). This allows for more flexibility and is easier to learn, but can cause a slew of related bugs. The worst part is, these are bugs that might not show up until runtime, that is, when your code is actually in use!
How can it be solved?
By adding type hints and using a static type checker (like mypy) we can catch these errors before they’re encountered in the wild.
Let’s get into it!
Python’s Typing Types
You might have heard that Python is a dynamically typed and strongly typed language. What does this mean? These describe two different behaviours, and don’t necessarily always have to be tied together. There can be statically typed, strongly typed languages and dynamically typed, weakly typed languages.
This post will talk about the problems of dynamic typing in Python, and how they can be solved with static type checking and type hints.
First, what is dynamic typing? It means that variables can change to point to different types of objects. It also means functions can accept parameters of any type, as well as return any type.
You’ve probably seen code like this before:
val = "World" # val is a string
# some more code in here
val = 5 # val is now an integer
This code is perfectly valid in Python, and will execute fine.
Compare this to a statically typed language like Java.
String val = "World";
// some more code in here
val = 5; // not valid, as val must be a String
Since val
was declared with the type String
, it can’t later be set to an integer. This Java code would not compile.
We’ll get back to talking about which is better soon, but first we’ll take a quick detour to talk about strong vs weak typing.
Aside: What makes Python strongly typed?
The term strongly typed means that a values themselves "know" what type they are and the memory that stores them can’t be re-interpreted as a different type. An integer
is always an integer
. In Python, an integer
can be converted to a float
, but the original integer
is still there in memory, it’s just a float
copy that has been created.
Compare that to a language like C. A variable can be declared as an int
, but later can be cast to a float
. That is, there are the same underlying bits in memory, but during execution they will be interpreted differently. For example, the binary value 01000000000000000000000000000000
would be interpreted as 2
as an IEEE754 floating point number, but would be 1073741824
as an integer. This is the same underlying memory, just treated differently.
That was a bit of a deep dive for just a small throwaway, but it’s good to understand weak vs. strong typing.
Now, let’s talk about the pros and cons of static vs. dynamic.
Which is better, static or dynamic?
Neither is inherently, better but each have their pros and cons.
Dynamic typing can be more succinct and easier. One of the advantages of Python is that it’s quick to learn as we don’t have to worry about specifying types. This can lead to errors though.
Take this Python code that builds the string Hello World
.
h = "Hello"
w = " World"
o = h + w
o
now contains the string Hello World
.
Now, what if we change the type of one of the variables, to an integer?
h = "Hello"
w = 5
o = h + w
Executing this code raises an exception like this:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects
So Python obviously flags this as a problem, but unfortunately not until runtime when we hit this code. Which means if this piece of code is deep inside a program, it could be running successfully (in production perhaps!) before it suddenly dies.
Static typing allows this whole class of errors to be picked up at compile time. Returning back to Java, here’s some code that builds the string Hello World
.
String h = "Hello";
String w = " World";
String o = h + w;
Now what happens if one of the variables in an integer?
String h = "Hello";
Integer w = 5;
String o = h + w;
This code won’t compile, which means it can’t be released into the wild (production) with this bug.
Fortunately, since version 3.5, Python can be enhanced with type annotations The type annotations are completely optional when executing, and the Python interpreter itself won’t validate that they are correct. They need to be manually checked prior to execution, using a tool such as mypy or
pyright. In this post we’ll be using mypy
.
Type hints can be incrementally added to a codebase, so it’s easy to get started and gradually enhance your codebase while you build your knowledge. Let’s get started.
Your First Type Annotation and Checking
Let’s start with a simple example, the classic Greeter function. It takes a name and returns a greeting for the user.
If you want to follow along, you’ll need to install mypy
, which can be installed with pip
:
$ pip install mypy
Remember you’ll need at least Python 3.5, at least for these basic examples. Newer annotations are added in newer Python versions.
Now, create a file called greeter.py
. This will contain the make_greeting
function, and at the end, a call to it to generate the greeting and print it out.
def make_greeting(name):
return "Hello, " + name
if __name__ == "__main__":
greeting = make_greeting("Ben")
print(greeting)
Executing it give the output we expect:
$ python3 greeter.py
Hello, Ben
Great, now let’s try to break it. We can update the call to make_greeting
to pass in a type that’s not able to be concatenated with a string. For example, an integer. You can make this change on line 6:
greeting = greet(5)
Now try running the script again:
$ python3 greeter.py
Traceback (most recent call last):
File "/Users/ben/greeter.py", line 6, in <module>
greeting = make_greeting(5)
File "/Users/ben/greeter.py", line 2, in greet
return "Hello, " + name
TypeError: can only concatenate str (not "int") to str
As you would expect, you get an error at runtime indicating that combining a string and integer together is not possible.
We can try to run mypy
against the file and see if it detects the error. Simply run the command mypy <filename>
.
$ mypy greeter.py
Success: no issues found in 1 source file
mypy
has not detected any errors in the file. This is to be expected. We have not added any type hinting and so it does not know what types are correct.
We’ll start by annotating the type of the parameter of the function. This is done by adding : (type)
(without the brackets) after the parameter(s). In our example, we expect name
to be a str
, so we’ll add : str
:
def make_greeting(name: str):
return "Hello, " + name
Now, we can recheck out (incorrect) code with mypy
again:
$ mypy greeter.py
greeter.py:6: error: Argument 1 to "make_greeting" has incompatible type "int"; expected "str"
Found 1 error in 1 file (checked 1 source file)
Awesome, we’ve found an error before runtime! However we need to reiterate that even though mypy
has detected the error, the annotations still don’t serve a purpose to the Python interpreter. Therefore, we would not be prevented from running the script again and receiving the same TypeError
as we saw previously.
Next, let’s look at annotating variable and return types.
Variable and return type annotation
Type hints can also be applied to variables, so mypy
knows what types they are allowed to contain, and to the return values of functions and methods, so mypy
knows what values they are allowed to return.
Let’s add another function to greeter.py
call make_grading
. It takes a string containing the user’s mood and then "grades" it on a scale of 1-10, returning the grading.
def make_grading(mood: str):
return 10 if mood == "good" else 1
You can add this function to your greeter.py
too.
Now, we can pretend we’ve made some mistakes in calling our functions. We accidentally generate a greeting using make_grading
instead of make_greeting
.
if __name__ == "__main__":
greeting = make_grading("Ben")
print(greeting)
This code doesn’t generate any errors, but it doesn’t quite work as expected:
$ python greeter.py
1
mypy
doesn’t detect anything wrong with our code either. But, we can add more annotations so it does flag this weirdness. First we’ll add return types to our functions. This is done by adding -> (type)
(again, without brackets) to the function definition.
Our functions would therefore be updated like this:
def make_grading(mood: str) -> int:
return 10 if mood == "good" else 1
def make_greeting(name: str) -> str:
return "Hello, " + name
Which tells us make_grading
accepts a str
and returns an int
, while make_greeting
takes a str
and returns a str
.
You can update your greeter.py
with these changes too.
mypy
will still not give us any errors as it’s perfectly fine to try to print()
an int
. But, we can also annotate variable so mypy
knows what we’re expecting. This is done similar to parameters, by adding : (type)
after the variable is defined. In our case, after the definition of greeting
:
greeting: str = make_grading("Ben")
Now, running mypy
on the file will cause it to complain, as greeting
should be a str
but make_grading
returns an int
.
$ mypy greeter.py
greeter.py:10: error: Incompatible types in assignment (expression has type "int", variable has type "str")
Found 1 error in 1 file (checked 1 source file)
Once again, this code will still run fine, as what we are doing is perfectly valid Python. But, mypy
has helped us locate this logic "mistake" before runtime.
Also note that mypy
will raise errors if you’ve annotated a return type and not returned the right type. For example, temporarily change make_greeting
to this:
def make_greeting(name: str) -> str:
return 0
Then run mypy
:
$ mypy greeter.py
greeter.py:6: error: Incompatible return value type (got "int", expected "str")
Found 1 error in 1 file (checked 1 source file
Change it back to returning a string so mypy
is happy.
The final way of annotating that we’ll introduce now is to annotate attributes in a class. Here’s a short of example of annotating a User
class with a name
and age
attribute. It’s stored in user.py
.
class User:
name: str
age: int = 20
def __init__(self, name: str) -> None:
self.name = name
if __name__ == "__main__":
u: User = User("Ben")
u.name = 10
This example shows a few thing of interest:
- Class attributes are annotated like variables, and can have values set, or not.
__init__()
methods always have the return typeNone
.self
does not need to be annotated.- Type annotation supports both built in types and our own types.
u
is annotated as typeUser
which is a class we have written.
Of course, this code has a mistake in the last line, trying to set name
to an int
. If we run mypy
on this file, it spots it for us:
$ mypy user.py
user.py:11: error: Incompatible types in assignment (expression has type "int", variable has type "str")
Found 1 error in 1 file (checked 1 source file)
That covers most of the syntax related to type hinting, but hinting can get a lot more advanced. Next we’ll take another small step up and look at how to annotate when types could be multiple values or None
.
Multiple types and the typing
module
What about when multiple types are acceptable? Returning back to our User
class above, a user’s age could be more accurately represented by a float
than an int
. What we want is referred to as a "union of types".
The Python typing
module has a bunch of helpers to wrap one or more types and provide extra information. To use them, just import them from the that module. In our case, to specify that either int
or float
is allowed for user’s age, we need the Optional
type helper, which can be imported like a normal Python identifier:
from typing import Union
Then it can be applied to the User
class like so:
class User:
name: str
age: Union[int, float] = 20
def __init__(self, name: str) -> None:
self.name = name
This is a good use of Union
as float
and int
have compatible interfaces. For example, int
s can be added to float
s. If we defined it as something like Union[str, int]
, it would not be such a good choice as we’d still have to do type checking before we would know if we could add the age
to another value (for example).
Union
can contain any number of types in it. However if you put in too many types, it can remove any benefits you get.
For example, if we annotate a variable with many different types, with incompatible interfaces, we’d need to check all the types manually before using them:
value: Union[int, str, dict, list] = get_value()
if isinstance(value, int):
value = value - 1
elif isinstance(value, str):
value = value.find("a")
elif isinstance(value, dict):
value = value["a"]
else:
value = value[0]
As you can see from this kind of strange example, since it could be one of many types you still need to do lots of runtime checking.
If we want to have a variable that could sometimes be a str
and sometimes be None
, you might think to annotate it with Union[None, str]
. Returning to the typing
module, a shortcut for this is provided with Optional
. So, the equivalent is Optional[str]
. Only one argument is allowed, however Union
s can be nested inside Optional
.
Let’s see an example of this in User
class. Since we might sometimes not know a user’s age
, we’ll make it optional, but we’ll retain the annotation that says it can be an int
or a float
.
from typing import Union, Optional
class User:
name: str
age: Optional[Union[int, float]] = None
def __init__(self, name: str) -> None:
self.name = name
When using Optional
, you’ll probably need to do a lot of checks for None
. For example, let’s add a method called have_birthday()
to User
, which will increase the user’s age
by 1
.
class User:
…
def have_birthday(self) -> None:
self.age += 1
No prizes here for guessing what mypy
has to say about this:
$ mypy user.py
user.py:12: error: Unsupported operand types for + ("None" and "int")
user.py:12: note: Left operand is of type "Optional[float]"
Found 1 error in 1 file (checked 1 source file)
So we need to include None
check to cater for that case.
class User:
…
def have_birthday(self) -> None:
if self.age is not None:
self.age += 1
Adding the None
check makes mypy
happy.
$ mypy user.py
Success: no issues found in 1 source file
We’ll finish this typing introduction by looking at how to annotate containers like dict
, list
, set
and tuple
.
Annotating Containers
We can annotate container types the same as any other variable. Let’s start with lists
. Here’s a function that returns a list
of User
s:
def generate_users(count: int) -> list:
users = []
for i in range(count):
users.append(User(f"User {i}"))
return users
And this is valid, and can be our "better-than-nothing" option if we don’t know what type of objects the list could contain. But in this case, we know it will contain User
instance. We can use the List
helper from typing
which allows us to specify the types inside the list
; like this:
from typing import List
def generate_users(count: int) -> List[User]:
users = []
for i in range(count):
users.append(User(f"User {i}"))
return users
Now mypy
will know that when you’re fetching items from the returned list
that you should only be doing User
y stuff with them.
Python 3.10 has added shortcuts for to these type annotations so they don’t have to be imported, and instead the normal classes can be used. For example,
list[User]
instead ofList[User]
, which saves you animport
. We’ll go through some more of these differences at the end.
Note that an empty list
is always valid for the type hint.
These type hints can be nested, but you need to think about the order. For example, if your function sometimes returns a list
of Users, and sometimes returns None
, then this should be Optional[List[User]]
. However if your function always returns a list
, which will sometims contain User
and sometimes contain None
, then this would be inverted: List[Optiona[User]]
. You can go deeper too! If you function sometimes returns None
, and sometimes returns a list
, but the list
could contain User
, str
or None
, you would do Optional[List[Optional[Union[User, str]]]]
. Quite a mouthful! Don’t worry, we won’t be going that complicated in further posts.
We won’t talk much about sets
, as they’re very similar to list
s. Essentially, import Set
from typing
. Then annotate with the type inside the square brackets. For example, a set of strings would be annotated Set[str]
.
dict
annotation is a little different. Use dict
if you don’t know the keys/values of your dictionary. Otherwise, you can annotate in the form Dict[(key type), (value type)]
.
For example, if we have a dictionary that maps from the user’s name (a str
) to their User
object, it would have the type Dict[str, User]
.
from typing import Dict
u1 = User("Ben")
u2 = User("Bill")
users: Dict[str, User] = {"Ben": u1, "Bill": u2}
If you have different types of keys or values, then just nest in Union
and Optional
where appropriate.
Finally, tuple
s; they are immutable and therefore always have a fixed number of elements. The annotation should have the same number of elements.
To demonstrate, we could define a user with a tuple
instead of its own class since we’re just storing a name
and age.
from typing import Tuple
user: Tuple[str, int] = ("Ben", 20)
Or to get more complicated, if we allow the age to be an optional integer or float, as well as also storing the user’s birthdate.
from datetime import date
from typing import Tuple, Optional, Union
user: Tuple[str, Optional[Union[int, float]], date] = ("Ben", 20, date(1991, 12, 25))
The possibilities are endless.
Shortcuts
As mentioned earlier, newer Python versions introduce shortcuts for typing. In 3.10, |
can be used instead of Union
(e.g. int | float
instead of Union[int, float]
).
Built in classes can also be used instead of having to import special typing
ones.
dict[kt, vt]
can be used instead oftyping.Dict[kt, vt]
list[t]
instead oftyping.List[t]
set[t]
instead oftyping.Set[t]
tuple[t1, t2, …]
instead oftyping.Tuple[t1, t2, …]
And more shortcuts are certain to be added to newer Python versions.
IDE Support
The last thing to mention is that type hinting also helps your IDE. PyCharm and VS Code can understand type hints and be more intelligent about what options they present when autocompleting. They can also show warnings as you code to let you know that you might be using the wrong types.
Conclusion
This is a short introduction to typing hinting in Python, at least enough so you’re able to recognize type hints and what they mean. The optional-ness of type hinting means you can start annotating small parts of a project and start getting the benefits even if it’s not perfect.