Python Typing 1: Intro

Not a Python?

What’s the problem?

Python is dynamically typed (don’t worry if you’re not sure what this means, we’ll go into detail soon). This allows for more flexibility and is easier to learn, but can cause a slew of related bugs. The worst part is, these are bugs that might not show up until runtime, that is, when your code is actually in use!

How can it be solved?

By adding type hints and using a static type checker (like mypy) we can catch these errors before they’re encountered in the wild.

Let’s get into it!

Python’s Typing Types

You might have heard that Python is a dynamically typed and strongly typed language. What does this mean? These describe two different behaviours, and don’t necessarily always have to be tied together. There can be statically typed, strongly typed languages and dynamically typed, weakly typed languages.

This post will talk about the problems of dynamic typing in Python, and how they can be solved with static type checking and type hints.

First, what is dynamic typing? It means that variables can change to point to different types of objects. It also means functions can accept parameters of any type, as well as return any type.

You’ve probably seen code like this before:

val = "World"  # val is a string

# some more code in here

val = 5  # val is now an integer

This code is perfectly valid in Python, and will execute fine.

Compare this to a statically typed language like Java.

String val = "World";

// some more code in here

val = 5; // not valid, as val must be a String

Since val was declared with the type String, it can’t later be set to an integer. This Java code would not compile.

We’ll get back to talking about which is better soon, but first we’ll take a quick detour to talk about strong vs weak typing.

Aside: What makes Python strongly typed?

The term strongly typed means that a values themselves "know" what type they are and the memory that stores them can’t be re-interpreted as a different type. An integer is always an integer. In Python, an integer can be converted to a float, but the original integer is still there in memory, it’s just a float copy that has been created.

Compare that to a language like C. A variable can be declared as an int, but later can be cast to a float. That is, there are the same underlying bits in memory, but during execution they will be interpreted differently. For example, the binary value 01000000000000000000000000000000 would be interpreted as 2 as an IEEE754 floating point number, but would be 1073741824 as an integer. This is the same underlying memory, just treated differently.

That was a bit of a deep dive for just a small throwaway, but it’s good to understand weak vs. strong typing.

Now, let’s talk about the pros and cons of static vs. dynamic.

Which is better, static or dynamic?

Neither is inherently, better but each have their pros and cons.

Dynamic typing can be more succinct and easier. One of the advantages of Python is that it’s quick to learn as we don’t have to worry about specifying types. This can lead to errors though.

Take this Python code that builds the string Hello World.

h = "Hello"
w = " World"
o = h + w

o now contains the string Hello World.

Now, what if we change the type of one of the variables, to an integer?

h = "Hello"
w = 5
o = h + w

Executing this code raises an exception like this:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

So Python obviously flags this as a problem, but unfortunately not until runtime when we hit this code. Which means if this piece of code is deep inside a program, it could be running successfully (in production perhaps!) before it suddenly dies.

Static typing allows this whole class of errors to be picked up at compile time. Returning back to Java, here’s some code that builds the string Hello World.

String h = "Hello";
String w = " World";
String o = h + w;

Now what happens if one of the variables in an integer?

String h = "Hello";
Integer w = 5;
String o = h + w;

This code won’t compile, which means it can’t be released into the wild (production) with this bug.

Fortunately, since version 3.5, Python can be enhanced with type annotations The type annotations are completely optional when executing, and the Python interpreter itself won’t validate that they are correct. They need to be manually checked prior to execution, using a tool such as mypy or pyright. In this post we’ll be using mypy.

Type hints can be incrementally added to a codebase, so it’s easy to get started and gradually enhance your codebase while you build your knowledge. Let’s get started.

Your First Type Annotation and Checking

Let’s start with a simple example, the classic Greeter function. It takes a name and returns a greeting for the user.

If you want to follow along, you’ll need to install mypy, which can be installed with pip:

$ pip install mypy

Remember you’ll need at least Python 3.5, at least for these basic examples. Newer annotations are added in newer Python versions.

Now, create a file called greeter.py. This will contain the make_greeting function, and at the end, a call to it to generate the greeting and print it out.

def make_greeting(name):
    return "Hello, " + name


if __name__ == "__main__":
    greeting = make_greeting("Ben")
    print(greeting)

Executing it give the output we expect:

$ python3 greeter.py
Hello, Ben

Great, now let’s try to break it. We can update the call to make_greeting to pass in a type that’s not able to be concatenated with a string. For example, an integer. You can make this change on line 6:

    greeting = greet(5)

Now try running the script again:

$ python3 greeter.py
Traceback (most recent call last):
  File "/Users/ben/greeter.py", line 6, in <module>
    greeting = make_greeting(5)
  File "/Users/ben/greeter.py", line 2, in greet
    return "Hello, " + name
TypeError: can only concatenate str (not "int") to str

As you would expect, you get an error at runtime indicating that combining a string and integer together is not possible.

We can try to run mypy against the file and see if it detects the error. Simply run the command mypy <filename>.

$ mypy greeter.py
Success: no issues found in 1 source file

mypy has not detected any errors in the file. This is to be expected. We have not added any type hinting and so it does not know what types are correct.

We’ll start by annotating the type of the parameter of the function. This is done by adding : (type) (without the brackets) after the parameter(s). In our example, we expect name to be a str, so we’ll add : str:

def make_greeting(name: str):
    return "Hello, " + name

Now, we can recheck out (incorrect) code with mypy again:

$ mypy greeter.py
greeter.py:6: error: Argument 1 to "make_greeting" has incompatible type "int"; expected "str"
Found 1 error in 1 file (checked 1 source file)

Awesome, we’ve found an error before runtime! However we need to reiterate that even though mypy has detected the error, the annotations still don’t serve a purpose to the Python interpreter. Therefore, we would not be prevented from running the script again and receiving the same TypeError as we saw previously.

Next, let’s look at annotating variable and return types.

Variable and return type annotation

Type hints can also be applied to variables, so mypy knows what types they are allowed to contain, and to the return values of functions and methods, so mypy knows what values they are allowed to return.

Let’s add another function to greeter.py call make_grading. It takes a string containing the user’s mood and then "grades" it on a scale of 1-10, returning the grading.

def make_grading(mood: str):
    return 10 if mood == "good" else 1

You can add this function to your greeter.py too.

Now, we can pretend we’ve made some mistakes in calling our functions. We accidentally generate a greeting using make_grading instead of make_greeting.

if __name__ == "__main__":
    greeting = make_grading("Ben")
    print(greeting)

This code doesn’t generate any errors, but it doesn’t quite work as expected:

$ python greeter.py
1

mypy doesn’t detect anything wrong with our code either. But, we can add more annotations so it does flag this weirdness. First we’ll add return types to our functions. This is done by adding -> (type) (again, without brackets) to the function definition.

Our functions would therefore be updated like this:

def make_grading(mood: str) -> int:
    return 10 if mood == "good" else 1


def make_greeting(name: str) -> str:
    return "Hello, " + name

Which tells us make_grading accepts a str and returns an int, while make_greeting takes a str and returns a str.

You can update your greeter.py with these changes too.

mypy will still not give us any errors as it’s perfectly fine to try to print() an int. But, we can also annotate variable so mypy knows what we’re expecting. This is done similar to parameters, by adding : (type) after the variable is defined. In our case, after the definition of greeting:

    greeting: str = make_grading("Ben")

Now, running mypy on the file will cause it to complain, as greeting should be a str but make_grading returns an int.

$ mypy greeter.py
greeter.py:10: error: Incompatible types in assignment (expression has type "int", variable has type "str")
Found 1 error in 1 file (checked 1 source file)

Once again, this code will still run fine, as what we are doing is perfectly valid Python. But, mypy has helped us locate this logic "mistake" before runtime.

Also note that mypy will raise errors if you’ve annotated a return type and not returned the right type. For example, temporarily change make_greeting to this:

def make_greeting(name: str) -> str:
    return 0

Then run mypy:

$ mypy greeter.py
greeter.py:6: error: Incompatible return value type (got "int", expected "str")
Found 1 error in 1 file (checked 1 source file

Change it back to returning a string so mypy is happy.

The final way of annotating that we’ll introduce now is to annotate attributes in a class. Here’s a short of example of annotating a User class with a name and age attribute. It’s stored in user.py.

class User:
    name: str
    age: int = 20

    def __init__(self, name: str) -> None:
        self.name = name


if __name__ == "__main__":
    u: User = User("Ben")
    u.name = 10

This example shows a few thing of interest:

  • Class attributes are annotated like variables, and can have values set, or not.
  • __init__() methods always have the return type None.
  • self does not need to be annotated.
  • Type annotation supports both built in types and our own types. u is annotated as type User which is a class we have written.

Of course, this code has a mistake in the last line, trying to set name to an int. If we run mypy on this file, it spots it for us:

$ mypy user.py
user.py:11: error: Incompatible types in assignment (expression has type "int", variable has type "str")
Found 1 error in 1 file (checked 1 source file)

That covers most of the syntax related to type hinting, but hinting can get a lot more advanced. Next we’ll take another small step up and look at how to annotate when types could be multiple values or None.

Multiple types and the typing module

What about when multiple types are acceptable? Returning back to our User class above, a user’s age could be more accurately represented by a float than an int. What we want is referred to as a "union of types".

The Python typing module has a bunch of helpers to wrap one or more types and provide extra information. To use them, just import them from the that module. In our case, to specify that either int or float is allowed for user’s age, we need the Optional type helper, which can be imported like a normal Python identifier:

from typing import Union

Then it can be applied to the User class like so:

class User:
    name: str
    age: Union[int, float] = 20

    def __init__(self, name: str) -> None:
        self.name = name

This is a good use of Union as float and int have compatible interfaces. For example, ints can be added to floats. If we defined it as something like Union[str, int], it would not be such a good choice as we’d still have to do type checking before we would know if we could add the age to another value (for example).

Union can contain any number of types in it. However if you put in too many types, it can remove any benefits you get.

For example, if we annotate a variable with many different types, with incompatible interfaces, we’d need to check all the types manually before using them:

value: Union[int, str, dict, list] = get_value()

if isinstance(value, int):
    value = value - 1
elif isinstance(value, str):
    value = value.find("a")
elif isinstance(value, dict):
    value = value["a"]
else:
    value = value[0]

As you can see from this kind of strange example, since it could be one of many types you still need to do lots of runtime checking.

If we want to have a variable that could sometimes be a str and sometimes be None, you might think to annotate it with Union[None, str]. Returning to the typing module, a shortcut for this is provided with Optional. So, the equivalent is Optional[str]. Only one argument is allowed, however Unions can be nested inside Optional.

Let’s see an example of this in User class. Since we might sometimes not know a user’s age, we’ll make it optional, but we’ll retain the annotation that says it can be an int or a float.

from typing import Union, Optional


class User:
    name: str
    age: Optional[Union[int, float]] = None

    def __init__(self, name: str) -> None:
        self.name = name

When using Optional, you’ll probably need to do a lot of checks for None. For example, let’s add a method called have_birthday() to User, which will increase the user’s age by 1.

class User:
    …

    def have_birthday(self) -> None:
        self.age += 1

No prizes here for guessing what mypy has to say about this:

$ mypy user.py
user.py:12: error: Unsupported operand types for + ("None" and "int")
user.py:12: note: Left operand is of type "Optional[float]"
Found 1 error in 1 file (checked 1 source file)

So we need to include None check to cater for that case.

class User:
    …

    def have_birthday(self) -> None:
        if self.age is not None:
            self.age += 1

Adding the None check makes mypy happy.

$ mypy user.py
Success: no issues found in 1 source file

We’ll finish this typing introduction by looking at how to annotate containers like dict, list, set and tuple.

Annotating Containers

We can annotate container types the same as any other variable. Let’s start with lists. Here’s a function that returns a list of Users:

def generate_users(count: int) -> list:
    users = []
    for i in range(count):
        users.append(User(f"User {i}"))

    return users

And this is valid, and can be our "better-than-nothing" option if we don’t know what type of objects the list could contain. But in this case, we know it will contain User instance. We can use the List helper from typing which allows us to specify the types inside the list; like this:

from typing import List


def generate_users(count: int) -> List[User]:
    users = []
    for i in range(count):
        users.append(User(f"User {i}"))

    return users

Now mypy will know that when you’re fetching items from the returned list that you should only be doing Usery stuff with them.

Python 3.10 has added shortcuts for to these type annotations so they don’t have to be imported, and instead the normal classes can be used. For example, list[User] instead of List[User], which saves you an import. We’ll go through some more of these differences at the end.

Note that an empty list is always valid for the type hint.

These type hints can be nested, but you need to think about the order. For example, if your function sometimes returns a list of Users, and sometimes returns None, then this should be Optional[List[User]]. However if your function always returns a list, which will sometims contain User and sometimes contain None, then this would be inverted: List[Optiona[User]]. You can go deeper too! If you function sometimes returns None, and sometimes returns a list, but the list could contain User, str or None, you would do Optional[List[Optional[Union[User, str]]]]. Quite a mouthful! Don’t worry, we won’t be going that complicated in further posts.

We won’t talk much about sets, as they’re very similar to lists. Essentially, import Set from typing. Then annotate with the type inside the square brackets. For example, a set of strings would be annotated Set[str].

dict annotation is a little different. Use dict if you don’t know the keys/values of your dictionary. Otherwise, you can annotate in the form Dict[(key type), (value type)].

For example, if we have a dictionary that maps from the user’s name (a str) to their User object, it would have the type Dict[str, User].

from typing import Dict


u1 = User("Ben")
u2 = User("Bill")

users: Dict[str, User] = {"Ben": u1, "Bill": u2}

If you have different types of keys or values, then just nest in Union and Optional where appropriate.

Finally, tuples; they are immutable and therefore always have a fixed number of elements. The annotation should have the same number of elements.

To demonstrate, we could define a user with a tuple instead of its own class since we’re just storing a name and age.

from typing import Tuple


user: Tuple[str, int] = ("Ben", 20)

Or to get more complicated, if we allow the age to be an optional integer or float, as well as also storing the user’s birthdate.

from datetime import date
from typing import Tuple, Optional, Union


user: Tuple[str, Optional[Union[int, float]], date] = ("Ben", 20, date(1991, 12, 25))

The possibilities are endless.

Shortcuts

As mentioned earlier, newer Python versions introduce shortcuts for typing. In 3.10, | can be used instead of Union (e.g. int | float instead of Union[int, float]).

Built in classes can also be used instead of having to import special typing ones.

  • dict[kt, vt] can be used instead of typing.Dict[kt, vt]
  • list[t] instead of typing.List[t]
  • set[t] instead of typing.Set[t]
  • tuple[t1, t2, …] instead of typing.Tuple[t1, t2, …]

And more shortcuts are certain to be added to newer Python versions.

IDE Support

The last thing to mention is that type hinting also helps your IDE. PyCharm and VS Code can understand type hints and be more intelligent about what options they present when autocompleting. They can also show warnings as you code to let you know that you might be using the wrong types.

Conclusion

This is a short introduction to typing hinting in Python, at least enough so you’re able to recognize type hints and what they mean. The optional-ness of type hinting means you can start annotating small parts of a project and start getting the benefits even if it’s not perfect.


Leave a Reply

Your email address will not be published. Required fields are marked *