Reduce Confusion with Enums

Enum choices

What’s the problem?

When you have several related and distinct values that you have to choose between, it can often be hard to remember what they are. For example, you need to provide a sorting option. It could be "ascending" or "descending". But then somewhere else in your code you (or someone else) accidentally uses "DESCENDING" and things don’t work as you expect. If you use strings or other primitive values for these kind of options, then there’s no way to catch this error before runtime. Your IDE also can’t help you with autocompleting and suggesting what values are valid.

How can it be solved?

By using Enums, special classes that store a list of unique values. They include logic for validation, creation from primitive/raw values and can by validated by type checkers.

The code in this post will use type hints, so be sure to check my type hinting intro to understand the syntax.

Note also that this post will use the term Enum when referring to the Enum type in Python, and enum when referring to enumerations in general (of which Enum is an implementation.

Before we look at how Enums can solve problems, let’s start by taking a more in-depth look into the problems that can occur when not using them.

Problems caused by using primitive values

This first problem that can occur when not using enums is one that was mentioned in the intro paragraph – using arbitrary values which could be mistyped. Here’s a class that store a list of directory entries. It has a method get_sorted_entries() which will return the entries sorted in either ascending or descending order. It defaults to ascending.

import typing


class DirectoryList:
    entries: typing.List[str]

    def __init__(self, entries: typing.List[str]) -> None:
        self.entries = entries

    def get_sorted_entries(self, direction: str) -> typing.List[str]:
        is_reversed = direction == "descending"
        return sorted(self.entries, reverse=is_reversed)

The sorted() function returns a sorted copy of the input list. By default it sorts in ascending order, but if reverse is specified as True it will sort descending.

Now let’s try to use it as someone who has not taken the time to look into the method implementation (probably most people won’t do this!). We’d have no idea what the argument direction should be:

dl = DirectoryList(["sorting.py", "post.md", "a.out"])

print(dl.get_sorted_entries("DESCENDING"))

Here’s the output:

['a.out', 'post.md', 'sorting.py']

It’s sorted ascending! Of course it is, because descending is not the same as DESCENDING.

One way to attempt to solve this problem might be to use constants instead. Perhaps ASCENDING = 0 and DESCENDING = 1. Let’s see what that code might look like:

import typing

ASCENDING = 1
DESCENDING = 1


class DirectoryList:
    entries: typing.List[str]

    def __init__(self, entries: typing.List[str]) -> None:
        self.entries = entries

    def get_sorted_entries(self, direction: int) -> typing.List[str]:
        is_reversed = direction == DESCENDING
        return sorted(self.entries, reverse=is_reversed)

This is better, but still has a few problems that enums would solve:

  • Did you notice the typo? Both ASCENDING and DESCENDING are defined as 1. Enum has a way of preventing duplicate values.
  • You could pass in any arbitrary integer (e.g. dl.get_sorted_entries(3)) and there would be no error. The type checker mypy won’t detect this problem either.
  • Your programs namespace (the number of "names" that you’re able to refer to) gets polluted by having more imports for these variables. They also lose context somewhat and can be confusing to go back and figure out what ASCENDING actually means if you just see that word outside of any context.

Of course checks could be added to validate the direction you received.

if direction not in (ASCENDING, DESCENDING):
    raise ValueError(f'Invalid direction ""{direction}".')

But that’s extra code you need to write and maintain – and writing fewer lines of code is generally better.

We could solve the polluted namespace by specifying using a normal class:

class SortOrder:
    ASCENDING = 0
    DESCENDING = 1

But the type annotations wouldn’t change. You’re still just working with ints (i.e. fetching the underlying value and comparing direction == SortOrder.ASCENDING). It also doesn’t solve the duplicate value issue.

Now let’s look at the Enum class and how it solves problems.

Basic Enum usage

The Enum class is in the enum module, and it’s actually pretty easy to get started with. Just import Enum and create a subclass of it, then add attributes with primitive values.

Here’s how the SortOrder class would look as an Enum.

import enum


class SortOrder(enum.Enum):
    ASCENDING = 0
    DESCENDING = 1

Simple! Now let’s update the get_sorted_entries() method to accept a SortOrder instead. We just update the type hint to the direction parameter and the comparison.

class DirectoryList:
    entries: typing.List[str]

    def __init__(self, entries: typing.List[str]) -> None:
        self.entries = entries

    def get_sorted_entries(self, direction: SortOrder) -> typing.List[str]:
        is_reversed = direction == SortOrder.DESCENDING
        return sorted(self.entries, reverse=is_reversed)

Now to call the function, just pass in the attribute of the SortOrder class to that specifies the direction.

dl = DirectoryList(["sorting.py", "post.md", "a.out"])

print(dl.get_sorted_entries(SortOrder.DESCENDING))

And it now works as expected, sorting in descending order:

$ python3 directory_sorting.py
['sorting.py', 'post.md', 'a.out']

Default parameters with types and Enums

As a quick aside, type-hinted methods can accept default parameters, they just need to be provided after the type annotation. And to pass in an Enum subclass, just provide whichever attribute you want to use.

For example, to default direction to SortOrder.ASCENDING:

def get_sorted_entries(self, direction: SortOrder = SortOrder.ASCENDING) -> typing.List[str]:
    …

Now let’s look at some of the problems we encountered previously and how they can be solved with Enums.

Preventing duplicate values with auto

First, how does using an Enum prevent the duplicate-value problem? Well… it doesn’t with how we’ve used it so far. For example, this is perfectly valid:

class SortOrder(enum.Enum):
    ASCENDING = 1
    DESCENDING = 1  # duplicate value! don't do this

And both values compare the same now too:

>>> SortOrder.DESCENDING == SortOrder.ASCENDING
True

The enum module provides a helper function called auto() which will automatically assign unique values to each attribute, which means you won’t have the duplication. You use it like this:

import enum


class SortOrder(enum.Enum):
    ASCENDING = enum.auto()
    DESCENDING = enum.auto()

ASCENDING and DESCENDING now have unique values that you never have to think about. And since comparisons are being done on the attribute names the get_sorted_entries() method doesn’t need updating at all.

Next let’s look at how Enums (along with type hints) prevent us passing in invalid values.

Enum type checking and valid values support

Provided you have added type hints to the method(s) being called, mypy will detect if you’re not passing in an Enum when expected. For example, if we try to call get_sorted_entries() with an integer, like this:

dl.get_sorted_entries(0)

mypy will detect this and tell us that we should be using a SortOrder instead:

$ mypy directory_sorting.py
directory_sorting.py:23: error: Argument 1 to "get_sorted_entries" of "DirectoryList" has incompatible type "int"; expected "SortOrder"
Found 1 error in 1 file (checked 1 source file)

With proper type annotations, it’s impossible* to call the method with a value other than SortOrder.ASCENDING or SortOrder.DESCENDING. Why? Because:

  1. These are the only two values for SortOrder.
  2. mypy prevents the method being called with anything other than SortOrder.
  3. The method must receive one of these values.

* Standard disclaimer, nothing is impossible if you set your mind to it! And if you ignore mypy‘s warning and run the code anyway then of course you can break it.

The strict-typing property becomes more useful with larger enums rather than what is effectively a boolean choice here. It also leads us back to the example code we saw that could be used to check that the value passed in was valid (it raised ValueError if direction is not ASCENDING or DESCENDING). Of course we now know that direction must to be one of those two values, so that check would be redundant.

What we have’t covered yet is how Enums can be used to convert primitive values into Enums, as well as verify their validity. Let’s look at that now, as it’s a useful way of validating input.

Converting primitives to Enums, with validation

A primitive value can be converted to an Enum by simply calling the Enum subclass as if you were trying to instantiate it, and passing in the primitive that matches the value.

For example to construct a SortOrder from an int:

>>> SortOrder(1)
<SortOrder.ASCENDING: 1>
>>> SortOrder(2)
<SortOrder.DESCENDING: 2>

Something to note here – when using enum.auto(), values start at 1.

We can then see what happens when calling the Enum with an invalid value:

>>> SortOrder(0)
Traceback (most recent call last):
  …
ValueError: 0 is not a valid SortOrder

Python makes sure the value being passed in exists on the Enum, and raises a ValueError if it doesn’t. We can leverage this to validate user input, for example. We might want to make sure that a valid value is passed in to a script on the command line.

Quick intro to reading command line arguments

The command line arguments passed to a Python script can be accessed in the argv array, imported from the sys module. Element 0 is always the name of the script (technically, the path to the script relative to the current directory). Elements argv[1] onward are the arguments. Here’s a short example Python script:

from sys import argv


print(argv)

And here’s the output:

$ python3 argv_example.py one two three four
['argv_example.py', 'one', 'two', 'three', 'four']

Now, back to Enums.

Parsing command line arguments to Enums

Arguments that come from the command line into argcv are always strings. To convert them to Enum values, we need to go back to using Enums with string values. Returning to our SortOrder example, it can be converted to str type:

class SortOrder(enum.Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"

Now when we call SortOrder with the string ascending or descending, it will be converted to the matching SortOrder value. If any other value is passed, a ValueError will be raised.

If you want to try this out yourself, here’s the full script to change the sort order of a (hard-coded) directory list. The sort order is read from argv[1]. It defaults to ascending. There’s a couple of other new things being snuck into this script too.

import enum
from sys import argv
import typing


class SortOrder(enum.Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"


class DirectoryList:
    entries: typing.List[str]

    def __init__(self, entries: typing.List[str]) -> None:
        self.entries = entries

    def get_sorted_entries(self, direction: SortOrder) -> typing.List[str]:
        is_reversed = direction == SortOrder.DESCENDING
        return sorted(self.entries, reverse=is_reversed)


def main():
    dl = DirectoryList(["sorting.py", "post.md", "a.out"])

    if len(argv) == 1:
        # called just with script name
        sort_order = SortOrder.ASCENDING
    else:
        try:
            sort_order = SortOrder(argv[1])
        except ValueError:
            sort_order_values = [f'"{so.value}"' for so in SortOrder]
            so_values_text = ", ".join(sort_order_values)
            print(f'Unknown order "{argv[1]}".')
            print(f'Valid options are: {so_values_text}.')
            return
    print(dl.get_sorted_entries(sort_order))


if __name__ == "__main__":
    main()

Here’s some output from running it with different options:

$ python3 directory_sorting.py descending
['sorting.py', 'post.md', 'a.out']
$ python3 directory_sorting.py
['a.out', 'post.md', 'sorting.py']
$ python3 directory_sorting.py reverse
Unknown order "reverse".
Valid options are: "ascending", "descending".

A better (but more complicated) way of parsing command line arguments is with the argparse module. You can even make it automatically parse values into Enums, too.

As mentioned, there are two new things to note in this script. The first is that it’s possible to iterate over an Enum which will yield each option. The second is how to get value (ascending) of the Enum option: from its value property. The other part of the Enum is its name (e.g ASCENDING) which can be accessed with the name property. To demonstrate with a code example:

>>> for so in SortOrder:
...     print(so.name)
...     print(so.value)
...
ASCENDING
ascending
DESCENDING
descending

On that note, we know how to go from a value to an Enum (call it like a class and pass in the value) but you can also go from the option’s name to an Enum option, but using square brackets like you’re accessing a dictionary. For example:

>>> SortOrder["ASCENDING"]
<SortOrder.ASCENDING: 'ascending'>

The names are always strings.

In the example script, if no sorting option was provided on the command line, it defaulted to SortOrder.ASCENDING. However this was defined in the body of the script. Let’s look at way of defining the default value on the Enum itself.

Default values on Enums

Adding a method to get default values on an Enum is straightforward – just add a classmethod (a method that can be called on uninstantiated class) to return the option you want to use. For example, this get_default() method:

class SortOrder(enum.Enum):
    ASCENDING = "ascending"
    DESCENDING = "descending"

    @classmethod
    def get_default(cls) -> "SortOrder":
        return cls.ASCENDING

cls is a reference to the class on which it’s attached, in this case SortOrder. By using the @classmethod decorator, it’s passed into the method instead of self.

Something new here is the type annotation "SortOrder" with quotes. Since SortOrder hasn’t been defined at this point when the file is executed (because the parser is still in the middle of parsing the SortOrder class) we can’t use it as a type annotation on itself. We can solve this by quoting it like a string, and mypy understands this annotation.

The last change to make to the script is to use get_default() to get the default value. The whole script won’t be repeated, but here are the relevant lines:

    if len(argv) == 1:
        # called just with script
        sort_order = SortOrder.get_default()

The advantage of implementing the default in this way is that if you decide you want to change your default sort order, it only has to be updated in a single place.

We’ll finish up by looking at a common mistake: comparison of Enums using values.

The value-comparison anti-pattern

It can be tempting to compare Enum values like this:

some_input_value = "ascending"  # pretend this has come from somewhere external

so = SortOrder.ASCENDING
if so.value == some_input_value:
    do_something()
else:
    do_something_else()

That is, we compare on the underlying primitive values. The drawbacks this has are:

  • It hardcodes the value. If you want to change ascending to another string (maybe you’re shortening to asc and desc) you need to find all these comparisons throughout your code.
  • It provides no validation of the input value, therefore you’re making an assumption that if the string is not ascending then it must be descending – which is not true. It could be any number of other strings. Whereas, if you validate it you know it can only be one of a finite number of Enum options and you can be sure you’re handling all possible cases.

Python 3.11 introduces StrEnum and IntEnum which can be compared to str and int directly (respectively) without using value. Caution is advised when using these as as it’s even easier to fall into this trap.

Conclusion

Enums are great tools to validate data, add extra type-safety, and reduce the amount of validation code you need to write. They can also cut back on repeated code, and help your IDE with autocomplete.

If you don’t care about the underlying values, then the enum.auto() function is helpful to automatically make sure you don’t have any duplicate values.

Once you start using them, you’ll wonder how you ever lived without them.


Leave a Reply

Your email address will not be published. Required fields are marked *