What’s the problem?
When you have several related and distinct values that you have to choose between, it can often be hard to remember what they are. For example, you need to provide a sorting option. It could be "ascending"
or "descending"
. But then somewhere else in your code you (or someone else) accidentally uses "DESCENDING"
and things don’t work as you expect. If you use strings or other primitive values for these kind of options, then there’s no way to catch this error before runtime. Your IDE also can’t help you with autocompleting and suggesting what values are valid.
How can it be solved?
By using Enum
s, special classes that store a list of unique values. They include logic for validation, creation from primitive/raw values and can by validated by type checkers.
The code in this post will use type hints, so be sure to check my type hinting intro to understand the syntax.
Note also that this post will use the term
Enum
when referring to theEnum
type in Python, and enum when referring to enumerations in general (of whichEnum
is an implementation.
Before we look at how Enum
s can solve problems, let’s start by taking a more in-depth look into the problems that can occur when not using them.
Problems caused by using primitive values
This first problem that can occur when not using enums is one that was mentioned in the intro paragraph – using arbitrary values which could be mistyped. Here’s a class that store a list of directory entries. It has a method get_sorted_entries()
which will return the entries sorted in either ascending or descending order. It defaults to ascending.
import typing
class DirectoryList:
entries: typing.List[str]
def __init__(self, entries: typing.List[str]) -> None:
self.entries = entries
def get_sorted_entries(self, direction: str) -> typing.List[str]:
is_reversed = direction == "descending"
return sorted(self.entries, reverse=is_reversed)
The
sorted()
function returns a sorted copy of the inputlist
. By default it sorts in ascending order, but ifreverse
is specified asTrue
it will sort descending.
Now let’s try to use it as someone who has not taken the time to look into the method implementation (probably most people won’t do this!). We’d have no idea what the argument direction
should be:
dl = DirectoryList(["sorting.py", "post.md", "a.out"])
print(dl.get_sorted_entries("DESCENDING"))
Here’s the output:
['a.out', 'post.md', 'sorting.py']
It’s sorted ascending! Of course it is, because descending
is not the same as DESCENDING
.
One way to attempt to solve this problem might be to use constants instead. Perhaps ASCENDING = 0
and DESCENDING = 1
. Let’s see what that code might look like:
import typing
ASCENDING = 1
DESCENDING = 1
class DirectoryList:
entries: typing.List[str]
def __init__(self, entries: typing.List[str]) -> None:
self.entries = entries
def get_sorted_entries(self, direction: int) -> typing.List[str]:
is_reversed = direction == DESCENDING
return sorted(self.entries, reverse=is_reversed)
This is better, but still has a few problems that enums would solve:
- Did you notice the typo? Both
ASCENDING
andDESCENDING
are defined as1
.Enum
has a way of preventing duplicate values. - You could pass in any arbitrary integer (e.g.
dl.get_sorted_entries(3)
) and there would be no error. The type checkermypy
won’t detect this problem either. - Your programs namespace (the number of "names" that you’re able to refer to) gets polluted by having more
import
s for these variables. They also lose context somewhat and can be confusing to go back and figure out whatASCENDING
actually means if you just see that word outside of any context.
Of course checks could be added to validate the direction
you received.
if direction not in (ASCENDING, DESCENDING):
raise ValueError(f'Invalid direction ""{direction}".')
But that’s extra code you need to write and maintain – and writing fewer lines of code is generally better.
We could solve the polluted namespace by specifying using a normal class:
class SortOrder:
ASCENDING = 0
DESCENDING = 1
But the type annotations wouldn’t change. You’re still just working with int
s (i.e. fetching the underlying value and comparing direction == SortOrder.ASCENDING
). It also doesn’t solve the duplicate value issue.
Now let’s look at the Enum
class and how it solves problems.
Basic Enum
usage
The Enum
class is in the enum
module, and it’s actually pretty easy to get started with. Just import Enum
and create a subclass of it, then add attributes with primitive values.
Here’s how the SortOrder
class would look as an Enum
.
import enum
class SortOrder(enum.Enum):
ASCENDING = 0
DESCENDING = 1
Simple! Now let’s update the get_sorted_entries()
method to accept a SortOrder
instead. We just update the type hint to the direction
parameter and the comparison.
class DirectoryList:
entries: typing.List[str]
def __init__(self, entries: typing.List[str]) -> None:
self.entries = entries
def get_sorted_entries(self, direction: SortOrder) -> typing.List[str]:
is_reversed = direction == SortOrder.DESCENDING
return sorted(self.entries, reverse=is_reversed)
Now to call the function, just pass in the attribute of the SortOrder
class to that specifies the direction.
dl = DirectoryList(["sorting.py", "post.md", "a.out"])
print(dl.get_sorted_entries(SortOrder.DESCENDING))
And it now works as expected, sorting in descending order:
$ python3 directory_sorting.py
['sorting.py', 'post.md', 'a.out']
Default parameters with types and Enum
s
As a quick aside, type-hinted methods can accept default parameters, they just need to be provided after the type annotation. And to pass in an Enum
subclass, just provide whichever attribute you want to use.
For example, to default direction
to SortOrder.ASCENDING
:
def get_sorted_entries(self, direction: SortOrder = SortOrder.ASCENDING) -> typing.List[str]:
…
Now let’s look at some of the problems we encountered previously and how they can be solved with Enum
s.
Preventing duplicate values with auto
First, how does using an Enum
prevent the duplicate-value problem? Well… it doesn’t with how we’ve used it so far. For example, this is perfectly valid:
class SortOrder(enum.Enum):
ASCENDING = 1
DESCENDING = 1 # duplicate value! don't do this
And both values compare the same now too:
>>> SortOrder.DESCENDING == SortOrder.ASCENDING
True
The enum
module provides a helper function called auto()
which will automatically assign unique values to each attribute, which means you won’t have the duplication. You use it like this:
import enum
class SortOrder(enum.Enum):
ASCENDING = enum.auto()
DESCENDING = enum.auto()
ASCENDING
and DESCENDING
now have unique values that you never have to think about. And since comparisons are being done on the attribute names the get_sorted_entries()
method doesn’t need updating at all.
Next let’s look at how Enum
s (along with type hints) prevent us passing in invalid values.
Enum
type checking and valid values support
Provided you have added type hints to the method(s) being called, mypy
will detect if you’re not passing in an Enum
when expected. For example, if we try to call get_sorted_entries()
with an integer, like this:
dl.get_sorted_entries(0)
mypy
will detect this and tell us that we should be using a SortOrder
instead:
$ mypy directory_sorting.py
directory_sorting.py:23: error: Argument 1 to "get_sorted_entries" of "DirectoryList" has incompatible type "int"; expected "SortOrder"
Found 1 error in 1 file (checked 1 source file)
With proper type annotations, it’s impossible* to call the method with a value other than SortOrder.ASCENDING
or SortOrder.DESCENDING
. Why? Because:
- These are the only two values for
SortOrder
. mypy
prevents the method being called with anything other thanSortOrder
.- The method must receive one of these values.
* Standard disclaimer, nothing is impossible if you set your mind to it! And if you ignore
mypy
‘s warning and run the code anyway then of course you can break it.
The strict-typing property becomes more useful with larger enums rather than what is effectively a boolean choice here. It also leads us back to the example code we saw that could be used to check that the value passed in was valid (it raised ValueError
if direction
is not ASCENDING
or DESCENDING
). Of course we now know that direction
must to be one of those two values, so that check would be redundant.
What we have’t covered yet is how Enum
s can be used to convert primitive values into Enum
s, as well as verify their validity. Let’s look at that now, as it’s a useful way of validating input.
Converting primitives to Enum
s, with validation
A primitive value can be converted to an Enum
by simply calling the Enum
subclass as if you were trying to instantiate it, and passing in the primitive that matches the value.
For example to construct a SortOrder
from an int
:
>>> SortOrder(1)
<SortOrder.ASCENDING: 1>
>>> SortOrder(2)
<SortOrder.DESCENDING: 2>
Something to note here – when using
enum.auto()
, values start at1
.
We can then see what happens when calling the Enum
with an invalid value:
>>> SortOrder(0)
Traceback (most recent call last):
…
ValueError: 0 is not a valid SortOrder
Python makes sure the value being passed in exists on the Enum
, and raises a ValueError
if it doesn’t. We can leverage this to validate user input, for example. We might want to make sure that a valid value is passed in to a script on the command line.
Quick intro to reading command line arguments
The command line arguments passed to a Python script can be accessed in the argv
array, imported from the sys
module. Element 0
is always the name of the script (technically, the path to the script relative to the current directory). Elements argv[1]
onward are the arguments. Here’s a short example Python script:
from sys import argv
print(argv)
And here’s the output:
$ python3 argv_example.py one two three four
['argv_example.py', 'one', 'two', 'three', 'four']
Now, back to Enum
s.
Parsing command line arguments to Enum
s
Arguments that come from the command line into argcv
are always strings. To convert them to Enum
values, we need to go back to using Enum
s with string values. Returning to our SortOrder
example, it can be converted to str
type:
class SortOrder(enum.Enum):
ASCENDING = "ascending"
DESCENDING = "descending"
Now when we call SortOrder
with the string ascending
or descending
, it will be converted to the matching SortOrder
value. If any other value is passed, a ValueError
will be raised.
If you want to try this out yourself, here’s the full script to change the sort order of a (hard-coded) directory list. The sort order is read from argv[1]
. It defaults to ascending
. There’s a couple of other new things being snuck into this script too.
import enum
from sys import argv
import typing
class SortOrder(enum.Enum):
ASCENDING = "ascending"
DESCENDING = "descending"
class DirectoryList:
entries: typing.List[str]
def __init__(self, entries: typing.List[str]) -> None:
self.entries = entries
def get_sorted_entries(self, direction: SortOrder) -> typing.List[str]:
is_reversed = direction == SortOrder.DESCENDING
return sorted(self.entries, reverse=is_reversed)
def main():
dl = DirectoryList(["sorting.py", "post.md", "a.out"])
if len(argv) == 1:
# called just with script name
sort_order = SortOrder.ASCENDING
else:
try:
sort_order = SortOrder(argv[1])
except ValueError:
sort_order_values = [f'"{so.value}"' for so in SortOrder]
so_values_text = ", ".join(sort_order_values)
print(f'Unknown order "{argv[1]}".')
print(f'Valid options are: {so_values_text}.')
return
print(dl.get_sorted_entries(sort_order))
if __name__ == "__main__":
main()
Here’s some output from running it with different options:
$ python3 directory_sorting.py descending
['sorting.py', 'post.md', 'a.out']
$ python3 directory_sorting.py
['a.out', 'post.md', 'sorting.py']
$ python3 directory_sorting.py reverse
Unknown order "reverse".
Valid options are: "ascending", "descending".
A better (but more complicated) way of parsing command line arguments is with the
argparse
module. You can even make it automatically parse values intoEnum
s, too.
As mentioned, there are two new things to note in this script. The first is that it’s possible to iterate over an Enum
which will yield each option. The second is how to get value (ascending
) of the Enum
option: from its value
property. The other part of the Enum
is its name
(e.g ASCENDING
) which can be accessed with the name
property. To demonstrate with a code example:
>>> for so in SortOrder:
... print(so.name)
... print(so.value)
...
ASCENDING
ascending
DESCENDING
descending
On that note, we know how to go from a value to an Enum
(call it like a class and pass in the value) but you can also go from the option’s name to an Enum
option, but using square brackets like you’re accessing a dictionary. For example:
>>> SortOrder["ASCENDING"]
<SortOrder.ASCENDING: 'ascending'>
The names are always strings.
In the example script, if no sorting option was provided on the command line, it defaulted to SortOrder.ASCENDING
. However this was defined in the body of the script. Let’s look at way of defining the default value on the Enum
itself.
Default values on Enum
s
Adding a method to get default values on an Enum
is straightforward – just add a classmethod
(a method that can be called on uninstantiated class) to return the option you want to use. For example, this get_default()
method:
class SortOrder(enum.Enum):
ASCENDING = "ascending"
DESCENDING = "descending"
@classmethod
def get_default(cls) -> "SortOrder":
return cls.ASCENDING
cls
is a reference to the class on which it’s attached, in this case SortOrder
. By using the @classmethod
decorator, it’s passed into the method instead of self
.
Something new here is the type annotation
"SortOrder"
with quotes. SinceSortOrder
hasn’t been defined at this point when the file is executed (because the parser is still in the middle of parsing theSortOrder
class) we can’t use it as a type annotation on itself. We can solve this by quoting it like a string, andmypy
understands this annotation.
The last change to make to the script is to use get_default()
to get the default value. The whole script won’t be repeated, but here are the relevant lines:
if len(argv) == 1:
# called just with script
sort_order = SortOrder.get_default()
The advantage of implementing the default in this way is that if you decide you want to change your default sort order, it only has to be updated in a single place.
We’ll finish up by looking at a common mistake: comparison of Enum
s using value
s.
The value-comparison anti-pattern
It can be tempting to compare Enum
values like this:
some_input_value = "ascending" # pretend this has come from somewhere external
so = SortOrder.ASCENDING
if so.value == some_input_value:
do_something()
else:
do_something_else()
That is, we compare on the underlying primitive values. The drawbacks this has are:
- It hardcodes the value. If you want to change
ascending
to another string (maybe you’re shortening toasc
anddesc
) you need to find all these comparisons throughout your code. - It provides no validation of the input value, therefore you’re making an assumption that if the string is not ascending then it must be descending – which is not true. It could be any number of other strings. Whereas, if you validate it you know it can only be one of a finite number of
Enum
options and you can be sure you’re handling all possible cases.
Python 3.11 introduces
StrEnum
andIntEnum
which can be compared tostr
andint
directly (respectively) without usingvalue
. Caution is advised when using these as as it’s even easier to fall into this trap.
Conclusion
Enum
s are great tools to validate data, add extra type-safety, and reduce the amount of validation code you need to write. They can also cut back on repeated code, and help your IDE with autocomplete.
If you don’t care about the underlying values, then the enum.auto()
function is helpful to automatically make sure you don’t have any duplicate values.
Once you start using them, you’ll wonder how you ever lived without them.