Sum Types in Python

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-10-19 06:18:02785browse

Sum Types in Python

Python is a lovely language. However, when working in Python I frequently find myself missing built-in support for sum types. Languages like Haskell and Rust make this kind of thing so easy:

While Python doesn't support this kind of construction out-of-the-box, we'll see that types like Expr are nonetheless possible (and easy) to express. Furthermore, we can create a decorator that handles all of the nasty boilerplate for us. The result isn't too different from the Haskell example above:

Representing Sum Types

We'll represent sum types using a "tagged union". This is easy to grok by example:

Each variant is an instance of the same class (in this case Expr). Each one contains a "tag" indicating which variant it is, along with the data specific to it.

The most basic way to use an Expr is with an if-else chain:

However, this has a few downsides:

  • The same if-else chain is repeated everywhere an Expr is used.
  • Changing the tag's value—say from "lit" to "literal"—breaks existing code.
  • Consuming sum types requires knowing implementation details (i.e. the tag and the names of the fields used by each variant).

Implementing match

We can avoid all of these issues by exposing a single, public match method used to consume sum types:

But first we need to make the different variants a little more uniform. Instead of storing its data in various fields, each variant will now store it in a tuple named data:

This allows us to implement match:

In one fell swoop we've solved all of the problems noted above! As another example, and for a change of scenery, here's Rust's Option type transcribed in this fashion:

As a small quality of life benefit, we can support a special wildcard or "catchall" handler in match, indicated by an underscore (_):

This allows us to use match like:

Implementing enum

As the Option class illustrates, a lot of the code needed to create sum types follows the same pattern:

Instead of writing this ourselves, let's write a decorator to generate these methods based on some description of the variants.

What kind of a description? The simplest thing would be to supply a list of variant names, but we can do a little better by also providing the types of arguments that we expect. We'd use enum to automagically enhance our Option class like this:

The basic structure of enum looks like this:

It's a function that returns another function, which will be called with the class we're enhancing as its only argument. Within enhance we'll attach methods for constructing each variant, along with match.

First, match, because it's just copy pasta:

Adding methods to construct each variant is only slightly more involved. We iterate over the variants dictionary, defining a method for each entry:

where make_constructor creates a constructor function for a variant with tag (and name) tag, and "type signature" sig:

Here's the full definition of enum for reference.

Bonus Features

More Dunder Methods

We can easily enhance our sum classes with __repr__ and __eq__ methods:

With enhance improved in this fashion, we can define Option with minimal cruft:

Recursive Definitions

Unfortunately, enum isn't (yet) up to the task of defining Expr:

We're using the class Expr before it's been defined. An easy fix here is to simply call the decorator after defining the class:

But there's a simple change we can make to support this: allow a "signature" to be a function that returns a tuple:

All this requires is a small change in make_constructor:

Conclusion

Useful as it may be, our fancy new enum decorator isn't without its shortcomings. The most apparent is the inability to perform any kind of "nested" pattern matching. In Rust, we can do things like this:

But we're forced to perform a double match to achieve the same result:

That said, these kinds of cases seem relatively rare.

Another downside is that match requires constructing and calling lots of functions. This means it's likely much slower than the equivalent if-else chain. However, the usual rule of thumb applies here: use enum if you like its ergonomic benefits, and replace it with its "generated" code if it's too slow.

The above is the detailed content of Sum Types in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn