Rust is cool – Enums

On 2021-02-15 by ThatGeoGuy

Since starting my new role at Tangram Vision, I’ve been doing a lot of programming in Rust. This is a bit of a change for me having mostly worked in C / C++ / C# over the past 6 years, but my impressions overall are quite positive.

One of my favourite features of Rust are the enum types. They don’t stick out among Rust’s set of language features compared to some of the more novel aspects of the language (borrow checker, lifetimes, safe-multi-threading); however, Rust’s enum types drive some of the coolest parts of the language, and make modeling data in terms of types a pleasure.

So I wanted to write something to point out some of the reasons I like Rust enums so much. There’s gonna be a lot of “compared to C / C++” in here, so do be warned.

The three forms of enum

First and foremost, an “enumeration” or enum in Rust is a kind of sum-type. Rust utilizes an algebraic type system, so there’s a few different ways to specify what the “type” of some data is.

Category	In Rust	Definition
Atoms	`i32`, `u64`, `f32`, etc.	Atoms are types whose values evaluate to themselves. These are more or less the smallest types you’ll encounter, but they are the building blocks for everything else.
Sum types	`enum`	Sum types are types where you can choose one of a set of possible states. The number of possible states a variable can have is the sum of all options.
Product types	`struct`	Product types are defined similarly to sum types. The number of possible states a variable can have is the product of all combinations of data in the struct.
Generic types	`SomeType<T>`	A type that is defined in terms of some other type (or types) `T`.

This may be a bit hand-wavy, and I certainly didn’t cover all possible types (I ignored traits and functions, for example), but it’s a good start to getting to know about enums.

Rust enums, or sum-types in general, make the most sense when you want to represent a “choice” as a type somehow. The easiest example to represent this in Rust is the Option<T> type, which is a generic enum roughly of the form:

pub enum Option<T> {
    Some(T),
    None,
}

This is an enum that represents whether or not you have Some of some T, or None of it. Other enums are a “choice” in the same way. Every variable of your enum type can only be one entry in the enum type at a time.

let x: Option<f32> = Some(42.0);

let y: Option<f32> = None;

While x and y above are the same type, they have to be either something (Some(42.0)) or nothing (None). They can’t be both.

Based on what we saw above, there’s a couple of interesting ways we can write an enum type.

No data

This is closest to what C++-style enum classes are like, although unlike in C or C++, these aren’t automatically mapped to a set of integers.

enum Color {
    Red,
    Green,
    Blue,
}

If you’re coming from C you might think: well, why aren’t these mapped to integers? Shouldn’t Red be zero, Green be one, Blue as two, etc? Unlike C, Rust’s type system isn’t entirely based on integers. Types have significantly more meaning, and it’s difficult to explain this succinctly. Instead, Color is a type in your program where Color::Red is a distinct value from Color::Green is a distinct value from Color::Blue, in the same way that 0 is distinct from 1 or 2.

If one wants to, there are crates for mapping integers to enum values. This can be useful if you’re serializing to some low-level protocol, but isn’t very interesting otherwise.

Tuple-like

One thing I’ve avoided until now is talking about union types in C. C has enums and unions, which are two separate kinds of abstractions. C-enums give names to a list of integer values, while C-unions provide a weak kind of sum-type.

The weakness of C-unions is because while you may have a union which can map over one option among a set of types, you don’t know which type the union was actually mapped to. Example:

union my_union_t {
    int a;
    float b;
    const char* c;
};

In the above example, you can’t know when you receive a my_union_t type whether it was created by setting some value a, b, or c. You know it’s one of the options, but not which one. A typical way to work around this is to type-tag your union with an enum, and pass that around too.

enum my_union_type_t {
    A,
    B,
    C
};

This means you’re passing around two values just to be able to understand what your compiler should already know for you! This “idiom” in C is also error-prone: what if you send the wrong enum value alongside your union? Undefined behaviour, that’s what. Rust’s enums are much more convenient, and allow for combining data alongside the enum options themselves:

enum MyType {
    A(i32),
    B(f32),
    C(String)
}

Because of this, you never need to worry about mismatching your enum / union types. Likewise, Rust doesn’t need a keyword for unions since the above example does exactly the same thing as the example in C. I think this was the first thing that really struck me about how cool enums in Rust are, because manually tagging and passing around unions and enums in C / C++ is a huge pain and source of bugs.

Disclaimer: It's worth mentioning these examples aren't exactly 1:1. Rust does not yet have a stable ABI, so whatever the compiler spews out for these isn't going to be exactly what the C ABI demands, and obviously "String" in Rust is a vastly more comprehensive type than the C-string equivalent.

Named data

Similar to the above, we can give the data we pass around with our enum types names:

enum MyTypeWithNames {
    A { value: i32 },
    B { quantity: f32 },
    C { message: String },
}

Giving names to things can be a good idea, especially if you’re passing around a lot of data inside your enum, or passing around multiple values of the same type. However, sometimes the best name is no name, so don’t overdo it!

One last fun fact: all of these can all be interchanged. Sometimes, depending on what you’re building, you might even be able to build an enum like this:

enum PixelFormat {
    Yuv {
        y: u8,
        u: u8,
        v: u8,
    },
    Rgb {
        r: u8,
        g: u8,
        b: u8,
    },
    Greyscale(u16),
    Unknown,
}

It might seem like this complicates your code significantly, but we have to choose to live with complexity at some level. A bit of extra complexity in the definition drops a significant amount of complexity in how we pass around and use these types.

Why is this exciting?

Without trying to (entirely) pitch you on ~~Haskell~~ strong type systems and pure functional programming, enums in Rust are exciting because of what they make easy. Rust uses enum types throughout the core language, and many of the common idioms utilize enums in some form.

And if I’m being honest, equivalents can be made in C or C++ but the ergonomics and language integration just aren’t there. Passing C-enums or C++-enum classes alongside unions is vastly more error-prone.

`Option<T>`

Rust doesn’t have “optional” or “defaulted” arguments in functions like C does. But we don’t need it! Instead, we can use the generic Option<T>, to represent whether or not something is required. In function parameters:

fn add_three_numbers(a: i32, b: i32, c: Option<i32>) -> i32 {
    if let Some(value) = c {
        a + b + value
    } else {
        a + b
    }
}

While a bit contrived, when calling this you might do:

let x = add_three_numbers(13, 12, None);    // => 25
let y = add_three_numbers(13, 12, Some(4)); // => 29

Similarly, if you avoid using unsafe Rust and raw pointers, the default Rust types for pointers (Box, Arc, etc) cannot ever exist as a “null” pointer. Instead, if you want to represent a pointer that can be null, you use Option<Box<T>>, and null pointers are represented as None! Unlike in C or C++, this makes the type of pointers very explicit. You know that you have to handle the enum differently than you would the pointer itself, and once you get a Box<T> you know for certain that it points to something. This is one way safe Rust avoids null de-references!

`Result<T, E>`

Similar to Option, Result is Rust’s standard error type. It is a fairly straightforward enum that roughly takes the form:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

In this case, Result is usually used as the return value from a function, which tells you if it succeeded and you get an Ok(T) back, or if it failed and you get some error Err(E) back. In practice, it’s pretty cool how flexible this is for error handling. It also avoids all the problems with C++-style exceptions.

Side note: Check out the anyhow crate if you haven't already.

Matching enums

By default Rust has built in support for deconstruction / pattern-matching enum types (even ones you define!). From our earlier example we can do:

let x: PixelFormat; // Assume this is given a value from something

match x {
    PixelFormat::Yuv{ y, u, v } => {
        println!("Pixel is y: {}, u: {}, v: {}", y, u, v);
    }
    PixelFormat::Rgb{ r, g, b } => {
        println!("Pixel is r: {}, g: {}, b: {}", r, g, b);
    }
    PixelFormat::Greyscale(g) => {
        println!("Greyscale value is: {}", g);
    }
    PixelFormat::Unknown => {
        println!("Oh no, we don't understand this pixel format!");
    }
}

While Rust doesn’t have switch statements that operate over integers like C does, this is a very close equivalent that semantically allows for some much more powerful abstractions.

As I said originally, enum types represent a “choice” of some kind when programming. The match expression above demonstrates how we dispatch a “choice” in our program according to our type.

Some(Conclusion)?

Enums in Rust unify the concept of enums and unions in C-family languages. They are more expressive and reduce a lot of complexity in managing state or tracking a union through a program. They also power some of the core types that are used in almost all Rust code.

Enums are one of my favourite parts of moving to Rust from C++. They are used in error management, optional arguments, and enable a whole host of different representations for data that aren’t ergonomic enough to see common use in C++.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

ThatGeoGuy

Geomatics. Technology. Life.