Rust is cool – Enums
Since starting my new role at Tangram Vision, I’ve been doing a lot of programming in Rust. This is a bit of a change for me having mostly worked in C / C++ / C# over the past 6 years, but my impressions overall are quite positive.
One of my favourite features of Rust are the enum types. They don’t stick out among Rust’s set of language features compared to some of the more novel aspects of the language (borrow checker, lifetimes, safe-multi-threading); however, Rust’s enum types drive some of the coolest parts of the language, and make modeling data in terms of types a pleasure.
So I wanted to write something to point out some of the reasons I like Rust enums so much. There’s gonna be a lot of “compared to C / C++” in here, so do be warned.
The three forms of enum
First and foremost, an “enumeration” or enum in Rust is a kind of sum-type. Rust utilizes an algebraic type system, so there’s a few different ways to specify what the “type” of some data is.
Category | In Rust | Definition |
---|---|---|
Atoms | i32 , u64 , f32 , etc. |
Atoms are types whose values evaluate to themselves. These are more or less the smallest types you’ll encounter, but they are the building blocks for everything else. |
Sum types | enum |
Sum types are types where you can choose one of a set of possible states. The number of possible states a variable can have is the sum of all options. |
Product types | struct |
Product types are defined similarly to sum types. The number of possible states a variable can have is the product of all combinations of data in the struct. |
Generic types | SomeType<T> |
A type that is defined in terms of some other type (or types) T . |
This may be a bit hand-wavy, and I certainly didn’t cover all possible types (I ignored traits and functions, for example), but it’s a good start to getting to know about enums.
Rust enums, or sum-types in general, make the most sense when you want to
represent a “choice” as a type somehow. The easiest example to represent this
in Rust is the Option<T>
type, which is a generic enum roughly of the form:
pub enum Option<T> {
Some(T),
None,
}
This is an enum that represents whether or not you have Some
of some T
, or
None
of it. Other enums are a “choice” in the same way. Every variable of
your enum type can only be one entry in the enum type at a time.
let x: Option<f32> = Some(42.0);
let y: Option<f32> = None;
While x
and y
above are the same type, they have to be either something
(Some(42.0)
) or nothing (None
). They can’t be both.
Based on what we saw above, there’s a couple of interesting ways we can write an enum type.
No data
This is closest to what C++-style enum classes are like, although unlike in C or C++, these aren’t automatically mapped to a set of integers.
enum Color {
Red,
Green,
Blue,
}
If you’re coming from C you might think: well, why aren’t these mapped to
integers? Shouldn’t Red
be zero, Green
be one, Blue
as two, etc? Unlike
C, Rust’s type system isn’t entirely based on integers. Types have
significantly more meaning, and it’s difficult to explain this succinctly.
Instead, Color
is a type in your program where Color::Red
is a distinct
value from Color::Green
is a distinct value from Color::Blue
, in the same
way that 0 is distinct from 1 or 2.
If one wants to, there are crates for mapping integers to enum values. This can be useful if you’re serializing to some low-level protocol, but isn’t very interesting otherwise.
Tuple-like
One thing I’ve avoided until now is talking about union
types in C. C has
enums and unions, which are two separate kinds of abstractions. C-enums give
names to a list of integer values, while C-unions provide a weak kind of
sum-type.
The weakness of C-unions is because while you may have a union which can map over one option among a set of types, you don’t know which type the union was actually mapped to. Example:
union my_union_t {
int a;
float b;
const char* c;
};
In the above example, you can’t know when you receive a my_union_t
type
whether it was created by setting some value a
, b
, or c
. You know it’s
one of the options, but not which one. A typical way to work around this is to
type-tag your union with an enum, and pass that around too.
enum my_union_type_t {
A,
B,
C
};
This means you’re passing around two values just to be able to understand what your compiler should already know for you! This “idiom” in C is also error-prone: what if you send the wrong enum value alongside your union? Undefined behaviour, that’s what. Rust’s enums are much more convenient, and allow for combining data alongside the enum options themselves:
enum MyType {
A(i32),
B(f32),
C(String)
}
Because of this, you never need to worry about mismatching your enum / union types. Likewise, Rust doesn’t need a keyword for unions since the above example does exactly the same thing as the example in C. I think this was the first thing that really struck me about how cool enums in Rust are, because manually tagging and passing around unions and enums in C / C++ is a huge pain and source of bugs.
Named data
Similar to the above, we can give the data we pass around with our enum types names:
enum MyTypeWithNames {
A { value: i32 },
B { quantity: f32 },
C { message: String },
}
Giving names to things can be a good idea, especially if you’re passing around a lot of data inside your enum, or passing around multiple values of the same type. However, sometimes the best name is no name, so don’t overdo it!
One last fun fact: all of these can all be interchanged. Sometimes, depending on what you’re building, you might even be able to build an enum like this:
enum PixelFormat {
Yuv {
y: u8,
u: u8,
v: u8,
},
Rgb {
r: u8,
g: u8,
b: u8,
},
Greyscale(u16),
Unknown,
}
It might seem like this complicates your code significantly, but we have to choose to live with complexity at some level. A bit of extra complexity in the definition drops a significant amount of complexity in how we pass around and use these types.
Why is this exciting?
Without trying to (entirely) pitch you on Haskell strong type systems and
pure functional programming, enums in Rust are exciting because of what they
make easy. Rust uses enum types throughout the core language, and many of the
common idioms utilize enums in some form.
And if I’m being honest, equivalents can be made in C or C++ but the ergonomics and language integration just aren’t there. Passing C-enums or C++-enum classes alongside unions is vastly more error-prone.
Option<T>
Rust doesn’t have “optional” or “defaulted” arguments in functions like C does.
But we don’t need it! Instead, we can use the generic Option<T>
, to represent
whether or not something is required. In function parameters:
fn add_three_numbers(a: i32, b: i32, c: Option<i32>) -> i32 {
if let Some(value) = c {
a + b + value
} else {
a + b
}
}
While a bit contrived, when calling this you might do:
let x = add_three_numbers(13, 12, None); // => 25
let y = add_three_numbers(13, 12, Some(4)); // => 29
Similarly, if you avoid using unsafe
Rust and raw pointers, the default Rust
types for pointers (Box
, Arc
, etc) cannot ever exist as a “null” pointer.
Instead, if you want to represent a pointer that can be null, you use
Option<Box<T>>
, and null pointers are represented as None
! Unlike in C or
C++, this makes the type of pointers very explicit. You know that you have to
handle the enum differently than you would the pointer itself, and once you get
a Box<T>
you know for certain that it points to something. This is one way
safe Rust avoids null de-references!
Result<T, E>
Similar to Option
, Result
is Rust’s standard error type. It is a fairly
straightforward enum that roughly takes the form:
enum Result<T, E> {
Ok(T),
Err(E),
}
In this case, Result
is usually used as the return value from a function,
which tells you if it succeeded and you get an Ok(T)
back, or if it failed
and you get some error Err(E)
back. In practice, it’s pretty cool how
flexible this is for error handling. It also avoids all the problems with
C++-style exceptions.
Matching enums
By default Rust has built in support for deconstruction / pattern-matching enum types (even ones you define!). From our earlier example we can do:
let x: PixelFormat; // Assume this is given a value from something
match x {
PixelFormat::Yuv{ y, u, v } => {
println!("Pixel is y: {}, u: {}, v: {}", y, u, v);
}
PixelFormat::Rgb{ r, g, b } => {
println!("Pixel is r: {}, g: {}, b: {}", r, g, b);
}
PixelFormat::Greyscale(g) => {
println!("Greyscale value is: {}", g);
}
PixelFormat::Unknown => {
println!("Oh no, we don't understand this pixel format!");
}
}
While Rust doesn’t have switch statements that operate over integers like C does, this is a very close equivalent that semantically allows for some much more powerful abstractions.
As I said originally, enum types represent a “choice” of some kind when
programming. The match
expression above demonstrates how we dispatch a
“choice” in our program according to our type.
Some(Conclusion)?
Enums in Rust unify the concept of enums and unions in C-family languages. They are more expressive and reduce a lot of complexity in managing state or tracking a union through a program. They also power some of the core types that are used in almost all Rust code.
Enums are one of my favourite parts of moving to Rust from C++. They are used in error management, optional arguments, and enable a whole host of different representations for data that aren’t ergonomic enough to see common use in C++.