Kai Kaufman's tech blog

Bringing runtime checks to compile time in Rust

4106 words; 28 minute read

Introduction

For the past couple of months, I’ve been participating in the MITRE Embedded Capture the Flag, or eCTF for short, with a team of my university peers. As the name suggests, the eCTF involves writing secure firmware for a Tiva C microcontroller, which presents some interesting challenges. Many teams chose to use C, likely because that’s what’s taught in most (if not all) university-level embedded systems courses. We wanted to live less dangerously, however, and we opted to use Rust instead.

Using Rust was more or less a no-brainer for us, because of the language’s strong memory-safety guarantees, excellent library ecosystem and developer experience, and… all the other good things about Rust. This isn’t an article about why Rust is the best thing ever - instead, we’re going to look at some real examples (from my team’s eCTF work) to see how Rust can be (and was) leveraged to enhance confidence in the correctness of code!

Quick primer

My implementations of compile-time checks relied on multiple Rust features, some more obscure than others:

  • Associated constants allow us to add constant members to types, either directly or indirectly through a trait. Think of these as the equivalent to (for example) static final class members in Java… but better, because they exist at compile time!
  • Const generics allow us to pass values, rather than types, as generic parameters. (Before const generics were implemented and stabilized, you couldn’t represent things such as “an array of any size” natively - only by using a library like generic_array.)
  • Const panics allow us to abort compilation if necessary, by panicking during constant evaluation. We even get to choose the error message!

With these three tools, a lot can be done.

Other terms to know:

  • A reference is basically a pointer without the footguns. The Rust book goes into more detail. A reference to a type T is represented as &T.
  • An array is a fixed-length collection of items of a single type, usually represented as [T; N] (where T is the item type, and N is the length.) Arrays cannot be resized.
  • An array reference is simply a reference to an array, represented as &[T; N].
  • A slice (usually represented as [T], with T still being the item type) is a subsection of a larger collection, such as an array. Slices cannot be passed or stored directly - you can only work with references to them.
  • Something being infallible means that it cannot fail. If it does, that’s a bug.

Example 1: Slicing and indexing arrays/array references

Shameless plug

I’ve polished and implemented the system described here as part of a library that I released recently. If you like what you see here, maybe consider checking out the more refined version. I cannot overstate how much time it saved me!

Background

My team’s firmware made extensive use of slicing and indexing in order to (de)serialize data, among many, many other things. Ironically, these two simple tasks are where things start to get complicated.

The first problem: slicing

Consider the following Rust program (which is valid and will compile):

fn main() {
    // yes, I know I don't need to specify the types explicitly.
    // there's a point to this :)
    let x: [i32; 6] = [1, 2, 3, 4, 5, 6];
    let y: i32 = x[0];
    let z /*: ??? */ = &x[4..6];
}

We haven’t specified the type for z, so it’s up to the compiler to figure it out. What I (and, I suspect, many new users) would expect is for a process like this to occur:

  1. x (which we are slicing) is a [i32; 6].
  2. The slice index is 4..6; in interval notation, that’s [4, 6), and in “list of numbers” that’s 4 and 5. So, we want 2 items starting from index 4.
  3. The result of selecting 2 items should be, well, 2 items, or [i32; 2] in this case.
  4. Since we’re only taking a reference instead of “moving” the actual values, z should be &[i32; 2].

Unfortunately, all we get is a &[i32] - a reference to a slice. If we try to explicitly specify the type as &[i32; 2] anyway, we get this error:

error[E0308]: mismatched types
 --> src/bin/blogtest.rs:6:24
  |
6 |     let z: &[i32; 2] = &x[4..6];
  |            ---------   ^^^^^^^^ expected `&[i32; 2]`, found `&[i32]`
  |            |
  |            expected due to this
  |
  = note: expected reference `&[i32; 2]`
             found reference `&[i32]`

The logical next step, then, is to try to convert this &[i32] into a &[i32; 2]. For infallible conversions, we can use the into method of the Into trait - since we know that the slice is 2 elements long, maybe into will work?

// let z: &[i32; 2] = &x[4..6];
let z: &[i32; 2] = &x[4..6].into();

Nope:

error[E0277]: the trait bound `[i32; 2]: From<&[i32]>` is not satisfied
 --> src/bin/blogtest.rs:6:33
  |
6 |     let z: &[i32; 2] = &x[4..6].into();
  |                                 ^^^^ the trait `From<&[i32]>` is not implemented for `[i32; 2]`
  |
  = help: the following other types implement trait `From<T>`:
            <[T; LANES] as From<Simd<T, LANES>>>
            <[bool; LANES] as From<Mask<T, LANES>>>
  = note: required for `&[i32]` to implement `Into<[i32; 2]>`

I suppose this makes sense - although we know that the conversion would be infallible in this case, we can’t say the same for the general case, and Into has a strict general-case infallibility requriement:

Note: This trait must not fail. If the conversion can fail, use TryInto. (source: Into documentation)

If we take the hint and try the rather cumbersome .try_into().unwrap(), it works:

// let z: &[i32; 2] = &x[4..6];
let z: &[i32; 2] = &x[4..6].try_into().unwrap();
     Running `target/debug/blogtest`
y = 1
z = [5, 6]

This works, but there are two issues:

  • It looks weird. If I know something can’t fail, why am I “trying” to do it?
  • It’s really easy to make a typo and end up crashing at runtime. (If you get either the length or the index range wrong, you get something called a TryFromSliceError with no extra information.)

The second problem: indexing

Consider the following Rust program:

fn main() {
    let x: [i32; 6] = [1, 2, 3, 4, 5, 6];
    let y = x[8];
}

Interestingly, this doesn’t compile - we get an error about an “unconditional panic.”

error: this operation will panic at runtime
 --> src/bin/blogtest.rs:3:13
  |
3 |     let y = x[8];
  |             ^^^^ index out of bounds: the length is 6 but the index is 8
  |
  = note: `#[deny(unconditional_panic)]` on by default

I’m not entirely sure where this error comes from. A slightly different program will compile, but panic at runtime with the same exact “index out of bounds” message:

use std::ops::Index;

fn main() {
    let x: [i32; 6] = [1, 2, 3, 4, 5, 6];
    let y = x.index(8);
}

So far so good, as long as we stick to the normal [indexing] syntax. Now, what happens if we change x from an array to an array reference?

fn main() {
    let x: &[i32; 6] = &[1, 2, 3, 4, 5, 6];
    let y = x[8];
}

This compiles just fine, and panics at runtime (same message as before.) How unfortunate - it’s not like we’re missing any information here, since we know the length of the array behind x, and that tells us that index 8 doesn’t exist!

Sketching out a solution

Thankfully, there’s a solution to both issues: use our own index types! Rust allows us to do this by implementing the Index<Idx> trait, with Idx being our custom index type. If we define our own type, called (for example) CustomIndex, we can do something like this:

use std::ops::Index;

struct CustomIndex;
impl Index<CustomIndex> for [i32; 4] {
    type Output = i32;

    fn index(&self, _: CustomIndex) -> &Self::Output {
        todo!()
    }
}

Obviously this isn’t very useful - we’ve only implemented it for 4-element arrays of signed 32-bit integers, and indexing will always fail - but it’s a start.

There are a few constraints we should keep in mind:

  1. There should not be any extra runtime cost when using our custom indexes.
  2. Any checks that can be done at compile time should be done at compile time.
  3. This all needs to be safe and sound.

Thankfully, all of these can be satisfied!

Design constraint #1: Aim for zero cost

As it turns out, an easy way to avoid storage costs is to make sure no extra data is being carried around. If we make use of zero-sized types (ZSTs for short), we can leave no trace of our custom indexing system!

For example, consider following program:

struct MyZST<const A: usize, const B: usize>;

impl<const A: usize, const B: usize> MyZST<A, B> {
  pub fn sum(self) -> usize {
    A + B
  }
}

fn main() {
    let x = MyZST::<67, 96>.sum();
    println!("x = {}", x);
}

If we examine the optimized assembly using cargo-show-asm, we can see that the let x = ... line was translated into a single mov instruction:

$ cargo asm --bin blogtest 0 --rust
    Finished release [optimized] target(s) in 0.01s

.section .text.blogtest::main,"ax",@progbits
        .p2align        4, 0x90
        .type   blogtest::main,@function
blogtest::main:

                ...
                // blogtest.rs : 5
                A + B
        mov qword ptr [rsp], 163 <----- this is 67+96!
        ...

There is no evidence of MyZST’s existence, which is encouraging! Returning to our CustomIndex example, we might modify it like so:

use std::ops::Index;

struct CustomIndex<const I: usize>;
impl<const I: usize> Index<CustomIndex<I>> for [i32; 4] {
    type Output = i32;

    fn index(&self, _: CustomIndex<I>) -> &Self::Output {
        // Delegating to Index<usize>
        self.index(I)
    }
}

Of course, this still isn’t ideal - we’ve only implemented indexing for arrays of 4 32-bit signed integers. We can fix that, too:

use std::ops::Index;

struct CustomIndex<const I: usize>;
impl<T, const N: usize, const I: usize> Index<CustomIndex<I>> for [T; N] {
    type Output = T;

    fn index(&self, _: CustomIndex<I>) -> &Self::Output {
        // Delegating to Index<usize>
        self.index(I)
    }
}

Now we can use it on arrays of any size that contain items of any type:

fn main() {
    let x = [1u8, 2, 4, 8, 16, 32];
    let y = x[CustomIndex::<5>];
    println!("y = {}", y);
}

And if we check the generated assembly…

...
                // blogtest.rs : 22
                let y = x[CustomIndex::<5>];
        mov byte ptr [rsp + 7], 32

Perfect! We’re not getting in the way of any optimizations, so we can move on to the next constraint.

Design constraint #2: Prefer compile-time checks

In order for this to be useful, we should do bounds checking at compile-time rather than runtime. Delegating to a lower-level implementation of Index - namely, Index<usize> - doesn’t help us one bit - we’ll still panic at runtime if we try to perform an out-of-bounds access. A potential compile-time-checked version could look like this:

use std::ops::Index;

struct CustomIndex<const I: usize>;
impl<T, const N: usize, const I: usize> Index<CustomIndex<I>> for [T; N] {
    type Output = T;

    fn index(&self, _: CustomIndex<I>) -> &Self::Output {
        const RESULT: () = assert!(N > I, "Index is out of bounds!");
        unsafe { &*(self.as_ptr().add(I) as *const T) }
    }
}

Unfortunately, this doesn’t compile. The compiler doesn’t like that we’re using N and I, which are both considered generic parameters from an “outer function”, inside the definition of the RESULT constant. Oh well.

This is where type system hacking comes into play. Instead of doing the check within the index method, we can delegate it to something else… something that can make use of the generic parameters. For this, we introduce the concept of checker traits.

Since we know that we can panic in a const context, and we know that traits can have associated const members, and we know that traits support both normal and const generics, we can create a IsValidIndex trait that CustomIndex will implement, like so:

// Target is the collection type, i.e., [u32; 17].
trait IsValidIndex<Target> {
    const RESULT: ();
}

struct CustomIndex<const I: usize>;

impl<T, const I: usize, const N: usize> IsValidIndex<[T; N]> for CustomIndex<I> {
    const RESULT: () = assert!(N > I, "Index is out of bounds!");
}

This compiles without any trouble, and now all we have to do is integrate it into the index method…

impl<T, const N: usize, const I: usize> Index<CustomIndex<I>> for [T; N] {
    type Output = T;

    fn index(&self, _: CustomIndex<I>) -> &Self::Output {
        let _ = <CustomIndex<I> as IsValidIndex<Self>>::RESULT;
        unsafe { &*(self.as_ptr().add(I) as *const T) }
    }
}

Now we’ll try to compile this program:

fn main() {
    // notice that `x` is an array reference now
    let x = &[1u8, 2, 4, 8, 16, 32];
    let y = x[CustomIndex::<7>];
    println!("y = {}", y);
}

We get a compilation error! Our check worked.

error[E0080]: evaluation of `<CustomIndex<7> as IsValidIndex<[u8; 6]>>::RESULT` failed
  --> src/bin/blogtest.rs:18:24
   |
18 |     const RESULT: () = assert!(N > I, "Index is out of bounds!");
   |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the evaluated program panicked at 'Index is out of bounds!', src/bin/blogtest.rs:18:24

If we were to change CustomIndex::<7> to 7, the program would compile but crash at runtime.

We can go through this whole process again to create a better system for slicing. I won’t reiterate all the concepts involved, but here’s a quick and easy implementation:

trait IsValidRangeIndex<Target> {
    const RESULT: ();
}

pub struct CustomRangeIndex<const START: usize, const LENGTH: usize>;

impl<T, const START: usize, const LENGTH: usize, const N: usize> IsValidRangeIndex<[T; N]> for CustomRangeIndex<START, LENGTH> {
    const RESULT: () = assert!(N >= START + LENGTH, "Ending index is out of bounds!");
}

impl<T, const START: usize, const LENGTH: usize, const N: usize> Index<CustomRangeIndex<START, LENGTH>> for [T; N] {
    type Output = [T; LENGTH];

    fn index(&self, _: CustomRangeIndex<START, LENGTH>) -> &Self::Output {
        let _ = <CustomRangeIndex<START, LENGTH> as IsValidRangeIndex<Self>>::RESULT;
        unsafe { &*(self.as_ptr().add(START) as *const [T; LENGTH])}
    }
}

Okay, maybe I lied about it being “quick and easy”, but it’s probably about as simple as you can get. By using this, you can write the following program, which will compile and run successfully:

fn main() {
    let x = &[0x01, 0x03, 0x00, 0x00, 0x88, 0x77, 0x66, 0x55];
    let y = u16::from_le_bytes(x[CustomRangeIndex::<4, 2>]);
    assert_eq!(y, 0x7788);
}

You won’t win any code golf contests with this, but I think using something like CustomRangeIndex is a lot nicer than sprinkling .try_into().unwrap() everywhere.

Design constraint #3: Safe and sound

The nice thing about compile-time-checks is that as long as we do them correctly, we don’t really have to worry about creating unsound code. By design, our custom indexing system is 100% safe and perfectly sound. Both of these guarantees come from the compile-time checking - we won’t compile code that could access data out of bounds, and there’s no way to bypass the checks and create undefined behavior.

Example 2: Enforcing data alignment requirements

My team’s eCTF firmware made extensive use of the microcontroller’s embedded EEPROM as a persistent data store. Like many pieces of hardware, though, this EEPROM had some particular requirements that we needed to respect:

  • All reads and writes had to be at 4-byte aligned addresses (0x0, 0x4, 0x8, etc, all the way up to 0x800.)
  • All read and write sizes had to be of multiples of 4 bytes. Reading or writing 5 bytes, for example, was not allowed - 8 was the next valid size after 4.

Both of these problems were addressed using the same checker trait technique. For example, to prevent ourselves from accidentally reading or writing an unaligned data type, I created an IsEEPROMCompatible trait that would produce a compilation error if the implementing type was not properly aligned:

trait IsEEPROMCompatible {
    const RESULT: ();
}

impl<T: Sized> IsEEPROMCompatible for T {
    const RESULT: () = {
        if core::mem::size_of::<T>() % 4 != 0 {
            panic!("the size of this type is not a multiple of the EEPROM word size (4 bytes)");
        }
    };
}

We could then use it in our EEPROM interaction code and be secure in the knowledge that any incorrect interaction would not compile!

impl<Inner> EEPROMVar<Inner>
where
    Inner: Sized + PartialEq,
{
    pub fn new<const ADDRESS: u32>() -> Self {
        // Safety check #1: ensure the data type is compatible with
        // being read from/written to EEPROM, i.e., its size is
        // a multiple of 4 bytes.
        let _ = <Inner as IsEEPROMCompatible>::RESULT;

        // ...
    }
}

Coincidentally, implementing this check highlighted a previously unnoticed bug in our code: we were, in fact, trying to do an unaligned read from EEPROM, which would have otherwise failed at runtime.

Conclusions

While Rust is certainly a significant improvement over older languages (such as C) in terms of safety and developer experience, the language lends itself to far more than what is commonly advertised. This is also where the language’s relative immaturity begins to become more obvious, as awkward workarounds are necessary to deal with language limitations. (Try to implement Index with a custom index type for both [T] and [T; N]. You can’t, because an implementation for [T; N] is automatically generated if an implementation for [T] exists, and it’s seemingly impossible to override it.)

Despite these difficulties, when all the pieces fall into place it’s quite remarkable what can be achieved with the language’s powerful type system and constant evaluation mechanism. As the language matures and features such as const generics are (hopefully) improved upon, I look forward to seeing what new tricks can be implemented to help make writing good, safe code even easier.