59

In Learning Rust With Entirely Too Many Linked Lists, the author mentions:

However, if we have a special kind of enum:

enum Foo {
    A,
    B(ContainsANonNullPtr),
}

the null pointer optimization kicks in, which eliminates the space needed for the tag. If the variant is A, the whole enum is set to all 0's. Otherwise, the variant is B. This works because B can never be all 0's, since it contains a non-zero pointer.

I guess that the author is saying that (assuming A is 4 bits, and B is 4 bits)

let test = Foo::A

the memory layout is

0000 0000

but

let test = Foo::B

the memory layout is

some 8 bit non 0 value

What exactly is optimized here? Aren't both representation always 8 bits What does it mean when the author claims

It means &, &mut, Box, Rc, Arc, Vec, and several other important types in Rust have no overhead when put in an Option

0

4 Answers 4

85

The null pointer optimization basically means that if you have an enum with two variants, where one variant has no associated data, and the other variant has associated data where the bit pattern of all zeros isn't a valid value, then the enum itself will take exactly the same amount of space as that associated value, using the all zeroes bit pattern to indicate that it's the other variant.

In other words, this means that Option<&T> is exactly the same size as &T instead of requiring an extra word.

3
  • I understand that. But how does the compiler know if all-zeros is an invalid value? I assume the optimization only kicks in for specific built-in types. If so, which ones? Commented Dec 14, 2019 at 13:22
  • 17
    The compiler has built-in knowledge about the memory layout of various types. For example, it knows that &-references can never be null. It also knows that String and Vec can never be all zeroes; following this down into the implementation, String is backed by Vec, which is backed by RawVec, which is backed by Unique, which contains a *const T but has a compiler attribute that declares that it can't be null. Similarly there's a stdlib NonNull<T> type that acts like a *mut T that can never be null. Commented Dec 15, 2019 at 21:42
  • 12
    To elaborate, empty strings and vectors don't point to null, they point to a fixed nonnull address with zero capacity. Same thing with other containers like HashMaps that can be cheaply created. The Rust stdlib tries very hard to avoid nulls in pointers specifically so the all-zeroes value can be reserved for things like the null pointer optimization. Commented Dec 15, 2019 at 21:43
24

enum is a tagged union. Without optimization it looks like

Foo::A;    // tag 0x00 data 0xXX
Foo::B(2); // tag 0x01 data 0x02

The null pointer optimization removes the separate tag field.

Foo::A;    // tag+data 0x00
Foo::B(2); // tag+data 0x02
1
  • 16
    The second example seems to be a little bit inaccurate, taking into account that 0x00 is a valid bit pattern for an integer, thus it's ambiguous meaning both Foo::A and Foo::B(0).
    – mvlabat
    Commented Oct 31, 2019 at 11:06
13

I m also learning too many linked list, perhaps this code snippet can deepen your understanding

pub enum WithNullPtrOptimization{
    A,
    B(String),
}

pub enum WithoutNullPtrOptimization{
    A,
    B(u32),
}

fn main()  {
    println!("{} {}", std::mem::size_of::<WithNullPtrOptimization>(), std::mem::size_of::<String>()); // 24 24
    println!("{} {}", std::mem::size_of::<WithoutNullPtrOptimization>(), std::mem::size_of::<u32>()); // 8 4
}

it outputs something along the lines of:

24 24
8 4
0

You can find an explanation and a list of types for which this optimization is applied in the module level documentation of std::option under Representation:

Rust guarantees to optimize the following types T such that [Option<T>] has the same size, alignment, and function call ABI as T. In some of these cases, Rust further guarantees that transmute::<_, Option<T>>([0u8; size_of::<T>()]) is sound and produces Option::<T>::None. These cases are identified by the second column:

T transmute::<_, Option<T>>([0u8; size_of::<T>()]) sound?
Box<U> (specifically, only Box<U, Global>) when U: Sized
&U when U: Sized
&mut U when U: Sized
fn, extern "C" fn1 always
num::NonZero* always
ptr::NonNull<U> when U: Sized
#[repr(transparent)] struct around one of the types in this list. when it holds for the inner type

1: this remains true for any argument/return types and any other ABI: extern "abi" fn (e.g., extern "system" fn)

Under some conditions the above types T are also null pointer optimized when wrapped in a Result.

This is called the "null pointer optimization" or NPO.

It is further guaranteed that, for the cases above, one can mem::transmute from all valid values of T to Option<T> and from Some::<T>(_) to T (but transmuting None::<T> to T is undefined behavior).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.