Rust: iter() vs into_iter()

TLDR

  • The iterator returned by into_iterator() can yield a T, &T, or &mut T, depending on the context, normally T unless there's some other circumstance.
  • The iterator returned by iter() will yield a &T by convention.
  • The iterator returned by iter_mut() will yield &mut T by convention.

WTF is into_iter()?

into_iter() comes from the IntoIterator trait, which you implement when you want to specify how a particular type gets converted into an iterator. Notably, if you want a type to be usable in a for loop, you must implement into_iter() for the type.

As an example, Vec<T> implements IntoIterator three times:

impl<T> IntoIterator for Vec<T>
impl<'a, T> IntoIterator for &'a Vec<T>
impl<'a, T> IntoIterator for &'a mut Vec<T>

Each of these is slightly different. The first one consumes the Vec and yields its T values directly.

The other two take the Vec by reference and yield immutable and mutable references of type T.

Yeah okay cool, so what's the difference though?

into_iter() is a generic method to obtain an iterator, and what this iterator yields (values, immutable references, or mutable references) is context dependent, and can sometimes be something you aren't expecting.

iter() and iter_mut() have return types independent of the context, and conventionally return immutable and mutable references respectively.

This is best shown with examples, so code blocks incoming:

#[test]
fn iter_demo() {
    let v1 = vec![1, 2, 3];
    let mut v1_iter = v1.iter();

    // iter() returns an iterator over references to the values
    assert_eq!(v1_iter.next(), Some(&1));
    assert_eq!(v1_iter.next(), Some(&2));
    assert_eq!(v1_iter.next(), Some(&3));
    assert_eq!(v1_iter.next(), None);
}

#[test]
fn into_iter_demo() {
    let v1 = vec![1, 2, 3];
    let mut v1_iter = v1.into_iter();

    // into_iter() returns an iterator over owned values in this particular case
    assert_eq!(v1_iter.next(), Some(1));
    assert_eq!(v1_iter.next(), Some(2));
    assert_eq!(v1_iter.next(), Some(3));
    assert_eq!(v1_iter.next(), None);
}

#[test]
fn iter_mut_demo() {
    let mut v1 = vec![1, 2, 3];
    let mut v1_iter = v1.iter_mut();

    // iter_mut() returns an iterator over mutable references to the values
    assert_eq!(v1_iter.next(), Some(&mut 1));
    assert_eq!(v1_iter.next(), Some(&mut 2));
    assert_eq!(v1_iter.next(), Some(&mut 3));
    assert_eq!(v1_iter.next(), None);
}

Pointers in C

Preface

Pointers were one of the most confusing for me while learning C and C++, so here's a little reminder for myself on how they work. Some of the syntax in this post may not compile under a C++ compiler, as C++ isn't a strict superset of C, but the general idea should still apply.

What is a pointer?

A pointer is a number that represents an address in memory where some sort of object is stored. A pointer can also have a value of zero, sometimes called a null pointer, which indicates the pointer points to nothing.

Where does the asterisk go?

// Neat trick to avoid refactoring in the future,
// when calculating the size of the type in malloc, use
// the dereferenced variable name. If the type is
// changed in the future everything will still compile as intended.
int* option1 = malloc(sizeof(*option1));

int * option2 = malloc(sizeof(*option2));

int *option3  = malloc(sizeof(*option3));

All of these are legal syntax. I've seen the first and last syntax before, but I don't know anybody who puts the "*" as it's shown in option two (you're kind of a monster if you do). I believe option 3 is what is suggested by K&R, with the logic being that it reflects how you would use the dereference operator to get what's pointed to. While I understand this idea, I still prefer to use option one, as it makes it much more clear that the variable type is a pointer.

Using const with pointers

The const modifier is "left associative", which means that it binds to whatever is to its left. The exception to this is when there's nothing to its left, where instead it will bind to the right. I choose to always bind to the left, as it makes reading const pointers to const objects easier (pointer three in the following code). In general, you can read the type of the variable from right to left to understand what it is.

// the pointer can point anywhere in memory,
// but you can't modify the data at the address
// "pointer to constant object"
int const* one = malloc(sizeof(*one));

// the pointer can only ever point to this one memory address,
// but I can change the data there
// "constant pointer to object"
int* const two = malloc(sizeof(*two));

// the pointer can only point at this one memory address,
// and I can't modify the data there
// "constant pointer to constant object"
int const* const three = malloc(sizeof(*three));