I was working on the Advent of Code (day 7) today, which I’m using to learn a bit more about Rust, and while I was able to quickly solve the problem with python (thanks to the fact that python allows references to everything everywhere), I had a hard time solving the problem with Rust, thanks to the inability to keep around references to a “parent” in the tree. This inconvenience is by design: Rust tries to ensure memory safety by forbidding you from doing things that might potentially be unsafe.

Stonewalled by a lot of error[E0106]: missing lifetime specifier errors, I decided to dig a bit deeper into this problem – could we adjust the code in such a way that we could keep track of all of the things that we needed, but still be “safe” from a Rust perspective.

So, what we’d like to have is the following:

First, a filesystem, which contains other file-system objects, a list of files, and their sizes in the local filesystem, and a reference to their parent, so we can implement the cd .. command. We need to be able to add files while we’re traversing the tree (hence why everything here is mutable).

struct FileSystem {
    name: String,
    directories: Vec<&mut FileSystem>,
    files: Vec<(String, usize)>,
    parent: Option<&mut FileSystem>,
}

First, the constructor. It seemed initially like this was pretty easy. We just initialized the objects.

impl FileSystem {

    // Constructor
    pub fn new(name: String, parent: Option<&mut FileSystem>) -> FileSystem {
        return FileSystem {
            name: name,
            directories: Vec::new(),
            files: Vec::new(),
            parent: parent,
        };
    }

Next, several key commands:

impl FileSystem {

    // Compute the size of the current directory
    pub fn size(&self) -> usize {
        let mut size = 0;
        for (_name, s) in self.files.iter() {
            size += s;
        }
        for &mut directory in self.directories {
            size += directory.size();
        }
        return size;
    }

    // Add a file
    pub fn add_file(&mut self, name: &str, size: usize) {
        self.files.push((file_name.to_string(), size));
    }

    // Add a directory
    pub fn add_directory(&mut self, name: &str) {
        self.directories.push(FileSystem::new(file_name.to_string()), self);
    }

    // Get a directory
    pub fn get_directory(&mut self, name: &str) -> Option<&mut FileSystem> {
        for &mut directory in self.directories {
            if directory.name == name {
                return Some(&mut directory);
            }
        }
        return None;
    }

}

If we stitch this together into a simple program:

fn main() {
    // Create a root file system
    let mut root = FileSystem::new("/".to_string(), None);

    // Add a file
    root.add_file("README.md", 1024);

    // Add a directory
    root.add_directory("src");

    // Add a file to the src directory
    let mut src = root.get_directory("src").unwrap();
    src.add_file("main.rs", 2048);

    // Compute the size of the root directory
    println!("Size of root directory: {}", root.size());
}

We can compile this, and get our first set of errors!

error[E0106]: missing lifetime specifier
 --> ./part-1-tree.rs:8:22
  |
8 |     directories: Vec<&mut FileSystem>,
  |                      ^ expected named lifetime parameter
  |
help: consider introducing a named lifetime parameter
  |
6 ~ struct FileSystem<'a> {
7 |     name: String,
8 ~     directories: Vec<&'a mut FileSystem>,
  |

error[E0106]: missing lifetime specifier
  --> ./part-1-tree.rs:10:20
   |
10 |     parent: Option<&mut FileSystem>,
   |                    ^ expected named lifetime parameter
   |
help: consider introducing a named lifetime parameter
   |
6  ~ struct FileSystem<'a> {
7  |     name: String,
8  |     directories: Vec<&mut FileSystem>,
9  |     files: Vec<(String, usize)>,
10 ~     parent: Option<&'a mut FileSystem>,
   |

error: aborting due to 2 previous errors

This is rust telling us that it’s possible that the references we’re keeping to the directories within a FileSystem could go out of scope without us knowing about it. Similarly, the references to the parents could go out of scope as well.

The directories are the easy part of this. Since we don’t necessarily care about symlinks (i.e. linking to other directories in the filesystem), we can update the code so that each FileSystem owns it’s own directories. To do this, we can update the code to go from directories: Vec<&mut FileSystem> to directories: Vec<FileSystem> and update the code accordingly:

struct FileSystem {
    name: String,
    directories: Vec<FileSystem>,
    files: Vec<(String, usize)>,
}

impl FileSystem {
    pub fn new(name: String) -> FileSystem {
        return FileSystem {
            name: name,
            directories: Vec::new(),
            files: Vec::new(),
        };
    }

    // Add a directory
    pub fn add_directory(&mut self, name: &str) {
        self.directories.push(FileSystem::new(name.to_string()));
    }

    // Get a directory
    pub fn get_directory(&mut self, name: &str) -> Option<&mut FileSystem> {
        for directory in self.directories.iter_mut() {
            if directory.name == name {
                return Some(directory);
            }
        }
        return None;
    }
}

This gives us a working filesystem object, but we’ve lost two key components. The parent pointer, and the ability to link two directories together. Unfortunately, the best option in rust is to keep track of the parent pointer in any traversal we’re doing. For example, we could keep track of a stack of directories, and each time we do a traversal, we could push and pop elements onto the stack. A simple implementation of this might look like the following:

fn main() {
    // Read the input file from the command line
    let filename = env::args().nth(1).expect("No filename given");
    let contents = fs::read_to_string(filename).expect("Could not read file");

    // Create a root file system
    let mut root = FileSystem::new("/".to_string());
    let mut traversal_stack: Vec<&mut FileSystem> = Vec::new();

    traversal_stack.push(&mut root);

    // Parse the input file
    for line in contents.lines() {
        print_path(&traversal_stack);

        // If the line is a command, execute it (Commands are of the form $ <command> args)
        if line.starts_with("$") {
            let command = line.split_whitespace().nth(1).unwrap();
            // Print the command
            println!("Command: {}", command);
            match command {
                "mkdir" => {
                    let name = line.split_whitespace().nth(2).unwrap();
                    traversal_stack.last_mut().unwrap().add_directory(name);
                }
                "cd" => {
                    let name = line.split_whitespace().nth(2).unwrap();
                    if name == ".." {
                        if traversal_stack.len() > 1 {
                            traversal_stack.pop();
                        }
                    } else if name == "/" {
                        // Unwind the traversal stack to the root
                        while traversal_stack.len() > 1 {
                            traversal_stack.pop();
                        }
                    } else {
                        // Oh No!
                        let new_dir = traversal_stack.last_mut().unwrap().get_directory(name);
                        if let Some(new_dir) = new_dir {
                            traversal_stack.push(new_dir);
                        }
                    }
                }
                "touch" => {
                    let name = line.split_whitespace().nth(2).unwrap();
                    let size = line
                        .split_whitespace()
                        .nth(3)
                        .unwrap()
                        .parse::<usize>()
                        .unwrap();
                    traversal_stack.last_mut().unwrap().add_file(name, size);
                }
                "ls" => {
                    let directory = traversal_stack.last_mut().unwrap();
                    for directory in directory.directories.iter() {
                        println!("{} {}", directory.name, directory.size());
                    }
                    for (name, size) in directory.files.iter() {
                        println!("{} {}", name, size);
                    }
                }
                _ => {
                    println!("Unknown command: {}", command);
                }
            }
        }
    }

    // Print the size of the root directory
    println!("Size of root directory: {}", root.size());
}

But wait! We get a new error!!!

error[E0502]: cannot borrow `traversal_stack` as mutable because it is also borrowed as immutable
   --> ./part-1-tree.rs:127:37
    |
105 |                         let new_dir = traversal_stack
    |                                       --------------- immutable borrow occurs here
...
127 |                     let directory = traversal_stack.last_mut().unwrap();
    |                                     ^^^^^^^^^^^^^^^
    |                                     |
    |                                     mutable borrow occurs here
    |                                     immutable borrow later used here

Without the changing directory (i.e. everything the traversal stack is useful for…), this code works, but the error illustrates a key problem with how Rust handles graph structures. In order to update the structure, you have to borrow the whole traversal stack (as either mutable or immutable), so we borrow it as mutable, but we can’t release it, since the mutable pointer to the new directory has the same lifecycle as the traversal stack! Even though these two objects aren’t related, Rust doesn’t know that the traversal stack and the directory are completely unrelated. Thus, we need to think of a new way to solve this problem.

How I originally solved this problem for the day of the advent of code was by using integer indices into arrays to keep track of the objects. This solves the problem, since we’ve basically implemented pointers on our own, instead of using Rust’s borrowing mechanisms, but it goes completely against the ethos of Rust, since we’ve just created a situation where unsafe dereferences of integers (our makeshift pointers), could introduce unknown effects. Yes, for advent of code this doesn’t really matter… but what if we were working on something more complicated?