Rust for the Novice Programmer/Basics of references and borrowing

Basics of References and Borrowing

edit

Let's say we have a vector of integers and we want to find the mode(the number that shows up the most times) of it. The biggest issue is that we have to store the number of times each number shows up so that at the end we are sure we have the most common one. What we want is called a map. A map allows us to go from a key to a value. In this case, the key is the number and the value is the count of how many times it has shown up. Here, we will use an array as a rudimentary map. An array kind of functions as a map since it uses indices as keys and the value inside the array is the value. Here's the function:

 fn compute_mode(input: Vec<i32>) -> i32 {
     //compute count of numbers
     let mut number_count = [0; 100];
     for index in 0..input.len() {
         let number = input[index];
         number_count[number as usize] += 1;
     } 
     
     //work out which one is the highest
     let mut highest_count = 0;
     let mut highest_index = 0;
     for index in 0..number_count.len() {
         let count = number_count[index];
         if count > highest_count {
             highest_count = count;
             highest_index = index;
         }
     }
     return highest_index as i32;
 }

There are quite a few interesting things to go through here. First, the [0; 100] syntax indicates that we are initialising the array to 100 0s. It is quite convenient shorthand. Next, what is usize? Essentially, different computers use a different number of bits to refer to array indices. To reflect this, we can use usize as a type that conforms to whatever computer type the user is using. It is unsigned since we can't have negative indices. All indexing of arrays uses usizes, the index that comes from the 'for index in 0..input.len()' is also usize since input.len() returns a usize. There is also a corresponding isize data type but it is used quite a bit less and is most useful for offsets of indices where you might have a negative offset.

There are some significant disadvantages to using an array as a map:

  1. The numbers that go in must be positive, since usize which is used to index them is positive.
  2. The numbers that go in must fit within the size of the array. Here we set the size of the array to 100, you could set it higher but that will use up more memory
  3. There could be quite a bit of wasted memory since for the vectors we are using, most of the numbers will be left as 0 but will still be taking up space.

There are solutions to this problem but for now it's good to be aware of the shortcomings and it is sufficient for what we are doing right now.

Now we use this function and we get:

 fn main() {
     let numbers = vec![3, 5, 6, 1, 3, 6, 3, 5];
     let mode = compute_mode(numbers);
     
     println!("{}", mode);
 }

We get 3, which is what we expect. Let's say we also wanted to print out the 4th value(index 3) out of the vector as well:

 fn main() {
     let numbers = vec![3, 5, 6, 1, 3, 6, 3, 5];
     let mode = compute_mode(numbers);
     
     println!("{}", mode);
     println!("{}", numbers[3]);
 }

However, if we try to run this, now we get this error? 'borrow of moved value: `numbers`' What does this mean?

This is one of Rust's fundamental concepts, the idea of ownership and borrowing. Essentially a variable can only be owned in one place at a time. In this example, by calling the function compute_mode(numbers), we pass ownership to that function and can no longer use that variable anymore afterwards. This is an extremely important idea to understand.

However, what's the best solution to our problem? Does every function need to own a variable to be able to use it? The answer is no, and it lies in something called borrowing. It is what you might expect borrowing, the function borrows the variable and gives it back after it's finished using it. Borrowing is done with what are called references. A reference is an addition to the normal types with an ampersand, & at the start. For example, a &i32 is a reference to an i32 integer. If you have a reference to a variable, you cannot change/mutate the variable unless it is a mutable reference. A mutable reference would be written as &mut i32 would give a mutable reference to an i32 integer.

For our compute_mode function, we don't need to modify the vector so we only need to change it to:

 fn compute_mode(input: &Vec<i32>) -> i32 {
     //the inside of the function remains the same
 }

Then in our main function, we do:

 fn main() {
     let numbers = vec![3, 5, 6, 1, 3, 6, 3, 5];
     let mode = compute_mode(&numbers);
     
     println!("{}", mode);
     println!("{}", numbers[3]);
 }

Note that we have to specify that we are taking a reference to numbers when calling the function. Now this runs without any errors!

Next: Strings