5

Learning to Fly: Let's simulate evolution in Rust! (pt 3)

 3 years ago
source link: https://pwy.io/en/posts/learning-to-fly-pt3/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Learning to Fly: Let's simulate evolution in Rust! (pt 3)

This is third part of the Learning to Fly series in which we’re coding a simulation of evolution using neural network and genetic algorithm:

In the previous post, we implemented a rudimentary feed-forward neural network that could propagate numbers through its randomized layers - it was the first milestone in our effors to create a functioning brain.

Randomness can get us only so far, though - for the most part, evolution is about making small, incremental changes to the system, so that it gets better over time; so that our brains, from entirely haywire~ish, begin to gather knowledge and function as we expect from them (i.e. just catch the food, birdies!).

But how can we train a bunch of Vec<f32>?

Plan

Plan for today is straightforward: we’ll learn how a genetic algorithm works by implementing one in Rust. We’ll examine in-depth how selection, crossover, and mutation all play together, allowing for a computer to find complex answers seemingly out of thin air.

We’ll try to remain generic, meaning that instead of hard-coding a particular selection or crossover algorithm, we’ll use traits to create a versatile library that could be even published to crates.io!

As in the previous part, so today we’ll investigate various intricacies of Rust’s syntax, focusing a lot on the terminology.

Hopefully, by the end of this post you’ll be able to say: I could have implemented it on my own!

Introduction

First, let’s recap how does a genetic algorithm work and what’s the point of this venture.

Our problem is that we’ve got an object - a neural network - that’s defined by a whole lot of parameters. There are so many of them that, even for the smallest of networks, we couldn’t possibly brute-force all of their combinations in our lifetime.

As you might remember, all the possible parameters in general are called a search space; so an erudite would say that our problem’s search space is huuuuge and then run away.

What we can do is to kinda mimic the nature: if we started with a bunch of random suboptimal solutions, we could try improving them to get - over time - gradually better answers.

One of the methods that allows to simulate all this evolutionary machinery is the eponymous genetic algorithm: it starts with a bunch of random solution-candidates (a population), which are then improved using mutation and crossover, utilizing a fitness function to evaluate solutions (individuals) found so far:

intro 1
Figure 2. Overview of the genetic algorithm; starting from top and and going clockwise: (1) estimating current solutions with a fitness function, (2) performing crossover and mutation, (3) starting the entire proces over on the new, improved population

Because genetic algorithm involves working on random numbers, it’s an example of a probabilistic method.

Probabilistic algorithms trade accuracy for performance - they don’t always return the best answers, but usually get Pretty Close Pretty Cheaply ®.

Vegetable manufacturers don’t want you to know this, but there exists a straightforward procedure to becoming a carrot tycoon that’s actually based on a genetic algorithm:

10  go to your garden
20  sow a few random carrots
30  wait for those carrots to sprout
40  choose the best carrot-children and sow them
50  goto 30

in this world:
- population = carrots
- individual = carrot
- mutation & crossover = happen automatically (free labor!)
- fitness function = your eyes & brain

By now, most of those words ought to sound familiar to you - we’ve already gone through the basics of evolutionary computation in the first article; by the end of this article, you’ll have answered questions such as:

  • but how do you choose the individuals? there must be like a thousand ways to do it!
    (psst: oh my yes, there are)

  • but how do you represent their genomes? there must be like a thousand ways to do it!
    (psst: oh my yes, there are)

  • but how do you implement it in Rust? you promised it’ll work inside a web browser!
    (psst: oh my yes, it will)

Coding: Outline

We’ll start by creating a second crate inside our workspace:

$ cd shorelark/libs
$ cargo new genetic-algorithm --lib

Oh, that’s some nice genetic-algorithm/src/lib.rs Cargo created for us in there - let’s replace it with a just as good entry point:

pub struct GeneticAlgorithm;

impl GeneticAlgorithm {
    pub fn new() -> Self {
        Self
    }
}

Our genetic algorithm, as all good objects do, will provide only one functionality - sometimes it’s called iterate, sometimes it’s called step or process; I’ve tossed a coin and decided on:

impl GeneticAlgorithm {
    /* ... */

    pub fn evolve(&self) {
        todo!()
    }
}

What are we evolving? A population, of course!

impl GeneticAlgorithm {
    pub fn evolve(&self, population: &[???]) -> Vec<???> {
        todo!()
    }
}

Our actual problem will depend on neural networks, but since we want for this library to be generic, we can’t force it to accept a hard-coded NeuralNetwork - instead, we can introduce a type parameter:

impl GeneticAlgorithm {
    pub fn evolve<I>(&self, population: &[I]) -> Vec<I> {
        todo!()
    }
}

I stands for individual.

As for the Rust’s terminology:

// visibility  generics   _ function parameters
// |          _|     ____|  (or just "parameters")
// |         |      |
// v-v       v-----vv----------v
   pub fn foo<'a, T>(bar: &'a T) { /* ... */ }
//            ^^  ^  ^--------^
//            |   |  |
//            |   |  function parameter
//            |   |  (or just "parameter")
//            |   type parameter
//            lifetime parameter

If you wanted to read this signature aloud, you’d say:

public function foo is generic over lifetime a and type T, and it accepts a single parameter named bar that is a reference to T.

That was function’s definition - on the other hand, the place where you invoke a function is named call site and the values you specify there are called arguments:

// v-----------------------v call site
   foo::<'static, f32>(&1.0);
//       ^-----^  ^-^  ^--^
//       |        |    |
//       |        |    function argument
//       |        |    (or just "argument")
//       |        type argument
//       lifetime argument

Most of this vernacular (e.g. the difference between argument and parameter) is universal across all the programming languages, so it’s worth remembering.

Learning from past mistakes, let’s not forget about preconditions:

pub fn evolve<I>(&self, population: &[I]) -> Vec<I> {
    assert!(!population.is_empty());

    /* ... */
}

As for the algorithm itself - the outline is:

pub fn evolve<I>(&self, population: &[I]) -> Vec<I> {
    /* ... */

    (0..population.len())
        .map(|_| {
            // TODO selection
            // TODO crossover
            // TODO mutation
        })
        .collect()
}

Coding: Selection

At this point, inside the loop, we have to pick two individuals - they will become parents and "create" us a digital offspring.

(imagining that chip made of sand can create "offspring" hits my uncanny valley right in the feels.)

Choosing individuals is called the selection stage of a genetic algorithm, and it should satisfy following two properties:

  • each individual should have a non-zero chance of being picked,

  • an individual with a higher fitness score should get picked, on average, more often than an individual with a lower fitness score.

Because we’ll have to know fitness scores, let’s start off by thinking how we want users to specify their fitness function; as trivial as it sounds, there are at least two exceptive approaches we can apply:

  1. Fitness function as a parameter:

    pub fn evolve<I>(
        &self,
        population: &[I],
        evaluate_fitness: &dyn Fn(&I) -> f32,
    ) -> Vec<I> {
        /* ... */
    }
  2. Fitness score as a property of an individual:

    pub trait Individual {
        fn fitness(&self) -> f32;
    }
    
    pub fn evolve<I>(&self, population: &[I]) -> Vec<I>
    where
        I: Individual,
    {
        /* ... */
    }

First approach:

  • ✅ allows to provide many different fitness functions for one kind of an individual, which might prove to be useful for somebody (not us, though),

  • ❌ requires specifying fitness function for each invocation of .evolve(), which feels a bit shoehorned.

Second approach:

  • ✅ allows to encapsulate all the individual-oriented attributes into a single trait, making it easy for users to discover what they need to provide,

  • ❌ specifying different fitness functions is possible, but a bit more tricky (can you figure out how and why?).

My guts vote 133:7 for introducing a trait (a trait that, as you’ll see later, we would need anyway), so a trait it is.

As for the selection method, we’ll use an algorithm called fitness proportionate selection (also known as roulette wheel selection), as it’s easy to reason about; to understand how it works, let’s imagine we have following three individuals:

Individual Fitness score Fitness score %

3 / (1 + 2 + 3) = 3 / 6 = 50%

2 / (1 + 2 + 3) = 2 / 6 = 33%

1 / (1 + 2 + 3) = 1 / 6 = 16%

If we placed them all on a roulette wheel - or a pie chart, for all it matters - with each individual getting a slice of wheel as large as proportion of their fitness score to the rest:

coding selection 1
Figure 3. A pie chart - or a roulette wheel, if you squeeze your eyes enough - illustrating individuals from the table above

... randomizing an individual would boil down to "spinning" the wheel with random force and seeing what came up:

coding selection 2

In practice, fitness proportionate selection is rather frowned upon - it’s because it allows for the best individuals to dominate the simulation.

Say, your genetic algorithm finds a solution that’s exponentially better than the rest:

coding selection 3

... when it happens, fitness proportionate selection will happily choose this green solution 99% of the time, making rest of the individuals an army of copy-pasted, green clones.

You might think:

isn’t finding the best solution, like, the whole point?

... it is; but it’s important to remember that solutions found by genetic algorithm are always best to date - if you discard a seemingly unpromising candidate too early, you will never know whether tuning some parameter wouldn’t make it an even better solution in the long run.

To put it another way:

the more diverse humans you have, the greater chance one of them is a trombone prodigy

For simplicity, we’ll continue with the roulette wheel selection - but should you feel frisky, I’ll just say that rank selection is example of an algorithm that doesn’t exhibit this dominating behavior, and it will work with our birdies, too!

Living up to the generic-ness promise, instead of hard-coding our library to always use roulette wheel selection, let’s create a trait - this way users will be able to use any algorithm they fancy:

pub trait SelectionMethod {
    fn select(&self);
}

A selection method has to have access to the entire population:

pub trait SelectionMethod {
    fn select<I>(&self, population: &[I]) -> &I
    where
        I: Individual;
}

For clarity, let’s annotate the output’s lifetime:

pub trait SelectionMethod {
    fn select<'a, I>(&self, population: &'a [I]) -> &'a I
    where
        I: Individual;
}

We’re going to need random numbers any minute now, so let’s add rand to libs/genetic-algorithm/Cargo.toml:

# ...

[dependencies]
rand = "0.8"

[dev-dependencies]
rand_chacha = "0.3"

Each crate in a workspace has its own set of dependencies - the rand we’ve previously added to libs/neural-network/Cargo.toml is not automatically visible in any other crate in the workspace.

Learning from our past troubles with thread_rng(), let’s already pass the pseudo-random number generator via a parameter:

use rand::RngCore;

pub trait SelectionMethod {
    fn select<'a, I>(
       &self,
       rng: &mut dyn RngCore,
       population: &'a [I],
    ) -> &'a I
    where
        I: Individual;
}

Ain’t that a beautiful signature?

You might be wondering why we don’t take a step further and make select() generic over the PRNG, too:

pub trait SelectionMethod {
    fn select<'a, R, I>(
       &self,
       rng: &mut R,
       population: &'a [I],
    ) -> &'a I
    where
        R: RngCore,
        I: Individual;
}

First, let’s catch up on the vernacular - in general:

  1. dyn Trait, &dyn Trait and &mut dyn Trait imply dynamic dispatch,

  2. T, &T and &mut T imply static dispatch.

Dispatching is the way compiler answers the question "which method should get called here, exactly?" for generic types:

fn foo() {
   bar();

   // ^ compiling this call is easy, because it always transfers
   // control into `bar`
}

fn bar() {
   println!("yas queen");
}

fn method(obj: &dyn SomeTrait) {
    obj.method();

    // ^ compiling this call is harder, because there's no single
    // function this could refer to - each implementation of the
    // trait might provide its own `fn method()`
    //
    // in general, this is called *polymorphism*
}

For the sake of an example, let’s consider this trait with its two implementations:

trait Animal {
    fn kind(&self) -> &'static str;
}

// --

struct Chinchilla;

impl Animal for Chinchilla {
    fn kind(&self) -> &'static str {
        "chinchilla"
    }
}

// --

struct Viscacha;

impl Animal for Viscacha {
    fn kind(&self) -> &'static str {
        "viscacha"
    }
}

If you wanted to create a function that prints kind of any animal, you could do it twofold:

// Uses static dispatch
// (aka static polymorphism)
fn print_kind_static<A>(animal: &A)
    where A: Animal
{
    println!("{}", animal.kind());
}

// Uses dynamic dispatch
// (aka dynamic polymorphism, aka runtime polymorphism)
fn print_kind_dynamic(animal: &dyn Animal) {
    println!("{}", animal.kind());
}

fn main() {
    print_kind_static(&Chinchilla);
    print_kind_static(&Viscacha);

    print_kind_dynamic(&Chinchilla);
    print_kind_dynamic(&Viscacha);
}

From a distance, both function look alike - what’s the difference, then?

print_kind_static() uses a technique called monomorphization - meaning that for each Animal this function is invoked with, compiler transparently generates a dedicated, "copy-pasted" version of that function:

fn print_kind_static__chinchilla(animal: &Chinchilla) {
    println!("{}", Chinchilla::kind(animal));
}

fn print_kind_static__viscacha(animal: &Viscacha) {
    println!("{}", Viscacha::kind(animal));
}

fn main() {
    print_kind_static__chinchilla(&Chinchilla);
    print_kind_static__viscacha(&Viscacha);
}

At this point you can see why it’s called static dispatch - underneath, compiler replaces dynamic traits with static types.

Monomorphization has a drawback of being a bit slower to compile (instead of just one function, compiler has to process many of them), but usually it results in a faster, more optimized code at runtime; it can make a noticeable difference for applications that call such generic functions, say, million times per second.

print_kind_dynamic(), on the other hand, uses a technique called vtable ("virtual table"), where each implementation is created a dedicated "proxy table" that maps to concrete functions:

// This is pseudo-Rust, just to show the concept

struct AnimalVtable {
    // Pointer to a concrete implementation of the `kind()` method
    kind: fn(*const ()) -> &'static str,
}

const CHINCHILLA_VTABLE: AnimalVtable = AnimalVtable {
    kind: Chinchilla::kind,
};

const VISCACHA_VTABLE: AnimalVtable = AnimalVtable {
    kind: Viscacha::kind,
};

fn print_kind_dynamic(
    animal_obj: *const(),
    animal_vtable: &AnimalVtable,
) {
    println!("{}", animal_vtable.kind(animal_obj));
}

fn main() {
    print_kind_dynamic(&Chinchilla, &CHINCHILLA_VTABLE);
    print_kind_dynamic(&Viscacha, &VISCACHA_VTABLE);
}

Since all implementations can be described via AnimalVtable, print_kind_dynamic() doesn’t have to be monomorphized - depending on the underlying type, compiler will simply pass different vtable.

In this case, the drawback is that each time you call print_kind_dynamic(), it has to go through this additional "proxy table", which makes it theoretically slower than print_kind_static(); more often than not the difference is not meaningful, though.

Circling back to the original question: so why not where R: RngCore?

Because we won’t be invoking this PRNG million times per second, and so the additional maintenance burden is - to me - a cake not worth the candle.

As for the implementation, we could do it by hand:

use rand::Rng;
use rand::seq::SliceRandom;

pub struct RouletteWheelSelection;

impl RouletteWheelSelection {
    pub fn new() -> Self {
        Self
    }
}

impl SelectionMethod for RouletteWheelSelection {
    fn select<'a, I>(
       &self,
       rng: &mut dyn RngCore,
       population: &'a [I],
    ) -> &'a I
    where
        I: Individual,
    {
        assert!(!population.is_empty());

        let total_fitness: f32 = population
            .iter()
            .map(|individual| individual.fitness())
            .sum();

        // This is a naïve approach for demonstration purposes; a more
        // efficient implementation could invoke `rng` just once
        loop {
            let indiv = population
                .choose(rng)
                .expect("got an empty population");

            let indiv_share = indiv.fitness() / total_fitness;

            if rng.gen_bool(indiv_share as f64) {
                return indiv;
            }
        }
    }
}

... but a code par excellence would be <drums/> no code at all!

If you go through rand-'s documentation, you might just spot a trait called SliceRandom; if you take a look inside it, you might just spot a method called choose_weighted() that happens to be doing exactly the thing we need:

impl SelectionMethod for RouletteWheelSelection {
    fn select<'a, I>(
       &self,
       rng: &mut dyn RngCore,
       population: &'a [I],
    ) -> &'a I
    where
        I: Individual,
    {
        population
            .choose_weighted(rng, |individual| individual.fitness())
            .expect("got an empty population")
    }
}

Apart from trusting rand-s developers, how can we be sure choose_weighted() does the thing we need? By testing it!

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test() {
        todo!();
    }
}

Path to the TDD-nirvana is a path strawn with roses, and we’re about to get bit by one of their thorns:

use rand::SeedableRng;
use rand_chacha::ChaCha8Rng;

#[test]
fn test() {
    let mut rng = ChaCha8Rng::from_seed(Default::default());

    let population = vec![ /* what here? */ ];

    let actual = RouletteWheelSelection::new()
        .select(&mut rng, &population);

    assert!(/* what here? */);
}

At this point we’ve got two problems:

  1. Since Individual is a trait, how can we fake it for testing purposes?

  2. Since .select() returns just a single individual, how can we be sure it’s random?
    (obligatory xkcd.)

Starting from the top: creating fake objects for testing purposes is called mocking - and while there are mocking solutions for Rust, I gotta admit I’ve never been a fan of the concept of a mock, as best presented in a song:

Friend, either you're closing your eyes
To a situation you do not wish to acknowledge
Or you are not aware of the caliber of disaster indicated
By the presence of mocks in your repository

[...]

Oh yes we got trouble, trouble, trouble!
With a "T"! Gotta rhyme it with "M"!
And that stands for Mock

(but for real, my contempt for mocks deserves its own post.)

My suggestion - that doesn’t require any external crates - it to create a dedicated testing-struct:

#[cfg(test)]
#[derive(Clone, Debug)]
pub struct TestIndividual {
    fitness: f32,
}

#[cfg(test)]
impl TestIndividual {
    pub fn new(fitness: f32) -> Self {
        Self { fitness }
    }
}

#[cfg(test)]
impl Individual for TestIndividual {
    fn fitness(&self) -> f32 {
        self.fitness
    }
}

... which can be then used as:

#[test]
fn test() {
    let population = vec![
        TestIndividual::new(2.0),
        TestIndividual::new(1.0),
        TestIndividual::new(4.0),
        TestIndividual::new(3.0),
    ];

    /* ... */
}

What about the assertion, though? A test such as this one:

#[test]
fn test() {
    /* ... */

    let actual = RouletteWheelSelection::new()
        .select(&mut rng, &population);

    assert!(actual, &population[2]);
}

... doesn’t inspire confidence, as it doesn’t prove that fitness scores are being considered; a hypothetical, invalid implementation:

impl SelectionMethod for RouletteWheelSelection {
    fn select<'a, I>(/* ... */) -> &'a I
    where
        I: Individual,
    {
        &population[2]
    }
}

... would pass such test with flying colors!

Fortunately, we are not doomed - since we want to assess probability, instead of invoking .select() once, we can do it many times and look at the histogram:

coding selection 4
Figure 4. A histogram - X axis represents items, Y axis represents frequency; also, to make hay while the sun shines, I hereby dub this type of histogram the The Johnny Bravo Chart
use std::iter::FromIterator;

#[test]
fn test() {
    let method = RouletteWheelSelection::new();
    let mut rng = ChaCha8Rng::from_seed(Default::default());

    let population = vec![
        TestIndividual::new(2.0),
        TestIndividual::new(1.0),
        TestIndividual::new(4.0),
        TestIndividual::new(3.0),
    ];

    let mut actual_histogram = BTreeMap::new();

    //               there is nothing special about this thousand;
    //          v--v a number as low as fifty might do the trick, too
    for _ in 0..1000 {
        let fitness = method
            .select(&mut rng, &population)
            .fitness() as i32;

        *actual_histogram
            .entry(fitness)
            .or_insert(0) += 1;
    }

    let expected_histogram = BTreeMap::from_iter(vec![
        // (fitness, how many times this fitness has been chosen)
        (1, 0),
        (2, 0),
        (3, 0),
        (4, 0),
    ]);

    assert_eq!(actual_histogram, expected_histogram);
}

Notice that while building the histogram, we’re casting fitness scores from f32 to i32:

let fitness = method
    .select(&mut rng, &population)
    .fitness() as i32;

We have to do that, because floating-point numbers in Rust don’t implement the Ord trait, making it impossible to use them as a BTreeMap-'s key:

use std::collections::BTreeMap;

fn main() {
    let mut map = BTreeMap::new();
    map.insert(1.0, "one point zero");
}
error[E0277]: the trait bound `{float}: Ord` is not satisfied
  |
  |     map.insert(1.0, "one point zero");
  |         ^^^^^^ the trait `Ord` is not implemented for `{float}`

The reason is that floating-point numbers, as defined by the IEEE 754 standard, are not a totally ordered set - namely, comparing NaN-s is problematic, because:

NaN != NaN

Practically, it means that had you ever inserted a NaN into a map, not only would you be unable to retrieve that particular entry using .get(), but you could break BTreeMap-'s internal structure, making it impossible to retrieve any item.

(by the way, that’s also true for custom implementations of Ord and PartialOrd - if they don’t satisfy asymmetry and transitivity, you’re gonna have a bad time.)

If you feel like exploring this topic more:

cargo test (or cargo test --workspace, if you’re in the virtual manifest’s directory) returns:

thread '...' panicked at 'assertion failed: `(left == right)`
  left: `{1: 98, 2: 202, 3: 278, 4: 422}`,
 right: `{1: 0, 2: 0, 3: 0, 4: 0}`'

... proving that choose_weighted() does work as advertised (higher fitness scores were chosen more frequently), so let’s fix the test:

#[test]
fn test() {
    /* ... */

    let expected_histogram = BTreeMap::from_iter(vec![
        // (fitness, how many times this fitness has been chosen)
        (1, 98),
        (2, 202),
        (3, 278),
        (4, 422),
    ]);

    /* ... */
}

Voilà - we’ve tested the untestable!

Before moving on, let’s take a moment to ponder the nature of existence - and maybe rustify our code a little bit as there are some things we could get improved.

First, constructing maps via ::from_iter() is kinda messy - not only you have to create a vector on the way, but you’re limited to (key, value) tuples that look foreign to an untrained eye.

As always when it comes to Rust’s syntax being overly verbose: just macro it ™; in case we’ll make use of a crate named maplit:

# ...

[dev-dependencies]
# ...
maplit = "1.0"

... that provides a handy macro called btreemap!:

#[test]
fn test() {
    /* ... */

    let expected_histogram = maplit::btreemap! {
        // fitness => how many times this fitness has been chosen
        1 => 98,
        2 => 202,
        3 => 278,
        4 => 422,
    };

    /* ... */
}

Moreover, we can use Iterator::fold() to simplify the loop:

#[test]
fn test() {
    /* ... */

    let actual_histogram: BTreeMap<i32, _> = (0..1000)
        .map(|_| method.select(&mut rng, &population))
        .fold(Default::default(), |mut histogram, individual| {
            *histogram
                .entry(individual.fitness() as _)
                .or_default() += 1;

            histogram
        });

    /* ... */
}

as _ means "compiler, pretty please infer what type is required and cast this value into it".

While this operation feels pointless, it performs the same function it did a few code blocks above (i.e. converting f32 to i32) - the only difference is that now that we’ve explicitly stated our map’s key using : BTreeMap<i32, _>, we don’t have to do it inside the casting.

(it wouldn’t be wrong to repeat the type and write as i32 - it’s just my preference to avoid duplicating type annotations.)

Another way we could’ve written this code is:

let actual_histogram = (0..1000)
    .map(|_| method.select(&mut rng, &population))
    .fold(BTreeMap::default(), |mut histogram, individual| {
        *histogram
            .entry(individual.fitness() as i32)
            .or_default() += 1;

        histogram
    });

All those three approaches (including the initial one with an explicit loop) are equally valid Rust code - use whichever one you find the most readable.

Having the selection algorithm ready, let’s recap where we stopped:

pub fn evolve<I>(&self, population: &[I]) -> Vec<I>
where
    I: Individual,
{
    /* ... */

    (0..population.len())
        .map(|_| {
            // TODO selection
            // TODO crossover
            // TODO mutation
        })
        .collect()
}

What we have to figure out now is how to pass SelectionMethod there - I see two approaches:

  1. Using parameter:

    pub fn evolve<I, S>(
        &self,
        population: &[I],
        selection_method: &S,
    ) -> Vec<I>
    where
        I: Individual,
        S: SelectionMethod,
    {
        /* ... */
    }
  2. Using constructor:

    pub struct GeneticAlgorithm<S> {
        selection_method: S,
    }
    
    impl<S> GeneticAlgorithm<S>
    where
        S: SelectionMethod,
    {
        pub fn new(selection_method: S) -> Self {
            Self { selection_method }
        }
    
        pub fn evolve<I, S>(&self, population: &[I]) -> Vec<I>
        where
            I: Individual,
        {
            /* ... */
        }
    }

When I’m facing this kind of decision, I think how often users will need to affect that particular value: a population is usually different each time someone calls .evolve(), so it’s convenient to accept it via parameter; on the other hand, selection algorithm generally remains identical for the whole simulation, so it’ll be wiser to go with constructor.

Now we’re almost ready to invoke the selection method:

pub fn evolve<I>(&self, population: &[I]) -> Vec<I>
where
    I: Individual,
{
    /* ... */

    (0..population.len())
        .map(|_| {
            let parent_a = self
                .selection_method
                .select(population);

            let parent_b = self
                .selection_method
                .select(population);

            // TODO crossover
            // TODO mutation
        })
        .collect()
}

... the only thing we’re missing is the PRNG:

pub fn evolve<I>(
    &self,
    rng: &mut dyn RngCore,
    population: &[I],
) -> Vec<I>
where
    I: Individual,
{
    /* ... */

    (0..population.len())
        .map(|_| {
            let parent_a = self
                .selection_method
                .select(rng, population);

            let parent_b = self
                .selection_method
                .select(rng, population);

            // TODO crossover
            // TODO mutation
        })
        .collect()
}

You might be wondering why rng is passed via .evolve() instead of going through the constructor - surely the random number generator doesn’t change that frequently!

Well, this decision was more subtle - to understand why, let’s explore other ways we could have written that code:

  1. By accepting an owned PRNG via the constructor:

    pub struct GeneticAlgorithm<R> {
        rng: R,
    }
    
    impl<R> GeneticAlgorithm<R>
    where
        R: RngCore,
    {
        pub fn new(rng: R) -> Self {
            Self { rng }
        }
    }
  2. By accepting a borrowed PRNG via the constructor:

    pub struct GeneticAlgorithm<'r> {
        rng: &'mut dyn RngCore,
    }
    
    impl<'r> GeneticAlgorithm<'r> {
        pub fn new(rng: &'r mut dyn RngCore) -> Self {
            Self { rng }
        }
    }

The first approach is something I would suggest had we been implementing our simulation in C# or Java - it’s a different story in Rust, because if we move rng inside the constructor, we won’t be able to use it in other places of the application:

fn main() {
    let rng = /* ... */;
    let ga = GeneticAlgorithm::new(rng);

    // oh no, we can't use this `rng` anymore!
    if rng.gen_bool() {
        /* ... */
    } else {
        /* ... */
    }
}

You could argue the same already happens for SelectionMethod:

fn main() {
    let sp = RouletteWheelSelection::new();
    let ga = GeneticAlgorithm::new(sp);

    // oh no, we can't use this `sp` anymore!
    if sp.something() {
        /* ... */
    }
}

... but in my opinion there’s a difference in that Rng is a more versatile trait - it makes sense to use it outside the GeneticAlgorithm, which cannot be said about SelectionMethod.

All said, you’d be right to call this a far-fetched explanation - picking a "more versatile" trait is nothing but a gut feeling; if you ask me, the owned-value approach is correct, just slightly inferior in this particular case.

As for the &mut dyn RngCore variant, I consider it the worst - it requires a unique borrow (&mut) on rng, so not only it "locks" PRNG for the lifetime of the genetic algorithm:

fn main() {
    let rng = /* ... */;
    let ga = GeneticAlgorithm::new(&mut rng);

    // oh no, we still can't use this `rng`!
    let population = if rng.gen_bool() {
        /* ... */
    } else {
        /* ... */
    };

    ga.evolve(population);
}

... but it also prevents otherwise valid use cases such as this one:

struct Simulation {
    rng: ChaCha8Rng,
    ga: GeneticAlgoritm<'whats_this_lifetime??>,
}

By the way, this is called a self-referential struct - without referring to eldritch unsafe magicks, it cannot be declared in Rust.

More on this topic, if you feel curious:

Coding: Crossover

Now that we have chosen two parent-individuals, it’s time for the crossover phase.

Crossover (also known as recombination) takes two individuals and "mixes" them, yielding a new~ish solution:

coding crossover 1

Compared to randomizing brand new individuals, crossover is neat in the sense that it tries to preserve knowledge - the idea is that combining two solutions will usually yield a solution that’s both new and at least as good as the two solutions we already have; this allows to explore the search space without the risk of losing the best solutions discovered so far.

As in real world, crossover doesn’t actually happen on an individual, but rather on their chromosomes - which is a fancy word for "an encoding of a solution":

coding crossover 2

A chromosome (also called genotype, though I’m convinced a biologist dies each time someone juxtaposes both terms) is usually built from genes, and its structure depends on the underlying problem - sometimes it’s convenient to model a chromosome as a bitset:

coding crossover 3

... sometimes it’s more convenient to have a string:

coding crossover 4

... and we’ll re-use what we already have - a bunch of floating-point numbers (i.e. weights of the neural network):

coding crossover 5

As for the code:

#[derive(Clone, Debug)]
pub struct Chromosome {
    genes: Vec<f32>,
}

Instead of exposing genes directly (via pub genes: …​), we’ll provide a handful of functions allowing to peek inside the chromosome:

impl Chromosome {
    pub fn len(&self) -> usize {
        self.genes.len()
    }

    pub fn iter(&self) -> impl Iterator<Item = &f32> {
        self.genes.iter()
    }

    pub fn iter_mut(&mut self) -> impl Iterator<Item = &mut f32> {
        self.genes.iter_mut()
    }
}

If you’re assiduous, you might even write a handful of tests for them; some people would consider testing such tiny functions overzealous, but I say we go for it:

#[cfg(test)]
mod tests {
    use super::*;

    fn chromosome() -> Chromosome {
        Chromosome {
            genes: vec![3.0, 1.0, 2.0],
        }
    }

    mod len {
        use super::*;

        #[test]
        fn test() {
            assert_eq!(chromosome().len(), 3);
        }
    }

    mod iter {
        use super::*;

        #[test]
        fn test() {
            let chromosome = chromosome();
            let genes: Vec<_> = chromosome.iter().collect();

            assert_eq!(genes.len(), 3);
            assert_eq!(genes[0], &3.0);
            assert_eq!(genes[1], &1.0);
            assert_eq!(genes[2], &2.0);
        }
    }

    mod iter_mut {
        use super::*;

        #[test]
        fn test() {
            let mut chromosome = chromosome();

            chromosome.iter_mut().for_each(|gene| {
                *gene *= 10.0;
            });

            let genes: Vec<_> = chromosome.iter().collect();

            assert_eq!(genes.len(), 3);
            assert_eq!(genes[0], &30.0);
            assert_eq!(genes[1], &10.0);
            assert_eq!(genes[2], &20.0);
        }
    }
}

Seizing the day, let’s catch up on some cool traits from the standard library:

  1. There’s Index, which allows to use the indexing operator - [] - on your type:

    /* ... */
    
    // ---
    // | this is the type of expression you expect inside the square
    // | brackets
    // |
    // | e.g. if you implemented `Index<&str>`, you could write:
    // |   chromosome["yass"]
    // ------- v---v
    impl Index<usize> for Chromosome {
        type Output = f32;
    
        fn index(&self, index: usize) -> &Self::Output {
            &self.genes[index]
        }
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        mod index {
            use super::*;
    
            #[test]
            fn test() {
                let chromosome = Chromosome {
                    genes: vec![3.0, 1.0, 2.0],
                };
    
                assert_eq!(chromosome[0], 3.0);
                assert_eq!(chromosome[1], 1.0);
                assert_eq!(chromosome[2], 2.0);
            }
        }
    }
  2. There’s FromIterator, which allows to .collect() into your type:

    /* ... */
    
    // ---
    // | this is the "type of iterator" for which you want to provide
    // | `from_iter()` and `collect()`
    // |
    // | sometimes it's called the type the iterator *yields*
    // -------------- v-v
    impl FromIterator<f32> for Chromosome {
        fn from_iter<T: IntoIterator<Item = f32>>(iter: T) -> Self {
            Self {
                genes: iter.into_iter().collect(),
            }
        }
    }
    
    #[cfg(test)]
    mod tests {
        /* ... */
    
        mod from_iterator {
            use super::*;
    
            #[test]
            fn test() {
                let chromosome: Chromosome =
                    vec![3.0, 1.0, 2.0]
                        .into_iter()
                        .collect();
    
                assert_eq!(chromosome[0], 3.0);
                assert_eq!(chromosome[1], 1.0);
                assert_eq!(chromosome[2], 2.0);
            }
        }
    }
  3. Finally, there’s also a reverse trait - IntoIterator:

    #![feature(type_alias_impl_trait)]
    
    /* ... */
    
    impl IntoIterator for Chromosome {
        type Item = f32;
        type IntoIter = impl Iterator<Item = f32>;
    
        fn into_iter(self) -> Self::IntoIter {
            self.genes.into_iter()
        }
    }
    
    #[cfg(test)]
    mod tests {
        /* ... */
    
        mod into_iterator {
            use super::*;
    
            #[test]
            fn test() {
                let chromosome = Chromosome {
                    genes: vec![3.0, 1.0, 2.0],
                };
    
                let genes: Vec<_> = chromosome.into_iter().collect();
    
                assert_eq!(genes.len(), 3);
                assert_eq!(genes[0], 3.0);
                assert_eq!(genes[1], 1.0);
                assert_eq!(genes[2], 2.0);
            }
        }
    }

Our last implementation uses a nightly feature called type_alias_impl_trait - it allows to use impl Trait in places such as associated types:

impl IntoIterator for Chromosome {
    /* ... */

    type IntoIter = impl Iterator<Item = f32>;

    /* ... */
}

(by the way, impl Trait in this position is called an existential type.)

If not for this feature, we’d have to figure out the type on our own:

impl IntoIterator for Chromosome {
    /* ... */

    type IntoIter = std::vec::IntoIter<f32>;

    /* ... */
}

... which isn’t always as smooth as that (e.g. combinators such as .map() can be tricky).

So yeah, I guess you could call me a #![feature(type_alias_impl_trait)]-'s fan #1.

As I said few minutes before:

[...] crossover doesn’t actually happen on an individual, but rather on their chromosomes

... which brings us to:

pub trait Individual {
    fn chromosome(&self) -> &Chromosome;

    /* ... */
}

/* ... */

#[cfg(test)]
impl Individual for TestIndividual {
    fn chromosome(&self) -> &Chromosome {
        panic!("not supported for TestIndividual")
    }

    /* ... */
}

/* ... */

(0..population.len())
    .map(|_| {
        let parent_a = self
            .selection_method
            .select(rng, population)
            .chromosome();

        let parent_b = self
            .selection_method
            .select(rng, population)
            .chromosome();

        // TODO crossover
        // TODO mutation
        // TODO convert `Chromosome` back into `Individual`
    })
    .collect()

As for the crossover itself, there are many algorithms we could implement - usually it’s best to try few of them and see whichever plays best with given problem; for simplicity, we’ll go with uniform crossover, which can be described with a single drawing:

coding crossover 6

Same as before, let’s start with a trait:

pub trait CrossoverMethod {
    fn crossover(
        &self,
        rng: &mut dyn RngCore,
        parent_a: &Chromosome,
        parent_b: &Chromosome,
    ) -> Chromosome;
}

... and now a rudimentary implementation:

#[derive(Clone, Debug)]
pub struct UniformCrossover;

impl UniformCrossover {
    pub fn new() -> Self {
        Self
    }
}

impl CrossoverMethod for UniformCrossover {
    fn crossover(
        &self,
        rng: &mut dyn RngCore,
        parent_a: &Chromosome,
        parent_b: &Chromosome,
    ) -> Chromosome {
        let mut child = Vec::new();
        let gene_count = parent_a.len();

        for gene_idx in 0..gene_count {
            let gene = if rng.gen_bool(0.5) {
                parent_a[gene_idx]
            } else {
                parent_b[gene_idx]
            };

            child.push(gene);
        }

        child.into_iter().collect()
    }
}

Your internal code reviewer might notice a few things off about this code - rightfully so!

First, let’s add an assertion:

fn crossover(/* ... */) -> Chromosome {
    assert_eq!(parent_a.len(), parent_b.len());

    /* ... */
}

Second, let’s use a combinator; we already know this one from the previous post - it’s .zip():

fn crossover(/* ... */) -> Chromosome {
    /* ... */

    let parent_a = parent_a.iter();
    let parent_b = parent_b.iter();

    parent_a
        .zip(parent_b)
        .map(|(&a, &b)| if rng.gen_bool(0.5) { a } else { b })
        .collect()
}

How neat!

Notice how by implementing .iter() and FromIterator a moment ago, we were able to reduce code in here to a bare minimum that conveys the essence of the uniform crossover. Apart from improved readability, what I love about combinators the most is that they don’t sacrifice performance - if anything, the variant above will be faster (preallocation!).

Your internal sceptic might still be alerted that something’s missing…​ hmm…​ ah, tests!

#[cfg(test)]
mod tests {
    use super::*;
    use rand::SeedableRng;
    use rand_chacha::ChaCha8Rng;

    #[test]
    fn test() {
        let mut rng = ChaCha8Rng::from_seed(Default::default());
        let parent_a = todo!();
        let parent_b = todo!();

        let child = UniformCrossover::new()
            .crossover(&mut rng, &parent_a, &parent_b);

        assert!(/* ... */);
    }
}

What we want to verify, in plain words, is that child is 50% parent_a + 50% parent_b.

My suggestion is to generate two distinct chromosomes (they don’t have to be random, just build out of different genes):

let parent_a: Chromosome =
    (1..=100)
        .map(|n| n as f32)
        .collect();

let parent_a: Chromosome =
    (1..=100)
        .map(|n| -n as f32)
        .collect();

// First parent will be:
//   [1, 2, /* ... */, 100]
//
// Second parent will look similar, but with reversed signs:
//   [-1, -2, /* ... */, -100]
//
// Just like in the histogram, the concrete number of genes doesn't
// matter - 100 will nicely round up to 100%, that's all

... and compare how much child differs from each of them:

// Number of genes different between `child` and `parent_a`
let diff_a = child
    .iter()
    .zip(parent_a)
    .filter(|(c, p)| *c != p)
    .count();

// Number of genes different between `child` and `parent_b`
let diff_b = child
    .iter()
    .zip(parent_b)
    .filter(|(c, p)| *c != p)
    .count();

assert_eq!(diff_a, 0);
assert_eq!(diff_b, 0);

Given this code, cargo test returns:

thread '...' panicked at 'assertion failed: `(left == right)`
  left: `49`,
 right: `0`'

... so let’s adjust the test:

assert_eq!(diff_a, 49);

Another cargo test will fail on the second assertion:

thread '...' panicked at 'assertion failed: `(left == right)`
  left: `51`,
 right: `0`'

... so:

assert_eq!(diff_b, 51);

To recall, what we’ve got is:

assert_eq!(diff_a, 49);
assert_eq!(diff_b, 51);

... meaning that our child got 49% of its genome from parent_a, and 51% from parent_b; this ultimately proves our uniform crossover picks genes from both parents at equal probability.

(we didn’t get an exact 50% - 50% match, but that’s just probability being probability.)

Now we can pass CrossoverMethod next to SelectionMethod:

pub struct GeneticAlgorithm<S> {
    selection_method: S,
    crossover_method: Box<dyn CrossoverMethod>,
}

impl<S> GeneticAlgorithm<S>
where
    S: SelectionMethod,
{
    pub fn new(
        selection_method: S,
        crossover_method: impl CrossoverMethod + 'static,
    ) -> Self {
        Self {
            selection_method,
            crossover_method: Box::new(crossover_method),
        }
    }

    /* ... */
}

Contrary to SelectionMethod::select(), CrossoverMethod::crossover() doesn’t contain any generic parameters - that’s why we can Box it; another solution would be:

pub struct GeneticAlgorithm<S, C> {
    selection_method: S,
    crossover_method: C,
}

impl<S, C> GeneticAlgorithm<S, C>
where
    S: SelectionMethod,
    C: CrossoverMethod,
{
    pub fn new(
        selection_method: S,
        crossover_method: C,
    ) -> Self {
        Self {
            selection_method,
            crossover_method,
        }
    }

    /* ... */
}

The trade-off here is exactly the same as when we were talking about T vs dyn Trait, with the code directly above us using static dispatch (monomorphization) and Box<dyn Trait> being dynamic dispatch.

Because Rust makes it so easy to create genetic parameter upon generic parameter, some people argue that using Box - which has a theoretical performance penalty - makes for a less idiomatic, slower Rust code.

My personal stance is that Box is a convenient mechanism whose potential runtime trade-off is paid by keeping a code that’s easier to maintain; I’d say: use Box unless benchmarks prove it’s an issue / unless it makes the code awkward to read.

... and then:

(0..population.len())
    .map(|_| {
        /* ... */

        let mut child = self
            .crossover_method
            .crossover(rng, parent_a, parent_b);

        // TODO mutation
        // TODO convert `Chromosome` back into `Individual`
    })
    .collect()

Coding: Mutation

Now that we have a semi-new chromosome at hand, it’s time to introduce some diversity!

Mutation, next to crossover and selection, is the third genetic operator - it takes a chromosome and introduces one or many random changes to it:

coding mutation 1

As in The Actual Evolution®, mutation’s role is to allow to explore solutions that were not present in the initial population; it also helps to avoid getting stuck in a local optimum, as it keeps the population ceaselessly changed, from one generation to another.

Since we already have Chromosome implemented, mutation’s going to be smooth as butter:

pub trait MutationMethod {
    fn mutate(&self, rng: &mut dyn RngCore, child: &mut Chromosome);
}

We’ll use Gaussian mutation, which is a fancy way of saying "we’ll add or subtract random numbers from the genome". Contrary to our parameter-less selection method, Gaussian mutation requires specifying two arguments:

#[derive(Clone, Debug)]
pub struct GaussianMutation {
    /// Probability of changing a gene:
    /// - 0.0 = no genes will be touched
    /// - 1.0 = all genes will be touched
    chance: f32,

    /// Magnitude of that change:
    /// - 0.0 = touched genes will not be modified
    /// - 3.0 = touched genes will be += or -= by at most 3.0
    coeff: f32,
}

impl GaussianMutation {
    pub fn new(chance: f32, coeff: f32) -> Self {
        assert!(chance >= 0.0 && chance <= 1.0);

        Self { chance, coeff }
    }
}

impl MutationMethod for GaussianMutation {
    fn mutate(&self, rng: &mut dyn RngCore, child: &mut Chromosome) {
        for gene in child.iter_mut() {
            let sign = if rng.gen_bool(0.5) { -1.0 } else { 1.0 };

            if rng.gen_bool(self.chance as _) {
                *gene += sign * self.coeff * rng.gen::<f32>();
            }
        }
    }
}

As for the tests, instead of doing fn test() like before, this time I’d like to show you a bit different approach - let’s talk edge cases:

#[cfg(test)]
mod tests {
    use super::*;

    mod given_zero_chance {
        mod and_zero_coefficient {
            #[test]
            fn does_not_change_the_original_chromosome() {
                todo!();
            }
        }

        mod and_nonzero_coefficient {
            #[test]
            fn does_not_change_the_original_chromosome() {
                todo!();
            }
        }
    }

    mod given_fifty_fifty_chance {
        mod and_zero_coefficient {
            #[test]
            fn does_not_change_the_original_chromosome() {
                todo!();
            }
        }

        mod and_nonzero_coefficient {
            #[test]
            fn slightly_changes_the_original_chromosome() {
                todo!();
            }
        }
    }

    mod given_max_chance {
        mod and_zero_coefficient {
            #[test]
            fn does_not_change_the_original_chromosome() {
                todo!();
            }
        }

        mod and_nonzero_coefficient {
            #[test]
            fn entirely_changes_the_original_chromosome() {
                todo!();
            }
        }
    }
}

Naming tests this way took me a while to get used to, but eventually I’ve found this convention so helpful that I cannot not mention it; tests structured like that are worth more than a million comments, because tests - contrary to comments - don’t become obsolete over time.

When implementing, to avoid copy-pasting, first let’s create a helper function:

#[cfg(test)]
mod tests {
    use super::*;
    use rand::SeedableRng;
    use rand_chacha::ChaCha8Rng;

    fn actual(chance: f32, coeff: f32) -> Vec<f32> {
        let mut child = vec![1.0, 2.0, 3.0, 4.0, 5.0]
            .into_iter()
            .collect();

        let mut rng = ChaCha8Rng::from_seed(Default::default());

        GaussianMutation::new(chance, coeff)
            .mutate(&mut rng, &mut child);

        child.into_iter().collect()
    }

    /* ... */
}

... with this function, rest is as easy as:

# ...

[dev-dependencies]
# ...
approx = "0.4"
/* ... */

mod given_zero_chance {
    fn actual(coeff: f32) -> Vec<f32> {
        super::actual(0.0, coeff)
    }

    mod and_zero_coefficient {
        use super::*;

        #[test]
        fn does_not_change_the_original_chromosome() {
            let actual = actual(0.0);
            let expected = vec![1.0, 2.0, 3.0, 4.0, 5.0];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }

    mod and_nonzero_coefficient {
        use super::*;

        #[test]
        fn does_not_change_the_original_chromosome() {
            let actual = actual(0.5);
            let expected = vec![1.0, 2.0, 3.0, 4.0, 5.0];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }
}

/* ... */

Just for completion, here’s rest of the tests:

/* ... */

mod given_fifty_fifty_chance {
    fn actual(coeff: f32) -> Vec<f32> {
        super::actual(0.5, coeff)
    }

    mod and_zero_coefficient {
        use super::*;

        #[test]
        fn does_not_change_the_original_chromosome() {
            let actual = actual(0.0);
            let expected = vec![1.0, 2.0, 3.0, 4.0, 5.0];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }

    mod and_nonzero_coefficient {
        use super::*;

        #[test]
        fn slightly_changes_the_original_chromosome() {
            let actual = actual(0.5);
            let expected = vec![1.0, 1.7756249, 3.0, 4.1596804, 5.0];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }
}

mod given_max_chance {
    fn actual(coeff: f32) -> Vec<f32> {
        super::actual(1.0, coeff)
    }

    mod and_zero_coefficient {
        use super::*;

        #[test]
        fn does_not_change_the_original_chromosome() {
            let actual = actual(0.0);
            let expected = vec![1.0, 2.0, 3.0, 4.0, 5.0];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }

    mod and_nonzero_coefficient {
        use super::*;

        #[test]
        fn entirely_changes_the_original_chromosome() {
            let actual = actual(0.5);

            let expected = vec![
                1.4545316,
                2.1162078,
                2.7756248,
                3.9505124,
                4.638691,
            ];

            approx::assert_relative_eq!(
                actual.as_slice(),
                expected.as_slice(),
            );
        }
    }
}

/* ... */

Now we can pass MutationMethod next to SelectionMethod; there are no generics inside MutationMethod::mutate(), so let’s not hesitate on a Box:

pub struct GeneticAlgorithm<S> {
    selection_method: S,
    crossover_method: Box<dyn CrossoverMethod>,
    mutation_method: Box<dyn MutationMethod>,
}

impl<S> GeneticAlgorithm<S>
where
    S: SelectionMethod,
{
    pub fn new(
        selection_method: S,
        crossover_method: impl CrossoverMethod + 'static,
        mutation_method: impl MutationMethod + 'static,
    ) -> Self {
        Self {
            selection_method,
            crossover_method: Box::new(crossover_method),
            mutation_method: Box::new(mutation_method),
        }
    }

    /* ... */
}

... and:

(0..population.len())
    .map(|_| {
        /* ... */

        self.mutation_method.mutate(rng, &mut child);

        /* ... */
    })
    .collect()

Coding: Creating individuals

Behold, our entire function so far:

pub fn evolve<I>(
    &self,
    rng: &mut dyn RngCore,
    population: &[I],
) -> Vec<I>
where
    I: Individual,
{
    assert!(!population.is_empty());

    (0..population.len())
        .map(|_| {
            let parent_a = self
                .selection_method
                .select(rng, population)
                .chromosome();

            let parent_b = self
                .selection_method
                .select(rng, population)
                .chromosome();

            let mut child = self
                .crossover_method
                .crossover(rng, parent_a, parent_b);

            self.mutation_method.mutate(rng, &mut child);

            // TODO convert `Chromosome` back into `Individual`
        })
        .collect()
}

We’re missing the very last part:

// TODO convert `Chromosome` back into `Individual`

child is of type Chromosome, while the vector we return is Vec<I> - so the thing we need is a method converting genotype back into an individual:

pub trait Individual {
    fn create(chromosome: Chromosome) -> Self;

    /* ... */
}

Naming’s important - sometimes instead of abstract "create", it makes more sense to talk in terms of "where from" and "where to".

For instance, compare which one of these feels more natural to you:

pub trait Individual {
    fn create(chromosome: Chromosome) -> Self;
    fn chromosome(&self) -> &Chromosome;
    fn fitness(&self) -> f32;
}
pub trait Individual {
    fn from_chromosome(chromosome: Chromosome) -> Self;
    fn as_chromosome(&self) -> &Chromosome;
    fn fitness(&self) -> f32;
}

As usually when it comes to naming, there’s no single best answer; we’ll continue with ::create(), because we have to decide on something, but keep this example in mind next time you’re scratching you head, thinking "how to call this function".

Just for fun, it’s also possible to delegate 2/3 of those methods into supertraits:

pub trait Individual: From<Chromosome> + AsRef<Chromosome> {
    fn fitness(&self) -> f32;
}

... although I wouldn’t recommend it, because supertraits make impl-s look less compact:

#[cfg(test)]
impl Individual for TestIndividual {
    fn fitness(&self) -> f32 {
        self.fitness
    }
}

/// By looking at this impl you cannot really say that it's a part
/// of implementation for `Individual`
#[cfg(test)]
impl From<Chromosome> for TestIndividual {
    fn from(_: Chromosome) -> Self {
        panic!("not supported for TestIndividual")
    }
}

/// ditto
#[cfg(test)]
impl AsRef<Chromosome> for TestIndividual {
    fn as_ref(&self) -> &Chromosome {
        panic!("not supported for TestIndividual")
    }
}

Back to the loop:

(0..population.len())
    .map(|_| {
        /* ... */

        I::create(child)
    })

Voilà:

pub fn evolve<I>(
    &self,
    rng: &mut dyn RngCore,
    population: &[I],
) -> Vec<I>
where
    I: Individual,
{
    assert!(!population.is_empty());

    (0..population.len())
        .map(|_| {
            let parent_a = self
                .selection_method
                .select(rng, population)
                .chromosome();

            let parent_b = self
                .selection_method
                .select(rng, population)
                .chromosome();

            let mut child = self
                .crossover_method
                .crossover(rng, parent_a, parent_b);

            self.mutation_method.mutate(rng, &mut child);

            I::create(child)
        })
        .collect()
}

The Test

Having .evolve() ready, the time has come for perhaps the most exciting slice of this post: the telos, final, cherry-on-top test - one that will prove to everyone that our .evolve() works and that we know what we’re doing.

We’ll start by adjusting our TestIndividual so that instead of panic!()-ing, it actually implements ::create() and .chromosome().

Some tests - e.g. the ones for RouletteWheelSelection - don’t care about genes at all, so we’ll get bonus points for inventing a solution that doesn’t require modifying those already-working tests.

My proposition is:

#[cfg(test)]
#[derive(Clone, Debug, PartialEq)]
pub enum TestIndividual {
    WithChromosome { chromosome: Chromosome },
    WithFitness { fitness: f32 },
}

#[cfg(test)]
impl TestIndividual {
    pub fn new(fitness: f32) -> Self {
        Self::WithFitness { fitness }
    }
}

#[cfg(test)]
impl Individual for TestIndividual {
    fn create(chromosome: Chromosome) -> Self {
        Self::WithChromosome { chromosome }
    }

    fn chromosome(&self) -> &Chromosome {
        match self {
            Self::WithChromosome { chromosome } => chromosome,

            Self::WithFitness { .. } => {
                panic!("not supported for TestIndividual::WithFitness")
            }
        }
    }

    fn fitness(&self) -> f32 {
        match self {
            Self::WithChromosome { chromosome } => {
                chromosome.iter().sum()

                // ^ the simplest fitness function ever - we're just
                // summing all the genes together
            }

            Self::WithFitness { fitness } => *fitness,
        }
    }
}

(instead of enum, you could also create two separate types like TestIndividualWithChromosome and TestIndividualWithFitness - but that feels too enterprise-y for me, so I’ll take a rain check)

Since we’re already here, let’s derive PartialEq:

#[cfg(test)]
#[derive(Clone, Debug, PartialEq)]
pub enum TestIndividual {
    /* ... */
}

We’ll need PartialEq for Chromosome too; to avoid funky floating-point surprises, let’s write it by hand using approx:

#[cfg(test)]
impl PartialEq for Chromosome {
    fn eq(&self, other: &Self) -> bool {
        approx::relative_eq!(
            self.genes.as_slice(),
            other.genes.as_slice(),
        )
    }
}

As for the test, nothing scary - we’ll start with a few individuals, evolve them over a few generations, and compare the output population:

#[cfg(test)]
mod tests {
    use super::*;
    use rand::SeedableRng;
    use rand_chacha::ChaCha8Rng;

    #[test]
    fn test() {
        let mut rng = ChaCha8Rng::from_seed(Default::default());

        let ga = GeneticAlgorithm::new(
            RouletteWheelSelection::new(),
            UniformCrossover::new(),
            GaussianMutation::new(0.5, 0.5),
        );

        let mut population = vec![
            /* todo */
        ];

        // We're running `.evolve()` a few times, so that the
        // differences between initial and output population are
        // easier to spot.
        //
        // No particular reason for a number of 10 - this test would
        // be fine for 5, 20 or even 1000 generations; the only thing
        // that'd change is the *magnitude* of difference between
        // initial and output population.
        for _ in 0..10 {
            population = ga.evolve(&mut rng, &population);
        }

        let expected_population = vec![
            /* todo */
        ];

        assert_eq!(population, expected_population);
    }
}

We’re going to create a few individuals at once, so a helper function is a must:

fn individual(genes: &[f32]) -> TestIndividual {
    let chromosome = genes.iter().cloned().collect();

    TestIndividual::create(chromosome)
}

... and now:

#[test]
fn test() {
    /* ... */

    let mut population = vec![
        individual(&[0.0, 0.0, 0.0]), // fitness = 0.0
        individual(&[1.0, 1.0, 1.0]), // fitness = 3.0
        individual(&[1.0, 2.0, 1.0]), // fitness = 4.0
        individual(&[1.0, 2.0, 4.0]), // fitness = 7.0
    ];

    /* ... */

    let expected_population = vec![
        individual(&[0.0, 0.0, 0.0, 0.0]),
        individual(&[0.0, 0.0, 0.0, 0.0]),
        individual(&[0.0, 0.0, 0.0, 0.0]),
        individual(&[0.0, 0.0, 0.0, 0.0]),
    ];

    /* ... */
}

(as before, there’s no reason for creating exactly 4 individuals with chromosomes of length 3 - it just feels right and reasonable to maintain; two individuals would be too few, a hundredth - probably too many.)

Do you hear the people sing, singings the songs of angry men? It’s cargo test, failing - as intended - due to our zeroed-out expected_population:

thread '...' panicked at 'assertion failed: `(left == right)`
  left: `[WithChromosome { ... }, WithChromosome { ... }, ... ]`,
 right: `[WithChromosome { ... }, WithChromosome { ... }, ... ]`,

Using this output, we can copy-paste actual genes from left into expected_population:

let expected_population = vec![
    individual(&[0.44769490, 2.0648358, 4.3058133]),
    individual(&[1.21268670, 1.5538777, 2.8869110]),
    individual(&[1.06176780, 2.2657390, 4.4287640]),
    individual(&[0.95909685, 2.4618788, 4.0247330]),
];

Yes…​ ha ha ha…​ yes? ta-da? hey presto - we got…​ numbers?

What’s certain is that we’ve got some output - but how do we know those four individuals are actually better than the four we started with?

Well, what do you say we look at their fitness scores:

// In this case, `fitness score` means `average of the genes`, as per our
// implemetation inside `TestIndividual::fitness()`

let expected_population = vec![
    individual(&[/* ... */]), // fitness ~= 6.8
    individual(&[/* ... */]), // fitness ~= 5.7
    individual(&[/* ... */]), // fitness ~= 7.8
    individual(&[/* ... */]), // fitness ~= 7.4
];

As the clandestine averages begin to caress my neurons, a juggernaut shout leaves my throat: it works! it works! our genetic algorithm is the apex predator of computer science!

the test 1
Figure 5. stonks

For real tho - we’ve got higher fitness scores, which means that our individuals did get better and everything works as intended:

  1. Thanks to the roulette wheel selection, the worst solution - [0.0, 0.0, 0.0] - was discarded.

  2. Thanks to the uniform crossover, the average fitness score grew from 3.5 to 7.0 (!)

  3. Thanks to the Gaussian mutation, we see genes - numbers - that were not present in the initial population.

You don’t have to believe me, though - try commenting out various parts of the algorithm:

// self.mutation_method.mutate(rng, &mut child);

... and see how they affect the output population :-)

Closing thoughts

So far, learning whoa so much on the way, we’ve implemented two separate components: a neural network and a genetic algorithm.

In the upcoming, last post of this series we’ll integrate both algorithms and implement a snazzy user interface that’ll allow us to see beyond lists of floating-point numbers. As promised, that’s where the JavaScript and WebAssembly will come to play.

As always, thank you for reading and until the next time!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK