2

Things you can’t do in Rust (and what to do instead)

 3 years ago
source link: https://blog.logrocket.com/what-you-cant-do-in-rust-and-what-to-do-instead/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

As a moderator of the Rust subreddit, I regularly happen upon posts about developers’ attempts to transpose their respective language paradigms to Rust, with mixed results and varying degrees of success.

In this guide, I’ll describe some of the issues developers encounter when transposing other language paradigms to Rust and propose some alternative solutions to help you work around Rust’s limitations.

Inheritance in Rust

Arguably the most-asked-about missing feature coming from object-oriented languages is inheritance. Why wouldn’t Rust let a struct inherit from another?

You could surely argue that even in the OO world, inheritance has a bad reputation and practitioners usually favor composition if they can. But you could also argue that allowing a type to perform a method differently might improve performance and is thus desirable for those specific instances.

Here’s a classical example taken from Java:

interface Animal {
    void tell();
    void pet();
    void feed(Food food);
}

class Cat implements Animal {
    public void tell() { System.out.println("Meow"); }
    public void pet() { System.out.println("purr"); }
    public void feed(Food food) { System.out.println("lick"); }
}

// this implementation is probably too optimistic...
class Lion extends Cat {
    public void tell() { System.out.println("Roar"); }
}

The first part can be implemented with traits:

trait Animal {
    fn tell(&self);
    fn pet(&mut self);
    fn feed(&mut self, food: Food);
}

struct Cat;

impl Animal for Cat {
    fn tell(&self) { println!("Meow"); }
    fn pet(&mut self) { println!("purr");
    fn feed(&mut self, food: Food) { println!("lick"); }
}

But the second part is not so easy:

struct Lion;

impl Animal for Lion {
    fn tell(&self) { println!("Roar"); }
    // Error: Missing methods pet and feed
}

The simplest way is, obviously, to duplicate the methods. Yes, duplication is bad. So is complexity. Create a free method and call that from the Cat and Lion impl if you need to deduplicate the code.

But wait, you might say, what about the polymorphism part of the equation? That’s where it gets complicated. Where OO languages usually give you dynamic dispatch, Rust makes you choose between static and dynamic dispatch, and both have their costs and benefits.

// static dispatch
let cat = Cat;
cat.tell();

let lion = Lion;
lion.tell();

// dynamic dispatch via enum
enum AnyAnimal {
   Cat(Cat),
   Lion(Lion),
}

// `impl Animal for AnyAnimal` left as an exercise for the reader

let animals = [AnyAnimal::Cat(cat), AnyAnimal::Lion(lion)];
for animal in animals.iter() {
   animal.tell();
}

// dynamic dispatch via "fat" pointer including vtable
let animals = [&cat as &dyn Animal, &lion as &dyn Animal];
for animal in animals.iter() {
   animal.tell();
}

Note that, unlike in garbage collected languages, each variable has to have a single concrete type at compile time. Also, for the enum case, delegating the implementation of the trait is tedious, but crates such as ambassador can help.

A rather hacky way to delegate functions to a member is using the Deref trait for polymorphism so functions defined on the deref target can be called on the derefee directly. Note, however, that this is often considered an antipattern.

Finally, it’s possible to implement a trait for all classes that implement one of a number of other traits, but it requires specialization, which is a nightly feature for now (though there is a workaround available, even packed in a macro crate if you don’t want to write out all the boilerplate required). Traits may very well inherit from each other, though they only prescribe behavior, not data.

Linked lists and other pointer-based data structures

Many folks coming from C++ to Rust will at first want to implement a “simple” doubly linked list but quickly learn that it’s actually far from simple. That’s because Rust wants to be clear about ownership, and thus doubly linked lists require quite complex handling of pointers vs. references.

A newcomer might try to write the following structure:

struct MyLinkedList<T> {
    value: T
    previous_node: Option<Box<MyLinkedList<T>>>,
    next_node: Option<Box<MyLinkedList<T>>>,
}

Well, they’ll add the Option and Box when they note that this otherwise fails. But once they try to implement insertion, they’re in for an unpleasant surprise:

impl<T> MyLinkedList<T> {
    fn insert(&mut self, value: T) {
        let next_node = self.next_node.take();
        self.next_node = Some(Box::new(MyLinkedList {
            value,
            previous_node: Some(Box::new(*self)), // Ouch
            next_node,
        }));
    }
} 

Of course, the borrow checker won’t allow this. The ownership of values is completely muddled. Box owns the data it contains, and thus each node in the list would be owned by the previous and next node in the list. Rust only ever allows one owner per data, so this will at least require a Rc or Arc to work. But even this becomes cumbersome quickly, not to mention the overhead from reference counts.

Luckily, you don’t have to write a doubly linked list because the standard library already contains one (std::collections::LinkedList). Also, it is quite rare that this will give you good performance compared to simple Vecs, so you may want to measure accordingly.

If you really want to write a doubly linked list, you can refer to “Learning Rust With Entirely Too Many Linked Lists,” which may help you both write linked lists and learn a lot about unsafe Rust in the process.

(Aside: Singly-linked lists are absolutely fine to build out of a chain of boxes. In fact, the Rust compiler contains an implementation.)

The same mostly applies to graph structures, although you’ll likely need a dependency for handling graph data structures. petgraph is the most popular at the moment, providing both the data structure and a number of graph algorithms.

Self-referencing types

When faced with the concept of self-referencing types, it’s fair to ask, “Who owns this?” Again, this is a wrinkle in the ownership story that the borrow checker isn’t usually happy with.

You’ll encounter this problem when you have an ownership relation and want to store both the owning and owned object within one struct. Try this naïvely and you’ll have a bad time trying to get the lifetimes to work.

We can only guess that many Rustaceans have turned to unsafe code, which is subtle and really easy to get wrong. Of course, using a plain pointer instead of a reference will remove your lifetime worries, as pointers carry no lifetime. However, this is taking up the responsibility of managing the lifetime manually.

Luckily there are some crates that take the solution and present a safe interface, such as the ouroboros, self_cell and one_self_cell crates.

Global mutable state

People coming from C and/or C++ — or, less often, from dynamic languages — are sometimes accustomed to creating and modifying global state throughout their code. For example, one Redditor ranted that “It’s completely safe and yet Rust doesn’t let you do it.”

Here is a slightly simplified example:

#include <iostream>
int i = 1;

int main() {
    std::cout << i;
    i = 2;
    std::cout << i;
}

In Rust, that would translate roughly to:

static I: u32 = 1;

fn main() {
    print!("{}", I);
    I = 2; // <- Error: Cannot mutate global state
    print!("{}", I);
}

Many Rustaceans will tell you that you just don’t need that state to be global. Of course, in such a simple example, this is true. But for a good number of use cases, you really do need global mutable state — e.g., in some embedded applications.

There is, of course a way to do it, using unsafe. But before you reach for that, depending on your use case, you may just want to use a Mutex instead to be really sure. Or, if the mutation is only needed once for initialization, a OnceCell or lazy_static will solve the problem neatly. Or if you really only need integers, the std::sync::Atomic*​ types have you covered.

With that said, especially in the embedded world where every byte counts and resources are often mapped into memory, having a mutable static is often the preferred solution. So if you really must do it, it would look like this:

static mut DATA_RACE_COUNTER: u32 = 1;

fn main() {
    print!("{}", DATA_RACE_COUNTER);
    // I solemny swear that I'm up to no good, and also single threaded.
    unsafe {
        DATA_RACE_COUNTER = 2;
    }
    print!("{}", DATA_RACE_COUNTER);
}

Again, you shouldn’t do this unless you really need to. And if you need to ask whether it’s a good idea, the answer is no.

‘Just’ initializing an array

A neophyte may be tempted to declare an array as follows:

let array: [usize; 512];

for i in 0..512 {
    array[i] = i;
}

This fails because the array was never initialized. We then try to assign values into it, but without telling the compiler, it won’t even reserve a place for us to write on the stack. Rust is picky like that; it distinguishes the array from its contents. Furthermore it requires both to be initialized before we can read them.

By initializing let array = [0usize; 512];, we solve this problem at the cost of a double initialization, which may or may not get optimized out — or, depending on the type, may even be impossible. See “Unsafe Rust: How and when (not) to use it” for a solution.

​​And this concludes our short tour of things that you cannot (easily) do in Rust. While there are surely other things that Rust makes hard, listing them all would take up too much time for both me and you, dear reader.

LogRocket: Full visibility into production Rust apps

Debugging Rust applications can be difficult, especially when users experience issues that are difficult to reproduce. If you’re interested in monitoring and tracking performance of your Rust apps, automatically surfacing errors, and tracking slow network requests and load time, try LogRocket.

LogRocket is like a DVR for web apps, recording literally everything that happens on your Rust app. Instead of guessing why problems happen, you can aggregate and report on what state your application was in when an issue occurred. LogRocket also monitors your app’s performance, reporting metrics like client CPU load, client memory usage, and more.

Modernize how you debug your Rust apps — start monitoring for free.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK