1

api: add Cow guarantee to replace API by BurntSushi · Pull Request #1178 · rust-...

 4 weeks ago
source link: https://github.com/rust-lang/regex/pull/1178
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Conversation

Member

This adds a guarantee to the API of the replace, replace_all and
replacen routines that, when Cow::Borrowed is returned, it is
guaranteed that it is equivalent to the haystack given.

The implementation has always matched this behavior, but this elevates
the implementation behavior to an API guarantee.

There do exists implementations where this guarantee might not be upheld
in every case. For example, if the final result were the empty string,
we could return a Cow::Borrowed. Similarly, if the final result were a
substring of haystack, then Cow::Borrowed could be returned in that
case too. In practice, these sorts of optimizations are tricky to do in
practice, and seem like niche corner cases that aren't important to
optimize.

Nevertheless, having this guarantee is useful because it can be used as
a signal that the original input remains unchanged. This came up in
discussions with @quicknir on Discord. Namely, in cases where one is
doing a sequence of replacements and in most cases nothing is replaced,
using a Cow is nice to be able to avoid copying the haystack over and
over again. But to get this to work right, you have to know whether a
Cow::Borrowed matches the input or not. If it doesn't, then you'd need
to transform it into an owned string. For example, this code tries to do
replacements on each of a sequence of Cow<str> values, where the
common case is no replacement:

use std::borrow::Cow;

use regex::Regex;

fn trim_strs(strs: &mut Vec<Cow<str>>) {
    strs
    .iter_mut()
    .for_each(|s| moo(s, &regex_replace));
}

fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
    let result = f(&c);
    match result {
        Cow::Owned(s) => *c = Cow::Owned(s),
        Cow::Borrowed(s) => {
            *c = Cow::Borrowed(s);
        }
    }
}

fn regex_replace(s: &str) -> Cow<str> {
    Regex::new(r"does-not-matter").unwrap().replace_all(s, "whatever")
}

But this doesn't pass borrowck. Instead, you could write moo like
this:

fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
    let result = f(&c);
    match result {
        Cow::Owned(s) => *c = Cow::Owned(s),
        Cow::Borrowed(s) => {
            if !std::ptr::eq(s, &**c) {
                *c = Cow::Owned(s.to_owned())
            }
        }
    }
}

But the std::ptr:eq call here is a bit strange. Instead, after this PR
and the new guarantee, one can write it like this:

fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
    if let Cow::Owned(s) = f(&c) {
        *c = Cow::Owned(s);
    }
}
quicknir reacted with thumbs up emoji

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK