2

Github Implement indexing slices with pairs of core::ops::Bound<usize> by...

 3 years ago
source link: https://github.com/rust-lang/rust/pull/77704
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Copy link

Contributor

AnthonyMikh commented on Oct 8, 2020

Closes #49976.

I am not sure about code duplication between check_range and into_maybe_range. Should be former implemented in terms of the latter? Also this PR doesn't address code duplication between impl SliceIndex for Range*.

Copy link

Collaborator

rust-highfive commented on Oct 8, 2020

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link

Contributor

KodrAus left a comment

Thanks for your patience on this one @AnthonyMikh!

It looks like I had a review I started but never actually finished slightly_smiling_face

This seems like a reasonable addition to me, I've just left a few comments.

Copy link

Contributor

crlf0710 commented on Dec 4, 2020

@AnthonyMikh Ping from triage, would you mind addressing the comments above? Thanks!

Copy link

Contributor

Author

AnthonyMikh commented on Dec 10, 2020

Sorry about the delay, I was busy with my work.

@KodrAus do you mind taking a look again?

Copy link

Contributor

Author

AnthonyMikh commented on Dec 10, 2020

Ah, I forgot that I can do it as well

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

Copy link

Contributor

KodrAus left a comment

Thanks @AnthonyMikh!

This looks good to me. Would you like to squash these commits down? r=me after that.

AnthonyMikh

force-pushed the

AnthonyMikh:slice_index_with_ops_bound_pair

branch from 1303b97 to da7b5d4

on Dec 17, 2020

Copy link

Contributor

Author

AnthonyMikh commented on Dec 17, 2020

@KodrAus rebased and ready to merge

Copy link

Contributor

KodrAus commented on Jan 14

Thanks for your patience here @AnthonyMikh!

@bors r+

Copy link

Contributor

bors commented on Jan 14

pushpin Commit da7b5d4 has been approved by KodrAus

Copy link

Contributor

bors commented on Jan 14

hourglass Testing commit da7b5d4 with merge 135d6df...

Copy link

Contributor

bors commented on Jan 14

broken_heart Test failed - checks-actions

Copy link

Collaborator

rust-log-analyzer commented on Jan 14

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

Copy link

Contributor

KodrAus commented on Jan 14

@bors r-

Copy link

Contributor

clarfonthey commented on Jan 24

I personally would rather have this impl because it doesn't rely on extra checks being removed for efficiency. Right now, (Bound, Bound) is a very good way to represent a RangeBounds instance in a generic way, and my current use case is that a single method on a trait which accepts a pair of bounds is object-safe, whereas a method that accepts a generic RangeBounds is not.

I think that relying on a conversion to Range or some other form of assertion is still not necessarily the best way to index slices with a generic RangeBounds, and that simply adding a SliceIndex is relatively simple and robust. The intent is much clearer than converting to something else before indexing.

Copy link

Contributor

KodrAus commented on Jan 29

I personally would rather have this impl because it doesn't rely on extra checks being removed for efficiency

@clarfonthey the contract of check_len is that the bounds you're given are safe to pass to get_unchecked though, so you could avoid those potentially not-optimized-away bounds checks if you're worried about them.

Whether or not this exists in practice becomes the difference between:

let subslice = &bytes[bounds];

and (using the latest assert_len proposal):

let subslice = &bytes[slice::range(bounds, ..slice.len())];

because if you want a (Bound, Bound) it's typically because you're either stashing it somewhere or want it as a concrete alternative to impl RangeBounds. I'd never really expect to see &bytes[(Bound::Unbounded, Bound::Included(11))].

For checked variants, there's a little more work:

let subslice = &bytes.get(bounds);
let subslice = slice::checked_range(bounds, ..slice.len()).and_then(|range| &bytes::get(range));

Adding a SliceIndex impl is certainly more terse.

Copy link

Contributor

dylni commented on Jan 29

@KodrAus If bounds is RangeBounds, I don't believe those examples would work even with this impl.

let subslice = &bytes[bounds];

would need to be changed to

let subslice = &bytes[(bounds.start_bound(), bounds.end_bound())];

The compiler would still have no way to convert RangeBounds to SliceIndex.

Copy link

Contributor

KodrAus commented on Jan 29

@dylni Ah sorry, I wasn’t clear. In those cases I was treating bounds as a concrete (Bound, Bound).

Copy link

Contributor

dylni commented on Jan 29

@KodrAus That makes more sense. Sorry for the misunderstanding.

Copy link

Contributor

clarfonthey commented on Jan 29

@clarfonthey the contract of check_len is that the bounds you're given are safe to pass to get_unchecked though, so you could avoid those potentially not-optimized-away bounds checks if you're worried about them.

That requires unsafe code, and it's an extra step required to use correctly.

If the primary purpose of such a method is to immediately use it to index a slice, why require the use of unsafe code in these cases?

Copy link

Contributor

dylni commented on Jan 30

edited

If the primary purpose of such a method is to immediately use it to index a slice, why require the use of unsafe code in these cases?

That's because it's not the primary purpose. The primary purpose is to get start and end as integers. However, you might want to use those integers to index a slice later, so that needs to be supported.

It's also one reason why the feature is unstable. It would be best if the method could cover all of these use cases, but that might not be realistic.

Copy link

Contributor

clarfonthey commented on Jan 30

edited

That's because it's not the primary purpose. The primary purpose is to get start and end as integers. However, you might want to use those integers to index a slice later, so that needs to be supported.

It's also one reason why the feature is unstable. It would be best if the method could cover all of these use cases, but that might not be realistic.

That's fair, although I feel like a majority of the use cases for normalising discrete ranges of usize are to then easily use the range to index into something. If not a slice, some other container.

For that reason, I'd argue that the purpose of the method isn't to normalise the range, but to allow it to be easily used to index other containers outside libstd. And so, it would not make much sense to offer that functionality to crates outside libstd while not using it within libstd itself.

IMHO, if the goal is not ultimately to use the range for indexing something, the method should operate on all integer ranges and not just those of usize.

Copy link

Contributor

dylni commented on Jan 30

edited

And so, it would not make much sense to offer that functionality to crates outside libstd while not using it within libstd itself.

It's been used in libstd since it was implemented:
https://github.com/rust-lang/rust/pull/75207/files

The design changed, but it's still used in the same places.

IMHO, if the goal is not ultimately to use the range for indexing something, the method should operate on all integer ranges and not just those of usize.

Unfortunately, it needs to add 1 sometimes. There's no generic way to create 1.

Copy link

Contributor

KodrAus commented on Jan 30

edited

That requires unsafe code, and it's an extra step required to use correctly.

I guess what I was trying to get at was that motivating this equivalent API through speculative local optimizations doesn't seem strong to me. For what it's worth, it does look like that additional bounds check optimizes away just fine. My point was if you're concerned about unnecessary bounds checks not optimizing then that's what get_unchecked is for. It's not incorrect not to use get_unchecked.

You did make me realise though that the way I originally motivated this in the FCP isn't great. I don't think we'd ever expect anyone to write something like:

let subslice = slice.get((Bound::Excluded(2), Bound::Excluded(13));
let subslice = slice.get((bounds.start_bound().cloned(), bounds.end_bound().cloned()));

directly. Instead, you'd be using (Bound, Bound) as an alternative to impl RangeBounds either on a struct or in a method to avoid generics. So I think then the motivation is whether we should make something like:

// Bounds in a struct with no generic arg
struct Reader {
    bounds: (Bound<usize>, Bound<usize>),
}

// Bounds in an object-safe method
pub fn read(&mut self, bounds: (Bound<usize>, Bound<usize>) { .. }

as consistently capable as:

struct Reader {
    range: Range<usize>,
}

pub fn read(&mut self, range: Range<usize>) { .. }

with respect to slicing.

Copy link

Contributor

bors commented on Feb 23

umbrella The latest upstream changes (presumably #82430) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link

Member

m-ou-se commented 25 days ago

@BurntSushi Is your concern resolved by the comments that have been posted since then?

Copy link

Member

BurntSushi commented 24 days ago

Hmmm, I actually still don't really understand the use case here even after reading the comments. I think there might be some shared context that folks have that I'm missing. Could someone succinctly describe what this impl enables?

Copy link

Contributor

Amanieu commented 24 days ago

I would feel more comfortable if this was wrapped in a SliceBounds type rather than just as a bare tuple.

Copy link

Contributor

dylni commented 21 days ago

edited

@BurntSushi The best use case I've seen is described in this comment. In most cases, assert_len slice::range will be better. You should only need this impl if you don't know the length that will be used to cap the range.

Copy link

Member

BurntSushi commented 16 days ago

@dylni I'll rephrase. Is there a way to show what this enables that is otherwise not possible (or less ergonomic) with a short code snippet?

Copy link

Contributor

ExpHP commented 15 days ago

edited

The sort of example that first comes to mind for me are &mut Builder-pattern applications, where a range is supplied to a builder long before it is used, and must squeeze into a monomorphic type:

use std::ops::{Bound, RangeBounds};

pub fn main() {
    let data: Vec<Vec<i32>> = unimplemented!();
    
    let view = TableViewBuilder::new()
        .row_range(..=3)
        .column_range(2..5)
        .make_view(&data);
}

Without this impl, it may be unclear what type TableViewBuilder should actually store (I think you might need to allocate a Box). With this impl, (Bound<usize>, Bound<usize>) can be stored:

#![feature(bound_cloned)]

pub struct TableViewBuilder {
    row_range: (Bound<usize>, Bound<usize>),
    column_range: (Bound<usize>, Bound<usize>),
}

impl TableViewBuilder {
    pub fn new() -> Self {
        TableViewBuilder {
            row_range: (Bound::Unbounded, Bound::Unbounded),
            column_range: (Bound::Unbounded, Bound::Unbounded),
        }
    }
    
    pub fn row_range(&mut self, r: impl RangeBounds<usize>) -> &mut Self {
        self.row_range = (r.start_bound().cloned(), r.end_bound().cloned());
        self
    }
    
    pub fn column_range(&mut self, r: impl RangeBounds<usize>) -> &mut Self {
        self.column_range = (r.start_bound().cloned(), r.end_bound().cloned());
        self
    }
    
    pub fn make_view(&self, table: &[Vec<i32>]) -> impl Iterator<Item=&[i32]> {
        table[self.row_range].iter()
            .map(|row| &row[self.column_range])
    }
}

Of course, here I will emphasize that this particular example only works because make_view is very specifically using these ranges to index slices. If it tried to do something else with the range, this impl would be insufficient, which is why I'm ultimately more interested in #76393.

Copy link

Contributor

dylni commented 14 days ago

@ExpHP SliceIndex allows creating that API using a polymorphic struct with some effort:

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=efed75ae7c56f5ef617b70407b0ddf03

Your example is more ergonomic, but this looks reasonable to me.

@BurntSushi I see this impl being used mostly to simplify some uncommon code and make the SliceIndex impls more consistent with RangeBounds, but I'm ambivalent. I've never needed this impl myself.

Copy link

Contributor

clarfonthey commented 14 days ago

edited

@dylni I'll rephrase. Is there a way to show what this enables that is otherwise not possible (or less ergonomic) with a short code snippet?

IMHO, this is essentially a replacement for assert_len that uses the existing RangeBounds impls as a guide. The most common use case being monomorphising indexing across range types, as mentioned by @ExpHP and myself.

I agree with the concerns that perhaps a separate type than a plain tuple might be better, but RangeBounds already went the tuple route and it would be weird to deviate here.

Without this impl or slice::range, making code that accepts a generic RangeBounds and indexes into a slice is a lot more complicated than it's worth, since you have to essentially implement the conversion yourself. In those cases, your best bet is to just not allow anything other than Range and force users of the API to convert things into ranges themselves, which is less ergonomic.

Another alternative might be to just force use of SliceIndex directly but that currently cannot be implemented by custom types.

The crux of it is honestly that RangeBounds doesn't imply SliceIndex, and we have to deal with that somehow. I feel like forcing users to just pidgeonhole themselves into one of the types that's shared between the two is a suboptimal solution.

Copy link

Member

BurntSushi commented 11 days ago

@rfcbot resolve unclear-use-case

Copy link

rfcbot commented 11 days ago

bellThis is now entering its final comment period, as per the review above. bell

Copy link

Member

BurntSushi commented 11 days ago

@ExpHP Thank you! The key thing I was missing was "there's no way to store a monomorphic range that permits accepting any kind of range because our language syntax generates different range types depending on what syntax is used." Maybe seems obvious in retrospect, but it just wasn't clicking for me. Thank you for elaborating.

Copy link

rfcbot commented yesterday

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK