Github Implement indexing slices with pairs of core::ops::Bound<usize> by...
source link: https://github.com/rust-lang/rust/pull/77704
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Closes #49976.
I am not sure about code duplication between check_range
and into_maybe_range
. Should be former implemented in terms of the latter? Also this PR doesn't address code duplication between impl SliceIndex for Range*
.
r? @KodrAus
(rust_highfive has picked a reviewer for you, use r? to override)
Thanks for your patience on this one @AnthonyMikh!
It looks like I had a review I started but never actually finished
This seems like a reasonable addition to me, I've just left a few comments.
@AnthonyMikh Ping from triage, would you mind addressing the comments above? Thanks!
Sorry about the delay, I was busy with my work.
@KodrAus do you mind taking a look again?
Ah, I forgot that I can do it as well
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author
Thanks @AnthonyMikh!
This looks good to me. Would you like to squash these commits down? r=me after that.
@KodrAus rebased and ready to merge
Thanks for your patience here @AnthonyMikh!
@bors r+
Test failed - checks-actions
I personally would rather have this impl because it doesn't rely on extra checks being removed for efficiency. Right now, (Bound, Bound)
is a very good way to represent a RangeBounds
instance in a generic way, and my current use case is that a single method on a trait which accepts a pair of bounds is object-safe, whereas a method that accepts a generic RangeBounds
is not.
I think that relying on a conversion to Range
or some other form of assertion is still not necessarily the best way to index slices with a generic RangeBounds
, and that simply adding a SliceIndex
is relatively simple and robust. The intent is much clearer than converting to something else before indexing.
I personally would rather have this impl because it doesn't rely on extra checks being removed for efficiency
@clarfonthey the contract of check_len
is that the bounds you're given are safe to pass to get_unchecked
though, so you could avoid those potentially not-optimized-away bounds checks if you're worried about them.
Whether or not this exists in practice becomes the difference between:
let subslice = &bytes[bounds];
and (using the latest assert_len
proposal):
let subslice = &bytes[slice::range(bounds, ..slice.len())];
because if you want a (Bound, Bound)
it's typically because you're either stashing it somewhere or want it as a concrete alternative to impl RangeBounds
. I'd never really expect to see &bytes[(Bound::Unbounded, Bound::Included(11))]
.
For checked variants, there's a little more work:
let subslice = &bytes.get(bounds);
let subslice = slice::checked_range(bounds, ..slice.len()).and_then(|range| &bytes::get(range));
Adding a SliceIndex
impl is certainly more terse.
@KodrAus If bounds
is RangeBounds
, I don't believe those examples would work even with this impl.
let subslice = &bytes[bounds];
would need to be changed to
let subslice = &bytes[(bounds.start_bound(), bounds.end_bound())];
The compiler would still have no way to convert RangeBounds
to SliceIndex
.
@clarfonthey the contract of
check_len
is that the bounds you're given are safe to pass toget_unchecked
though, so you could avoid those potentially not-optimized-away bounds checks if you're worried about them.
That requires unsafe code, and it's an extra step required to use correctly.
If the primary purpose of such a method is to immediately use it to index a slice, why require the use of unsafe code in these cases?
If the primary purpose of such a method is to immediately use it to index a slice, why require the use of unsafe code in these cases?
That's because it's not the primary purpose. The primary purpose is to get start
and end
as integers. However, you might want to use those integers to index a slice later, so that needs to be supported.
It's also one reason why the feature is unstable. It would be best if the method could cover all of these use cases, but that might not be realistic.
That's because it's not the primary purpose. The primary purpose is to get
start
andend
as integers. However, you might want to use those integers to index a slice later, so that needs to be supported.It's also one reason why the feature is unstable. It would be best if the method could cover all of these use cases, but that might not be realistic.
That's fair, although I feel like a majority of the use cases for normalising discrete ranges of usize
are to then easily use the range to index into something. If not a slice, some other container.
For that reason, I'd argue that the purpose of the method isn't to normalise the range, but to allow it to be easily used to index other containers outside libstd. And so, it would not make much sense to offer that functionality to crates outside libstd while not using it within libstd itself.
IMHO, if the goal is not ultimately to use the range for indexing something, the method should operate on all integer ranges and not just those of usize
.
And so, it would not make much sense to offer that functionality to crates outside libstd while not using it within libstd itself.
It's been used in libstd since it was implemented:
https://github.com/rust-lang/rust/pull/75207/files
The design changed, but it's still used in the same places.
IMHO, if the goal is not ultimately to use the range for indexing something, the method should operate on all integer ranges and not just those of
usize
.
Unfortunately, it needs to add 1
sometimes. There's no generic way to create 1
.
That requires unsafe code, and it's an extra step required to use correctly.
I guess what I was trying to get at was that motivating this equivalent API through speculative local optimizations doesn't seem strong to me. For what it's worth, it does look like that additional bounds check optimizes away just fine. My point was if you're concerned about unnecessary bounds checks not optimizing then that's what get_unchecked
is for. It's not incorrect not to use get_unchecked
.
You did make me realise though that the way I originally motivated this in the FCP isn't great. I don't think we'd ever expect anyone to write something like:
let subslice = slice.get((Bound::Excluded(2), Bound::Excluded(13));
let subslice = slice.get((bounds.start_bound().cloned(), bounds.end_bound().cloned()));
directly. Instead, you'd be using (Bound, Bound)
as an alternative to impl RangeBounds
either on a struct or in a method to avoid generics. So I think then the motivation is whether we should make something like:
// Bounds in a struct with no generic arg struct Reader { bounds: (Bound<usize>, Bound<usize>), } // Bounds in an object-safe method pub fn read(&mut self, bounds: (Bound<usize>, Bound<usize>) { .. }
as consistently capable as:
struct Reader { range: Range<usize>, } pub fn read(&mut self, range: Range<usize>) { .. }
with respect to slicing.
The latest upstream changes (presumably #82430) made this pull request unmergeable. Please resolve the merge conflicts.
@BurntSushi Is your concern resolved by the comments that have been posted since then?
Hmmm, I actually still don't really understand the use case here even after reading the comments. I think there might be some shared context that folks have that I'm missing. Could someone succinctly describe what this impl enables?
I would feel more comfortable if this was wrapped in a SliceBounds
type rather than just as a bare tuple.
@BurntSushi The best use case I've seen is described in this comment. In most cases, assert_len
slice::range
will be better. You should only need this impl if you don't know the length that will be used to cap the range.
@dylni I'll rephrase. Is there a way to show what this enables that is otherwise not possible (or less ergonomic) with a short code snippet?
The sort of example that first comes to mind for me are &mut Builder
-pattern applications, where a range is supplied to a builder long before it is used, and must squeeze into a monomorphic type:
use std::ops::{Bound, RangeBounds}; pub fn main() { let data: Vec<Vec<i32>> = unimplemented!(); let view = TableViewBuilder::new() .row_range(..=3) .column_range(2..5) .make_view(&data); }
Without this impl, it may be unclear what type TableViewBuilder
should actually store (I think you might need to allocate a Box). With this impl, (Bound<usize>, Bound<usize>)
can be stored:
#![feature(bound_cloned)] pub struct TableViewBuilder { row_range: (Bound<usize>, Bound<usize>), column_range: (Bound<usize>, Bound<usize>), } impl TableViewBuilder { pub fn new() -> Self { TableViewBuilder { row_range: (Bound::Unbounded, Bound::Unbounded), column_range: (Bound::Unbounded, Bound::Unbounded), } } pub fn row_range(&mut self, r: impl RangeBounds<usize>) -> &mut Self { self.row_range = (r.start_bound().cloned(), r.end_bound().cloned()); self } pub fn column_range(&mut self, r: impl RangeBounds<usize>) -> &mut Self { self.column_range = (r.start_bound().cloned(), r.end_bound().cloned()); self } pub fn make_view(&self, table: &[Vec<i32>]) -> impl Iterator<Item=&[i32]> { table[self.row_range].iter() .map(|row| &row[self.column_range]) } }
Of course, here I will emphasize that this particular example only works because make_view
is very specifically using these ranges to index slices. If it tried to do something else with the range, this impl would be insufficient, which is why I'm ultimately more interested in #76393.
@ExpHP SliceIndex
allows creating that API using a polymorphic struct with some effort:
Your example is more ergonomic, but this looks reasonable to me.
@BurntSushi I see this impl being used mostly to simplify some uncommon code and make the SliceIndex
impls more consistent with RangeBounds
, but I'm ambivalent. I've never needed this impl myself.
@dylni I'll rephrase. Is there a way to show what this enables that is otherwise not possible (or less ergonomic) with a short code snippet?
IMHO, this is essentially a replacement for assert_len
that uses the existing RangeBounds
impls as a guide. The most common use case being monomorphising indexing across range types, as mentioned by @ExpHP and myself.
I agree with the concerns that perhaps a separate type than a plain tuple might be better, but RangeBounds
already went the tuple route and it would be weird to deviate here.
Without this impl or slice::range
, making code that accepts a generic RangeBounds
and indexes into a slice is a lot more complicated than it's worth, since you have to essentially implement the conversion yourself. In those cases, your best bet is to just not allow anything other than Range
and force users of the API to convert things into ranges themselves, which is less ergonomic.
Another alternative might be to just force use of SliceIndex
directly but that currently cannot be implemented by custom types.
The crux of it is honestly that RangeBounds
doesn't imply SliceIndex
, and we have to deal with that somehow. I feel like forcing users to just pidgeonhole themselves into one of the types that's shared between the two is a suboptimal solution.
@rfcbot resolve unclear-use-case
Copy link
rfcbot commented 11 days ago
This is now entering its final comment period, as per the review above.
@ExpHP Thank you! The key thing I was missing was "there's no way to store a monomorphic range that permits accepting any kind of range because our language syntax generates different range types depending on what syntax is used." Maybe seems obvious in retrospect, but it just wasn't clicking for me. Thank you for elaborating.
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
The RFC will be merged soon.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK