Avoid allocations and copying in Vec::leak by mbrubeck · Pull Request #89337 · r...
source link: https://github.com/rust-lang/rust/pull/89337
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
New issue
Avoid allocations and copying in Vec::leak #89337
Conversation
The Vec::leak
method (#62195) is currently implemented by calling Vec::into_boxed_slice
and Box::leak
. This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.
By avoiding the conversion to Box
, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the Vec
. Users who want to shrink the vector first can still do so by calling shrink_to_fit
explicitly.
Note: This could break code that uses Box::from_raw
to “un-leak” the slice returned by Vec::leak
. However, the Vec::leak
docs explicitly forbid this, so such code is already incorrect.
r? @dtolnay
(rust-highfive has picked a reviewer for you, use r? to override)
This was discussed previously at: rust-lang/rfcs#2969 (review)
Note: This could break code that uses
Box::from_raw
to “un-leak” the slice returned byVec::leak
. However, theVec::leak
docs explicitly forbid this, so such code is already incorrect.
That clause was added in #77709, but saying "there is no way" is a pretty casual way to forbid this IMO.
I think we should make an explicit guarantee about the capacity here, so there would be a way to use Vec::from_raw_parts
. Either we do shrink and you can use len == capacity
, or we don't shrink and you can call capacity()
beforehand. If we don't shrink, the point about data movement is also a useful performance guarantee to document.
I pushed a new commit updating the docs to say:
This method does not reallocate or shrink the
Vec
, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates theVec
(for example, by callingVec::from_raw_parts
) must keep track of the original capacity.
(This replaces the paragraph that said “there is no way to recover the leaked memory.“)
I don't know if we need to call it out in the docs, but of course anyone who wants the len == capacity
guarantee can get it by calling shrink_to_fit
before leak
.
I'm happy with that, but @rust-lang/libs-api should discuss that as a new API guarantee.
I was not involved in the discussion over at rust-lang/rfcs#2969, but I think this is pretty clearly the right behavior, for the reasons described by @m-ou-se in rust-lang/rfcs#2969 (review).
-
Using "leak" to mean "leak A COPY OF, but destroy the argument" rather than "leak the argument" is uncharacteristic.
-
If leak does not shrink, getting shrink behavior is as simple as
vec.shrink_to_fit(); vec.leak()
, orBox::leak(vec.into_boxed_slice())
. If leak does shrink, getting the non-shrinking behavior requires the user to write unsafe code.
I'll add one more:
- If the user has built a data structure containing pointers into the vector, it's reasonable to want to be able to leak without all those pointers getting invalidated.
@rfcbot fcp merge
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Copy link
rfcbot commented 13 days ago
This is now entering its final comment period, as per the review above.
I pushed a new commit updating the docs to say:
This method does not reallocate or shrink the
Vec
, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates theVec
(for example, by callingVec::from_raw_parts
) must keep track of the original capacity.(This replaces the paragraph that said “there is no way to recover the leaked memory.“)
Maybe that should be prefaced by "As of "? IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.
IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.
Hmm, it's even worse than that. Even if the author of the code is using the most recent toolchain, there's no easy way to prevent users of the code from compiling it with an older toolchain. For example, if a published library contains unsafe code that relies on this guarantee, it would have undefined behavior when any project using the library is built with an older toolchain, where the guarantee does not hold.
This seems bad enough that we might want to avoid making any guarantee at all that doesn't apply across toolchain versions. But if we do document this guarantee, I agree that we need to document when it changed.
Well, once 1.56 lands, people will be able to prevent compilation using an older rustc, at least in Cargo projects, through rust-version in Cargo.toml
:)
Copy link
rfcbot commented 3 days ago
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
This will be merged soon.
Note: if the len != cap
then the leaked slice doesn't have provenance over the unused capacity. So reconstructing via from_raw_parts
is unsound. This should be explicitly documented.
cc @RalfJung
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK