8

Avoid allocations and copying in Vec::leak by mbrubeck · Pull Request #89337 · r...

 2 years ago
source link: https://github.com/rust-lang/rust/pull/89337
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

New issue

Avoid allocations and copying in Vec::leak #89337

Conversation

Copy link

Contributor

mbrubeck commented 14 days ago

edited

The Vec::leak method (#62195) is currently implemented by calling Vec::into_boxed_slice and Box::leak. This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.

By avoiding the conversion to Box, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the Vec. Users who want to shrink the vector first can still do so by calling shrink_to_fit explicitly.

Note: This could break code that uses Box::from_raw to “un-leak” the slice returned by Vec::leak. However, the Vec::leak docs explicitly forbid this, so such code is already incorrect.

Copy link

Collaborator

rust-highfive commented 14 days ago

r? @dtolnay

(rust-highfive has picked a reviewer for you, use r? to override)

Copy link

Contributor

Author

mbrubeck commented 14 days ago

This was discussed previously at: rust-lang/rfcs#2969 (review)

Copy link

Member

cuviper commented 13 days ago

Note: This could break code that uses Box::from_raw to “un-leak” the slice returned by Vec::leak. However, the Vec::leak docs explicitly forbid this, so such code is already incorrect.

That clause was added in #77709, but saying "there is no way" is a pretty casual way to forbid this IMO.

I think we should make an explicit guarantee about the capacity here, so there would be a way to use Vec::from_raw_parts. Either we do shrink and you can use len == capacity, or we don't shrink and you can call capacity() beforehand. If we don't shrink, the point about data movement is also a useful performance guarantee to document.

Copy link

Contributor

Author

mbrubeck commented 13 days ago

edited

I pushed a new commit updating the docs to say:

This method does not reallocate or shrink the Vec, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates the Vec (for example, by calling Vec::from_raw_parts) must keep track of the original capacity.

(This replaces the paragraph that said “there is no way to recover the leaked memory.“)

I don't know if we need to call it out in the docs, but of course anyone who wants the len == capacity guarantee can get it by calling shrink_to_fit before leak.

Copy link

Member

cuviper commented 13 days ago

I'm happy with that, but @rust-lang/libs-api should discuss that as a new API guarantee.

Copy link

Member

dtolnay commented 13 days ago

I was not involved in the discussion over at rust-lang/rfcs#2969, but I think this is pretty clearly the right behavior, for the reasons described by @m-ou-se in rust-lang/rfcs#2969 (review).

  • Using "leak" to mean "leak A COPY OF, but destroy the argument" rather than "leak the argument" is uncharacteristic.

  • If leak does not shrink, getting shrink behavior is as simple as vec.shrink_to_fit(); vec.leak(), or Box::leak(vec.into_boxed_slice()). If leak does shrink, getting the non-shrinking behavior requires the user to write unsafe code.

I'll add one more:

  • If the user has built a data structure containing pointers into the vector, it's reasonable to want to be able to leak without all those pointers getting invalidated.

@rfcbot fcp merge

Copy link

rfcbot commented 13 days ago

edited by joshtriplett

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Copy link

rfcbot commented 13 days ago

bellThis is now entering its final comment period, as per the review above. bell

Copy link

Contributor

jplatte commented 12 days ago

edited

I pushed a new commit updating the docs to say:

This method does not reallocate or shrink the Vec, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates the Vec (for example, by calling Vec::from_raw_parts) must keep track of the original capacity.

(This replaces the paragraph that said “there is no way to recover the leaked memory.“)

Maybe that should be prefaced by "As of "? IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.

Copy link

Contributor

Author

mbrubeck commented 12 days ago

IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.

Hmm, it's even worse than that. Even if the author of the code is using the most recent toolchain, there's no easy way to prevent users of the code from compiling it with an older toolchain. For example, if a published library contains unsafe code that relies on this guarantee, it would have undefined behavior when any project using the library is built with an older toolchain, where the guarantee does not hold.

This seems bad enough that we might want to avoid making any guarantee at all that doesn't apply across toolchain versions. But if we do document this guarantee, I agree that we need to document when it changed.

Copy link

Contributor

jplatte commented 12 days ago

Well, once 1.56 lands, people will be able to prevent compilation using an older rustc, at least in Cargo projects, through rust-version in Cargo.toml :)

Copy link

rfcbot commented 3 days ago

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Copy link

Contributor

RustyYato commented 2 days ago

Note: if the len != cap then the leaked slice doesn't have provenance over the unused capacity. So reconstructing via from_raw_parts is unsound. This should be explicitly documented.

cc @RalfJung

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

dtolnay

Assignees

dtolnay

Projects

None yet

Milestone

No milestone

Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK