2

New RFC: Collection Transmute by llogiq · Pull Request #2756 · rust-lang/rfcs ·...

 2 years ago
source link: https://github.com/rust-lang/rfcs/pull/2756
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Copy link

Contributor

llogiq commented on Sep 3, 2019

edited

Copy link

Contributor

Centril commented on Sep 3, 2019

Copy link

Member

BurntSushi commented on Sep 3, 2019

edited

Nice! Thanks for writing this up.

Though the code is not too large, there are a good number of things to get wrong – this solution was iteratively created by soundness-knowledgeable Rustaceans, with multiple wrong attempts.

Could you include text in the RFC that documents this progression? In particular, why is the naive approach unsound, and why is the final result correct?

In addition to that, I don't see alignment mentioned in this RFC, and I'd kind of expect that to be an issue somewhere. Or am I misunderstanding?

Additionally, are there any problems with the fact that you might create an allocation for Vec<T>, but wind up freeing that allocation as a Vec<U>? It seems to me like that is not universally okay. I think size_of::<T>() == size_of::<U>() and align_of::<T>() == align_of::<U>() both need to hold, where I think the latter is an additional restriction over transmute, which AFAIK does not require the alignments to be equivalent. Are there more conditions that need to hold?

I'd rather just declare mem::transmute to be OK. A new method with the same name and purpose as mem::transmute but subtly different meaning makes it difficult to remember which one to use, and it doesn't help people who (in the past or in the future) write it "incorrectly".

As far as I know, transmuting a Vec (if the item types are sufficiently compatible) is only UB today because we're not committed to Vec<T> and Vec<U> having the same memory layout for all T and U, but we can easily guarantee that that if we so choose:

  • The current implementation (e.g., Vec's source code, rustc's layout algorithm) poses no obstacles to such transmutes, it's just a matter of guaranteeing it will remain so
  • There's no reason to expect that we'll ever want to not make it true (e.g. it might make sense to randomize layout of user-defined structs or determine it by PGO, but for Vec neither point seems likely to be useful)
  • If needed (e.g. because we add struct layout randomization), we can always add something extra to Vec to make it true, e.g., a repr(C) attribute or the hypothetical repr(in_order) (if the FFI implications of repr(C) are undesirable)

One remaining benefit of a specialized transmute method would that it could do more sanity checks, such as the item types having the same size and alignment. This is significant but I'm not so sure it outweighs the aforementioned drawbacks of having a separate but similar method.

I agree with @BurntSushi about enforcing size_of::<T>() == size_of::<U>(). We must enforce this because mem::transmute specifies that it checks type sizes, which morally unsafe { mem::transmute::<_,Vec<u64>>(vec![0u8]) } violates.

I think however an inequality suffices for the alignment, probably align_of::<U>() < align_of::<T>() no? If we dislike this, then Vec::transmute could reallocate whenever it must fix alignment, but maybe transmute users would prefer the inequality.

Copy link

Member

BurntSushi commented on Sep 3, 2019

I got the equality (as opposed to inequality) from the docs on GlobalAlloc::dealloc. Specifically:

layout must be the same layout that was used to allocate that block of memory

Where layout specifies the alignment. A strict reading of that suggests that the alignment you allocate with must be the same as the alignment that you deallocate with. Whether this is actually required for the underlying memory allocator(s) used, I don't know.

An inequality would prove useful, so maybe @RalfJung can say if it suffices.

Copy link

Member

RalfJung commented on Sep 3, 2019

edited

You'll have to ask whoever wrote the allocator docs. I have no idea why alignment is even required to match.

(I'll only be able to read the full thing and respond properly to it when I am back from vacation.)

Copy link

Member

sfackler commented on Sep 3, 2019

The Windows system allocator implementation does rely on getting the same alignment when deallocating as allocating I believe.

Copy link

Contributor

Diggsey commented on Sep 3, 2019

This is not just a windows thing: any time that you need a greater alignment than the underlying allocator can provide, you have to allocate a slightly larger chunk of memory and then offset the returned pointer to the required alignment. When it comes to freeing that memory, you need some way to undo that offset.

Copy link

Member

RalfJung commented on Sep 3, 2019

Just the alignment is not enough information though to compute what the offset was.
When requesting a 8-byte-aligned allocation where we really need an alignment of 16, we need to know whether the offset-by-8 was necessary or not. Just knowing "oh, 16-aligned was requested" does not help.

Copy link

Contributor

Author

llogiq commented on Sep 3, 2019

Thanks to all of you for this illuminating discussion! This convinces me that a) the current design isn't courageous enough, I should go with an unsafe Transmute trait instead, b) implementations should check what they can, preferrably at compile-time, and c) we should discourage mem::transmute usage, leaving it only as a last resort.

Regarding the alignment question, we'd need the information from the allocator if reducing alignment is acceptable.

Copy link

Contributor

Diggsey commented on Sep 3, 2019

Just the alignment is not enough information though to compute what the offset was.
When requesting a 8-byte-aligned allocation where we really need an alignment of 16, we need to know whether the offset-by-8 was necessary or not. Just knowing "oh, 16-aligned was requested" does not help.

You need the alignment to determine whether an offset was applied or not though, and that's exactly what libstd does: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/windows/alloc.rs#L47-L56

Without that, you end up having to add overhead to every single allocation, not just those with unusually large alignment requirements.

Copy link

Contributor

clarfonthey commented on Sep 3, 2019

edited

Kind of unsure how this could be used properly in the case of BinaryHeap.

Copy link

burdges commented on Sep 3, 2019

edited

How much do we know about when transmute gets used? I assume the most common usages are firstly to convert to/from a byte array, and secondly to violate another crate's visibility rules, like by adding or removing a private wrapper type. I think size_of::<T>() == size_of::<U>() suffices for the second, so the most restrictive form sounds useful.

Copy link

Contributor

Author

llogiq commented on Sep 3, 2019

@clarfon it's an unsafe operation. One could pair it with a heapify operation to restore the heap property if so desired, in cases where the types' orderings differ.

Copy link

Contributor

clarfonthey commented on Sep 3, 2019

@llogiq Right, but BinaryHeap doesn't offer that in any public APIs, and at that rate, you're mostly just reimplementing what BinaryHeap has to offer anyway. Either the internals of the heap should be exposed (seems out of scope for libstd) or anyone wanting to transmute their heap should just implement one on top of Vec.

Copy link

comex commented on Sep 4, 2019

How exactly is the size/alignment check to be implemented? An extension of the special-case intrinsicck routine that currently checks std::mem::transmute? Is there any way to use const generics to integrate it properly with the trait system? (I tried, but it seems like the necessary functionality isn't implemented yet.)

Copy link

Contributor

djc commented on Sep 4, 2019

This may be obvious to most people involved here so far, but it might be useful to mention one or two use cases for having a transmute operation on these collections in the first place.

Copy link

ExpHP commented on Sep 4, 2019

edited

A Transmute trait sounds useless. Let me explain:


There are some cases where it would be highly desirable for straight-up transmutation to be well-defined. In particular, when switching a generic type argument between #[repr(transparent)] types; I'm not sure that #[repr(transparent)] is specified in a way such that ArbitraryStruct<usize> and ArbitraryStruct<Node> are guaranteed to have the same representation; but it ought to be!

// Newtype index wrapper for a graph node.
#[repr(transparent)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
struct Node(usize);

// It is highly desirable for all of these to be valid operations
transmute::<Vec<Vec<usize>>, Vec<Vec<Node>>>(vecs);
transmute::<Vec<HashSet<usize>>, Vec<HashSet<Node>>>(sets);
transmute::<&HashSet<usize>, &HashSet<Node>>(set);

If we tried to solve the problem of UB Vec transmutes by introducing a Transmute trait, then

  1. either: all of these transmutes have the same tricky UB that we'd be discouraging against mem::transmute for, and so we merely brushed a tiny bit of dirt under the rug.
  2. or: the trait impls do not actually transmute. The first two transmutes would have to cost O(n), and the transmute behind references cannot be implemented.

I'm not sure that #[repr(transparent)] is specified in a way such that ArbitraryStruct and ArbitraryStruct are guaranteed to have the same representation; but it ought to be!

No it should not!

trait Foo {
    type Bar;
}

struct Baz<T: Foo>(T, T::Bar);

Baz can have different representations even for transparent types.

Copy link

hanna-kruppe commented on Sep 4, 2019

edited

@KrishnaSannasi

Baz can have different representations even for transparent types.

That is not a transparent newtype, though. repr(transparent) does guarantee equal layout to the sole non-ZST field (which may not be the same as the type parameter, if any, but that is a minor point).

However, @ExpHP, transmuting transparent newtypes may still not be safe because the newtype may have additional invariants over the wrapped type. For example, NonZeroU32 is a repr(transparent) struct containing u32.

I have done it again! Now, I have added try_zip_with and zip_with. These take an additional vec and try and use the capacity of either one! I have also make the code more robust, now it will clean up it's mess even if you panic in the drop.

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=8655604d47b5a128ab306b1a5c87589a

In the interest of not polluting this thread even more with these updates, I have published this as a minimal crate so that you can track it's progress there and contribute if you would like to!

https://crates.io/crates/vec-utils

Copy link

Contributor

SimonSapin commented on Oct 15, 2019

edited

I’m sympathetic to the idea that there is a need here, but this thread has several comment that should lead to edits in the RFC. This includes at least:

  • #2756 (comment) More explanations of the wrong ways to do this, and why each part of the correct way is necessary
  • #2756 (comment) Same alignment is an additional requirement compared to mem::transmute of each items
  • #2756 (comment) As a result, the Vec<u32> -> Vec<[u8; 4]> example is incorrect

I’ll add: I’d prefer the method to assert! the size and alignment requirements. The optimizer should have no trouble constant-folding that branch away.

Copy link

RustyYato commented on Oct 15, 2019

edited

I’ll add: I’d prefer the method to assert! the size and alignment requirements. The optimizer should have no trouble constant-folding that branch away

An if checking the layout optimizes away just fine because everything in the check is a compile time constant and it makes the code easier to change, so that should be fine.

I could add more docs to vec-utils, would that work to clear up why we need each step?

Copy link

Contributor

SimonSapin commented on Oct 25, 2019

I feel that Vec::into_raw_parts removes much of the need for Vec::transmute.

Copy link

Member

shepmaster commented on Oct 25, 2019

I can go either way. From my comment on the implementation PR:


I believe that there's some amount of benefit to both. Vec::transmute would address my original case here, but into_raw_parts would be nice to use for FFI concerns.

If I understand it correctly, Vec::transmute could be implemented in terms of into_raw_parts:

unsafe fn transmute<U>(self) -> Vec<U> {
    let (ptr, len, cap) = self.into_raw_parts();
    let ptr = mem::transmute::<*mut T, *mut U>(ptr);
    Vec::from_raw_parts(ptr, len, cap)
}

Then the original example would be

let mut v: Vec<Foobar<T>> = unimplemented!();
let values: Vec<T> = unsafe { v.transmute() };

Copy link

Contributor

SimonSapin commented on Oct 25, 2019

If I understand it correctly, Vec::transmute could be implemented in terms of into_raw_parts

Yes, this is what I was getting at. If the implementation of Vec::transmute becomes less error-prone, it becomes that much less pressing for it to be in the standard library. "End" users could do something like:

let mut v: Vec<Foobar<T>> = unimplemented!();
let (ptr, len, cap) = v.into_raw_parts();
let values = unsafe { Vec::from_raw_parts(ptr as *mut T, len, cap) };

(No need for transmute to cast raw pointers.)

Note that Vec::transmute had two things that made it error-prone:

  • #[repr(Rust)] transmute, "fixed" by Vec::{into,from}_raw_parts heavy_check_mark

  • Layout<T> == Layout<V> (including equal alignment) warning which is a check people may miss if they implement Vec::transmute as @shepmaster suggested.

    • This point maybe can be addressed with good enough documentation woman_shrugging

Layout<T> == Layout<V> is the especially tricky part. Outside owned collections it is totally fine to cast to a lower alignment, i.e. it's OK to view a u32 as [u8; 4], but that's not the case for collections that own their allocation.

Linting in rustc or Clippy for the specific pattern of obtaining a pointer from an owned collection, casting it to an incompatible alignment and creating another collection from it would solve the problem, but I'm not sure how feasible that is.

Clippy already has a lint for casts between pointers with incompatible layout, but it would not be triggered for casting a pointer to u32 as pointer to [u8; 4], and yet that triggers undefined behavior as soon as an owning collection constructed from the pointer with altered alignment is deallocated.

Copy link

Contributor

SimonSapin commented on Oct 25, 2019

That’s a good point, an explicit method would also be an opportunity to assert! the the layouts match. (And since the types are monomorphized, layouts should be constant and that branch should optimize away.)

Copy link

Contributor

Author

llogiq commented on Oct 25, 2019

We could at least lint pointers that are used in Vec::from_raw_parts. Or other functions we identify.

We discussed this in the @rust-lang/lang meeting today.

We don't think this is a lang concern; this is entirely up to @rust-lang/libs.

We do think there'd be value in project-safe-transmute taking a look at this, and ensuring there are appropriate mechanisms to do safe transmutes.

Apart from that, it seems reasonable to have mechanisms to do unsafe transmute here. Such a mechanism could use either from_raw_parts or a top-level transmute, depending on type layouts. Either way, this doesn't seem like a lang issue.

Note that simple iterators already do almost the right thing for vec. https://rust.godbolt.org/z/537KEs
It does not call any external functions such as alloc/free. I'm not sure why the assembly still contains those loops, they seem to do nothing.

Dropping T-lang per #2756 (comment)

Copy link

Member

dtolnay commented 8 days ago

We discussed this RFC in the most recent @rust-lang/libs-api meeting.

It was concerning that the code examples of "doing it correctly" in the RFC are all UB due to the alignment issue raised in the thread. You can't transmute by value to a vector with a different element alignment (like Vec<i32> to Vec<[u8; 4]>) since when dropping the resulting vector's heap allocation the Layout provided to the allocator is required to match the Layout it was allocated with.

We are interested in collection transmute APIs in general (both safe and unsafe) but would prefer to consider them only once there are trait bounds from #2981 to incorporate into the design.

For now, as noted in #2756 (comment), into_raw_parts + from_raw_parts is the recommended idiom for performing such transmute.

@rfcbot fcp close

Copy link

rfcbot commented 8 days ago

edited by yaahc

Team member @dtolnay has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Copy link

rfcbot commented 8 days ago

bellThis is now entering its final comment period, as per the review above. bell

Copy link

Contributor

Lokathor commented 8 days ago

A comment: If you are willing to use a crate then many vec transmutations are already available: https://docs.rs/bytemuck/1.7.0/bytemuck/allocation/fn.try_cast_vec.html


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK