4

Opt-in Stable Trait VTables

 3 years ago
source link: https://github.com/rust-lang/rfcs/pull/2955
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Summary

To allow for traits to have a stable fat-pointer layout and a stable vtable layout to allow FFI Compatibility,
a new #[stable_vtable] attribute is provided to opt-in to the layout as described within this rfc.

Informational Links

Pre-RFC Discussion on rust-lang internals: https://internals.rust-lang.org/t/pre-rfc-repr-c-for-traits/12598.

A link to the similar Proposed Technical Specification for the Laser Language as mentioned in the RFC will be provided when it is made available by the Language Design Working Group. I am specifically interested in maintaing cohesion between this RFC and that Proposed Technical Specification.

comex commented on Jul 11

+1
Bikeshed: I would prefer a syntax that could naturally be extended to support multiple stable vtable formats in the future. Perhaps something like #[vtable_repr(rust_v1)]?
Why would we ever need multiple stable formats? Well,

  • We might want to add a new version someday, if backwards-incompatible changes are necessary to support new functionality (e.g. adding more built-in functions in addition to drop_in_place)
  • We might want to add a COM-compatible option, assuming the initial version isn't. The problem here is that COM, like C++, stores the vtable pointer in the object itself rather than using fat pointers. But there have been proposals in the past for thin Rust trait objects; if the language gains support for those someday, going further and adding COM compatibility would be a more natural extension.

Author

chorman0773 commented on Jul 11

edited

Presently I left the attribute to be without an argument, as the point is to come up with a single multi-language vtable layout.
As mentioned in the rfc, I'm open to extensions that follow from the work to add custom abis (I do not presently know where that is, beyond the discussion on rust-lang internals). However, I'm hesitent to add an argument as its stands.
Certainly rust_v1 as the argument would seem to go against the purpose of the RFC, as its name implies its specifically for rust, not generally for systems that subscribe to the proposed abi.
If others think that the attribute should be there by default I'd be willing to add it for sure, but I'm not sure its necessary as it stands.

In reference to the COM-compatible option, that was one of the options I was exploring as an alternative.
That may be a use for providing the argument early. As for the problem of storing the object inside the pointer, that may cause some problems indeed. You could specify that dyn Trait for such traits has the same layout as

#[repr(C)]
struct COMObject{
     vtable: *const VTable,
     obj: ()-but-really-an-arbitrary-unsized-type-implementing-Trait
};

however creating references to such objects wouldn't be easy as simply, materializing a pointer to the object, and materializing a pointer to the vtable.

comex commented on Jul 11

Certainly rust_v1 as the argument would seem to go against the purpose of the RFC, as its name implies its specifically for rust, not generally for systems that subscribe to the proposed abi.

Fair enough, but I still think there should be some sort of argument.

burdges commented on Jul 11

edited

If I understand, you want first to organize where extern "C" fns live inside trait object vtables? Is there really any need for an an attribute here? Could extern "C" fns just always come first? If you do need attributes, then maybe attributes should give explicit positions:

pub trait Baz {
    // Rust methods have no stable vtable position.
    fn foo(&self) -> &Self;
    fn bar(&self) -> &Self;

    /// But extern "C" methods can be assigned unique vtable positions
    #[vtable_position=1]
    extern "C" fn c_foo(self: *mut Self) -> *mut Self;
    #[vtable_position=2]
    extern "C" fn c_bar(self: *mut Self) -> *mut Self;
}

You've this separate issue about Rust stabilizing some fat pointers, which sounds harder:

We cannot do struct Foo([A],[B]) currently but maybe one day we'd allow this, meaning &Foo requires three usizes. We cannot cast &[T] to &dyn Borrow<[T]> either, again because this extra fat pointer needs three usizes, which sounds harder to do than Foo, but maybe not impossible. You must express constraints upon the pointer types that reference this trait object somehow.

Are attributes sufficient here? You might require an actual type dyn(C) Trait perhaps? We might later support stuff like &dyn(unsized=2) Borrow<([A],[B])> where the 2 records the number of dynamic sizes, so one for each slice, and this type occupies 4 usizes.


As an aside related to fat pointers, rust should add small box optimizations eventually, probably via some SmallBox<dyn Trait> type that encodes Sized types smaller no larger than usize into the pointer's space. We need this because SmallBox<dyn Trait> could work without even alloc: Any call SmallBox::new::<T>(..) succeeds if size_of_val::<&T>() < size_of::<usize>() and align_of_val::<&T>() < align_of::<usize>(), even without alloc, but without alloc then such calls panic if the size and alignment checks fail. After this, SmallBox: Deref/DerefMut determine how they handle the pointer usizes by reinvoking these size and alignment checks, which simply read from the vtable without processing the pointer. We need this so that some SmallBox<dyn Error> works without alloc, and hence that io-like traits work.

Author

chorman0773 commented on Jul 11

On Sat, 11 Jul 2020 at 09:30, Jeff Burdges ***@***.***> wrote: There are a couple issues here:

1. Could we organize where extern "C" fns live inside trait object vtables? I suppose yes sure..

We do not necessarily require an attribute here either, maybe extern "C" fns could always come first? Or even give them negative offsets relative to the vtable start? If we do need attributes, then maybe those attributes should give explicit positions:

pub trait Baz { // Rust methods have no stable vtable position. fn foo(&self) -> Self; fn bar(&self) -> Self;

/// But extern "C" methods can be assigned unique vtable positions #[vtable_position=1] extern "C" fn c_foo(self: *mut Self) -> *mut Self; #[vtable_position=2] extern "C" fn c_bar(self: *mut Self) -> *mut Self; }

Under these rules the above trait isn't stable_vtable (it also can't be,

as it uses `Self` so its not *object-safe*). Because everything about the *stable-vtable* pointer needs to be specified, I wanted an explicit attribute for api authors to opt-in to the stable layout (going from stable layout->unstable layout is necessarily a breaking change). The attribute also provides a degree of sanity checking. As mentioned in the rfc, if a trait has supertraits which do not themselves have a stable vtable (and aren't `auto` traits), or is not object-safe, its a compile time error to use the attribute on that trait. The idea is that you would specify the traits you intend to allow foreign modules, and potentially even foreign languages, to interact with the pointers you provide. It also

1. There is a separate question about Rust stabilizing anything about fat pointers:

We cannot do struct Foo([A],[B]) currently but maybe one day we'd allow this, meaning &Foo requires three usizes. We cannot cast &[T] to &dyn Borrow<[T]> either, again because this extra fat pointer needs three usizes, which sounds harder to do than Foo, but maybe not impossible. You must express constraints upon the pointer types that reference this trait object somehow.

Both of these cases would not be affected by this rfc, structs containing trait objects are currently not considered by the rfc. I don't see why you couldn't cast a theoretical &[T] to a &dyn Trait under these rules, in this stablized vtable layout, use the size encoding to include the length of the array. `struct Foo([A],[B])` would be much harder. Possibly its ill-formed to convert &Foo to &dyn StableVTable. If keeping the "Adding `#[stable_vtable]` to a trait is a minor change", it would likely have to be promoted to ill-formed to convert &Foo to any &dyn Trait.
As an aside, rust should add small box optimizations eventually, probably via some SmallBox<dyn Trait> type that encodes Sized types smaller no larger than usize into the pointer's space. We need this because SmallBox<dyn Trait> could work without even alloc: Any call SmallBox:new:<T>(..) succeeds if size_of_val::<&T>() < size_of::<usize>() and align_of_val::<&T>() < align_of::<usize>(), even without alloc, but without alloc then such calls panic if the size and alignment checks fail. After this, SmallBox: Deref/DerefMut determine how they handle the pointer usizes by reinvoking these size and alignment checks, which simply read from the vtable without processing the pointer. We need this so that some SmallBox<dyn Error> works without alloc, and hence that io-like traits work.
Such a SmallBox would necessarily not be a *stable-layout* pointer, and would be beyond the purview of this rfc. Is it not stable that `Box<T>` is heap-allocated (particularly with the `Global` allocator), and therefore not possible to change to allow small buffer optimizations? That was noted in the Pre-RFC discussion, which led to the `dealloc` field being replaced with simply a `reserved` field. —

burdges commented on Jul 12

I neglected to commit an edit I made right after that comment, sorry. Anyways you're right this always requires annotation because going from stable layout to unstable layout is a breaking change, so then my main question is if you want anything more explicit or just want to say the order in the trait is the order in the vtable, which is I guess how C does everything so hey..

And my final question was if you want some type like dyn(C) Trait that captures layout of the pointer itself.

Author

chorman0773 commented on Jul 12

The RFC defines the layout order to be the declaration order of the trait.
As for dyn(C) Trait. imo it seems like this would be better as a property of the trait, rather than a property of the object type.

Author

chorman0773 commented on Jul 14

edited

Re: a mandatory argument in the attribute
I'm not sure what such an argument would be, particularily. Originally, this was overloading the meaning of repr(C), but I don't really want to use C as the argument as this isn't necessarily a "C" thing to do. I am certainly open to suggestions.

Contributor

bjorn3 commented on Jul 16

How should where Self: Sized be handled? Currently it leaves a gap filled with 0, instead of completely omitting the method. This makes it very easy to go from the nth method to the position in the vtable for it: (3 + n) * size_of::<usize>()

Author

chorman0773 commented on Jul 16

That's an omission, I'll correct it. My intention was that Self: Sized methods be omitted from the vtable. The idea is that a plugin written in a foreign language could declare the equivalent of a trait using the equivalent variation of this rfc, and pass it to an application running rust, or vice-versa. For some things, a Self: Sized -equivalent bound wouldn't make sense or even exist.

Author

chorman0773 commented on Jul 23

Note the following rules:
The behaviour is undefined if any of the following is violated for any stable-layout-pointer. The implementation shall not cause any of these constraints to be violated:

  • size shall be a multiple of align.
  • align shall be a power of two.

These are imposing a limitation on the implementation as well as programs. Specifically, this means that any vtable the implementation creates must not violate those constraints. This is for compatibility with other specifications providing similar constructs. If an implementation violates them, it would be unsound on the merits that if handed to a different implementation, it violates those UB causing constraints.

@@ -108,7 +108,7 @@ struct VTable{

size: usize,

align: usize,

drop_in_place: Option<unsafe extern"C" fn(*mut ())->()>,

reserved: *mut ()

dealloc:Option<unsafe extern"C" fn(*mut ())->()>,

bjorn3 on Aug 23

Contributor

Suggested change
dealloc:Option<unsafe extern"C" fn(*mut ())->()>, dealloc: Option<unsafe extern "C" fn(*mut ())>,

chorman0773 on Aug 23

Author

For consistency with the other pointers, I'd want to leave the explicit return type. In particular I'd like to have the return type on the virtual fns to show that there is a return type, its just been erased.

bjorn3 on Aug 23

Contributor

dealloc and drop_in_place don't need an erased return type. For the rest using something like struct ErasedVtableFunction(()); *const ErasedVtableFunction instead would make more sense to prevent accidentally calling it with the wrong signature.

programmerjake on Aug 23

edited

I think #[repr(transparent)] struct ErasedVTableFn(unsafe extern "C" fn()); ... virtual_fns: [ErasedVTableFn] would be better, since on some targets (MS-DOS is an example, though rust doesn't currently support that target) function pointers and data pointers aren't the same size.

chorman0773 on Aug 24

Author

The type is for exposition-only. The type does not actually exist in the RFC, as part of the core language or standard library. If I actually defined the type, I'd definately put something like enum Empty{} in the parameter list to prevent it being called. The intent is simply to express the layout of the type.

* The `drop_in_place` entry shall be initialized to a function which performs the drop operation of the implementing type. If the drop operation is a no-op,

the entry may be initialized to a null pointer (`None`) _Note - It is unspecified if types with trivial (no-op) destruction have the entry initialized to None,

or to a function that performs no operation - End Note_

* The `dealloc` entry shall be initialized to a function which is suitable for deallocating the pointer if it was produced by the in-use global-allocator, (including potentially the intrinsic global-allocator provided by the `std` library). If no global-allocator is available, the entry shall be initialized to a null pointer, or a pointer to a function which performs no operation.

bjorn3 on Aug 23

Contributor

Vtables are associated with the combination of a trait and the type implementing the trait. The pointer type is completely ignored. This means that you can't have a different dealloc entry for every pointer type. Also through what mechanism would the dealloc entry be determined? The pointer type would need some way to tell the compiler what to fill in when creating the vtable.

chorman0773 on Aug 23

Author

dealloc is filled by the compiler with what is effectively alloc::alloc::dealloc (w/o the passed in Layout). When combined with an smart-pointer, the type would be constructed (in rust) using the global allocator (either manually, through alloc::alloc::alloc, or through a Box), but deallocated using this entry, so its suitable for vtables constructed in different languages, or vtables constructed by hand.

bjorn3 on Aug 23

Contributor

alloc::alloc::dealloc may be the wrong function to call. How does the smart pointer tell this to rustc? I think it would make more sense to allocate the memory using malloc in ffi cases and then use free when needing to dealloc on the other side.

chorman0773 on Aug 24

Author

If the pointer is produced from FFI, the appropriate specification or API defines what function is called to deallocate the memory. If its passed into FFI, passing the pointer as is, allocated using rust's Global allocator, and is deallocated using the dealoc vtable entry.
In the cases its necessary to use a different allocation/deallocation, it is possible and valid to manufacture the vtable manually, and it remain compatible with the stable layout pointers defined here.

I realise this isn't perfect, and does not deal with, for example, the use of the allocator traits that wg-allocator is working on. However, it handles enough that it is possible to manipulate smart pointers with this RFC, as well as normal (raw) pointers.
This has prior art in C++, where virtual destructors are stored with a pointer to the applicable operator delete function, in the Itanium ABI Virtual Table, but this is unsuitable for objects created using a non-standard allocator, or even a non-standard allocation function.

Member

cramertj commented on Oct 23

We discussed this in the last language team meeting and the consensus was that now isn't the right time for the right time for this feature. For one, our typical vtable representation isn't yet settled, so it's conceivable we'd wind up with several iterations of in-language vtable representations as a result of making promises here. Additionally, it seems like the functionality proposed here could be emulated at relatively minor ergonomic cost by using a procedural macro to generate a vtable struct and constructor function. This would also allow more flexibility, quick iteration, and evolution than an in-language tool, as well as potentially access to custom features like associated constants and static methods in vtables. With that in mind,

@rfcbot fcp postpone

rfcbot commented on Oct 23

edited by joshtriplett

Team member @cramertj has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Contributor

nikomatsakis commented on Nov 2

@rfcbot reviewed

rfcbot commented 12 days ago

bellThis is now entering its final comment period, as per the review above. bell

rfcbot commented 2 days ago

The final comment period, with a disposition to postpone, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now postponed.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK