Github RFC: `#[derive_no_bound(..)]` and `#[derive_field_bound(..)]` by Centril...
source link: https://github.com/rust-lang/rfcs/pull/2353
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Conversation
Summary
This RFC gives users a way to control trait bounds on derived implementations by allowing them to omit default bounds on type parameters or add bounds for field types. This is achieved with the two attributes #[derive_no_bound(Trait)]
and #[derive_field_bound(Trait)]
.
The semantics of #[derive_no_bound(Trait)]
for a type parameter P
are:
The type parameter P
does not need to satisfy Trait
for any field referencing it to be Trait
The semantics of #[derive_field_bound(Trait)]
on a field are that the type of the field is added to the where
-clause of the referenced Trait
as: FieldType: Trait
.
Thanks
Thanks to @nox for collaborating on this RFC with me as well as @kennytm for providing some great input.
+ Replace bounds on impl of `Clone` and `PartialEq` with `T: Sync`
```rust
#[derive_bound(Clone, PartialEq, T: Sync)]
Centril on Feb 27, 2018
Author
Contributor
Ah yes, this is what was meant by:
Changing this would require a language change.
I'll clarify =)
kennytm on Feb 27, 2018 •
Member
This will be quite a large language change And I wonder if this will leads to things like
#[derive_bound(T: PartialEq<#[thing] fn(i32)>)]
Centril on Feb 27, 2018 •
Author
Contributor
On the other hand I think steps to at least provide a sensible subset of things such as bounds, expressions, types in the attribute grammar is good as people already do this kinds of things in serde, derivative, (my derive crate for proptest also does this, but it is still WIP)... It is just done in string quotes which is somewhat hacky :(
That is, afaik, people can already write:
#[serde(bound = "T: PartialEq<#[thing] fn(i32)>")]
I think we should provide custom derive macro authors and users ways to do these things in a less hacky way.
clarfonthey on Mar 1, 2018
Contributor
Don't attributes allow arbitrary token trees now? This should make this acceptable.
Centril on Mar 2, 2018
Author
Contributor
@clarcharr I don't think so, but I'm not sure - I'd love to be proven wrong =)
CAD97 on Jun 8, 2019
Attributes do allow arbitrary token trees between their brackets now.
Did anyone think about automatically detecting whether the bounds on the type parameters are required?
I think this should at least be listed as a theoretical alternative. The downside would be that it's less visible (especially if the fields are private) and easier to accidentally break APIs.
The proposed syntax seems to be the "safe" option; but it's also quite heavy to use imho.
Another alternative would be allowing things like:
#[derive] impl<T> Clone for MyArc<T>;
Or in more crazy scenarios:
#[derive] impl<T: UnrelatedTrait> Clone for MyArc<T>;
This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.
Did anyone think about automatically detecting whether the bounds on the type parameters are required?
That cannot be done, that's why this is not listed as an alternative, deriving is a syntactical thing and syntactical things don't have any type knowledge.
Another alternative would be allowing things like:
What is impl<T> Clone for MyArc<T>;
supposed to mean? Why is the derive
attribute empty?
This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.
Not being able to know the bounds just from reading the code isn't more readable IMO.
Since all those derives (clone, copy, eq, cmp, ...) rely entirely on forwarding the implementation in some way to the struct's fields, wouldn't it always work if we never add bounds and instead always bind on field types? So basically the equivalent of derive_no_bound
+derive_field_bound
, always, everywhere.
Since all those derives (clone, copy, eq, cmp, ...) rely entirely on forwarding the implementation in some way to the struct's fields, wouldn't it always work if we never add bounds and instead always bind on field types?
No, because of privacy. You can't have a where clause involving a private type in a public one. And changing that now would be a breaking change anyway.
What is
impl<T> Clone for MyArc<T>;
supposed to mean? Why is thederive
attribute empty?
And there I thought that part would be obvious...
#[derive] impl<T: Clone> Clone for MyArc<T>;
would be the equivalent to what
#[derive(Clone)] struct MyArc<T>(...);
is today: it should automatically derive the trait implementation. The derive attribute is empty, because the trait to be derived is already given.
[...] deriving is a syntactical thing [...]
Hm right; this means my idea only works if the compiler manages to find the type definition in that stage. Maybe restricting this to only be allowed in the same file after the type definition would make this work?
This probably requires more lines of code, but is imho better readable, and allows more powerful constructs.
Not being able to know the bounds just from reading the code isn't more readable IMO.
This second idea is unrelated to my first about automatically finding the bounds; this one requires writing the bounds just like in a normal implementation, but allows deriving the implementation even if the bounds are more complex.
You're trying to solve the more complex cases with derive_field_bound
, but this only works if the field type is public, as you said yourself.
Sorry about the confusion.
And there I thought that part would be obvious...
Things need to be exhaustively described in a RFC, we don't have room for confusion.
Deriving the trait implementation from an impl
block without a body looks like a cute idea, but this seems unrelated to the RFC at hand, and would result in more verbose code than the attributes mentioned here, given it requires writing the type name again, its where
clause if any (that's alleviated by implied bounds but that's not stable yet), and for use cases needing #[derive_field_bound]
, you would end up having to rewrite the field types too. Finally, that idea does not let you control the bounds for multiple derived traits at once.
I think this idea should be the way to go if someone needs to go beyond what's provided by the attributes in this RFC, but it doesn't ultimately supersede it.
Hm right; this means my idea only works if the compiler manages to find the type definition in that stage. Maybe restricting this to only be allowed in the same file after the type definition would make this work?
Even in the same file, a derive macro doesn't have access to the definitions of types used in the item it's deriving things for.
No, because of privacy. You can't have a where clause involving a private type in a public one.
The privacy issue is fixable, in theory it's already fixed by #2145, it just needs implementation.
It turns out there's a harder problem - coinductive reasoning.
Deriving the trait implementation from an
impl
block without a body looks like a cute idea, but this seems unrelated to the RFC at hand,
I disagree. I think the proposed syntax is real ugly; I don't look forward to using it, although the feature itself is quite important imho.
So if there are better ways doing it, those are imho related to this RFC.
and would result in more verbose code than the attributes mentioned here, given it requires writing the type name again,
Yes. I'm fine with that compared to the syntax currently proposed.
its where clause if any (that's alleviated by implied bounds but that's not stable yet),
Doesn't look like an important argument.
and for use cases needing #[derive_field_bound], you would end up having to rewrite the field types too.
But either derive_field_bound
won't work for private types, or we can just always use it by default.
Finally, that idea does not let you control the bounds for multiple derived traits at once.
Yes, and that's why I like it. Because it's way easier to see what bounds are required for a certain trait compared to reading a big mess of attributes.
I think this idea should be the way to go if someone needs to go beyond what's provided by the attributes in this RFC, but it doesn't ultimately supersede it.
I disagree again; I'd rather not see the attributes getting implemented in the first place if a better solution can be found.
Even in the same file, a derive macro doesn't have access to the definitions of types used in the item it's deriving things for.
I'm not familiar with the implementation here, and not sure where this is in the range of "theoretically possible, but not gonna happen", "would be a lot of work" or "trivial to change". But it's certainly not impossible to implement.
I admit I have no idea how to combine my idea with proc_macros (although I'm pretty sure I wouldn't want to parse the proposed attributes either).
I disagree. I think the proposed syntax is real ugly; I don't look forward to using it, although the feature itself is quite important imho.
I look forward to using it because I actually need it or my code doesn't compile. As for the syntax being ugly, the syntax in that RFC lets me have less code duplication than your suggestion, so I don't really see what you want me to tell you.
Yes. I'm fine with that compared to the syntax currently proposed.
And I'm not because I will have around 60 types needing this.
Doesn't look like an important argument.
Well, that's your opinion. I will have around 60 types needing this.
But either
derive_field_bound
won't work for private types, or we can just always use it by default.
It won't work, but my types are public, and I will not have to write them once in the type and once in each derived impl, which is better than your proposal.
Yes, and that's why I like it. Because it's way easier to see what bounds are required for a certain trait compared to reading a big mess of attributes.
I have types that derive more than 10 traits. I have around 60 types like that.
I disagree again; I'd rather not see the attributes getting implemented in the first place if a better solution can be found.
Then show me a solution that doesn't lead to inordinate amounts of code duplication.
Edit: I forgot a verb.
My feeling is that in Rust it is preferred to have more code over having compact and unreadable code.
Maybe you could use macros to organize your types in a more compact way.
It seems you're also motivated by some time constraints, but it is not obvious why those are relevant for the Rust community.
All in all I'd still like to see the feature itself; so if the requirement is that customizing the bounds for derive needs to be done in attributes in the type definition, you're proposal seems overall fine to me.
I'm not really motivated by some time constraint, given my project is blocked on at least 2 other unstable features. I'm motivated by not duplicating code. Your idea leads to duplicating code immensely. There are even cases where #[derive(Clone)] struct Foo<...>(...);
wouldn't be translatable to #[derive] impl<...> Clone for Foo<...>
with current code, because the latter wouldn't know about where clauses in the definition of Foo<...>
.
I'm already using macros extensively, that's why I can derive more than 10 traits on them. There is not much you can do when your work is to encode more than 300 different CSS properties with a huge variety of types and intricacies.
Finally, I still don't see what's supposed to be unreadable about the attributes suggested in this RFC.
#[derive] impl<T: UnrelatedTrait> MyTrait for MyType<T>;
Of the top of my head that is an interesting extension, but I agree with @nox that it is more verbose, less local and should be used when deriving still can't figure it out, but for specific problem this RFC attempts to solve, I think it is not the better solution.
The problem with #[derive]
on impl
s is that you have even less information about the type you are trying to derive the trait for as you have zero of the type's definition. No fields, no variant, no information of other traits being derived, no other attributes, no nothing. The benefit of deriving on the type definition site is that you can infer from the fields present the behavior that the impl will have.
I disagree. I think the proposed syntax is real ugly;
By all means, if you have some other attribute names in mind or a method that is equally terse as the one proposed in the RFC, I'm all ears. Tho, I don't understand why you think it is ugly... It builds upon what we really have, attributes, so I think it is consistent.
The privacy issue is fixable, in theory it's already fixed by #2145, it just needs implementation.
I'm not sure it is morally right to fix it... seems somewhat strange to me that you should be able to refer to private types in public types's impl where clauses..?
I like this a lot but I wish there were a way to make the names less... unwieldy.
They are just a bit lengthy, but does that really matter?
@clarcharr
I like this a lot but I wish there were a way to make the names less... unwieldy.
We originally went without derive_
prefixes but added them to make things more legible... If you have some suggestions on alternative naming we're all ears =)
Oh this, with phantom typing. DO WANT!
Have you considered to talk about type roles?
I think that always_derive
is better of a name than derive_no_bound
. To me, always_derive(Default)
very clearly states "derive default so that I can always use it, regardless of T
"
The derive_
part of the names are only there because derive
is a reserved prefix for attributes (AFAIK), so it would be derive_always
.
@KrishnaSannasi I copied that almost verbatim from a supposedly valid example in the RFC. I got the impression that #[derive_field_bound(Trait)]
gives additional instructions to a macro invoked by #[derive(Trait)]
on the same type definition item, isn't it so? For this to work in the current proc macro system without more macro tricks, the first attribute has to come below the second to be visible to the derive macro.
The way this could be made to work without compiler hacks is that derive_field_bound(...)
is backed by an attribute macro that looks for a yet-unpeeled derive
attribute in the item AST passed to it. If it finds one, the macro re-injects its attribute below the derive
attribute so that the derive macro will pick it up as a helper attribute and eat it, implementing the directives. If there is no derive
attribute below, it probably means a derive
macro above is unaware of derive_field_bound
, so it has been regurgitated as an inert attribute, and so a semi-informative error can be raised saying that the feature is not supported by the combination of attributes present on the item.
Given all this, and the list of error rules every macro author is supposed to uphold when dealing with the new attributes, I think the proposed solution is unwieldy. If the only purpose of it is to determine what generics does the derived impl receive, I humbly suggest considering #2811 as a more accessible alternative for both macro authors and the macro users.
@mzabaluev whoops, I seem to have misremembered how this RFC works because it's been so long since I read it
I copied that almost verbatim from a supposedly valid example in the RFC. I got the impression that
#[derive_field_bound(Trait)]
gives additional instructions to a macro invoked by#[derive(Clone)]
on the same declaration item, isn't it so?
Yes, that is correct
For this to work in the current proc macro system without more macro tricks, the first attribute has to come below the second to be visible to the derive macro.
derive_no_bounds
could be a helper attribute for all derives, so all derives see all of the derive_no_bounds
attributes no matter what they contain or where they are.
derive_no_bounds
could be a helper attribute
Ah thanks, I did not think this works on the item level attributes alongside the macro's own attribute, too.
(Besides, could this lead to strangeness when an attribute is both a helper and also has its own macro in scope to process it?)
The way this could be made to work without compiler hacks is that
derive_field_bound(...)
is backed by an attribute macro that looks for a yet-unpeeledderive
attribute in the item AST passed to it. If it finds one, the macro re-injects its attribute below thederive
attribute so that the derive macro will pick it up as a helper attribute and eat it, implementing the directives. If there is noderive
attribute below, it probably means aderive
macro above is unaware ofderive_field_bound
, so it has been regurgitated as an inert attribute, and so a semi-informative error can be raised saying that the feature is not supported by the combination of attributes present on the item.
See the RFC:
#[derive_field_bound(Trait)]
is specified on a type definition andTrait
is registered for deriving by a custom macro which specifies#[proc_macro_derive(Trait, attributes(<attr_list>))]
where<attr_list>
does not mentionderive_field_bound
. If#[derive_field_bound]
is specified instead, then this applies to all traits derived. This also applies to#[derive_no_bound]
.
Like mentioned in #2811 (comment) I think this is mostly a shortcoming of how Clone
and other derives are implemented. E.g. for
#[derive(Clone)] struct X<T> { x: PhantomData<T> }
the generated bound is T: Clone
where it should actually be PhantomData<T>: Clone
because that's where the actual .clone()
method is called.
The problem with "just making derive
smarter" (which has been discussed before but I didn't find the comments with a super quick overview) is privacy/encapsulation.
#[derive(Clone)] pub struct PubType<T> { inner: PrivType<T>, } #[derive(Clone)] pub(self) struct PrivType<T>(T);
If the generated Clone
impl for PubType
uses where PrivType<T>: Clone
, then it both leaks the fact that PubType
is implemented by holding a PrivType
and violates the private-in-public rules.
The "most correct" "smart" version would recursively expand these bounds until reaching public types to bound, but this starts to be very magic and hard to control what implementation details are publicized (and thus what changes are breaking). Not to mention it then adds ordering constraints on derive expansion and requires much more type information than current macros which are pure over their input.
the generated bound is
T: Clone
where it should actually bePhantomData<T>: Clone
T
should be unbounded. The impl should not leak the fact that the struct has a PhantomData
inside.
But as @CAD97 says above, figuring out the loosest possible bound of T
that only refers to publicly visible types and parameters is not something that proc macros can currently do, and even if they could, this may be more magic than the macro user can reason about.
Yes that makes sense. I didn't think about private types. I still think for some simple cases the derive could be made smarter (e.g. Arc
and PhantomData
implement Clone
for every T
). But that maybe has to be done in the compiler.
I also noticed that in the Prior arts section of Haskell it says:
it is not possible to configure deriving mechanisms in Haskell
while Haskell doesn't have attributes like Rust it does have the StandaloneDeriving
extension. Which (like this RFC) allows to specify the bounds of a derived instance:
data Foo a = Bar a | Baz String deriving instance Eq a => Eq (Foo a)
Unlike a deriving declaration attached to a data declaration, GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate boilerplate code for the specified class, and typechecks it. If there is a type error, it is your problem. (GHC will show you the offending code if it has a type error.)
Source: https://downloads.haskell.org/~ghc/7.8.4/docs/html/users_guide/deriving.html#stand-alone-deriving
This is basically what this RFC is trying to accomplish, right? Although probably a bit more verbose.
Syntax like that would also allow to derive a trait for a more specific instance. E.g.
deriving instance Eq a => Eq (Foo [a])
This would maybe translate to rust syntax that looks like this:
#[derive] impl<T> Clone for Struct1<T>; #[derive] impl<T: Clone, S> Clone for Struct2<T, S>; #[derive] impl<T, S> Clone for Struct2<T, S> where T: Clone; #[derive] impl Clone for Struct1<String> where T: Clone;
which is probably a lot more repetitive than what this RFC suggests but IMHO it is also a lot more readable and basically mirrors what you would find in the generated documentation.
Edit: Also like the Haskell docs mention this would allow the proc macros responsible for deriving the trait work like they do now (generate the boilerplate) and the compiler would replace the generated bounds with the bounds given by the user.
This would maybe translate to rust syntax that looks like this:
#[derive] impl<T> Clone for Struct1<T>; #[derive] impl<T: Clone, S> Clone for Struct2<T, S>; #[derive] impl<T, S> Clone for Struct2<T, S> where T: Clone; #[derive] impl Clone for Struct1<String> where T: Clone;
This has the same expressive strengths and weaknesses as #2811 in its current shape (as of eaaa257), but will require a change in Rust syntax. I like the idea of there being a greppable impl
keyword for each generated impl, though.
#[derive_field_bound(Trait)]
is specified on a type definition andTrait
is registered for deriving by a custom macro which specifies#[proc_macro_derive(Trait, attributes(<attr_list>))]
where<attr_list>
does not mentionderive_field_bound
. If#[derive_field_bound]
is specified instead, then this applies to all traits derived. This also applies to#[derive_no_bound]
.
Should the compiler treat specially the situation when derive_field_bound
or derive_no_bound
is present, but no derive
macro applicable on the item has listed it as a helper attribute? Currently that would result in an "unknown attribute" error, which is not very helpful.
Edit: Re-reading the rule above, it does seem to call for special treatment that does not automatically fall out of the current behavior over helper attributes.
This would maybe translate to rust syntax that looks like this:
#[derive] impl<T> Clone for Struct1<T>; #[derive] impl<T: Clone, S> Clone for Struct2<T, S>; #[derive] impl<T, S> Clone for Struct2<T, S> where T: Clone; #[derive] impl Clone for Struct1<String> where T: Clone;which is probably a lot more repetitive than what this RFC suggests but IMHO it is also a lot more readable and basically mirrors what you would find in the generated documentation.
This has been suggested before and requires repeating the field types if that's what you want to put trait bounds on.
This comment has been hidden.
@rfcbot fcp close
We discussed this RFC in our "Backlog Bonanza". The consensus was that we will close it, even though we are interested in solving this general problem. The solution here doesn't seem to be 100% on target. The right next step to carry this work forward would be to create a lang team project proposal, because we would want to charter an effort to explore and write-up the design space. Thanks all.
Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
@rfcbot reviewed
(I checked the names of folks who were present in the meeting.)
Copy link
rfcbot commented 12 days ago
This is now entering its final comment period, as per the review above.
Copy link
rfcbot commented 2 days ago
The final comment period, with a disposition to close, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
The RFC is now closed.
None yet
No milestone
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK