Report retags as distinct from real memory accesses for data races by Zoxc · Pul...
source link: https://github.com/rust-lang/miri/pull/3385
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Report retags as distinct from real memory accesses for data races #3385
Merged
Conversation
Contributor
This changes the error reporting for data races such that reference invariants are no longer reported as real read and writes.
Before:
Data race detected between (1) non-atomic write on thread `unnamed-6` and (2) non-atomic read on thread `unnamed-5` at alloc1034971+0x10c. (2) just happened here
After:
Data race detected between (1) non-atomic write on thread `unnamed-8` and (2) shared reference invariant on thread `unnamed-6` at alloc1018329+0x190. (2) just happened here
Non-atomic read accesses from the other thread don't have this information tracked so those are called some potential non-atomic read access
here.
Contributor
Author
Should I change the race tests to look for |
Member
This is great, I was just thinking that we should finally tackle this issue. :) In terms of interface, I would suggest to simply extend the existing In terms of wording, "shared reference invariant on thread" does not make a lot of sense to me. I would say something like "implicit non-atomic read due to retag". (Cc @saethlin in case you have better ideas)
The point of these tests is to ensure that we call out the right thread names and access types. So just saying "race detected" would risk regressions of those aspects. I am surprised that there are so many test failures, even for tests that involve actual accesses. Take for instance the classic
That is a completely different error. It now started mentioning the access size as well, and what is "some potential access"? There's a very clear non-atomic read access in the source. So something is definitely wrong with the diagnostics here. |
Contributor
Author
I do find calling it a read quite confusing. I did initially go looking for a real read operation when I found a race, which is what promted the PR.
The usage in rustc is more to have the
I fixed that now.
Maybe calling it "read or retag" is better? Tracking the information on reads would be nicer, but I don't know where to do that. |
Member
This is the common terminology. These are reads, they are just implicit side-effects of retags rather than explicit in the source. Calling it an "invariant" doesn't even make semantic sense. An invariant is a property that is maintained throughout execution. We're not talking about a property here but about an operation.
I don't agree, in rustc we also use the annotations to ensure that the right things show up in the message. Otherwise we'd just use
Why does "unknown" even happen? "read or retag" will be even more confusing for the common case where it is a regular read. |
Contributor
Author
Maybe calling them speculative reads would be better?
We only know the read type on the current thread, not on the other conflicting operation, so I wanted a hint to the user that it could be either. |
Member
Hm... that sounds a bit like what CPUs are doing, which is unrelated. OTOH these reads are indeed related to the compiler's ability to insert read (or write) operations. So maybe the term makes sense. But then we should also use it for the aliasing errors (eventually, not in this PR). @rust-lang/miri any opinions on terminology here?
That sounds like it could cause a lot of confusion. Is there any way we can store that one bit of information (implicit/speculative read vs explicit read) somehow so that it is available later when computing the error? We already store an entire span, storing one more bit can't be so bad... |
Contributor
Author
Speculative retag reads or retag reads could also be options that relates them back more to the borrow model.
Very likely, but I wasn't able to locate where it would go. The span logic seemed quite weird to me with it's handling of dummy spans too. |
Member
"retag read" also works for me, though using "retag" as an adjective seems a bit odd.
|
Member
I think our diagnostics should be informative for people who were completely unaware of this property of references until seeing the error. If we call them something like "a speculative retag read" we should provide a link or some other kind of reference to documentation that explains what that terminology refers to, or a separate
|
Member
Tweaking the wording of that message:
Or maybe "restrictions" instead of "models"? I don't like the use of "invariants" there, that's a specific technical term and this is too far outside its regular meaning. |
Contributor
Author
Could we put an adjective in front of "retag" instead? "shared retag" and "unique retag"? That makes it more clear that there's no correspondence to reads in the source code. |
Member
Note that whether unique retags are reads or writes depends on the aliasing model (in SB they are writes, in TB reads). Is the idea that you would the entirely relegate the explanation that "shared retag" means a form of read into the Further editing on the help text, since it is specifically for data races so I think we can remove the "aliasing model" part:
|
Contributor
Author
I stuck with |
Member
I like the wording you just pushed. |
changed the title
Report reference invariants as distinct from real memory accesses for data races
Report retags as distinct from real memory accesses for data races
This looks great, thanks! We're getting into the fine-tuning phase.
| |
||
help: and (1) occurred earlier here |
||
--> $DIR/retag_data_race_write.rs:LL:CC |
||
| |
||
LL | let _r = &mut *p; |
||
| ^^^^^^^ |
||
= help: retags permit optimizations that insert speculative (or loop-hoisted) reads/writes. Therefore from the perspective of data races, a retag has the same implications as a read/write. |
One possible issue here is that the user probably won't know what a "retag" is. For aliasing model messages at least there is context implying it's probably an aliasing thing, but here this is a data race message and people won't expect aliasing model terms.
Not sure what to do about that. We could either use a less precise but more common term, such as "reborrow", or we have to say at least a few words about what a retag is.
This is still open. Any opinion on what to do? I have a slight tendency towards just saying "reborrow" instead of "retag".
Contributor
Author
I'd probably add a short explanation of what a retag is. When would they not correspond to a reborrow?
Retags happen (almost) every time a value of reference is assigned. So e.g. if x
has type (&i32, &i32)
then passing that to a function retags both references. If y
has type &(&i32, &i32)
then let x = *y;
also retags both references in the pair.
Contributor
Author
Maybe something like retags happens due to explicit or implicit borrowing
?
Does the initial borrow also result in a retag? Presumably those could be involved in data races too.
Yes, every execution of &
/&mut
expressions results in a retag.
So... maybe you can just add another line which says something like
help: retags occur on all (re)borrows and as well as when references are copied/moved
src/concurrency/vector_clock.rs
Outdated Show resolved
src/concurrency/vector_clock.rs
Show resolved
src/concurrency/vector_clock.rs
Outdated Show resolved
src/concurrency/vector_clock.rs
Outdated Show resolved
Show resolved
Show resolved
Show resolved
Outdated Show resolved
tests/fail/stacked_borrows/retag_data_race_protected_read.stderr
Outdated Show resolved
Member
@rustbot author |
added the S-waiting-on-author Status: Waiting for the PR author to address review comments label
Outdated
Comment on lines
274 to 276
helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved. \ |
||
Retags permit optimizations that insert speculative reads or writes. \ |
||
Therefore from the perspective of data races, a retag has the same implications as a read or write.".to_owned())); |
helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved. \ | |
Retags permit optimizations that insert speculative reads or writes. \ | |
Therefore from the perspective of data races, a retag has the same implications as a read or write.".to_owned())); | |
helps.push((None, "retags permit optimizations that insert speculative reads or writes. \ | |
Therefore, a retag has the same implications as a read or write from the perspective of data races.".to_owned())); | |
helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved".to_owned())); |
Contributor
Author
Should the lines be split to one help message each? Why would why is a retag?
be answered before what is a retag?
?
I was thinking we would start with "why is this a race" -- because of speculative reads/writes. Then we can explain where those come from. But you are right, the other order would also make sense.
Splitting into 3 messages also works for me, I just felt it was too long for 1.^^
Member
Just one final tweak of the message. Then please squash all the commits. |
Member
Looks good, thanks for working through all these revisions with me. :) |
Member
@bors r+ |
Collaborator
☀️ Test successful - checks-actions |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
No one assigned
None yet
No milestone
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK