Report retags as distinct from real memory accesses for data races #3385

Merged

Conversation

Contributor

This changes the error reporting for data races such that reference invariants are no longer reported as real read and writes.

Before:

Data race detected between (1) non-atomic write on thread `unnamed-6` and (2) non-atomic read on thread `unnamed-5` at alloc1034971+0x10c. (2) just happened here

After:

Data race detected between (1) non-atomic write on thread `unnamed-8` and (2) shared reference invariant on thread `unnamed-6` at alloc1018329+0x190. (2) just happened here

Non-atomic read accesses from the other thread don't have this information tracked so those are called some potential non-atomic read access here.

RalfJung reacted with heart emoji

Contributor

Author

Should I change the race tests to look for race detected? Matching the entire string as they currently do seems quite redundant.

Member

This is great, I was just thinking that we should finally tackle this issue. :)

In terms of interface, I would suggest to simply extend the existing read with an NaReadType parameter. I don't quite see why we'd want functions like assert_immutable_memory, that just makes the API surface one has to understand so much bigger. (I see for write we have this internal unique_access function; I think it'd be fine to remove this and just rename unique_access to write and expose that. In fact na_read and na_write or non_atomic_read/non_atomic_write would probably be better.)

In terms of wording, "shared reference invariant on thread" does not make a lot of sense to me. I would say something like "implicit non-atomic read due to retag". (Cc @saethlin in case you have better ideas)

Should I change the race tests to look for race detected? Matching the entire string as they currently do seems quite redundant.

The point of these tests is to ensure that we call out the right thread names and access types. So just saying "race detected" would risk regressions of those aspects.

I am surprised that there are so many test failures, even for tests that involve actual accesses. Take for instance the classic read_write_race.rs:

  -error: Undefined Behavior: Data race detected between (1) non-atomic read on thread `unnamed-ID` and (2) non-atomic write on thread `unnamed-ID` at ALLOC. (2) just happened here
  +error: Undefined Behavior: Data race detected between (1) some potential non-atomic read access on thread `unnamed-ID` and (2) 4-byte non-atomic write on thread `unnamed-ID` at ALLOC. (2) just happened here

That is a completely different error. It now started mentioning the access size as well, and what is "some potential access"? There's a very clear non-atomic read access in the source. So something is definitely wrong with the diagnostics here.

Contributor

Author

implicit non-atomic read due to retag

I do find calling it a read quite confusing. I did initially go looking for a real read operation when I found a race, which is what promted the PR.

The point of these tests is to ensure that we call out the right thread names and access types. So just saying "race detected" would risk regressions of those aspects.

The usage in rustc is more to have the .stderr catch such regressions and the test source code just ensures the right kind and location.

It now started mentioning the access size as well

I fixed that now.

and what is "some potential access"

Maybe calling it "read or retag" is better? Tracking the information on reads would be nicer, but I don't know where to do that.

Member

I do find calling it a read quite confusing. I did initially go looking for a real read operation when I found a race, which is what promted the PR.

This is the common terminology. These are reads, they are just implicit side-effects of retags rather than explicit in the source.

Calling it an "invariant" doesn't even make semantic sense. An invariant is a property that is maintained throughout execution. We're not talking about a property here but about an operation.

The usage in rustc is more to have the .stderr catch such regressions and the test source code just ensures the right kind and location.

I don't agree, in rustc we also use the annotations to ensure that the right things show up in the message. Otherwise we'd just use //~ERROR and never //~ERROR: message.

Maybe calling it "read or retag" is better? Tracking the information on reads would be nicer, but I don't know where to do that.

Why does "unknown" even happen? "read or retag" will be even more confusing for the common case where it is a regular read.

Contributor

Author

they are just implicit side-effects of retags rather than explicit in the source.

Maybe calling them speculative reads would be better?

Why does "unknown" even happen?

We only know the read type on the current thread, not on the other conflicting operation, so I wanted a hint to the user that it could be either.

Member

Maybe calling them speculative reads would be better?

Hm... that sounds a bit like what CPUs are doing, which is unrelated. OTOH these reads are indeed related to the compiler's ability to insert read (or write) operations. So maybe the term makes sense. But then we should also use it for the aliasing errors (eventually, not in this PR).

@rust-lang/miri any opinions on terminology here?

We only know the read type on the current thread, not on the other conflicting operation, so I wanted a hint to the user that it could be either.

That sounds like it could cause a lot of confusion. Is there any way we can store that one bit of information (implicit/speculative read vs explicit read) somehow so that it is available later when computing the error? We already store an entire span, storing one more bit can't be so bad...

Contributor

Author

Hm... that sounds a bit like what CPUs are doing, which is unrelated. OTOH these reads are indeed related to the compiler's ability to insert read (or write) operations. So maybe the term makes sense

Speculative retag reads or retag reads could also be options that relates them back more to the borrow model.

Is there any way we can store that one bit of information (implicit/speculative read vs explicit read) somehow so that it is available later when computing the error?

Very likely, but I wasn't able to locate where it would go. The span logic seemed quite weird to me with it's handling of dummy spans too.

Member

"retag read" also works for me, though using "retag" as an adjective seems a bit odd.

Very likely, but I wasn't able to locate where it would go. The span logic seemed quite weird to me with it's handling of dummy spans too.

VTimestamp probably needs a bool that indicates whether the read is "real" or not. That should likely be packed into time to avoid making it 4 bytes bigger.

Member

I think our diagnostics should be informative for people who were completely unaware of this property of references until seeing the error. If we call them something like "a speculative retag read" we should provide a link or some other kind of reference to documentation that explains what that terminology refers to, or a separate help: message that says something like:

help: Retags permit optimizations that insert speculative (or hoisted) reads/writes. Therefore from the perspective of aliasing and data race invariants, a retag has the same implications as a read/write.

Member

Tweaking the wording of that message:

help: Retags permit optimizations that insert speculative (or loop-hoisted) reads/writes. Therefore from the perspective of aliasing and data race models, a retag has the same implications as a read/write.

Or maybe "restrictions" instead of "models"? I don't like the use of "invariants" there, that's a specific technical term and this is too far outside its regular meaning.

Contributor

Author

"retag read" also works for me, though using "retag" as an adjective seems a bit odd.

Could we put an adjective in front of "retag" instead? "shared retag" and "unique retag"? That makes it more clear that there's no correspondence to reads in the source code.

Member

Note that whether unique retags are reads or writes depends on the aliasing model (in SB they are writes, in TB reads).

Is the idea that you would the entirely relegate the explanation that "shared retag" means a form of read into the help? That would also work for me.

Further editing on the help text, since it is specifically for data races so I think we can remove the "aliasing model" part:

help: Retags permit optimizations that insert speculative (or loop-hoisted) reads/writes. Therefore from the perspective of data races, a retag has the same implications as a read/write.

Contributor

Author

Note that whether unique retags are reads or writes depends on the aliasing model (in SB they are writes, in TB reads).

I stuck with retag read and retag write then. It's useful from the data race perspective to know if they're reads or writes.

saethlin reacted with thumbs up emoji

Member

I like the wording you just pushed.

RalfJung

changed the title ~~Report reference invariants as distinct from real memory accesses for data races~~

Report retags as distinct from real memory accesses for data races

Mar 18, 2024

Member

RalfJung

left a comment

This looks great, thanks! We're getting into the fine-tuning phase.

tests/fail/both_borrows/retag_data_race_write.stack.stderr

Outdated

		\|
		help: and (1) occurred earlier here
		--> $DIR/retag_data_race_write.rs:LL:CC
		\|
		LL \| let _r = &mut *p;
		\| ^^^^^^^
		= help: retags permit optimizations that insert speculative (or loop-hoisted) reads/writes. Therefore from the perspective of data races, a retag has the same implications as a read/write.

One possible issue here is that the user probably won't know what a "retag" is. For aliasing model messages at least there is context implying it's probably an aliasing thing, but here this is a data race message and people won't expect aliasing model terms.

Not sure what to do about that. We could either use a less precise but more common term, such as "reborrow", or we have to say at least a few words about what a retag is.

This is still open. Any opinion on what to do? I have a slight tendency towards just saying "reborrow" instead of "retag".

Contributor

Author

I'd probably add a short explanation of what a retag is. When would they not correspond to a reborrow?

Retags happen (almost) every time a value of reference is assigned. So e.g. if x has type (&i32, &i32) then passing that to a function retags both references. If y has type &(&i32, &i32) then let x = *y; also retags both references in the pair.

Contributor

Author

Maybe something like retags happens due to explicit or implicit borrowing?

Does the initial borrow also result in a retag? Presumably those could be involved in data races too.

Yes, every execution of &/&mut expressions results in a retag.

So... maybe you can just add another line which says something like

help: retags occur on all (re)borrows and as well as when references are copied/moved

src/concurrency/vector_clock.rs

Outdated Show resolved

src/concurrency/vector_clock.rs

Show resolved

src/concurrency/vector_clock.rs

Outdated Show resolved

src/concurrency/vector_clock.rs

Outdated Show resolved

src/concurrency/data_race.rs

Show resolved

src/concurrency/data_race.rs

Show resolved

src/concurrency/data_race.rs

Show resolved

src/concurrency/data_race.rs

Outdated Show resolved

tests/fail/stacked_borrows/retag_data_race_protected_read.stderr

Outdated Show resolved

Member

@rustbot author

rustbot

added the S-waiting-on-author Status: Waiting for the PR author to address review comments label

Mar 19, 2024

src/diagnostics.rs

Outdated

Comment on lines

274 to 276

		helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved. \
		Retags permit optimizations that insert speculative reads or writes. \
		Therefore from the perspective of data races, a retag has the same implications as a read or write.".to_owned()));

Suggested change

	helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved. \
	Retags permit optimizations that insert speculative reads or writes. \
	Therefore from the perspective of data races, a retag has the same implications as a read or write.".to_owned()));
	helps.push((None, "retags permit optimizations that insert speculative reads or writes. \
	Therefore, a retag has the same implications as a read or write from the perspective of data races.".to_owned()));
	helps.push((None, "retags occur on all (re)borrows and as well as when references are copied or moved".to_owned()));

Contributor

Author

Should the lines be split to one help message each? Why would why is a retag? be answered before what is a retag??

I was thinking we would start with "why is this a race" -- because of speculative reads/writes. Then we can explain where those come from. But you are right, the other order would also make sense.

Splitting into 3 messages also works for me, I just felt it was too long for 1.^^

Member

Just one final tweak of the message. Then please squash all the commits.

Member

Looks good, thanks for working through all these revisions with me. :)
Please squash, then we can land this.

Member

@bors r+

Collaborator

📌 Commit 3cf20a2 has been approved by RalfJung

It is now in the queue for this repository.

Collaborator

⌛ Testing commit 3cf20a2 with merge 3670823...

Collaborator

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 3670823 to master...

Reviewers

RalfJung

RalfJung left review comments

Assignees

No one assigned

Labels

S-waiting-on-author Status: Waiting for the PR author to address review comments

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

None yet

5 participants

Report retags as distinct from real memory accesses for data races by Zoxc · Pul...

Report retags as distinct from real memory accesses for data races #3385

Conversation

Recommend

2024年投资机构高管或更偏向食品饮料行业，中泰证券和中金建议关注

Building a Notification Service in Rust with AWS SNS

Replace closures with `_` when suggesting fully qualified path for method call b...

可为未拆封 iPhone 无线更新 iOS 系统，苹果零售店“Presto”手机 OTA 设备外观曝光

Handle str literals written with `'` lexed as lifetime by estebank · Pull Reques...

Ensure nested statics have a HIR node to prevent various queries from ICEing by...

BAE demos platform that gives Army AMPVs turret system options - Top Tech News

How I reduced (incremental) Rust compile times by up to 40%

Kylie Jenner全新个人服饰品牌Khy在伦敦Selfridges开设门店，这是品牌在线下的首个销...

Don't ICE when encountering bound regions in generator interior type by compiler...

About Joyk