3

[perf] cache type info for ParamEnv by lukas-code · Pull Request #123058 · rust-...

 1 week ago
source link: https://github.com/rust-lang/rust/pull/123058
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

[perf] cache type info for ParamEnv #123058

Conversation

Contributor

This is an attempt to mitigate some of the perf regressions in #122553 (comment), but seems worth to test and land separately, since it is mostly unrelated to that PR.

rustbot

added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

labels

Mar 25, 2024

Comment on lines

191 to 214

let data_ptr = ptr::addr_of!(self.skel.data).cast::<T>();

// SAFETY: `data_ptr` has the same provenance as `self` and can therefore

// access the `self.len` elements stored at `self.data`.

// Note that we specifically don't reborrow `&self.skel.data`, because that

// would give us a pointer with provenance over 0 bytes.

unsafe { slice::from_raw_parts(data_ptr, self.skel.len) }

Contributor

Author

This was UB before.

Member

lukas-code reacted with heart emoji

This comment has been minimized.

rustbot

added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

Mar 25, 2024

Contributor

⌛ Trying commit 4a99cdc with merge be2a772...

Contributor

💥 Test timed out

lukas-code reacted with thumbs down emoji

Contributor

Author

uuh idk what bors is on about, the artifact looks like it got built? Let's try this:

☀️ Try build successful - checks-actions
Build commit: be2a772 (be2a772ea7aa3f1114b45c8fb9dd2a2466a5c152)

This comment has been minimized.

Contributor

Author

well that worked i guess?

Collaborator

Finished benchmarking commit (be2a772): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.3%, 0.6%] 4
Regressions ❌
(secondary)
0.5% [0.5%, 0.5%] 3
Improvements ✅
(primary)
-1.7% [-5.0%, -0.2%] 20
Improvements ✅
(secondary)
-0.3% [-0.4%, -0.2%] 4
All ❌✅ (primary) -1.3% [-5.0%, 0.6%] 24

Max RSS (memory usage)

Results

Cycles

Results

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 672.58s -> 671.55s (-0.15%)
Artifact size: 315.08 MiB -> 315.13 MiB (0.01%)

Contributor

Author

The regression in token-stream-stress looks like it's probably noise:

image

The regressions in bitmaps might be legit, but to me it seems like the improvements outweigh the regressions here.

r? @lcnr or compiler

Contributor

☔ The latest upstream changes (presumably #123098) made this pull request unmergeable. Please resolve the merge conflicts.

Contributor

☔ The latest upstream changes (presumably #123327) made this pull request unmergeable. Please resolve the merge conflicts.

#[repr(C)]

pub struct ListWithCachedTypeInfo<T> {

skel: ListWithCachedTypeInfoSkeleton<T>,

opaque: OpaqueListContents,

Contributor

why is that moved out of ListSkeleton?

Contributor

Author

Because accessing a field with an extern type tail will always panic, because we don't know the align of the extern type and therefore the padding before the field. So basically if we just put List as the tail of ListWithCachedTypeInfo, then any attempt to access the list will panic.

Contributor

that does not seem right to me 🤔 it should be totally fine to access an unsized field which contains an extern type as long as you never try to do a place-to-value conversion. I don't think we should ever copy ListSkeleton to the stack. Which use of ListSkeleton causes issues if you keep the current setup of having OpaqueListContents as a field of it?

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=a47ebf1e7a6ce3908b3ab3fcdd6ecc8d

Contributor

Author

Hmmm apparently there is a special case if the field with an extern type tail is the only field. Which i guess makes sense since there can't be any padding in that case. If you change the outer type to have some extra data, your example will panic: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=92bafecde22b9fd72a3ddb9d9018bc68

I don't think we should ever copy ListSkeleton to the stack

I agree. If my code does that it's not intended.

Contributor

@lcnr lcnr

left a comment

r=me after explaining the changes in ty/list.rs to me 🤔

Contributor

@lcnr lcnr

left a comment

🤔 the panic with extern types is annoying, but I understand the issue now.

Looking at this again, we now duplicate the unsafe code for List which is somewhat intricate and I am a bit worried that it ends up getting out of sync. Could we instead have a single type:

struct RawList<H, T> {
    header: H,
    length: usize,
    data: [T; 0],
    opaque: OpaqueListConten,
}

This should avoid both the "extern type alignment" issue and aloows us to remove the code duplication.

Sorry for not thinking of that during the first review round 😅

lukas-code reacted with thumbs up emoji

Contributor

@bors r+ rollup=never

Contributor

📌 Commit 94d61d8 has been approved by lcnr

It is now in the queue for this repository.

bors

added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

Apr 6, 2024

Contributor

⌛ Testing commit 94d61d8 with merge 087ae97...

Contributor

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing 087ae97 to master...

Collaborator

Finished benchmarking commit (087ae97): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.9% [-4.9%, -0.2%] 47
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.9% [-4.9%, -0.2%] 47

Max RSS (memory usage)

Results

Cycles

Results

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 669.732s -> 668.846s (-0.13%)
Artifact size: 318.22 MiB -> 318.48 MiB (0.08%)

lukas-code reacted with hooray emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

lcnr

lcnr left review comments
Assignees

lcnr

Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects

None yet

Milestone

1.79.0

Development

Successfully merging this pull request may close these issues.

None yet

6 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK