add a deep fast_reject routine #97345

Conversation

Contributor

lcnr commented 9 days ago

continues the work on #97136.

r? @nnethercote

Actually agree with you on the match structure let's see how that impacted perf

rustbot

added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label

9 days ago

rust-highfive

added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label

9 days ago

lcnr

marked this pull request as ready for review

9 days ago

Contributor

Author

rust-timer commented 9 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rustbot

added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

9 days ago

Contributor

bors commented 9 days ago

Trying commit 9fe410f with merge 5dd29bd...

Contributor

Author

rust-timer commented 9 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Contributor

bors commented 9 days ago

Trying commit c2a5d73 with merge b1f8b07...

Contributor

Author

This comment has been minimized.

Contributor

bors commented 9 days ago

Try build successful - checks-actions
Build commit: b1f8b07 (b1f8b07304a489ac1719a510512c8f14ac90de0f)

Collaborator

rust-timer commented 9 days ago

Collaborator

rust-timer commented 9 days ago

Finished benchmarking commit (b1f8b07): comparison url.

Instruction count

Primary benchmarks: relevant improvements found
Secondary benchmarks: mixed results

Regressions (primary)	Regressions (secondary)	Improvements (primary)	Improvements (secondary)	All (primary)
count1	0	8	45	8	45
mean2	N/A	0.5%	-7.5%	-0.6%	-7.5%
max	N/A	0.7%	-42.4%	-0.8%	-42.4%

Cycles

Results

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

number of relevant changes
the arithmetic mean of the percent change

est31 reacted with rocket emojijackh726 reacted with eyes emoji

rustbot

removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

9 days ago

Contributor

Author

lcnr commented 9 days ago

ideally we extend that deep fast reject to also use it for coherence. That should probably give us some further speedups

For that we need a deep fast reject where both sides are inference vars, implementing that now

Contributor

Author

lcnr commented 9 days ago

Collaborator

rust-timer commented 9 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rustbot

added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

9 days ago

Contributor

bors commented 9 days ago

Trying commit df22932 with merge 3cd8dbe...

Contributor

Author

lcnr commented 9 days ago

Collaborator

rust-timer commented 9 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Contributor

Author

lcnr commented 9 days ago

Contributor

jackh726 commented 8 days ago

So, why does bitmaps get crazy wins here where other benchmarks are just "modest"?

Contributor

nnethercote commented 8 days ago

So, why does bitmaps get crazy wins here where other benchmarks are just "modest"?

It has a type struct BitsImpl<const N: usize>, a trait Bits, and then (via macros) 1024(!) impls of Bits for BitsImpl<1> through BitsImpl<1024>. The compiler ends up comparing all those impls for overlap.

This Zulip thread has more details.

compiler/rustc_middle/src/ty/fast_reject.rs

Outdated

		ty::Adt(obl_def, obl_substs) => match k {
		&ty::Adt(impl_def, impl_substs) => {
		obl_def == impl_def
		&& iter::zip(obl_substs, impl_substs)

I guess if obl_def == impl_def then the substs are guaranteed to have the same length?

Contributor

Author

lcnr 8 days ago

compiler/rustc_middle/src/ty/fast_reject.rs

Outdated

		}
		_ => false,
		},
		ty::Slice(obl_ty) => matches!(k, &ty::Slice(impl_ty) if types_may_unify(obl_ty, impl_ty)),

You could use matches!-with-an-if in more of these cases... probably good to have consistency one way or the other. Since this is the only one at the moment, maybe a match would be better here?

Contributor

Author

lcnr 8 days ago

my issue with matches is that it somewhat breaks formatting which is annoying.

Making this consistent is a bit annoying going to keep it this way for now '^^

compiler/rustc_middle/src/ty/fast_reject.rs

Outdated

		// Ideally we would walk the existential predicates here or at least
		// compare their length. But considering that the relevant `Relate` impl
		// actually sorts and deduplicates these, that doesn't work.
		matches!(k, ty::Dynamic(impl_preds, ..) if

Oh, this one is a matches! too.

Contributor

nnethercote commented 8 days ago

Great additional improvements on bitmaps.

Even better, I measured with a trunk version of nalgebra. My changes in #97136 got compile time reductions of up to 10%, but with your additional coherence changes it's now more like 50%! (cc @lqd)

The changes mostly look good. I have a few nits above, I'll let you decide how to deal with them. But please squash together commits 1, 3, and 4 before merging.

lqd reacted with hooray emoji

Contributor

nnethercote commented 8 days ago

@bors delegate=lcnr

oli-obk reacted with laugh emoji

Contributor

bors commented 8 days ago

@lcnr can now approve this pull request

Contributor

nnethercote commented 8 days ago

I measured some benchmarks that we know from this analysis have some similarities to bitmaps:

aes-gcm-0.9.4
bigdecimal-0.3.0
bitmaps-3.1.0
bytemuck-1.7.3
bytestring-1.0.0
hex-0.4.3
lzw-0.10.0
nalgebra-0.30.1
num-complex-0.4.0
ordered-float-2.10.0
pbkdf2-0.10.0
quickcheck-1.0.3
rustc-rayon-0.3.2
scroll-0.11.0
secrecy-0.8.0
sha3-0.10.0
strsim-0.10.0

Instruction counts measured locally for check builds:

Benchmark	Profile	Scenario	% Change	Significance Factor?
bitmaps-3.1.0	check	full	-68.61%	343.05x
bitmaps-3.1.0	check	incr-full	-62.70%	313.49x
nalgebra-0.30.1	check	incr-full	-59.27%	296.36x
nalgebra-0.30.1	check	full	-57.36%	286.78x
secrecy-0.8.0	check	full	-38.76%	193.81x
secrecy-0.8.0	check	incr-full	-31.52%	157.60x
hex-0.4.3	check	full	-22.94%	114.71x
hex-0.4.3	check	incr-full	-20.01%	100.05x
bytemuck-1.7.3	check	full	-15.27%	76.34x
ordered-float-2.10.0	check	full	-13.14%	65.68x
bytemuck-1.7.3	check	incr-full	-12.62%	63.12x
num-complex-0.4.0	check	full	-12.47%	62.35x
ordered-float-2.10.0	check	incr-full	-10.83%	54.13x
num-complex-0.4.0	check	incr-full	-10.59%	52.97x
bitmaps-3.1.0	check	incr-patched: println	-10.05%	50.25x
bitmaps-3.1.0	check	incr-unchanged	-9.75%	48.75x
bytestring-1.0.0	check	full	-8.42%	42.11x
rustc-rayon-0.3.2	check	full	-8.02%	40.12x
rustc-rayon-0.3.2	check	incr-full	-6.86%	34.31x
bytestring-1.0.0	check	incr-full	-6.54%	32.69x
pbkdf2-0.10.0	check	full	-5.76%	28.81x
bigdecimal-0.3.0	check	full	-5.63%	28.17x
bigdecimal-0.3.0	check	incr-full	-4.92%	24.59x
strsim-0.10.0	check	full	-4.92%	24.59x
pbkdf2-0.10.0	check	incr-full	-4.56%	22.80x
strsim-0.10.0	check	incr-full	-3.80%	19.02x
rustc-rayon-0.3.2	check	incr-unchanged	-1.83%	9.17x
aes-gcm-0.9.4	check	full	-1.24%	6.18x
bigdecimal-0.3.0	check	incr-unchanged	-1.22%	6.10x
aes-gcm-0.9.4	check	incr-full	-0.95%	4.74x
hex-0.4.3	check	incr-unchanged	-0.94%	4.68x
ordered-float-2.10.0	check	incr-unchanged	-0.86%	4.30x
num-complex-0.4.0	check	incr-unchanged	-0.82%	4.09x
secrecy-0.8.0	check	incr-unchanged	-0.77%	3.87x
sha3-0.10.0	check	full	-0.77%	3.86x
scroll-0.11.0	check	full	-0.70%	3.50x
sha3-0.10.0	check	incr-full	-0.58%	2.90x
lzw-0.10.0	check	full	-0.52%	2.59x
scroll-0.11.0	check	incr-full	-0.51%	2.54x
nalgebra-0.30.1	check	incr-unchanged	-0.44%	2.20x
lzw-0.10.0	check	incr-full	-0.31%	1.54x

rylev and memoryruins reacted with heart emojiluqmana, Kobzol, est31, and memoryruins reacted with rocket emoji

This comment has been minimized.

Contributor

Author

lcnr commented 8 days ago

@bors r=nnethercote

Contributor

bors commented 8 days ago

Commit bff7b51 has been approved by nnethercote

bors

added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

8 days ago

Contributor

bors commented 8 days ago

Testing commit bff7b51 with merge 4a99c5f...

Contributor

bors commented 8 days ago

Test successful - checks-actions
Approved by: nnethercote
Pushing 4a99c5f to master...

bors

added the merged-by-bors This PR was explicitly merged by bors label

8 days ago

bors

merged commit 4a99c5f into

rust-lang:master 8 days ago

11 checks passed

rustbot

added this to the 1.63.0 milestone

8 days ago

Collaborator

rust-timer commented 8 days ago

Finished benchmarking commit (4a99c5f): comparison url.

Instruction count

Primary benchmarks: relevant improvements found
Secondary benchmarks: mixed results

mean1	max	count2
Regressions (primary)	N/A	N/A	0
Regressions (secondary)	0.9%	1.6%	10
Improvements (primary)	-11.0%	-65.8%	45
Improvements (secondary)	-0.5%	-0.9%	12
All (primary)	-11.0%	-65.8%	45

Max RSS (memory usage)

Results

Cycles

Results

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

the arithmetic mean of the percent change
number of relevant changes

luqmana, memoryruins, and BlackHoleFox reacted with hooray emoji

Labels

merged-by-bors This PR was explicitly merged by bors S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Milestone

1.63.0

Development

Successfully merging this pull request may close these issues.

None yet

8 participants

add a deep fast_reject routine by lcnr · Pull Request #97345 · rust-lang/rust ·...

add a deep fast_reject routine #97345

Conversation

lcnr commented 9 days ago

lcnr commented 9 days ago

rust-timer commented 9 days ago

bors commented 9 days ago

lcnr commented 9 days ago

rust-timer commented 9 days ago

bors commented 9 days ago

lcnr commented 9 days ago

This comment has been minimized.

bors commented 9 days ago

rust-timer commented 9 days ago

rust-timer commented 9 days ago

Footnotes

lcnr commented 9 days ago

lcnr commented 9 days ago

rust-timer commented 9 days ago

bors commented 9 days ago

lcnr commented 9 days ago

rust-timer commented 9 days ago

lcnr commented 9 days ago

jackh726 commented 8 days ago

nnethercote commented 8 days ago

nnethercote commented 8 days ago

nnethercote commented 8 days ago

bors commented 8 days ago

nnethercote commented 8 days ago

This comment has been minimized.

lcnr commented 8 days ago

bors commented 8 days ago

bors commented 8 days ago

bors commented 8 days ago

rust-timer commented 8 days ago

Footnotes

Recommend

About Joyk