Add `Ord::cmp` for primitives as a `BinOp` in MIR by scottmcm · Pull Request #11... - JOYK Joy of Geek, Geek News, Link all geek

Conversation

Member

Update: most of this OP was written months ago. See #118310 (comment) below for where we got to recently that made it ready for review.

There are dozens of reasonable ways to implement Ord::cmp for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding BinOp::Cmp at the MIR level:

Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
By not inlining those details for every use of cmp, we drastically reduce the amount of MIR generated for derived PartialOrd, while also making it more amenable to MIR-level optimizations.

Having extremely careful if ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction tricks (#105840) is arguably even nicer, but depends on the optimizer understanding it (llvm/llvm-project#73417) to be practical. Or maybe bitor is better than add? But maybe only on a future version that has or disjoint support? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- which it should -- we'll need at least a rustc intrinsic to be able to call it.

As for simplifying it in Rust, we now regularly inline {integer}::partial_cmp, but it's quite a large amount of IR. The best way to see that is with 8811efa#diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113 -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-Some-then-match-it-in-same-BB noise is cleaned up, this'll expose the Cmp == 0 branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a BinOp::Eq and thus fix some of our generated code perf issues. (Tracking that through today's if a < b { Less } else if a == b { Equal } else { Greater } would be much harder.)

r? @ghost
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~

tgross35 reacted with thumbs up emojielichai and mati865 reacted with heart emoji

Add `Ord::cmp` for primitives as a `BinOp` in MIR by scottmcm · Pull Request #11...

Conversation

Recommend

上市后业绩“变脸”，已连续两年亏损云创数据收北交所问询函

What if LLMs were actually interesting to talk to?

Meta Llama 3 开源，性能超越 GPT-3.5

Case Study: 84—24

Improving authoritative DNS with the official release of Foundation DNS

Introducing MSTest SDK - Improved Configuration & Flexibility - .NET Blog

Data pipelines for the rest of us

Over-the-air TV might soon receive interactive functionality similar to streamin...

铜价等原材料价格高位运行，空调是否闻声涨价？

How IKEA adapted its design to China

About Joyk