Tracking Issue for total_cmp (on f32/f64) · Issue #72599 · rust-lang/rust · GitH... - JOYK Joy of Geek, Geek News, Link all geek

Copy link

Contributor

golddranks commented on May 26, 2020 •

edited by m-ou-se

This is a tracking issue for the APIs f32::total_cmp, f64::total_cmp ( #72568 )
The feature gate for the issue is #![feature(total_cmp)].

Overview of the API

Implements method total_cmp on f32 and f64. This method implements a float comparison that, unlike the standard partial_cmp, is total (defined on all values) in accordance to the IEEE 754 (rev 2008) §5.10 totalOrder predicate.
The method has an API similar to cmp: pub fn total_cmp(&self, other: &Self) -> crate::cmp::Ordering { ... }.

Steps

Implement the RFC
Stabilization PR
- Update docs to not promise more than we can.

Unresolved Questions

The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed. This means the order of those NaNs is reverse on those platfroms. Newer MIPS chips have a flag for the user to choose the behaviour. Should we add a special mention about this in the documentation, or just ignore the problem, as it's 1) not very common (being Tier 2 and all) 2) minor (reversal of order of sNaN and qNaN is unlikely to matter in real-life situations, and in those kinds of situations, the user is usually aware of the problem) 3) going away. (The newer systems addressing this problem.)

#73328

Implementation history

#72568

Copy link

Contributor

hanna-kruppe commented on May 26, 2020

Another reason to ignore the "signaling bit is flipped on older MIPS" issue: as far as I know, there is essentially no way to actually tell signaling and quiet NaNs apart in current or near-future Rust. You can of course inspect the bit in question with to_bits, but nothing tells you how that bit is interpreted by the hardware. Debug printing, f32::classify, etc. do not distinguish sNaN from qNaN, and Rust neither exposes the distinction between quiet/signaling comparisons not any way to check for floating point exceptions those predicates may raise depending on sNaN/qNaN operands.

Copy link

Member

sfackler commented on May 27, 2020

Yeah, I think that the best we can do is just stick something in the docs noting the issue.

Copy link

Contributor

est31 commented on Jun 14, 2020

The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed

There are multiple bit pattern definitions of "signaling NaN" and "quiet NaN":

The "flipped" definition of older MIPS standards
The definitions of IEEE 754-2008 and IEEE 754-2019

The two definitions disagree, and no implementation of totalOrder can be conformant with both. But the current implementation of totalOrder is conformant with the IEEE definition. Why should you use the MIPS standard definition to determine conformance to an IEEE standard?

I'd filed a PR in 2017 to add some docs on NaN to the {from,to}_bits functions but closed it because I thought it wouldn't matter due to those devices being phased out. #46246

Copy link

Contributor

ecstatic-morse commented on Jun 15, 2020 •

edited

At present, the behavior of this comparator is unspecified, since the bits inside of a NaN are not guaranteed to be preserved across operations (including loads and stores) by LLVM (#73328). This includes the sign bit as well (#55131). We will need to fix these issues this before stabilizing total_cmp, and we should call this out explicitly in the documentation.

Copy link

Member

RalfJung commented on Sep 9, 2020

The exact ordering required by IEEE 754, revisions 2008 and 2019, isn't followed on some Tier 2 platforms, namely, older MIPS systems. The problem is that on those platforms, the meaning of the "signaling NaN" and "quiet NaN" bit is reversed. This means the order of those NaNs is reverse on those platfroms.

What does this mean? Does the same bit pattern compare differently? Or is the ordering, when implemented uniformly across platforms, specified in terms of whether a NaN is signalling or not, and thus that cross-platform implementation is inconsistent with the signaling vs quiet distinction on that platform?

Basically, I am wondering which spec is being violated when we just ignore the fact that these MIPS systems have a "weird" definition of signaling NaNs.

Cc rust-lang/unsafe-code-guidelines#237 as the place where I am trying to collect all open questions around floating-point semantics.

Copy link

Member

RalfJung commented on Sep 9, 2020 •

edited

Oh I see... the order is specified in terms of whether NaNs are signaling or not, but implemented disregarding whatever "signalling NaN" means on the current platform (assuming the standardized meaning instead).

So I guess this technically means that the implementation is not quite conforming (but this seems rather minor compared to how badly behaved i686 floats are^^ see #72327, #73288). Not sure what would be more surprising, documenting this as "not fully spec-compliant on that platform" or trying to fix the implementation. Does any other language try to work around this by changing the implementation on that platform?

Copy link

Contributor

est31 commented on Sep 9, 2020 •

edited

Oh I see... the order is specified in terms of whether NaNs are signaling or not, but implemented disregarding whatever "signalling NaN" means on the current platform (assuming the standardized meaning instead).

So I guess this technically means that the implementation is not quite conforming

Not conforming to which spec? All IEEE 754 spec versions that define totalOrder (versions 2008 and onwards) also define what signaling NaN means. If you see a definition foo in a document A reference another definition bar, did they mean the definition bar from document B, or did they mean the definition bar that's also contained in A? I'd say the latter. Note that to my knowledge, there is no totalOrder definition in the document B (older versions of the MIPS spec in this instance). If that were the case, one could arguably say that Rust's implementation conflicts the platform.

So I believe that Rust is always spec compliant on older mips platforms, but yes there is possible confusions for readers of the spec of totalOrder.

Copy link

Contributor

clarfonthey commented on Nov 22, 2020

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

An alternative is to add f64/f32 methods like:

fn total_fp_ord(self) -> i64 {
    let bi = self.to_bits() as i64;
    bi^ (((bi >> 63) as u64) >> 1) as i64
}

With this method you can use max_by_key instead of max_by.
But the wrapper idea could be better.

Copy link

Member

scottmcm commented on Mar 29, 2021

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

That was brought up in my original PR: #53938 (comment)

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

I implemented something similar a while ago: https://github.com/l0calh05t/totally-ordered-rs

Copy link

l0calh05t commented on Jun 10, 2021 •

edited

Has there been any discussion of adding a Total wrapper like Wrapping for integers?

An alternative is to add f64/f32 methods like:
fn total_fp_ord(self) -> i64 {
    let bi = self.to_bits() as i64;
    bi^ (((bi >> 63) as u64) >> 1) as i64
}
With this method you can use max_by_key instead of max_by.
But the wrapper idea could be better.

I just benchmarked this approach (and a few others) and transforming the keys themselves instead of providing a comparison is much more efficient when using sort_by_cached_key (even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

Sort f32/Partial/65536  time:   [3.2772 ms 3.2940 ms 3.3112 ms]
Sort f32/Total/65536    time:   [4.6182 ms 4.6388 ms 4.6584 ms]
Sort f32/Cached key/65536
                        time:   [2.7210 ms 2.7433 ms 2.7684 ms]
Sort f32/In place key/65536
                        time:   [1.7014 ms 1.7027 ms 1.7042 ms]
Sort f64/Partial/65536  time:   [2.8154 ms 2.8275 ms 2.8414 ms]
Sort f64/Total/65536    time:   [4.8437 ms 4.8712 ms 4.8938 ms]
Sort f64/Cached key/65536
                        time:   [3.0629 ms 3.0803 ms 3.0989 ms]
Sort f64/In place key/65536
                        time:   [1.8246 ms 1.8481 ms 1.8689 ms]

Partial is the standard values.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));.
Total uses a wrapper / total_cmp.
Cached key uses total_fp_ord with sort_by_cached_key (all other sorts are unstable).
In place key transforms the keys in place, reinterprets the slice as a slice of i32/i64, runs a standard unstable integer sort, then transforms them back (this approach is the fastest, but providing a safe/clean interface is complicated, especially in a sort_by_key context).

Copy link

Contributor

Author

golddranks commented on Jun 10, 2021

@l0calh05t Do you have the benchmark code uploaded somewhere?

RalfJung

changed the title Tracking Issue for total_cmp

Tracking Issue for total_cmp (on f32/f64)

on Jun 10, 2021

Copy link

Member

scottmcm commented on Jun 10, 2021

(even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

That's because sort_by_cached_key actually calls sort_unstable_by internally -- it basically builds a Vec<(Key, usize)>, and having that index means that it's stable because there are no different-but-equivalent things being compared.

(even when compared to sort_unstable_by, as there is no equivalent sort_unstable_by_cached_key):

That's because sort_by_cached_key actually calls sort_unstable_by internally -- it basically builds a Vec<(Key, usize)>, and having that index means that it's stable because there are no different-but-equivalent things being compared.

Yeah, I wondered if that was the reason why and checked the code after posting. TBH, I would have expected a sort with elements in two separate Vecs/slices, but that would best be implemented on writable random-access zip iterators and I don't think we have anything like that yet, but that is a discussion for somewhere else.

Copy link

Member

joshtriplett commented 22 days ago

We discussed this in today's @rust-lang/libs-api meeting.

We agreed that the old MIPS bug is not an issue; we came to the same conclusion regarding to_bits and from_bits.

(However, we'd ask that the documentation be updated to not guarantee the ordering of different kinds of NaNs.)

We also agreed that #73328 shouldn't be a blocker.

@rfcbot merge

Copy link

rfcbot commented 22 days ago •

edited by yaahc

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Copy link

Member

scottmcm commented 22 days ago

(However, we'd ask that the documentation be updated to not guarantee the ordering of different kinds of NaNs.)

Maybe give it a platform-specific caveat, or something? That ordering is part of the IEEE behaviour of the operation, so it'd be nice to keep it for the usual case.

Copy link

rfcbot commented 22 days ago

This is now entering its final comment period, as per the review above.

Copy link

Member

joshtriplett commented 22 days ago

@scottmcm If it's only an issue on specific targets, we could document that it's an issue on those targets, sure.

Copy link

rfcbot commented 12 days ago

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Tracking Issue for total_cmp (on f32/f64) · Issue #72599 · rust-lang/rust · GitH...

golddranks commented on May 26, 2020 •

Overview of the API

Steps

Unresolved Questions

Implementation history

hanna-kruppe commented on May 26, 2020

sfackler commented on May 27, 2020

est31 commented on Jun 14, 2020

ecstatic-morse commented on Jun 15, 2020 •

RalfJung commented on Sep 9, 2020

RalfJung commented on Sep 9, 2020 •

est31 commented on Sep 9, 2020 •

clarfonthey commented on Nov 22, 2020

scottmcm commented on Mar 29, 2021

l0calh05t commented on Jun 10, 2021 •

golddranks commented on Jun 10, 2021

scottmcm commented on Jun 10, 2021

joshtriplett commented 22 days ago

rfcbot commented 22 days ago •

scottmcm commented 22 days ago

rfcbot commented 22 days ago

joshtriplett commented 22 days ago

rfcbot commented 12 days ago

Recommend

Accommodate yield points in the format_args expansion by dtolnay · Pull Request...

Improve selection errors for `~const` trait bounds by fee1-dead · Pull Request #...

腾讯新成立XR业务团队，含XR硬件及游戏工作室，发力全真互联网！

资深分析师：全球主机游戏市场发展的核心问题是什么？

Meilisearch raised a $5M Seed to change the world of search

Ferrous Systems and AdaCore to join forces on Ferrocene

Ignore unwinding edges when checking for unconditional recursion by tmiasko · Pu...

Воркшоп: пишемо гру на Rust, 5 — 6 лютого | DOU

Async Rust vs RTOS showdown!

Arduino/ESP 實驗：電磁繼電器與 MOS 管電子開關耗電實測

About Joyk