[compiler-rt][SelectionDAG] Add extendbfsf2 libcall and use it for bf16 extends...

Details

Summary

Previously this resulted in an assert (reproducible on RISC-V with soft FP). The existing code path assumes a libcall is present, and adding the libcall seems like the easiest fix. This libcall _is_ provided by libgcc, which perhaps providing its own motivation for adding it here.

The legalisation code in LegalizeDAG lowers to an anyext and shift which might be an alternative. This would however be more invasive to support vs just adding an extra case to the existing libcall lowering logic, and these soft targets are likely not a target we care strongly about BF16 support beyond wanting some basic support for completeness.

I'm not able to convince myself that the anyext+shift lowering is always identical to the more elaborate extension performed by the libcall in all cases (and if so, why do the trunc and extend libcalls even exist?). though I'm not sure I can convince myself. I know @craig.topper was involved in a previous discussion on this so I'd appreciate your view.

Diff Detail

Unit TestsFailed

Time	Test
	50 ms	x64 debian > LLVM.CodeGen/RISCV::bfloat.ll Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv32 -verify-machineinstrs < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/bfloat.ll \| /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/bfloat.ll -check-prefix=RV32I-ILP32
	60,050 ms	x64 debian > MLIR.Examples/standalone::test.toy Script: -- : 'RUN: at line 1'; "/etc/cmake/bin/cmake" "/var/lib/buildkite-agent/builds/llvm-project/mlir/examples/standalone" -G "Ninja" -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang -DLLVM_ENABLE_LIBCXX=OFF -DMLIR_DIR=/var/lib/buildkite-agent/builds/llvm-project/build/lib/cmake/mlir -DLLVM_USE_LINKER=lld -DPython3_EXECUTABLE="/usr/bin/python3.9"

Event Timeline

asb created this revision.Thu, May 25, 6:04 AM

asb requested review of this revision.Thu, May 25, 6:04 AM

Comment Actions

I'm not able to convince myself that the anyext+shift lowering is always identical to the more elaborate extension performed by the libcall in all cases (and if so, why do the trunc and extend libcalls even exist?). though I'm not sure I can convince myself. I know @craig.topper was involved in a previous discussion on this so I'd appreciate your view.

fp32 has more bits of mantissa than bfloat16 but they have the same number of exponent bits.

The trunc libcall exists because the extra bits of mantissa that exist in fp32 need to be rounded to convert to bfloat16. Also some f32 subnormal values can't be represented in bfloat16. So it can't be done as an integer truncate.

For extend, we should just need to add 0s to the end of the mantissa. The +0.0, -0.0 are encoded as all 0s in the mantissa and exponent in both encodings. infinity is encoded with a special exponent and all 0 mantissa in both formats. nan uses the same exponent as infinity but a non-zero mantissa. If the mantissa is already non-zero, adding more zeros doesn't change that. Adding zeros to the end of the mantissa for normals and denormals shouldn't change their value.

Comment Actions

For extend, we should just need to add 0s to the end of the mantissa. The +0.0, -0.0 are encoded as all 0s in the mantissa and exponent in both encodings. infinity is encoded with a special exponent and all 0 mantissa in both formats. nan uses the same exponent as infinity but a non-zero mantissa. If the mantissa is already non-zero, adding more zeros doesn't change that. Adding zeros to the end of the mantissa for normals and denormals shouldn't change their value.

And then we'd just lose out on FE_INVALID being set if the input is a signalling NaN - it seems libgcc does have some support for setting these exception bits (on some platforms at least, with the right support hooks implemented) while compiler-rt has none. So I think that justifies the libcall for them. Thanks for helping clear that up.

Comment Actions

You would only need to worry about snans with the constrained fptrunc

Event Timeline

Recommend

ROI超3.6！4步分析B站蓝链带货核心逻辑

Debugging AWS CloudFront issues live with SSH

Nothing in CSS - 0 vs 0px, no, none, hidden, initial and unset

联想宣布昭阳品牌升级王立平提商用PC三大价值标准

请问有 Telegram Bot 的对话的开源框架吗

[clang][docs] Clarify the semantics of -fexceptions

radv: Implement vk.check_status

How Microsoft Fabric aims to beat Amazon and Google in the cloud war

Very slow editing of large files when tree-sitter is used · Issue #3072 · heli...

国内首个！全国一体化算力算网调度平台（1.0版）上线

About Joyk