Footguns in SIMD land
source link: https://llogiq.github.io/2018/09/19/simd-footguns.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Footguns in SIMD land
19 September 2018
If you use the newly stabilized std::arch
module, you may find a few, well,
surprises:
If you fail to guard your SIMD code with #[cfg(and(target_arch = "..",
target_feature = ".."))]
, you’ll miss out on inlining. There is not even a
warning for this yet (clippy issue #????)
Both 128 and 256 bit types have three variations: One for {4, 8}×f32
, one
for {2, 4}×f64
and one bag of bits that can be sliced to various integer
lengths and signedness. For historical reasons, the __m128
/__m256
types
denote multiples of f32
, whereas the f64
-based types get a d
suffix, and
the integer types get an i
suffix. C/C++ programmers may feel at home here.
The type constructors aren’t const yet. This means that you cannot have static
__m128i
s, for example.
The _mm_set
function and the __mm_loadu
/__mm_storeu
functions take their
arguments in inverse order. For example:
let mut y = [0xf64; 2];
let onetwo = _mm_set_pd(1.0, 2.0);
__mm_storeu_pd((&mut y).as_mut_ptr(), onetwo);
assert_eq!([2.0, 1.0], y); // who'd have thunk?
Speaking of which, the _mm_store_*
functions don’t seem to be faster than the
_mm_storeu_*
functions (at least on my skylake), so it’s unclear if defining
a 16-byte-aligned type for the former would be worth the hassle. The same goes
for _mm256_store_*
and 32-byte-aligned types.
What problems have you found with Rust + stdsimd? Discuss on r/rust or rust-users!
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK