An NVIDIA CUDA core is 1 FP32 ALU, not 1/32 of an ALU. An ALU processes one FP32...
source link: https://twitter.com/RyanSmithAT/status/1450216681326796800
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Tweet
Conversation
Apple's current GPU architecture offers 128 FP32 ALUs per "core", which is similar to an Ampere SM. So M1 Max is powerful at 4096 ALUs, but that's still well under a high-end NV GPU.
But again, ALUs are not 1:1 comparable in real-world performance
An NVIDIA CUDA core is 1 FP32 ALU, not 1/32 of an ALU. An ALU processes one FP32 operation per cycle.
And you can ignore the thread number. That's for developers, who need to know how many threads they can switch between
science and computing nerd : she/her : : black lives matter
(ex-mercurial/linux/facebook/cisco/etc)
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK