

Yet another reason to not use printf (or write C code in general) | Belay the C+...
source link: https://belaycpp.com/2021/08/31/yet-another-reason-to-not-use-printf-or-write-c-code-in-general/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Recently, Joe Groff @jckarter tweeted a very interesting behavior inherited from C:
C++ pro tip: hardcoding constants in your code can create maintenance burdens. Instead of writing 2*x, try the handy double(x) function pic.twitter.com/vZigQW4O0h
— Joe Groff (@jckarter) August 27, 2021
Obviously, it’s a joke, but we’re gonna talk more about what’s happening in the code itself.
So, what’s happening?
Just to be 100% clear, double(2101253)
does not actually double the value of 2101253
. It’s a cast from int
to double
.
If we write this differently, we can obtain this:
#include <cstdio>
int
main() {
printf
(
"%d\n"
, 666);
printf
(
"%d\n"
,
double
(42));
}
On the x86_64 gcc 11.2
compiler, the prompt is as follows:
666
4202506
So we can see that the value 4202506
has nothing to do with the 666
nor the 42
values.
In fact, if we launch the same code in the x86_64 clang 12.0.1
compiler, things are a little bit different:
666
4202514
You can see the live results here: https://godbolt.org/z/c6Me7a5ee
You may have guessed it already, but this comes from line 5, where we print a double
as an int
. But this is not some kind of conversion error (of course that your computer knows how to convert from double
to int
, it will do it fine if this was what was happening), the issue comes from somewhere else.
The truth
If we want to understand how it works that way, we’ll have to take a look at the assembly code (https://godbolt.org/z/5YKEdj73r):
.LC0:
.string "%d\n"
main:
push rbp
mov rbp, rsp
mov esi, 666
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov rax, QWORD PTR .LC1[rip]
movq xmm0, rax
mov edi, OFFSET FLAT:.LC0
mov eax, 1
call printf
mov eax, 0
pop rbp
ret
.LC1:
.long 0
.long 1078263808
(use this Godbolt link to have a clearer matching between the C++ code and the assembly instructions: https://godbolt.org/z/5YKEdj73r)
In the yellow zone of the assembly code (lines 6-to 9, the equivalent to printf("%d\n", 666);
) we can see that everything’s fine, the 666
value is put in the esi
register and then the function printf
is call
ed. So it’s an educated guess to say that when the printf
function reads a %d
in the string it is given, it’ll look in the esi
register for what to print.
However, we can see in the blue part of the code (lines 10 to 14, the equivalent to printf("%d\n", double(42));
) the value is put in another register: the xmm0
register. Since it is given the same string as before, it’s pretty guessable that the printf
function will look into the esi
register again, whatever there is in there.
We can prove that statement pretty easily. Take the following code:
#include <cstdio>
int
main() {
printf
(
"%d\n"
, 666);
printf
(
"%d %d\n"
,
double
(42), 24);
}
It’s the same code, with an additional integer that is print in the second printf
instruction.
If we look at the assembly (https://godbolt.org/z/jjeca8qd7):
.LC0:
.string "%d %d\n"
main:
push rbp
mov rbp, rsp
mov esi, 666
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov rax, QWORD PTR .LC1[rip]
mov esi, 24
movq xmm0, rax
mov edi, OFFSET FLAT:.LC0
mov eax, 1
call printf
mov eax, 0
pop rbp
ret
.LC1:
.long 0
.long 1078263808
The double(42)
value still goes into the xmm0
register, and the 24
integer, logically, ends up in the esi
register. Thus, this happens in the output:
666
24 0
Why? Well, since we asked for two integers, the printf
call will look into the first integer register (esi
) and print its content (24
, as we stated above), then look in the following integer register (edx
) and print whatever is in it (incidentally 0
).
In the end, the behavior we see occurs because of how the x86_64
architecture is made. If you want to learn more about that, follow these links:
What does the doc say?
The truth is that according to the reference (printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s – cppreference.com):
If a conversion specification is invalid, the behavior is undefined.
And this same reference is unambiguous about the %d
conversion specifier:
converts a signed integer into decimal representation [-]dddd.
Precision specifies the minimum number of digits to appear. The default precision is 1.
If both the converted value and the precision are 0 the conversion results in no characters.
So, giving a double
to a printf
argument where you are supposed to give a signed integer is UB. So it was our mistake to write this in the first place.
This actually generates a warning with clang. But with gcc, you’ll have to activate -Wall
to see any warning about that.
Wrapping up
The C language is a very, very old language. It’s older than the C++ (obviously) that is itself very old. As a reminder, the first edition of the K&R has been printed in 1978. This was thirteen years before my own birth. And unlike us humans, programming languages don’t age well.
I could have summarized this article with a classic “don’t perform UB”, but I think it’s a bit off-purpose this time. So I’ll go and say it: don’t use printf
at all.
The problem is not with printf
itself, it’s with using a feature from another language1 that was originally published forty-three years ago. In short: don’t write C code.
Thanks for reading and see you next week!
1. Yeah, like it or not, but C and C++ and different languages. Different purpose, different intentions, different meta. That is exactly why I always deny job offers that have the tag “C/C++” because they obviously can’t pick a side.
Recommend
-
32
Raymond August 30th, 2019 Windows adopted Unicode before most other operating systems. [citation ne...
-
30
推广 - @lyver - 小保还没进互联网公司之前,对程序员的理解很单一:黑框眼镜,穿着休闲(格子衫),聪明绝顶(秃头),有着被称为单身 20 年的惊人手速……直到进公司之后:有天看见一个 1 米 8+的小哥哥昂首阔步
-
16
Problem setup One day we had a certain mismatch between two floating point numbers. One number when inspected in an IDE looked much longer than the other, having lots...
-
22
printf-tac-toe A c implementation of tic-tac-toe in a single call to printf. Written for IOCCC 2020. #include <stdio.h> #define N(a) "%"#a"$hhn" #define O(a,b) "%10$"#a"d"N(b) #define...
-
53
点击上方蓝字可直接关注!方便下次阅读。如果对你有帮助,麻烦点个在看或点个赞,感谢~ 文章首发 公众号—— Pou...
-
15
fmt.Printf formatting tutorial and cheat sheet yourbasic.org/golang Basics With the Go...
-
11
Use echo/printf to write images in 5 LoC with zero libraries or headers tl;dr: With the Netpbm file formats, it’s trivial to output pixels using nothing but text...
-
9
ConversationWhen I say I'm 'in the Tech space', I very much mean PC, gaming, silicon. When I see some others say 'I talk Tech, I have a Tech show', they're talking about smartphones, or gadgets. When someone says Te...
-
5
如何在Visual Studio Code实现MCU printf 网上有很多说直接把syscalls.c挪过来就可以用,实际测试并不能啊,难不成骗我了,不,重点有一句话漏了. // 重点,否则不会直接打印,甚至出错(因为缓冲区满) setvbuf(...
-
9
The Reason Jay Leno Teamed Up With General Motors To Build A Jet Car
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK