![](/style/images/good.png)
![](/style/images/bad.png)
Intel Processor Trace Part2. Better debugging experience.
source link: https://www.tuicool.com/articles/7fAJ7n6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Contents:
- Debugging stack corruption issues
- Appendix: How to build gdb with Intel PT support
Subscribe to mymailing list to get more updates from me!
In thefirst part of my series about Intel Processor Traces (PT) I showed the underlying mechanics of this HW feature and talked a bit about its main use cases. In this article I will go into one of areas where PT can provide additional value, which is debugging.
Postmortem debugging
Traditionally, in embedded world, issues that happen in production environment are being debugged by logging. But sometimes it’s not enough. Suppose we have a function like this with a big switch inside:
int foo(unsigned int arg) { int x = someComplexFunc(arg); log("Enter calculations"); switch (arg) { case 7498536: { int y = 7498536 / x; // potential div by zero x = y - 42; break; } // <lots of other cases> default: x = 42; break; } log("result: x = %d\n", x); return x; }
Customer reports that once in a while they see some of their machines crash. In the logs they provided the last lines are:
Time message 19:25:13:0124: bar: calling foo 19:25:13:0134: foo: Enter calculations
Well, that’s better than nothing, still doesn’t give any clue what exactly the problem is. If the switch is big enough it will be hard to find where exactly is the issue.
Intel PT can provide a little bit more insights. I used simple-pt
to collect the traces. Please refer to simple-pt
documentationfor how to build and use it.
Source code for this example is available on my github . I built the program like this:
$ gcc a.cpp -g -o app
The command below collects the traces. By default simple-pt
saves the traces into 2MB circular buffer. This means new traces overwrite the old ones. So, even for long running applications we can have a trace of what was happening just before the crash:
$ sudo sptcmd -K --cyc 1 taskset -c 0 ./app
After we collected the traces we can decode them with:
$ sudo sptdecode -s ptout.sideband --pt ptout.0 -i -t | xed -F insn: -A -64 > dump.txt
If we now look into the dump.txt
, right at the bottom we will see something like:
<...> 560a8ef2870d 0 call callq 0x560a8ef2866allq 560a8ef2866b 0 other mov %rsp, %rbp 560a8ef2866e 0 other movl %edi, -0x14(%rbp) 560a8ef28671 0 other movl $0x0, -0x8(%rbp) 560a8ef28678 0 other movl -0x14(%rbp), %eax 560a8ef2867b 0 other cmp $0x1d4a, %eax 560a8ef28680 0 cjump jnz 0x560a8ef28699 // arg == 7498536 560a8ef28682 0 other mov $0x1d4a, %eax // entering the block 560a8ef28687 0 other cdq // with div by zero
I was having some issues with emitting source code lines in the dumps. In sptdecode
there is -d
option that is supposed to print source code intermixed with the associated assembly code. After discussing the issue with Andi Kleen it looks like the problem is somewhere on my side.
In this example you can see how Intel PT can tell us the last instructions that were executed. Intel PT can be used as an almost freeaddition to the logging capabilities of your applications and often can provide a big chunk of useful information for postmortem debugging .
Logs still are very useful because you can print some values in them. Until PTWRITE
instruction came out there was no way of dumping data in processor traces. Traces were only useful for determining control flow. But in recent CPUs we have PTWRITE
instruction that allows writing values into the PT packets. According to Intel SD Manual
:
This instruction reads data in the source operand and sends it to the Intel Processor Trace hardware to be encoded in a PTW packet.
I haven’t used PTWRITE
in practice, but I assume that every time you want to print something new you need to recompile the binary. Though you need to do the same when using logs.
My Intel Core i5-8259U doesn’t have PTWRITE
support. You can check whether your CPU has PTWRITE
support with ptfeature
tool which is a part of simple-pt
.
Debugging stack corruption issues
Now let me show another case where Intel PT can be useful.
Let me jump right into the example with the program where the call stack is being corrupted:
// a.c void bar(); // implemented in assembly below void foo() { bar(); } int main() { foo(); }
// b.asm GLOBAL bar bar: pop rdx ; remove return address from the stack xor rax, rax ret ud2
Source code for this example is available on my github . Let’s build the program and make sure it’s crashing:
$ gcc a.c -c -g $ nasm -f elf64 b.asm -g $ gcc a.o b.o $ ./a.out Segmentation fault (core dumped)
When I run usual (pre-installed) version of gdb:
$ /usr/bin/gdb ./a.out GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git (gdb) r Starting program: /path/to/a.out Program received signal SIGSEGV, Segmentation fault. 0x00007fffffffe336 in ?? () (gdb) bt #0 0x00007fffffffe336 in ?? () #1 0x00007fffffffe320 in ?? () #2 0x00007fffffffe320 in ?? () #3 0x00007fffffffe320 in ?? () #4 0x0000555555554619 in main () at a.c:10 Backtrace stopped: frame did not save the PC
Stack is corrupted, so gdb is not able to unwind it. According to my experiments, rr tool cannot provide any additional value either.
Now let’s try to run the same example using build-from-sources gdb and record traces. You can find particular instructions in theappendix of this article.
$ /usr/local/bin/gdb ./a.out GNU gdb (GDB) 8.3.50.20190822-git (gdb) start Starting program: /path/to/a.out Temporary breakpoint 1, main () at a.c:10 10 foo(); (gdb) record btrace pt (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007fffffffe336 in ?? ()
We have the crash, let’s now see the call history:
(gdb) record function-call-history 1 main 2 foo 3 bar 4 ??
We can even pull the previous executed instructions.
(gdb) record instruction-history /m - 4 0x00005555555545fb <foo+1>: mov %rsp,%rbp a.c:5 bar(); 5 0x00005555555545fe <foo+4>: mov $0x0,%eax 6 0x0000555555554603 <foo+9>: callq 0x555555554620 <bar> 7 0x0000555555554620 <bar+0>: pop %rdx 8 0x0000555555554621 <bar+1>: xor %rax,%rax 9 0x0000555555554624 <bar+4>: retq 10 0x00007fffffffe330: xor %al,0x55(%rsi) 11 0x00007fffffffe333: push %rbp 12 0x00007fffffffe334: push %rbp 13 0x00007fffffffe335: push %rbp
Here /m
switch is used for intermixing source code with assembly instructions. Because bar
is implemented in assembly there is obviously no source line for it. But for foo
function we see the corresponding source line ( a.c:5
).
After we know the exact place where the issue happened we can put normal breakpoint and restart debugging session as usual.
As you see, Intel PT helps when debugging programs with corrupted stack.
Appendix: How to build gdb with Intel PT support
This page would probably be a good starting point. GDB uses libipt for collecting PT, so we need to build it first:
git clone https://github.com/intel/libipt.git mkdir build && cd build cmake ../libipt make make install
Then we build gdb from sources as shown here :
sudo apt install texinfo bison flex git clone git://sourceware.org/git/binutils-gdb.git mkdir build && cd build ../binutils-gdb/configure --disable-binutils --disable-ld --disable-gold --disable-gas --disable-sim --disable-gprof make make install
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK