Notes about an odd, esoteric, yet incredibly useful library: libthread_db
source link: http://timetobleed.com/notes-about-an-odd-esoteric-yet-incredibly-useful-library-libthread_db/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
tl;dr
This blog post will examine one of the weirder libraries I’ve come across: libthread_db
.
libthread_db
is typically used by debuggers, tracers, and other low level debugging/profiling applications to gather information about the threads in a running target process. Unfortunately, the documentation about how to use this library is a bit lacking and using it is not straightforward at all.
This library is pretty strange and there are several gotchas when trying to write a debugger or tracing program that makes use of the various features libthread_db
provides.
Loading the library (and probably failing)
As strange as it may seem to those who haven’t used this library before, loading and linking to libthread_db
is not as straight forward as simply adding -lthread_db
to your linker flags.
The key thing to understand is that different target programs may use different threading libraries. Individual threading libraries may or may not have a corresponding libthread_db
that works with a particular threading library, or even with a particular version of a particular threading library.
So until you attach to a target process, you have no idea which of the possibly several libthread_db
libraries on the system you will need to use to gather threading information from a target process.
You don’t even know where the corresponding libthread_db
library may live.
So, to load libthread_db
in your debugger/tracer, you must:
- Attach to your target process, usually via
ptrace
. - Traverse the target process’ link map to determine which libraries are currently loaded. Your program should search for the threading library of the process (often
libpthread
, but maybe your target program uses something else instead). - Once found, your program can search in nearby directories for the location of
libthread_db
. In the most common case, a program will uselibpthread
as its threading library and the correspondinglibthread_db
will be located in the same directory. Of course, you could also allow the user to specify the exact location. - Once found, simply use libdl to
dlopen
the libary. - If your target process is a linux process which uses
libpthread
(a common casse),libthread_db
fails to load with libdl. Otherlibthread_db
libraries may or may not load fine.
libthread_db’s numerous undefined symbols
If you’ve followed the above steps to attempt to locate libthread_db
and are targeting a linux process that uses libpthread
, you have now most likely failed to load it due to a number of undefined symbols.
Let’s use ldd
to figure out what is going on:
joe@ubuntu:~$ ldd -r /lib/x86_64-linux-gnu/libthread_db.so.1 | grep undefined
undefined symbol: ps_pdwrite (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_pglobal_lookup (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_lsetregs (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_getpid (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_lgetfpregs (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_lsetfpregs (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_lgetregs (/lib/x86_64-linux-gnu/libthread_db.so.1)
undefined symbol: ps_pdread (/lib/x86_64-linux-gnu/libthread_db.so.1)
Bring your own symbols to this party
libthread_db
will fail to load due to undefined symbols because the library expects your program to provide the implementations of these symbols. Unfortunately, the only way to determine which functions must be implemented is to examine the source code of the libthread_db
implementation(s) you are targeting.
The libthread_db
implementations that come with glibc include a header file named proc_service.h
which list all the functions and prototypes that your program must provide. I’ve noticed that other libthread_db
implementations also provide a similar header file.
These functions are all very platform specific and to maximize the portability of the various implementations of libthread_db
the implementations are left to the program using libthread_db
.
In general, your program must provide implementations of:
- Functions to read from and write to the address space of a targeted process. Typically implemented with
ptrace
. - Functions to read and write the general purpose registers and floating point registers (if there are any). Typically implemented with
ptrace
. - A function to locate a specified shared object and search that object for a particular symbol. This function is significantly more complex than the other functions. Your program could use something like libbfd or libelf to make locating a library and searching it’s symbol tables easier. If you are implementing a debugger or tracer, you likely already have the pieces you need to implement this function.
- A structure
struct ps_prochandle
thatlibthread_db
will pass through to the functions you implemented that are described above. You will place whatever data your functions will need. Typically this is something like apid
that you can pass through toptrace
.
libthread_db
still fails to load
So, you’ve implemented the symbols you were required to implement, but you are still unable to load libthread_db
with libdl
because you are getting undefined symbol: ...
errors.
Even stranger, you are getting these errors even though you are providing the symbols listed in the error messages!
The problem that you are running into is that the symbols are not being placed into the correct ELF symbol table. When you build an executable with gcc, the exported symbols of the executable are placed in the ELF section named .symtab
. When libthread_db
gets loaded with libdl
, only the symbols in the .dynsym
symbol table are examined to resolve dependencies. Thus, your symbols will not be found and libthread_db
will fail to load.
Why this happens is beyond the scope of this blog post, but I’ve written about dynamic linking and symbol tables before here and here, if you are curious to learn a bit more.
Use this one weird trick for getting your symbols in the dynamic symbol table
There are actually two ways to make sure your symbols end up in the dynamic symbol table.
The first way to do it is to use the large hammer approach and pass the flag --export-dynamic
to ld
. This will add all exported symbols to the dynamic symbol table and you will be able to load libthread_db
.
The second way to do it is much cleaner and strongly recommend over the previous method.
- Create a file which specifies the symbol names you want added to the dynamic symbol table.
- Use the linker flag
--dynamic-list=FILENAME
to letld
know which symbols you want placed in the dynamic symbol table.
Your file might look something like this:
{
ps_pdread;
ps_pdwrite;
ps_pglobal_lookup;
/* more symbol names would go here... */
};
If you are using gcc
, you can then simply pass the flag: -Wl,--dynamic-list=FILENAME
and your executable will have the symbols listed in the file placed in the dynamic symbol table.
Regardless of which method you use be sure to verify the results by using readelf
to determine if the symbols actually made it to the correct symbol table.
Calling the initialize function and allocating a libthread_db
handle
So, after all that work you will finally be able to load the library.
Since the library was loaded with libdl
, you will need to use dlsym
to grab function pointers to all the functions you intend to use. This is kind of tedious, but you can make clever use of C macros to help you, as long as you also make use of documentation to explain how they work.
So, to find and call the initialize function (without any macros for sanity and clarity):
/* find the init function */
td_init = dlsym(handle, "td_init");
if (td_init == NULL) {
fprintf(stderr, "Unable to find td_init");
return -1;
}
/* call the init function */
err = td_init();
if (err != TD_OK) {
fprintf(stderr, "td_init: %d\n",err);
return -1;
}
/* find the libthread_db handle allocator function */
td_ta_new = dlsym(handle, "td_ta_new");
if (td_ta_new == NULL) {
fprintf(stderr, "Unable to find td_ta_new");
return -1;
}
/* call td_ta_new */
err = td_ta_new(&somestructure->ph, &somestructure->ta);
if (err != TD_OK) {
fprintf(stderr, "td_ta_new failed: %d\n", err);
return -1;
}
/* XXX don't forget about td_ta_delete */
A cool version check
td_ta_new
performs a rather interesting version check when called before allocating a handle:
- First, it uses the
ps_pglobal_lookup
symbol you implemented to search for the symbolnptl_version
in thelibpthread
library linked into the remote process. Your function should find this symbol and return the address. - Next,
td_ta_new
reads several bytes from the target process at the address yourps_pglobal_lookup
returned using yourps_pdread
function. - Lastly, the bytes read from the target process are checked against
libthread_db
‘s internal version to determine if the versions match.
So, the library you load calls functions you implemented to search the symbol tables of a process you are attached to in order to read a series of bytes out of that process’ address space to determine if that process’ threading library matches the version of libthread_db
you loaded into your debugger.
Fucking rad.
By the way, if you were wondering why libpthread
is one of the few libraries that is not stripped on Linux, now you know. If it were stripped, this check would fail, unless of course your ps_pglobal_lookup
function searched debug information.
Now you can use the library
At this point, you’ve done enough setup to be able to dlsym
search for and call various functions to iterate over the threads in a remote process, to be notified asynchronously when threads are created or destroyed, and to access thread local data if you want to.
Conclusion
Here’s a summary of the steps you need to go through to load, link, and use libthread_db
:
- Implement a series of functions and structures specified in the
libthread_db
implementation(s) you are targeting. You can find these in the header file calledproc_service.h
. - Attach to the remote process, determine the path of the threading library it is using and look nearby to find
libthread_db
. Alternatively, allow the user to specify the location oflibthread_db
. - Use
libdl
to load the library by callingdlopen
. - Use
dlsym
to findtd_init
andtd_ta_new
. Call these functions to initialize the library. - Ensure you are using either
--export-dynamic
or--dynamic-list=FILENAME
to place the symbols in the correct symbol table so that the runtime dynamic linker will find them when you loadlibthread_db
. - Make sure to use lots of error checking and debug output to ensure that your implemented functions are being hit and that they are returning the proper return values as specified in
proc_service.h
. - Sit back and consider that this entire process actually works and allows you to debug or trace processes with multiple threads.
If you enjoyed this article, subscribe (via RSS or e-mail) and follow me on twitter.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK