

Why glibc 2.34 removed libpthread
source link: https://developers.redhat.com/articles/2021/12/17/why-glibc-234-removed-libpthread
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Why glibc 2.34 removed libpthread Skip to main content
The recent 2.34 release of the GNU C library, glibc, removes libpthread
as a separate library. This article explains the motivation behind this change and some consequences for developers and system administrators.
For a long time, glibc was split into multiple,
separate, shared objects. For example, the threading library
libpthread
was contained in a shared object libpthread.so.0
, and the
application interface for the dynamic linker, libdl
, in the file
libdl.so.2
. There was even a time, some twenty years ago, when
there were two separate implementations of libpthread
, the
LinuxThreads implementation for Linux 2.4 and earlier and the
Native POSIX Threads Library (NPTL)
implementation for Linux 2.6 and later.
In the glibc 2.34 release, we have integrated most components that
used to be in separate shared objects into the main libc
object,
libc.so.6
. These changes have been implemented in a
backward-compatible fashion, so even though libpthread
is gone as a
separate object, all the public functions it used to provide (such as
pthread_create
) are still available. In this consolidation effort, glibc follows the
pioneering work of the musl C library, which
provides absolutely everything (including the dynamic linker) in a
single shared object.
The developer view
The developer view
A textbook "Hello, world!" example using C++ threads look like this:
#include <iostream>
#include <thread>
int
main()
{
std::thread thr{[]() {
std::cout << "Hello, world!\n";
}};
thr.join();
}
Building the program with g++
on a system that uses glibc 2.33 or earlier
results in an unexpected error:
$ g++ -o hello hello.cpp /usr/bin/ld: /tmp/ccJckARF.o: in function `std::thread::thread<:>(main::{lambda()#1}&&)': hello.cpp:(.text+0x9b): undefined reference to `pthread_create' collect2: error: ld returned 1 exit status {lambda()#1},>
For a beginner, this error message is very confusing. The programmer
did not write pthread_create
, so it is not clear why the linker would complain about its absence.
The fix is to link with libpthread
, the separate thread
library implementation:
$ g++ -o hello hello.cpp -lpthread $ ./hello Hello, world!
But with glibc 2.34, the command works without -lpthread
:
$ g++ -o hello hello.cpp $ ./hello Hello, world!
The -lpthread
option still works because glibc provides an empty
libpthread.a
file. This file replaces the libpthread.so
symbolic
link to the shared object file libpthread.so.0
. The shared object
still exists so that existing applications that link dynamically
against it can still launch. dlopen
also continues to work, but the file is empty
apart from a few placeholder symbols. The presence of
this file helps distribution dependency generators provide the
correct set of dependencies.
The reorganization of files can seem like a trivial change, but integrating glibc components
into the main library also helps more advanced use cases. For
example, a programmer might want to link statically against the
Gio library but dynamically against glibc. A typical way to mix linking strategies is to use
pkg-config
with the --push-state
/--pop-state
linker bracket:
$ gcc -o application main.o -Wl,--push-state,-Bstatic \ $(pkg-config --static --libs gio-2.0) -Wl,--pop-state
The command pkg-config --static --libs gio-2.0
prints the static
libraries required by gio-2.0
, and only those are marked for static
linking with -Bstatic
. But the command does not work as expected in glibc versions before 2.34, because the
pkg-config
output includes -ldl
, a glibc component, and the static
libdl.a
library provided is
incompatible with dynamic linking. Thus, the command leads to the following error:
/usr/bin/ld: /usr/lib64/libdl.a(dlopen.o): in function `dlopen': (.text+0x9): undefined reference to `__dlopen'
This error is not very hard to fix: Just filter out -ldl
from the
pkg-config
output and include it after the -Wl,--pop-state
option
instead. But in glibc 2.34, libdl
has also been integrated, so
libdl.a
is now empty, and linking against it works in both static
and dynamic linker invocations.
A downside of these changes is that we had to add many new
GLIBC_2.34
symbol versions for existing functions. However, we had
to add a new __libc_start_main@@GLIBC_2.34
symbol version to
implement a long-requested feature,
startup code hardening.
__libc_start_main
is called by all applications during startup. This new symbol version prevents applications that have been built against glibc 2.34 from launching on systems that have installed glibc 2.33 and earlier.
Further GLIBC_2.34
symbols added for integrating libpthread
and the other
components did not seem much of an additional burden because of that.
The system administrator view
The system administrator view
Splitting glibc into multiple components means that some components
are loaded on process start (certainly the dynamic linker and the main
libc.so.6
library), whereas other components might be loaded later.
Such components can be loaded indirectly, for
example, if the Name Service Switch (NSS) is used to look up user
information using the getpwnam
function. In this case, NSS modules
such as nss_files
or nss_systemd
are loaded behind the scenes.
Some of these modules are part of glibc itself (nss_files
). However, others are
part of other software (e.g., systemd), and those could depend on
glibc components that are not initially loaded. In both cases, it is
not always possible during glibc upgrades to preserve the internal
application binary interface (ABI) between the initially-loaded glibc
components and the components loaded later.
If the system administrator performs a glibc upgrade and neglects to
restart all services (typically with a reboot of the system), late
loading of glibc components might pull updated versions of the
components described in the previous paragraph into a process that uses parts of the old glibc
installation. The resulting ABI mismatches can result in
hard-to-diagnose failures, including crashes. One common example is
that systemd can no longer launch services that use a User=
directive (although systemctl daemon-reexec
can usually be used to
work around this).
The use of incompatible dependencies is particular problematic for libpthread
, due to its tight
integration with the rest of glibc. But also, when backporting changes
to nss_files
, distributions had to attempt to preserve the internal
ABI with custom downstream patches, which is somewhat cumbersome.
Loading as much as possible of glibc at process startup makes these issues go away. Long-running processes keep using the old glibc version.
Performance considerations
Performance considerations
Most processes on a typical GNU/Linux system
are already dynamically linking to libpthread
, even before its integration.
Loading these processes is now marginally faster, because the dynamic linker has to process fewer
symbol lookups and relocations. Processes that did not load
libpthread
before invoke one additional system call (set_robust_list
)
that has been avoided before. This system call is required to make process-shared robust
mutexes work even if pthread_create
is never called.
All the integrated components had few relative relocations, which means that
that they do not contribute significantly to the overall glibc relocation overhead.
Historically, some applications interpret the absence of the pthread_create
symbol as an indicator to switch from thread-safe algorithms to single-threaded algorithms, as an optimization. This optimization no longer works because the pthread_create
symbol is now always present. Instead, applications should enable such optimizations based on the __libc_single_threaded
variable, which was introduced in glibc 2.32, partly in preparation for the libpthread
integration changes.
Remaining issues
Remaining issues
Currently, the dynamic linker still lives in a separate shared object
(/lib64/ld-linux-x86-64.so.2
on x86-64, for example). As musl
shows, it is theoretically possible to provide the entire C library
through the dynamic linker. For glibc, this would require additional
(non-mechanical) changes. Without further work, it would no longer be
possible to load an optimized libc.so.6
implementation file based on
CPU characteristics (e.g., a libc.so.6
version that uses PCREL
instructions on POWER10, something that cannot be achieved through
IFUNC-based optimizations).
We have also been unable to complete the transition of the libm
,
libmvec
, and libresolv
components in time for the glibc 2.34
release. This means that some linking and upgrade hazards remain.
We hope to complete these transitions in a future release.
These updates should make it easier for distributions to backport bug fixes and other changes. Some distributions might even want to experiment with seamless upgrades across major glibc releases without requiring reboots.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK