Using hardened_malloc in Alpine Linux

 1 year ago
source link: https://dustri.org/b/using-hardened_malloc-in-alpine-linux.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Using hardened_malloc in Alpine Linux

Now that GrapheneOS's hardened_malloc is available in Alpine Linux (yay!), time to make use of it. While throwing LD_PRELOAD into openrc units works, it's far easier to use it system wide: Note that musl doesn't use /etc/ld.so.conf but /etc/ld-musl-$(ARCH).path instead. You can check that's effectively being used after a reboot by running lsof /usr/lib/libhardened_malloc.so.

But why use hardened_malloc instead of musl's malloc-ng? Because they're making different trade-offs: the latter is optimized for using a minimal amount of memory while still being ~secure, while the former is about making the life of an attacker as difficult as possible, at the cost of a slightly increased memory consumption. Fortunately, my hypervisor has a ton of RAM, and my services are pretty thrifty so I don't really have to worry about staying within the limits of my memory budget.

Both allocators are pretty slow, so swapping one for the other doesn't really change anything in regard of speed/latency. I played a bit with the lightweight version of hardened_malloc, but didn't notice a difference in any of my services, since:

  • None of them have low-latency requirements.
  • The memory-allocator isn't a significant limiting factor, even my tor relays or qemu/kvm machines.
  • Most of them are running with their own memory allocator (python, php, go, …), and while replacing them is often doable, their custom allocators is usually so tailored that the performance impact is abysmal. Moreover, I don't think that anyone will waste a memory-corruption-based remote-code execution on web services, when they're likely full of lower-hanging fruits.

Moreover, having a slow-ish allocator is a nice motivation for making it faster.

I did all my benchmarks with mimalloc-bench, but since it produces a lot of data, you should run it yourself instead of trusting me:

git clone https://github.com/daanx/mimalloc-bench
cd mimalloc-bench
./build-bench-env.sh hm mng bench
cd out/bench/
./bench.sh alla allt > /tmp/out.txt
python3 ./graph.py /tmp/out.txt

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK