3

Ask HN: Why don't PCs have better entropy sources?

 2 years ago
source link: https://news.ycombinator.com/item?id=30877296
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Ask HN: Why don't PCs have better entropy sources?

Ask HN: Why don't PCs have better entropy sources? 62 points by bloopernova 5 hours ago | hide | past | favorite | 73 comments After reading the thread "Problems emerge for a unified /dev/*random" (1) I was wondering why PCs don't have a bunch of sensors available to draw entropy from.

Is this assumption correct, that adding a magnetometer, accelerometer, simple GPS, etc to a motherboard would improve its entropy gathering? Or is there a mathematical/cryptographical rule that makes the addition of such sensors useless?

Do smartphones have better entropy gathering abilities? It seems like phones would be able to seed a RNG based on input from a variety of sensors that would all be very different between even phones in the same room. Looking at a GPS Android app like Satstat (2) it feels like there's a huge amount of variability to draw from.

If such sensors would add better entropy, would it really cost that much to add them to PC motherboards?

(1) https://news.ycombinator.com/item?id=30848973

(2) https://mvglasow.gitlab.io/satstat/ & https://f-droid.org/en/packages/com.vonglasow.michael.satsta...

PCs can and do include hardware support for entropy gathering, see RDSEED [1]

Linux is aware of RDSEED and uses it to provide additional randomness when available. You do need to trust the implementation to be free from backdoors and bugs - some CPUs are known to be buggy. [2]

Randomness seeding issues largely does not concern desktop PCs or smartphones (although you can easily catch early booting programs like systemd reading randomness before it has been fully seeded) [3].

It is a much bigger issue on either small embedded devices or VMs, both of which may have very few peripherals to gather entropy from. They can be provided randomness through dedicated hardware support, or from the host, and they probably should be, but that still leaves many real-world systems currently running Linux out in the cold. This is not just a theoretical problem, as has been shown by looking at indicators like RSA keys with colliding primes, which should never happen when generated with good RNG. [4]

[1] https://en.wikipedia.org/wiki/RDRAND

[2] https://github.com/systemd/systemd/issues/18184

[3] https://github.com/systemd/systemd/issues/4167

[4] https://freedom-to-tinker.com/2012/02/15/new-research-theres...

s.gif
I can't agree on embedded devices.

They have plenty of peripherals that can act as sensors, where you can draw entropy seeds from.

You can even leave an empty pin with a PCB trace to act as a crude antenna and pick up garbage RF.

You can use the built in RC oscillator as a crude temperature sensor.

My point is, if you're creative, embedded systems offer you so many options without any added cost.

s.gif
You don't even need sensors. STMicroelectronics has a TRNG block [1] that "passes" NIST SP 800-22 statistical analysis [2]. I used quotes because the 800-22 is an analysis test suite and the notion of passing is based on your hypothesis parameters, but the suite itself is considered the de-facto analysis mechanism.

I don't follow where the confusion is coming from?

[1] https://www.st.com/resource/en/application_note/dm00073853-s...

[2] https://csrc.nist.gov/Projects/Random-Bit-Generation/Documen...

s.gif
You're probably right, the issue might be that there's no standard way for an OS to hook into that.
s.gif
Hmm good points, but you also ignore the power costs here - using oscillators, open pins as thermometers and rf antennas requires additional power draw, not to mention modifying these devices may be nearly impossible due to their embedded nature.

Even presuming you modify the hardware/firmware, the additional cycles to handle and process the sensor data mean additional power draw compared to normal operation (embedded devices may frequently power down to wake from some interrupt to save power, additionally not all instructions turn on the same number of transistors - floating point ops require more power than simple branch instructions) - something again that prohibits doing this easily.

So "without any added cost" is simply untrue.

The reality is that randomness is relatively expensive, whether via hardware or software. Phones have more sensors - they also have massively complex SoCs and large batteries, which still are drained often over the course of a single day. They also tend to cost 1k+ USD, at least for flagship models (prices go as low as 50$ these days, but this is more from economies of scale and resale value economics than because phone hardware/software is suddenly cheap to manufacture)

s.gif
Could timing be used to seed a cpu? ie, not all cpus are built the same, and similarly for all powersupplies, could the two be used together to bootstrap seeding i.e. by measuring voltage or voltage differences?
s.gif
Tha'ts effectively how a lot of the jitter sourced entropy ends up working, sometimes in a more round about way though.
s.gif
Yes! I went to a talk given by a security researcher who proposed using the difference in clocks between north bridge and south bridge to generate randomness.
I think it's worth repeating this:

While there are platforms with better and worse hardware sources of unpredictable bits, the problem with Linux /dev/random isn't so much a hardware issue, but rather a software one. The fundamental problem Linux tries to solve isn't that hard, as you can see from the fact that so many other popular platforms running on similar hardware have neatly solved it.

The problem with the LRNG is that it's been architecturally incoherent for a very long time (entropy estimation, urandom vs random, lack of clarity about seeding and initialization status, behavior doing bootup). As a result, an ecosystem of software has grown roots around the design (and bugs) of the current LRNG. Major changes to the behavior of the LRNG breaks "bug compatibility", and, because the LRNG is one of the core cryptographic facilities in the kernel, this is an instance where you really really don't want to break userland.

The basic fact of kernel random number generation is this: once you've properly seeded an RNG, your acute "entropy gathering" problem is over. Continuous access to high volumes of high-entropy bits are nice to have, but the kernel gains its ability to satisfy gigabytes of requests for random numbers from the same source that modern cryptography gains its ability to satisfy gigabytes of requests for ciphertext with a 128 bit key.

People looking to platform hardware (or who fixate on the intricacies of threading the LRNG isn't guest VMs) are mostly looking in the wrong place for problems to solve. The big issue today is that the LRNG is still pretty incoherent, but nobody really knows what would break if it was designed more carefully.

s.gif
The piece I've been missing in this whole debate: why isn't the existing RNG simply frozen in its current bug-exact-behavior state and a new /dev/sane_random created?

Stuff that depends on the existing bugs in order to function can keep functioning. Everything else can move to something sane.

Obviously I'm missing something here.

s.gif
Because /dev/sane_random or sane_random(2) has better security properties than what we have now, and you want the whole gamut of Linux software to benefit from that; just as importantly, you don't want /dev/urandom and getrandom(2) to fall into disrepair as attention shifts to the new interface, for the same reason that you care very much about UAF vulnerabilities in crappy old kernel facilities most people don't build new stuff on anymore.

Also, just, it seems unlikely that the kernel project is going to agree to run two entire unrelated CSPRNG subsystems at the same time! The current LRNG is kind of an incoherent mess root and branch; it's not just a matter of slapping a better character device and system call on top of it.

s.gif
> Obviously I'm missing something here

For a start, there's a long tail of migrating all useful software to /dev/sane_random. Moreover, there's a risk new software accidentally uses the old broken /dev/random.

Besides, /dev/sane_random essentially exists; it's just a sysctl called getrandom().

s.gif
It's not that simple; Donenfeld wants to replace the whole LRNG with a new engine that uses simpler, more modern, and more secure/easier-to-analyze cryptography, and one of the roadblocks there is that swapping out the engine risks breaking bugs that userland relies on.
All x86 CPUs sold after 2015 or so have built-in hardware random number generators [1]. They get their entropy by sampling thermal noise on the input of a meta-stable logic gate [2].

[1] https://en.wikipedia.org/wiki/RDRAND

[2] https://www.electronicdesign.com/resources/article/21796238/...

Read the LWN article. The problem isn’t PCs at all - it’s virtual machines, and embedded machines, that hang at startup, waiting for any entropy at all. Of course you could say the bug is in the hypervisor, or the embedded board designer, but the Linux kernel developers can’t afford to break these use cases just for purity’s sake.
s.gif
So the problem is very practical, not theoretical. Someone just needs to do the plumbing, and get that entropy from host to guest OS.
s.gif
> get that entropy from host to guest OS.

There are some ways, although that doesn't mean they're always used.

I read this week QEMU has provides a virtio RNG device to the guest, that reads from the host. That's good. What I'm less clear about is other hypervisors, or whether x86 hypervisors tend to provide RDRAND support.

> magnetometer, accelerometer, simple GPS,

At boot time, on a server sitting in a rack beside thousands of others ... how are these going to help any? They aint moving and the RF/energy environment around them should be steady state or well within characterize-able bounds of noise.

"Random enough" is a metaphysical question when you get into it. If an RTLSDR stick and a site customized munger script can't provide enough entropy for the entire data center you've fallen into a Purity spiral and will never be happy, anyway.

s.gif
There are true hardware random number generators. IIRC, one example is based on a reverse biased diode. Due random quantum effects, an electron flow backwards occasionally and measuring that gives you a source of real randomness.
s.gif
The dedicated RNG scares the paranoid the most because it is an obvious target for corruption.
s.gif
The best RNG solution for the paranoid would have been to have a standardized internal header/connector with an analog-digital converter input and a power supply, like the connector that exists on most motherboards for the front-panel audio (but preferably with a higher-frequency and lower-resolution ADC than for audio, even if an audio ADC is also acceptable).

If such a connector would have been standardized, very small analog noise generator boards that could be plugged in it would cost only a few dollars at most, and they would not contain any device more complex than an operational amplifier.

This solution cannot be back-doored, because it is trivial to test the ADC without a noise-generator attached, to verify that it really is an ADC and the small PCB with the analog noise generator can also be easily inspected to verify that it contains only the specified (analog) devices.

All this could have been very simple and cheap if it would have been standardized, and not more difficult to use than the unverifiable CPU instructions.

As it is, the paranoid must have electronics experience, to design and make their own analog-noise generator, to be used either with the microphone input of the PC audio connectors (which includes a weak power supply), or better with the ADC of a small microcontroller board, to be connected via USB (preferably on an internal USB connnector of the PC motherboard).

s.gif
> standardized, very small analog noise generator boards

The following design[1] uses _two_ pluggable analog noise generator boards (since you don't trust one). The writeup will be of interest to the paranoid in this thread.

[1] http://nosuchlabs.com/

s.gif
Thanks for the link.

This is a good example of how you can make a RNG using a microcontroller board connected to an internal USB connector of the motherboard.

However what they have is not perfect, because the RNG boards include the ADC and some simple digital post-processing, providing a RS-232 serial output. For better auditability, the RNG boards should have been simpler, with only the analog part of the RNG, and they should have used an ADC input of the microcontroller instead of using a RS-232 input. If you compile from source and you write the flash of the microcontroller yourself, then it is secure enough.

Because only seldom such boards are available for buying, many people have done something like this only for themselves.

However the problem is that this is a non-standard solution. A connector like the 3-pin header shown at this link should have existed on every motherboard (but with analog input, not with RS-232 input). All software should have expected to have a standard RNG input on the motherboard, like it expects to have HD Audio input/output or temperature/RPM sensors. If the ADC would have been provided by the motherboard chipset, which already provides many other ADCs, there would have been no need for a microcontroller and no need of firmware for the microcontroller.

Had they wanted, Intel could have easily standardized a RNG input for the chipset, like they have standardized HDAudio, SMBus and countless other chipset features. Anyone else would have followed.

It is very likely that standardizing such a solution would have been actually much cheaper for Intel and AMD than implementing RNG instructions inside the CPU, which will always remain non-recommendable for any serious applications, so they waste die area and testing time during manufacturing, and they may also reduce a little the yields of good dies.

s.gif
Here's another iteration: A user supplied board with a high-gain op-amp, a comparator, and a latch -- accepting a clock line -- could produce a definite noise-informed bit sequence. This bit sequence could be observed both at that level and the software level, to confirm that no alteration had taken place in-between, in the motherboard/chipset etc.
s.gif
That would just give an attacker an easy way to control the entropy source.
s.gif
Paranoid implies some aspect of unjustified fear. In this case the fear is quite justified.
s.gif
What’s the point of not trusting the hardware entropy source while still trusting the rest of the chip / hardware?
s.gif
For the rest of the chip / hardware there are at least some chances to test what they do and discover any suspicious behavior.

Any well-designed back-door in a black-box RNG cannot be discovered by testing.

Except for the RNG, the only other thing that cannot be trusted at all is because your computer/smartphone might allow remote connections to a management component of its hardware, regardless how you configure it.

Wired connections are not very dangerous, because you can pass them trough an external firewall and you can block anything suspicious.

The problem is with the WiFi connections, e.g. of the WiFi Intel chipsets in laptops, and obviously the least under your control are the mobile phones.

However even a smartphone can be put in a metal box to ensure that nobody can connect to it (even if that also defeats the main use of a mobile phone, it can make sense if you are worried about remote control only sometimes, not permanently).

On the other hand, a RNG that cannot be verified to be what it claims to be, is completely useless.

s.gif
> For the rest of the chip / hardware there are at least some chances to test what they do and discover any suspicious behavior.

That there are some chances to test them does not provide any measure of trust... you actually have to perform the audit to achieve that.

>... On the other hand, a RNG that cannot be verified to be what it claims to be, is completely useless.

If we are going to take such an absolute line over RNGs, then, to be consistent, we should take the same attitude to the rest of the hardware and software we use - but, per my previous point, that means actually evaluating it, not just having the possibility of doing so.

One might, instead, argue that we should use only verifiable RNGs because that is actually feasible (at least for non-mobile equipment), but that does nothing to bring the rest of the system up to the standard of your last paragraph.

s.gif
Like I have said, besides the RNG, the only other problem is with the possibility of remote connections to the computer chipset.

Any other malicious behavior is implausible, as either easy to detect or requiring too much information about the future (to be able to determine what to do and when to do without a remote control connection). Self-destruct is something that could happen e.g. after a certain number of active hours, but this would make sense only if you are a target for someone able to load special firmware into your own computer. If that would be a general feature of a computer, it would be easily detected after it would happen randomly.

So if you do not trust the HW, you must prevent remote connections. This is easy for desktop/server computers, if you do not have WiFi/Bluetooth/LTE and you do not use Intel Ethernet interfaces (or other chipset Ethernet interfaces, with remote management features) connected to the Internet or to any other untrusted network. Towards untrusted networks, you must use Ethernet interfaces without sideband management links, e.g. you may use securely USB Ethernet interfaces.

Unfortunately, currently there is no way to completely trust laptops when WiFi connections are possible, even if they claim that e.g. Intel vPro is disabled. In any case, it is still better if the manufacturer claims that this is true (like in my Dell laptop), even if you cannot verify the claim with certainty.

Even if someone would be able to connect remotely to your computer and spy you, they will have access only to your unencrypted or active documents.

If you use a bad RNG for encryption purposes, then the spies could also access any encrypted and non-active documents, which is a much greater danger.

In conclusion, the RNG is still in the top position of the hardware that cannot be tested and cannot be trusted. Nothing else comes close.

s.gif
> Like I have said...

Yes, you are repeating yourself without addressing the issue that the ability to verify does not, by itself, confer any trust. Even if we accept your conclusion, it does not mean the other risks are inconsequential.

s.gif
Deniable backdoors are a much bigger risk than reproducible backdoors.

I trust my hardware manufacturers to be afraid of putting a backdoor into their chips if a binary captured via network surveillance could be used to show that a backdoor existed. This would be devastating to their business. Therefore, I trust them to not do anything that risks this occurring.

This is why people were so uneasy when internally-accessible unique serial numbers were added to the microcode engines of Intel processors.

s.gif
You can trust a chip to correctly do cryptographic computations by comparing with another, more trusted system (an FPGA, if you want to go to absurd lengths).

You can protect yourself against faulty key generation by generating the key offsite or on a HSM.

However, a flaw in a RNG that allows a third party (hello NSA) to break cryptography - you cannot defend from that, you can't even detect it.

s.gif
> However, a flaw in a RNG that allows a third party (hello NSA) to break cryptography - you cannot defend from that, you can't even detect it.

You always put bad randomness through enough calls of one way functions that reversing them is computationally infeasible for your adversary for the lifetime of the secret.

s.gif
A datacenter scenario seems like a good fit for a centralized source of entropy, like a server with a dedicated high quality entropy source (maybe some kind of geiger counter/nuclear decay based source?). Very early in the boot process query the entropy server for a truly random seed and go from there to initialize your random algorithm, kind of like NTP and network time sources. Security would be something to pay attention to as you wouldn't want an attacker to ever get control of providing entropy.
s.gif
    Purity spiral
True that. I hadn't considered servers sitting in datacentres. Makes me wonder how AWS/Azure/Google manage randomness.
s.gif
They don’t use them (read the details); they could fall back to using them. And it’s a stupid publicity stunt. And even then, they would use them via an ipcam - something probably way less secure than any rdrand or lava lamp.
s.gif
It is and it isn't, in a sufficiently catastrophic scenario they might help for real.

Note also most people have never seen a real lava lamp, only digital reproductions, like the one in Day of the Tentacle. Not the same thing.

s.gif
i am pretty sure radioctive decay is random, and it's not metaphysical
s.gif
Ah, but is your sample still live enough to be "cryptographic grade" random? Is the hardware that measures the source and the software that reports it subject to any periodicity that you don't know about but your attackers might?

(Some) People who study this often get lost down the rabbit hole and come out thinking the universe is deterministic.

s.gif
Any distribution with a sufficient amount of entropy can be turned into "cryptographic-grade" randomness source using randomness extractors [1]. These work independently of any outside factors that might be trying to sneak signal (e.g. periodicity) into the noise -- as long as you can prove there's sufficient entropy to start with, you're good to go.

[1] https://en.wikipedia.org/wiki/Randomness_extractor

s.gif
Low-intensity radiation is random enough, but it's slow: your device is necessarily twiddling thumbs between a detected event and the next, and entropy is mostly proportional to the number of events (for example, almost n bits from what of 2^n identical units is hit by the next particle).
s.gif
Or, it's what one of my ex-NSA buddies told me: we almost never break the encryption, we break the implementation, because that's where the errors are.

The same can assuredly apply to capturing entropy.

s.gif
100% this. WEP WiFi was an infamous old example. The encryption was solid but the implementation was poor and could be easily broken.
There are such sensors, listening to thermal noise, right inside most CPUs https://en.wikipedia.org/wiki/RDRAND It's a question of trust. Do you trust that Intel (or whoever builds the physical RNG) didn't build a backdoor, a secret predictability? Maybe it's safer to build your own.. or combine several sources (like linux kernel does).
If you zoom in enough there is noise everywhere. Thermal noise across a resistor is particularly easy to measure [0]. Purists might want a nuclear decay event[1], or even a cosmic ray detection[2] but the more complex the apparatus the more shenanigans can occur.

For hyper important entropy, humans must invest in a macroscopic and very slow spectacle - a publicly prepared experiment broadcast live to a large audience. [3]

0 - https://analog.intgckts.com/noise/thermal-noise-of-a-resisto...

1 - https://en.wikipedia.org/wiki/Schr%C3%B6dinger%27s_cat

2 - https://www.youtube.com/watch?v=gwIGnATzBTg&t=479s

3 - https://www.youtube.com/watch?v=Bup0TcbQeVs

The problems you mentioned are not related to PCs or systems with motherboards like you are thinking. Every not-too-ancient PC CPU has been able to gather sufficient entropy just from cycle timing jitter. The problem is with old, simple embedded systems (and actually just emulations of said systems).
For those that want an (additional) source of entropy, the FSF offers[1] a USB HWRNG[2] that uses ADC quantization noise. It seems to be simple and clever enough to get the job done. True to FSF form the source is available if you want to adapt it to your own MCU. The data provided by mine passed all the randomness quality tests I could throw at it.

I don't know if it's still maintained or not, but the developer proposed a public entropy pool[3] which looks interesting. Full disclosure: I haven't looked at it closely enough to understand how the trust model works.

[1]: https://shop.fsf.org/storage-devices/neug-usb-true-random-nu... [2]: https://www.gniibe.org/memo/development/gnuk/rng/neug.html [3]: https://www.gniibe.org/memo/development/kun9be4/idea.html

I guess this is why "gpg --generate-key" asks you to bang randomly on your keybaord and wiggle your mouse, lol.
Interesting photonics opportunity here: could you produce a single photon slit-lamp on a chip for entropy production? Not sure you can do better than that.
For entropy sources that measure the environment, are there attacks where the attacker manipulates the environment? For example, when the source is measuring the temperature, an attacker could alter the temperature near the sensor to create more predictable random numbers?

Related: And does software detect if a sensor is broken or a poor source on entropy? Like if it broke and locked itself on to the same constant temperature reading?

s.gif
That's why you only use the least significant digits of any number that comes from a sensor.

So if the temperature changes from 65.78614329 degrees to 66.24667182 degrees, you don't take 65 and 66, you take 329 and 182. Those digits are most likely to be random noise and not something an attacker can manipulate at will. Even if the analog part of the sensor is stuck at the same temperature as in your example, the digital reading will probably fluctuate around that value with plenty of random digits.

s.gif
What if the connection to the sensor died and it was stuck at 0? What if an attacker heated up the sensor well above the maximum reading it could read?

I get these are probably unlikely, but curious if there's some built-in detection for similar cases.

s.gif
> you don't take 65 and 66, you take 329 and 182.

you take all of them and run it through an appropriate extractor. it is very easy for ADC error to have structure and bias.

PCs are not where problems with entropy generally are a large issue. For starters, many CPUs have inbuild hardware entropy sources (although not everybody trusts them)
> simple GPS

Excuse me, but when did pinging multiple satellites thousands of kilometres away (in orbit) become simple?

s.gif
1. GPS doesn't "ping" multiple satellites. It's the other way around. Multiple satellites are constantly transmitting their times, and any receiver can listen in and figure out their location based on that.

2. regardless of how much technology/research went into developing the GPS, it's still fairly simple to access from a software point of view. A CPU may have millions/billions of transistors that are intricately placed, but that doesn't mean using your CPU to add two numbers can't be described as "simple".

s.gif
I guess I'm just amazed that GPS is now considered simple.
I think VIA [1] did this on some of their mini-ITX boards with embedded CPUs. I seem to recall a contemporary review saying that they used electrical noise to implement an RNG, a fact which is corroborated [2] here:

> Nehemiah stepping 3 and higher offers an electrical noise-based random number generator (RNG) that produces good random numbers for different purposes.

[1] https://en.m.wikipedia.org/wiki/VIA_PadLock [2] http://www.logix.cz/michal/doc/article.xp/padlock-en

Because PC's weren't, from the beginning of its history, desgined for security scenarios.
Isn’t that what a Secure Enclave is for? I believe it has a TRNG, a true random number generator.
s.gif
That true random number generator needs entropy sources: it's the client, not the server.
Will quantum computing solve this issue?
s.gif
Quantum computers cost millions of dollars, and are worse random sources than a component that costs a penny. Simply put, no, quantum computers will not solve this.
I had an idea many years ago about using those cheap tv tuner cards that were briefly somewhat popular in PCs as an entropy source. If it could reliably tune to a dead channel I imagine it would be a good source of quality entropy.
s.gif
Many years ago I have indeed used this method, which has the advantage of a very high rate for the random bits.

There is no need to tune to a dead channel. Instead of connecting a TV cable or antenna, it is enough to connect a coaxial resistive terminator on the TV input connector. Then all the channels are dead.

Unfortunately this is a dead technology, so I have abandoned it after I retired my last motherboard with PCI connectors.

You can do something similar with the audio microphone input, but the TV tuners had a much higher bandwidth and a much higher sensitivity. For good results with the microphone input you need to make an analog noise generator, instead of using just a resistor, as it was enough for a TV tuner.

s.gif
Using a radio receiver as a source of entropy opens you to a rather obvious attack channel.
s.gif
If you replace the radio signal source at the radio receiver input with a shielded resistive terminator, you no longer have any attack channel (because now the radio receiver has only an output, but no input).

For this purpose, radio receivers are just very sensitive amplifiers. Any very sensitive amplifier will output noise when its input is terminated on a resistor, both amplified resistor noise and additional noise from sources internal to the amplifier.

When you use much less sensitive amplifiers than radio receivers, e.g. audio amplifiers, a resistor at the input may be not enough and you must put there a source of greater noise, e.g. an avalanche breakdown diode (a.k.a. Zener diode).

We should use the white noise or turbulence generated by our PC fans as entropy sources. Brownian motion is one of the best entropy sources we have.
They do have lots of sensors that can be used... We have bad defaults and a not-very-skillful usage of said hardware. On workstations there are lots of options. In datacenters, there could be some dedicated hardware for it somewhere - and an skillful usage of said entropy being imported into each rack unit.
s.gif Applications are open for YC Summer 2022
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK