AMD has started issuing some patches for its processors affected by a serious silicon-level bug dubbed Zenbleed that can be exploited by rogue users and malware to steal passwords, cryptographic keys, and other secrets from software running on a vulnerable system.
Zenbleed affects Ryzen and Epyc Zen 2 chips, and can be abused to swipe information at a rate of at least 30Kb per core per second. That’s practical enough for someone on a shared server, such as a cloud-hosted box, to spy on other tenants. Exploiting Zenbleed involves abusing speculative execution, though unlike the related Spectre family of design flaws, the bug is pretty easy to exploit. It is more on a par with Meltdown.
The vulnerability was highlighted today by Google infosec guru Tavis Ormandy, who discovered the data-leaking vulnerability while fuzzing hardware for flaws, and reported it to AMD in May. Ormandy, who acknowledged some of his colleagues for their help in investigating the security hole, said AMD intends to address the flaw with microcode upgrades, and urged users to “please update” their vulnerable machines as soon as they are able to.
Proof-of-concept exploit code, produced by Ormandy, is available here, and we’ve confirmed it works on a Zen 2 Epyc server system when running on the bare metal. While the exploit runs, it shows off the sensitive data being processed by the box, which can appear in fragments or in whole depending on the code running at the time.
If you stick any emulation layer in between, such as Qemu, then the exploit understandably fails.
The bug affects all AMD Zen 2 processors including the following series: Ryzen 3000; Ryzen Pro 3000; Ryzen Threadripper 3000; Ryzen 4000 Pro; Ryzen 4000, 5000, and 7020 with Radeon Graphics; and Epyc Rome datacenter processors.
AMD today issued a security advisory here, using the identifiers AMD-SB-7008 and CVE-2023-20593 to track the vulnerability. The chip giant scored the flaw as a medium severity one, describing it as a “cross-process information leak.”
A microcode patch for Epyc 7002 processors is available now. As for the rest of its affected silicon: AMD is targeting December 2023 for updates for desktop systems (eg, Ryzen 3000 and Ryzen 4000 with Radeon); October for high-end desktops (eg, Threadripper 3000); November and December for workstations (eg, Threadripper Pro 3000); and November to December for mobile (laptop-grade) Ryzens. Shared systems are the priority, it would seem, which makes sense given the nature of the design blunder.
Ormandy noted at least some microcode updates from AMD are making their way into the Linux kernel. OpenBSD has some details here. Our advice is to keep an eye out for AMD’s Zenbleed microcode updates, and for any security updates for your operating system, and apply them as necessary when available. There’s no word yet on whether there will be a performance hit from installing these but we can imagine it’ll mostly depend on your workloads.
There is a workaround in the meantime, which Ormandy set out in his write-up of the bug (archived copy as his site was being pummeled with traffic earlier). This involves setting a control bit that disables some functionality that prevents exploitation. We imagine this dials back some of the speculative execution required to exploit Zenbleed, and this may cause some kind of performance hit.
How does the bug work?
For the full technical details, see the above write-up. But we’ll summarize it here; understanding of how CPU cores work at the machine-code level is useful here.
As a modern x86 processor family, AMD’s Zen 2 chips offer vector registers, a bunch of long registers for performing operations. These vector registers are used by applications and operating systems to do all kinds of things, such as doing math operations and processing strings. As such these registers have all sorts of data flying through them, including passwords and keys.
There is an instruction called
vzeroupper [AMD PDF, page 860] that zeroes some of these vector registers, and it’s used in OS and application library routines that are invoked hundreds or thousands of times a second by all processor cores in a box. For example, the
strlen() function uses
vzeroupper, and that’s called quite a lot.
When AMD’s chips execute
vzeroupper, they simply mark the affected registers as zero by setting a special bit, and then allow those registers to be used for other operations. If
vzeroupper is speculatively executed – the processor anticipates it will need to run that instruction – it sets this zero bit and frees the registers in the register file for reuse. This can happen if the
vzeroupper instruction lies right after a branch instruction; if the processor thinks the branch is unlikely to be taken, it will start the
vzeroupper speculatively. As we saw with Spectre and Meltdown, CPUs do this kind of thing to gain big performance boosts.
If the processor core realizes soon after, actually, it shouldn’t have speculatively executed the
vzeroupper instruction, it tries to rewind that decision and undo the zeroing by clearing the bit that indicates the registers are zero. Unfortunately, by that point, the registers are probably in use by some other code, and are no longer marked as zero, so their contents from the previous operation are now accessible to that other code.
This is why the flaw is being compared to a use-after-free()-style vulnerability.
With threads being scheduled all over the processor core complex, and with some clever exploit code, it is possible to cause
vzeroupper to be incorrectly speculatively executed, rewound, and data to leak by observing the content of those vector registers. It relies on the speculative execution of
vzeroupper and the fact that registers are stored in a large register file and reassigned to operations as needed.
As Ormandy noted, “bits and bytes are flowing into these vector registers from all over your system constantly.” He continued:
His takeaway: “It turns out that memory management is hard, even in silicon.”
We’ve asked AMD for further comment. ®