Intel has a huge problem with every CPU it has made since the original P5 Pentiums in 1995. The story is changing and evolving rapidly but one thing isn't changing: This is huge.
Curt Franklin covered the basics of the story yesterday here at Security Now. What's happened is that because of a method Intel used to speed up the CPU performance called speculative execution, a threat actor can make a pretty good guess where the next instruction to be executed will be found. This fetching method has no security checks built in, so a program in the userspace can then find out about the code that is in the operating system's kernel space which is something that never should happen. Things like passwords, application keys, and file caches are stored in the kernel. A good geeky recap of the problem (named "Meltdown") can be found here.
Another fault similar to this called "Spectre" breaks the isolation between applications.
Intel has known about this situation since Jann Horn at Google's Project Zero discovered it, with confirmation from other academic researchers. There has been a major effort underway in secret since July to resolve it, with Intel trying to embargo any mention of the problem. That embargo came apart on Wednesday January 3 when the first proof-of-concept by @brainsmoke went public on Twitter.
As far as responses go, the Linux OS was the first out of the gate in late December with a Meltdown semi-fix called page table isolation. On Intel processors this separates the kernel addresses from the program addresses by storing them on separate pages. But Linux may need a major redesign to deal with the entire situation inclusive of Spectre.
Apple has already deployed part of a fix ("Double Map") since December's macOS 10.3.2, a solution that is enough to stop the Meltdown vulnerability.
Windows has been testing out a Meltdown fix (Windows 17035 Kernel ASLR/VA Isolation In Practice) in some recent builds since November, and has issued an emergency patch to address the issue as best they can.
Intel has responded to all this uproar by saying, "Intel is committed to product and customer security and is working closely with many other technology companies, including AMD, ARM Holdings and several operating system vendors, to develop an industry-wide approach to resolve this issue promptly and constructively. Intel and other vendors had planned to disclose this issue next week when more software and firmware updates will be available."
It remains unclear whether ARM and AMD processors are also affected by Meltdown, but AMD says that "Due to differences in AMD's architecture [from Intel's], we believe there is a near zero risk to AMD processors at this time."
It does seem that Spectre affects ARM and AMD processors, however. ARM has told developers that a "small subset" of its CPUs may be vulnerable to some of the attacks, most likely Spectre. There is no patch for Spectre, though specific mediations of particular exploits may be possible.
Mitigation for Meltdown on any brand of processor is going to cause a performance hit. This is because CPU buffers like the Translation Lookaside Buffer (TLB) that contain information that was stored statically now need to be flushed every time the kernel begins executing, and every time user code resumes executing. Getting information from a cache like the TLB can be 40 times faster than if physical memory needs to be accessed. For some workloads (and it will be dependent on what exactly is being done), the effective total loss of the TLB alone can give between a 5% and 30% slowdown.
Cloud service hypervisors are at risk from this. They are just programs themselves. Cloud services like Amazon EC2 and Google Compute Engine could be greatly impacted by something like a privilege escalation attack that uses this method, although Amazon and Microsoft's Azure have already announced they will soon conduct system maintenance involving rolling server reboots to allow patches to be applied to their hundreds of thousands of physical servers.
None of the host patching will do any good unless the VM that resides on the host is patched as well.
To bring this into perspective, Google has said that tests on virtual machines used in cloud computing environments extracted data from other customers using the same server. Their Project Zero blog gives more details.
This will be a difficult and drawn-out situation to respond to. It's not going to be just a simple patch-n-go. For some, new hardware is going to be required. But the entire problem range may not yet be revealed. Other hardware vulnerabilities like the Rowhammer attack on DRAM might yet be combined with what was revealed by Intel to cause even more problems. This is going to require some very serious attention and rethinking of assumptions from the security community in the days ahead.
— Larry Loeb has written for many of the last century's major "dead tree" computer magazines, having been, among other things, a consulting editor for BYTE magazine and senior editor for the launch of WebWeek.