The cloud runs the world. Given that it’s such a huge source of profit and the fundamental infrastructure of the future, you might assume that Amazon, Google, IBM, and all the cloud companies raking in billions intimately know every piece of hardware, software, and code in their data centers. You’d be wrong.
For hackers, the data center is the target’s brain—one of the most important points of control and one of the highest-value targets. American and Chinese intelligence agencies, two of the most advanced cyber powers in the world, have targeted and breached data centers as part of their most ambitious espionage operations.
Google today is announcing a new open source chip design based on the lessons the company has learned from their first layer of defense in the company’s 19 data centers on five continents: OpenTitan, the open-source version of the two-year-old Titan chip used in those data centers.
Titan aims to cryptographically prove that the machines operating in Google’s $8 billion cloud business can be trusted, haven’t added vulnerabilities, and aren’t surreptitiously under an adversary’s control.
The security guarantee the chip confers is “super critical when you’re running the planet,” says Royal Hansen, the vice president of engineering at Google Cloud.
The open-sourcing of the Titan chip is an effort by Google and its partners to expand transparency and trust at the lowest levels of the machines running in data centers. It is meant to allow anyone to inspect, understand, and fully trust the company’s machines from the first moment the power comes on. The nonprofit engineering company lowRISC, based in Cambridge, UK, will manage the project.
As one more line of defense against espionage in locations like chip fabrication plants, OpenTitan also boasts a self-test to check for tampering in the memory every time the chip boots.
Imagine a cloud computing data center as a pyramid or, if you’re the paranoid type, a house of cards the size of Mount Everest. The lowest level is the silicon hardware that runs the first code when the power turns on. The machine boots, and each component runs a shockingly large amount of code in those very first moments. This code has historically been difficult to know much about, never mind fully understand or effectively secure. Before security software is ever operational, code like the firmware is already active and controlling the boot process. That makes firmware and other low-level targets extraordinarily tempting for hackers.
“Google wants to ensure from the moment you press the power button that they can verify exactly the sequence of everything that happens before the first instruction gets executed,” says Kaveh Razavi, a security researcher at Vrije Universiteit Amsterdam.
OpenTitan will kill the entire boot process if the code generated by the firmware doesn’t match the code expected by the chip.
“If you can’t trust the thing the machine boots on, it’s game over,” says Gavin Ferris, a board member at lowRISC. “It doesn’t matter what the operating system does—if by the time the operating system boots you’re already compromised, then it’s all academic. You’re already done.”
The risk of supply chain attacks to spy on hardware at a low level is very real, and such an attack is shockingly affordable.
“Think about the millions of servers in our data centers: we have baseboard management controllers, network interface controllers, all kinds of chips on these motherboards,” says Hansen. He says the security needs to begin with the silicon hardware: “It can’t be in the software, because you’re already past that by the time it’s begun to boot and load.”
Take the case of MINIX, an operating system quietly embedded by Intel on the CPUs of over a billion machines before anyone realized what was going on. Faced with a previously entirely unknown and complex operating system, Google began work to remove the proprietary, “exploit friendly,” and highly privileged code from its platforms.
For Google’s engineers, MINIX was an entire attack surface they didn’t know about, understand, or defend. MINIX is one of the catalysts that pushed Google to fabricate its own hardware and led to OpenTitan.
The most advanced hacking groups will aim to achieve “persistence” so they can not only gain access to but remain present on a breached machine even across reboots. OpenTitan is not a silver bullet, but it does make it more difficult for an attacker to gain persistence without setting off alarms.
“Ideally for an attacker, once they’ve compromised a node, they would like to stay on that machine even though software gets updated or other components get updated. They would still like to observe what’s going on,” Razavi says.
“These are adversaries, not just interested in compromising infrastructure,” he says. “If at any given point they would like to compromise something specifically, they will already have a foothold in the infrastructure. Most of these things are, as you can imagine, quite covert.”
But it’s not just those advanced threats cloud providers need to worry about. The nightmare scenario is that once a sophisticated actor finds a particularly bad vulnerability, anyone can use it. Earlier this year, IBM had to deal with just that when researchers found they could exploit low-level firmware vulnerability and then persist and spy on that machine across customers and reboots.
“Every level has a potential for the injection of bad code,” Hansen says. “The only way to detect that is to check from the very beginning that the code is the code you thought you’d get.”
Correction: The article has been updated to clarify that OpenTitan is an open source chip design managed by multiple organizations, an effort based on the lessons learned from the original Titan chip so that anyone can use the technology. The article also now clarifies that that MINIX was not built by Intel.