Machine Learning Transforms Cloud Providers into Custom Chip Developers
I remember my first exposure to cloud computing quite clearly. I attended an enterprise architecture meetup and someone from Amazon was talking about a new service from the company called Amazon Web Services (AWS) that offered what he referred to as “Infrastructure as a Service” (this was so early in AWS’s life that the term cloud computing had not even invented).
AWS streamlined access to industry-standard X86 computing resources. Instead of the interminable provisioning timelines typical of on-premises environments, AWS delivered virtual machines in less than ten minutes. To someone used to waiting weeks or months for computing resources, this seemed like sorcery.
It was immediately clear to me that Infrastructure as a Service would revolutionize the technology industry. And so it has proved. Users flocked to AWS and its peers because they could design and run applications faster and cheaper than ever before.
Demand for cloud computing is so large today that the providers are building data centers at a furious pace. According to a Data Center Knowledge article, the big three of cloud—AWS, Microsoft, and Google (aka AMG) are spending about $30 billion per year on new infrastructure.
Each of them has moved far beyond industry-standard X86 servers. They employ in-house staff to create new hardware designs tuned to their environments. They work with Intel to develop new chips better suited for their use cases. One can think of them as implementing custom X86 computing environments designed to enable massive scale while operating at the lowest possible cost point.
The rise of machine learning (ML) changes this formula. While X86-based computing is great for common application workloads, it’s not nearly as well-suited for ML workload execution. This is because parallel processing, which is a hallmark of ML execution, is a poor match for the X86 single thread processing approach.
GPU boards are often used for ML, as the parallel processing of graphics chips is a better match for these workloads. All of the big three currently or will soon offer virtual machines with GPU boards attached to allow customers to execute ML workloads.
However, while GPU parallel processing is better than X86 chips for machine learning, these graphics-oriented chips are not ideally suited for machine learning. Stated another way, GPUs perform ML more efficiently and less expensively than X86 environments, but there are additional capabilities available with ML-focused chip designs. This has caused Google and Microsoft to move beyond system design and into the realm of chip manufacture. Simply put, the demands of ML require specialized processors, and there isn’t an Intel equivalent for ML chips—so AMG have stepped forward to implement them.
Interestingly, AMG have diverged in their approach to ML processors, which reflects their overall approach to ML.
Google created an open source ML framework called TensorFlow and is putting all of its ML efforts behind it. It focuses on a single framework and can implement the TensorFlow code directly in silicon—a term meaning that it has implemented the framework in an ASIC it refers to as a tensor processing unit. TPUs offer impressive performance gains compared to the CPU and GPU alternatives. To learn more on this topic, read the Google blog that describes their TPU initiative. Google launched its public TPU service in 2016, and has since followed up with an updated version a few months ago.
In contrast to Google, Microsoft is not wedded to a single ML framework. Its employees use several open source frameworks (including TensorFlow), depending on the task at hand. This dictates against an ASIC approach, so the company recently announced a FPGA-based service called Brainwave.
Microsoft’s approach offers flexibility to its researchers and engineers. This gives them performance well beyond GPUs, but without restricting them to a single ML framework. Brainwave is currently only available to Microsoft employees, but the company plans on making it available to external customers in the future.
The quiet one in this flurry of hardware announcements? AWS. As noted, it does offer GPU capability, but to date has made no announcement regarding custom hardware.
However, after a bit of a slow start, AWS is going headlong after AI, and the company is unlikely to be willing to take a back seat to the other members of AMG. I expect the upcoming Reinvent conference is likely to witness one or more ML hardware announcements from the company.
One might question the ML hardware investments these companies are making. Sure, the new hardware will run ML workloads better than X86 CPUs or even GPUs, but why go to the work and expense of designing and building entire new chip architectures?
In a phrase: customer demand. The use of machine learning is exploding with the technology applied across an enormous range of use cases, from clothing recommendation to railway maintenance to medical diagnosis.
Machine learning is now poised where cloud computing was a decade ago and will be accelerated by the same phenomenon: easy availability at a low price. Its growth is likely to be more rapid, though.
With cloud computing, we’ve spent a decade in debating whether on-premises or public clouds are better, with adoption lagging due to the ongoing controversy.
Machine learning will not undergo the same kind of bickering. ML’s natural home is a public offering, because it improves with scale and data, which are both more available from a public provider than in a single user location.
It will be interesting to see if, in a few year’s time we’ll see AMG doubling down with even more investment as they add ML facilities to their data center arms race. For sure, though, we’re only at the beginning of their custom hardware efforts.
About the On-Demand Webinar
About the Webinar