The NVIDIA/Microsoft AI Supercomputer Cloud Is A No-Brainer
This week, NVIDIA announced a multiyear collaboration with Microsoft to build a cloud-based artificial intelligence supercomputer. With this partnership, Microsoft Azure will be the first public cloud to leverage NVIDIA’s full AI stack — chips, networking, and software. More specifically, the supercomputer will be powered through a combination of Microsoft Azure’s scalable ND- and NC- series virtual machines and NVIDIA technologies (i.e., A100 and H100 GPUs, Quantum-2 InfiniBand networking, and AI Enterprise software suite). The collaboration will also incorporate Microsoft’s DeepSpeed using the H100 to double the rate of AI calculations by increasing from eight-bit floating point precision to 16-bit operations. Once completed, the companies claim that it will be the most scalable supercomputer, where customers will have thousands of GPUs to deploy in a single cluster to train massive language models, build complex recommender systems, and enable generative AI at scale.
Why Now?
The partnership is an unsurprising move for both companies. AI is a key growth pillar for Microsoft. The company’s vision is to bring “AI to every application, every business process, and every employee.” And it’s not the first time the company has built an AI supercomputer in Azure — the first was two years earlier in collaboration with OpenAI. With public cloud at mainstream adoption (87% of enterprises globally in 2022), positioning Azure as a key enabler for its AI tools and services is a logical move.
The major hyperscaler infrastructure services have reached parity in many respects. As such, the path to differentiation is now through specialized services such as advanced compute capabilities (i.e., AI and ML), edge and hybrid computing offerings, and industry-specific solutions.
Microsoft’s strategy is to offer its Azure customers a cost-effective infrastructure for AI workloads. This dovetails nicely with Azure’s larger portfolio of services that serves the large community of loyal Microsoft developers building the next generation of AI applications.
The Microsoft embrace of NVIDIA is an answer to Amazon Web Services’ (AWS) purpose-built chips for AI/ML — Trainium and Inferentia — as well as a counter to Google’s Vertex AI, an integrated platform that constitutes a specialized AI cloud nested within Google Cloud Platform (GCP). Microsoft already had a strong card to play with Power BI, which is often the destination point for models built on other clouds. Assuming that its rivals can’t easily replicate the deal with NVIDIA, Microsoft can stake a claim to the entire AI/ML workflow.
The Microsoft deal is a notable win for NVIDIA, too. Its technology is ubiquitous in almost every AI infrastructure solution and cloud service. Azure instances already feature a combination of NVIDIA’s A100 GPU and Quantum 200-GB/s Infiniband networking. GCP and AWS also use the A100, making NVIDIA’s technology connected to almost every US cloud customer. Of course, it isn’t just happenstance that NVIDIA is embedded in every major cloud provider. This decision was made a decade ago when the company decided to design and market its GPUs for cloud-based AI applications. And it did so right as the market for AI and cloud technologies was taking off.
What About Other Motivations?
Are there possibly other motivations driving the timing of this partnership? Could Microsoft and NVIDIA be chasing their competitors? In April, Fujitsu announced that it would be building the world’s fastest cloud-accessible supercomputer. The machine would leverage the Fujitsu A64X processor, which is known for its energy efficiency, and would provide a Japan-native alternative to the US hyperscalers. In January, Meta announced a collaboration with NVIDIA to build an AI supercomputer that hosts over 16,000 GPUs by summer 2022. Or there could be other factors outside of competition at play: In September, the US government ordered NVIDIA to cease exports of all A100 and H100 chips to China.
What Does This Mean For You?
The obvious effects include that AI is more accessible, adoption costs are lower, innovation is better enabled, and more organizations can build and leverage AI capabilities into their processes and products. In addition, access to supercomputing will mean new avenues for innovation breakthroughs and accelerated design and product development. For instance, product designing that requires massive amounts of simulation and physical prototyping can be replaced and accelerated with rapid software calculations. Airline startup Boom Supersonic was able to run through 53 million compute hours using AWS and has plans to use up to 100 million more compute hours. Microsoft is betting that its NVIDIA implementation will make it a cloud of choice for such workloads by combining raw compute power featuring seamless integration with Power BI. Consequently, supercomputing will shift from expensive and exotic to just another cloud workload option that may be pricey but will pack much more of a punch.