As more business and artificial intelligence workloads move to the cloud, it's no surprise that the demand for computing resources remains unabated. Today's data center must provide a combination of nearly infinite capacity and low latency processing. These requirements drive technology vendors such as ARM, Intel, and NVIDIA to innovate new chip designs and software platforms to support high-performance computing.
There's a big pot of gold for the vendors who get it right. Research by Statistics forecasts the global data center chip market will grow to $15.64 billion U.S. dollars by 2025, more than double the size recorded in 2017. It was clear from NVIDIA's GTC event that the company plans on getting an outsized share of this enormous growth opportunity. NVIDIA, a company known to many for its Graphics Processing Units in gaming, also provides computing technology for data centers.
GPUs are not just for AI training anymore
It's focused heavily on artificial intelligence, which represents the most computationally intensive workloads in the data center. Most companies think of NVIDIA's GPUs as a go-to computing resource for training AI models using large datasets. And the company achieved great success in that market. Companies like Walmart have sworn by Nvidia's GPUs.
However, AI computing extends beyond training. Broadly speaking, there are several stages in machine learning that include the data preparation, training the model and the inference, and deploying models into production. The inference stage is where trained models are used to infer an outcome or result. While training is the sexy high-performance area of AI, the inference area is where companies leverage the fruits of training a model. Today, much of the inferencing work gets processed on Intel CPUs.
GPUs are expensive and weren't considered the right price for performance fit in the inferencing space. At its recent GTC conference, NVIDIA aimed to change the dialogue by showcasing how the company can accelerate the entire machine learning pipeline. As noted, GPUs were useful for the compute-intensive training in machine learning, but overkill for inference. Simultaneously, companies are also clamoring for more performance at the higher end of the data processing spectrum.
NVIDIA introduced the A100, its 8th generation GPU design, and its first based on its Ampere architecture to address these needs. The A100 is a multi-instance GPU designed for data centers HPC and inference, delivering 20x speed improvements over Volta, with more than 54 billion transistors and third-generation Tensor Cores.
What's interesting is the chip was designed with flexibility that allows it to support both training and inference from a single chip. The A100 can efficiently scale to thousands of GPUs or, with NVIDIA Multi-Instance GPU (MIG) technology, be partitioned into seven GPU instances to accelerate workloads of all sizes. It was clear to everyone that Nvidia GPUs were the darling for training large models. However, the artificial intelligence market is broader than just training.
Inference workloads are a growth area in AI and it's an area that companies such as Intel have excelled that. To be successful in the long run, NVIDIA needed a solution that addresses the inference part of artificial intelligence. Change the data center, change the computing landscape
Change the data center, change the world
NVIDIA also spoke about the data center as the new unit of computing. It's bringing this concept to light with the EGX A100 card, a combination of an A100 Ampere-based GPU package along with a Mellanox ConnectX-6 Dx NIC on one card. By connecting networking and the GPU A100 you eliminate added steps and latency.
According to NVIDIA , its third-generation Tensor Cores accelerate diverse workloads and offer up to 6X Higher Out-of-the-Box performance Meanwhile, structural sparsity support delivers up to 2X more performance on top of A100's other inference performance gains. What's the Key takeaway for those that don't speak chip? NVIDIA is offering high performance that can be scaled up and down to support various types of computing. Additionally, Mellanox networking improves the ability to move data around in a smarter fashion.
The vision to create that data center as the new computing unit is a good one. The concept frees developers to design systems that leverage low latency and radically scalable processing power. Increasingly companies are looking to do you AI and machine learning in the cloud, which means data center technology must scale at a rapid pace. Software tools to support this are critical, which is why Nvidia announced tools such as Merlin. NVIDIA is creating a platform with tools to simplify the complicated machine learning pipelines such as recommender systems and offer a framework for conversational AI.
Is it curtains for other chip vendors? Not just yet.
NVIDIA's latest announcements secure NVIDIA at least a short-term lead in the performance sector. While the NVIDIA solution is less expensive for inference than it has been in the past, it's still not exactly cheap. Granted, there's a large market of data intensive industries that cares about breakthrough performance, so it's a win for those customers. Can NVIDIA walk away with the lion's share of the data center chip business? I'm sure Intel will have something to say about that.
We're still in the early days of building the AI infrastructure of the future, but I expect the latest offering to be attractive to many data center providers. NVIDIA's timing is perfect. The world needs all the advancements it can get to support high-performance computing. Ampere 100 and the acquisition of Mellanox are essential milestones in NVIDIA's progression to becoming a next-generation data center powerhouse. NVIDIA continues expanding its platform, which is a wise move as platform plays are what win in the long run.