NVIDIA: Nvidia’s first amp CPU designed for data centers and AI15. May 2020
NVIDIA: Nvidia’s first amp CPU designed for data centers and AI
New York, 15.5.2020
Jensen Huang, CEO of Nvidia, yesterday unveiled the next-generation Ampere-GPU architecture – the A100. This first GPU using amps is designed for scientific computing, cloud graphics and data analysis, and is primarily used in data centers.
Huang made no secret that this product is a result of changes in the marketplace, with the pandemic causing a huge increase in demand for cloud computing. Jensen Huang, “This momentum is really quite good for our data center business … I expect Ampere to do remarkably well. It is our best data centre GPU ever made and it benefits from almost a decade of our data centre experience”.
The A100 has more than 54 billion transistors, making it the world’s largest 7nm processor. “This is basically almost at the theoretical limit of what is possible in semiconductor manufacturing today,” Huang explains. “The largest chip the world has ever made, and the largest number of transistors in a calculating machine the world has ever made,” Huang explains.
Nvidia is expanding its Tensor cores to make them easier for developers to use, and the A100 will also offer 19.5 teraflops of FP32 performance, 6,912 CUDA cores, 40 GB of memory and 1.6 TB/s of memory bandwidth. However, all these features will not be included in the latest version of Assassin’s Creed.
Instead, Nvidia is combining these GPUs into a stacked AI system that will power its supercomputers in data centers around the world. Similar to how Nvidia used its earlier Volta architecture to develop the Tesla V100 and DGX systems, a new DGX A100 AI system combines eight of these A100 GPUs into a single massive GPU.
The DGX A100 system promises 5 petaflops of performance thanks to these eight A100s, and they are combined with Nvidia’s third generation NVLink. The combination of these eight GPUs provides 320GB of GPU memory with 12.4TB/s memory bandwidth. Nvidia also offers 15 TB of Gen4-generation internal NVMe memory for AI training. Researchers and scientists using the DGX A100 systems will even be able to split workloads across up to 56 instances and distribute smaller tasks to the powerful GPUs.
Nvidia’s recent $6.9 billion acquisition of server network provider Mellanox also comes into play, as the DGX A100 includes nine 200Gb/s network interfaces for a total bi-directional bandwidth of 3.6Tb/s per second. As modern data centers adapt to increasingly diverse workloads, Mellanox technology will become increasingly important to Nvidia. Huang describes Mellanox as the most important “connective tissue” in the next generation of data centers.
“When you look at how modern data centers are built, the workloads that can be handled are more diverse than ever,” Huang explains. “Our approach for the future is not to focus on the server itself, but to look at the entire data center as one computing unit. I believe that in the future, the world will think about data centers as one computing unit, and we will think about computing on a data center scale. No longer just personal computers or servers, but we will be working on a data centre scale”.
Delivery of Nvidia’s DGX A100 systems has already begun, with some of the first applications, including research on COVID-19, being conducted at the US Argonne National Laboratory.
“We are using the most powerful supercomputers in America in the fight against COVID-19 and are performing AI modeling and simulation on the latest available technology such as the Nvidia DGX A100,” said Rick Stevens, deputy laboratory director for computing, environment and life sciences at Argonne. “The computing power of the new DGX A100 systems coming to Argonne will help researchers research treatments and vaccines and study the spread of the virus, allowing scientists to perform years of work accelerated by the AI in months or days”.
Nvidia says Microsoft, Amazon, Google, Dell, Alibaba and many other major cloud services providers are also planning to integrate the individual A100 GPUs into their own offerings. “The adoption and enthusiasm for amps from all hyper-scalers and computer manufacturers around the world is truly unprecedented,” said Huang. “This is the fastest introduction of a new data center architecture we’ve ever had, and that’s understandable.