When NVIDIA’s founder and CEO Jensen Huang waxed poetic about artificial intelligence in the past, it mostly felt like marketing bluster — the sort of lofty rhetoric we’ve come to expect from an executive with a never-ending supply of leather jackets. But this year, following the hype around OpenAI’s ChatGPT, Microsoft’s revamped Bing and a slew of other competitors, NVIDIA’s AI push finally seems to be leading somewhere.
The company’s GTC (GPU Technology Conference) has always been a platform to promote its hardware for the AI world. Now it’s practically a celebration of how well-positioned NVIDIA is to take advantage of this moment.
“We are at the iPhone moment for AI,” Huang said during his GTC keynote this morning. He was quick to point out NVIDIA’s role at the start of this AI wave: he personally brought a DGX AI supercomputer to OpenAI in 2016, hardware that was ultimately used to build ChatGPT. We’ve seen the DGX systems evolve over the years, but it’s remained out of reach for many companies (the DGX A100 sold for $200,000 in 2020, which was half the price of its predecessor!). So what about everyone else?
That’s where NVIDIA’s new DGX Cloud comes in, an (obviously) online way to tap into the power of its AI supercomputers. Starting at a mere $36,999 a month for a single node, its meant to be a more flexible way for companies to scale up their AI needs. DGX Cloud can also work together with on-site DGX devices, since they’re all controlled with NVIDIA’s Base Command software.
NVIDIA says every DGX Cloud instance is powered by eight of its H100 or A100 systems with 60GB of VRAM, bringing the total amount of memory to 640GB across the node. There’s high-performance storage, as you’d expect, as well as low-latency fabric that connects the systems together. That amount of power may make the cloud solution more tempting for existing DGX customers—why spend another $200,000 on a box, when you can do so much more for a lower monthly fee? DGX Cloud will be powered by Oracle’s Cloud Infrastructure to start, but NVIDIA says it will expand to Microsoft Azure next quarter, as well as Google Cloud and other providers “soon.”
So what are you supposed to do with all of those AI smarts? NVIDIA has also unveiled AI Foundations, an easier way for companies to develop their own Large Language Models (similar to ChatGPT) and generative AI. Large companies like Adobe, Getty Images and Shutterstock are already using it to build their own LLMs. It also ties directly into DGX Cloud with NeMo, a service specifically focused on language, as well as NVIDIA Picasso, an image, video and 3D service.
Alongside DGX Cloud, NVIDIA showed off four new inference platform to tackle AI tasks, including NVIDIA L4, which offers “120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency,” according to the company. L4 can also be used for work like video streaming, encoding and decoding, as well as generating AI video. There’s also NVIDIA L40, which is devoted to 2D and 3D image generation, as well as NVIDIA H100 NVL, a LLM solution with 94GB of memory and an accelerated Transformer Engine. (That helps deliver 12-times faster GPT3 inference performance compared to the A100, according to NVIDIA.)
Finally, there’s NVIDIA Grace Hopper for Recommendation Models, an inference platform which does exactly what its name declares. And in addition to being built for recommendations, it can also power graph neural networks and vector databases.
If you’re curious about seeing NVIDIA L4 in action, it’ll be available to preview on Google Cloud G2 machines today. Google and NVIDIA have also announced that the generative AI video tool Descript, as well as the art app WOMBO, are both already using L4 over Google Cloud.