Intel Gaudi 3 And Xeon 6 Processors Arrive To Supercharge AI

hero intel gaudi 3
Tech companies have unilaterally decided that generative AI is the next big thing, and vendors at every level of the supply chain are pouring billions of dollars into developing AI hardware and software. That obviously includes Intel, whose presentation today at its Intel Vision 2024 event was heavily focused on AI and the next-generation hardware platforms and software solutions that it will be offering to run it.

intel vision enterprise ai

We’re typically hardware-focused around these parts—it’s in the name, after all—so we’re not going to go over the software in any great detail, but the general idea behind Intel’s software approach is that its customers can buy whatever Intel hardware suits their needs and the intermediate layers will take care of everything else. But what AI hardware does Intel offer? Primarily, Gaudi AI accelerators and Xeon processors with AI-focused instruction set extensions.

Intel Announces Powerful, New Gaudi 3 AI Accelerator

gaudi advancements

Intel has new hardware coming to both of those categories soon, and it announced them at its Vision event. Gaudi 3 is naturally the successor to the Gaudi 2 AI accelerator, and Intel says that the upcoming 5nm package will offer double the FP8 performance, quadruple the BF16 performance, double the network bandwidth, and a 50% uplift in memory bandwidth compared to Gaudi 2.

gaudi specs

In terms of overall specifications, this part is optimized for both handling very large models and also for the ability to scale out to massive proportions. Its architectural features and large 96MB SRAM cache give it an advantage in working with very large AI models, and Gaudi 3’s twenty-four on-package 200-Gigabit Ethernet connections allow it to scale, theoretically, all the way to 1,024 nodes in a single cluster. That’d be some 8,192 Gaudi 3 chips, with theoretical FP8 compute throughput somewhere in the 15-exaflop range, according to Intel.

Intel Gaudi 3 Performance Expectations

gaudi training

Intel compares Gaudi 3 directly against Intel’s Hopper H100 GPU training a variety of models. In all of the example cases, Gaudi 3 comes out with a considerable performance advantage. Of course, Intel isn’t going to show us numbers that don’t make Gaudi 3 look anything less than fantastic, right?

gaudi inference

Well, in inferencing, Gaudi 3 looks less dominant, although performance is certainly competitive and impressive. It’s really the big Falcon-180B model where Intel’s new part can stretch its legs and run away from NVIDIA’s GPU, with up to a 4x improvement in performance—although we have to stress that these numbers are all projections.

gaudi efficiency

A big point in favor of Gaudi 3 that Intel has been emphasizing to the press is its relative efficiency. Despite being built on a 5-nm process in comparison to Nvidia’s H100 (which is fabricated on TSMC’s 4N), Intel claims that Gaudi 3 can turn in up to 2.3x the power efficiency compared to NVIDIA’s part. A significant part of that efficiency gain is no doubt down to the extremely large 96 MB on-chip cache.

gaudi infographic

Gaudi 3 is coming to mezzanine (OAM) and PCIe add-in card formats as well as a universal baseboard form factor intended for use in big clusters. Intel says that developers may be able to migrate their models to run on Gaudi in as little as three lines of code; Stability AI says it took “less than a day” to move over to Intel’s hardware. These parts will be sampling to customers very soon, while mass production happens in the back half of the year.

Intel Xeon 6 Product Family Announced

xeon pcore ecore

But what about the new Xeons we mentioned? Well, it turns out that Intel is wrapping up its Sierra Forest Xeons—sporting up to 288 Crestmont e-cores—into the same product family with its upcoming Granite Rapids CPUs. Those parts are more traditional Xeon CPUs, with lots of big and fast P-cores. The entirety of both processor families is collectively being branded “Xeon 6” by Intel.

xeon efficiency

Intel shared few details about either processor family at the moment, although the chipmaker did offer the slide above promising a yearly power reduction of over one megawatt in the case that someone replaces some old 2nd-gen Xeons with brand spanking new Sierra Forest parts. We reckon there are a few more details on Sierra Forest because those chips are expected to launch any day now, while Granite Rapids will come “shortly after“, according to Intel.