Meta debuts new generation of AI chip

April 11, 2024 - 2:32 PM
1249
Meta logo
The logo of Meta Platforms' business group is seen in Brussels, Belgium December 6, 2022. (Reuters/Yves Herman/File Photo)

 Meta Platforms META.O unveiled details on Wednesday about the next generation of the company’s in-house artificial intelligence accelerator chip.

Why it is important

Reuters reported earlier this year that Meta planned to deploy a new version of a custom data center chip to address the swelling amount of computing power necessary to run AI products in Facebook, Instagram and WhatsApp. The chip, referred to internally as “Artemis,” will help Meta reduce its reliance on Nvida’s NVDA.O AI chips and reduce its energy costs overall.

Key quote

“This chip’s architecture is fundamentally focused on providing the right balance of compute, memory bandwidth, and memory capacity for serving ranking and recommendation models,” the company wrote in a blog post.

Context

The new Meta Training and Inference Accelerator (MTIA) chip is part of a broad custom silicon effort at the company that includes looking at other hardware systems too. Beyond building the chips and hardware, Meta has made significant investments in developing the software necessary to harness the power of its infrastructure in the most efficient way.

The company is also spending billions on buying Nvidia and other AI chips: This year CEO Mark Zuckerberg said the company planned to acquire roughly 350,000 flagship H100 chips from Nvidia. Combined with other suppliers, Meta plans to accumulate the equivalent of 600,000 H100 chips this year, he said.

The numbers

Taiwan Semiconductor Manufacturing Co 2330.TW will produce the new chip on its “5nm” process. Meta said it is capable of three times the performance of its first generation processor.

What is next

The chip has been deployed in the data center and is engaged in serving AI applications. The company said it has several programs underway “aimed at expanding the scope of MTIA, including support of (generative AI) workloads.”

—Reporting by Max A. Cherney in San Francisco; Editing by Chris Reese