Broadcom and AMD Collaborate to Enhance AI Infrastructure

The recent announcement of Broadcom's next-generation PCIe switches supporting AMD's XGMI/Infinity Fabric marks a significant development in AI infrastructure. This collaboration addresses the scaling challenges in AI technologies, enabling more than eight GPUs per node without efficiency loss. The introduction of XGMI-connected NICs also presents a novel approach to AI training cluster communication. As AMD expands its PCIe/XGMI-based solutions ecosystem, this partnership with Broadcom is poised to significantly enhance AMD's AI market competitiveness, offering a crucial edge in scalability, efficiency, and performance. In the article "Next-Gen Broadcom PCIe Switches to Support AMD Infinity Fabric XGMI to Counter NVIDIA NVLink", written by Patrick Kennedy at ServeTheHome, he states, “One of the really neat capabilities of AMD Infinity Fabric/ XGMI controllers is that they can serve multiple functions. AMD’s I/O controllers can do things like handle package-to-package connectivity as Infinity Fabric, PCIe Gen5 for cards, and CXL.

Scaling Beyond the 8-GPU Server

AMD's Instinct MI300X GPU launch illustrates the company's drive to exceed traditional AI technology boundaries. This launch not only introduced powerful GPUs and APUs but also emphasized the necessity of scaling these technologies across server clusters for efficient AI training. AMD's strategy of endorsing PCIe/XGMI as the scaling method is a crucial industry shift, likely to be adopted as a standard by other AI processor developers.

Broadcom's Pivotal Role in PCIe Switch Development

Broadcom's next-generation PCIe switches supporting XGMI/Infinity Fabric are essential for AMD's Infinity Fabric to enable seamless GPU server scaling. Jas Tremblay, Vice President and General Manager of the Data Center Solutions Group at Broadcom, highlighted the significance of this development, which is expected to allow more than eight GPUs per node without sacrificing efficiency, a vital advancement for AMD in the competitive AI sector.

Future Possibilities with XGMI-Connected NICs

The potential of XGMI-connected NICs extends beyond PCIe switches. Envision NICs communicating over XGMI/Infinity Fabric, on the same coherent fabric as CPUs and GPUs. This could significantly streamline communication within AI training clusters, offering an efficient alternative for RDMA transfers between GPU and NIC over a PCIe/XGMI-based fabric.

Looking Ahead: AI's Next Frontier

These advancements represent a future-oriented vision for Broadcom. As the industry strives to scale GPU density and maintain efficiency, Broadcom's PCIe/XGMI-based solutions are set to play a pivotal role. The collaboration and upcoming release of next-gen Infinity Fabric PCIe switches could be transformative for the AI market.

Liqid’s Strategic Advantage in Composable Infrastructure

Liqid is exceptionally well-placed to leverage the announcement of Broadcom's next-generation PCIe switches supporting AMD's XGMI/Infinity Fabric. This strategic collaboration is a significant leap in AI infrastructure, addressing the challenge of scaling AI technologies beyond conventional limits. Liqid's expertise in composable infrastructure, which allows for the dynamic allocation scalability of scaling resources, aligns perfectly with AMD's focus on PCIe/XGMI for AI server scaling and Broadcom's role in supporting XGMI/Infinity Fabric in PCIe switches. As the industry anticipates the transformative impact of these developments, Liqid's composable infrastructure solutions position us at the forefront of PCIe/XGMI-based solutions, making us a key player in the evolving AI infrastructure landscape.

‍

Written by

Sumit Puri

Posted on

January 29, 2024

in

Artificial Intelligence

You Might Also Like

Read more about the composable future of data center infrastructure.

Artificial Intelligence

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

DeepSeek is redefining AI infrastructure with efficiency and open-source innovation, enabling faster, cost-effective deployments. LIQID ensures agility for this new era.

Expert humans, specialized processors; compose your specialists wisely

Optimize performance and cost by choosing the right processors and GPUs for every workload. Explore LIQID's composable infrastructure solutions today!

The Time to Capitalize on AI Inference is Now

As the AI market matures from training to inference workloads, the time for partners to capitalize on Inference is now.

Would you like to learn more?

Speak with one of our sales experts to learn more about how we aim to deliver complete composability. For other inquiries, you can drop us a line. We'll get back to you as soon as possible.

CONTACT SALES General info

Broadcom and AMD Collaborate to Enhance AI Infrastructure

You Might Also Like

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

Expert humans, specialized processors; compose your specialists wisely

The Time to Capitalize on AI Inference is Now

Would you like to learn more?

Navigation

Resources

Speak With An Expert

Broadcom and AMD Collaborate to Enhance AI Infrastructure

You Might Also Like

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

Expert humans, specialized processors; compose your specialists wisely

The Time to Capitalize on AI Inference is Now

Would you like to learn more?

Navigation

Resources

Subscribe to our newsletter