Orange Silicon Valley composed one of the fastest single-node GPU supercomputers in the world with Liqid Matrix… Again

Once again, the two companies have collaborated to provide evidence that Liqid composed single-socket GPU systems are among the world’s fastest and most efficient for AI-driven computing.

Traditional supercomputing architecture silos GPU resources, creating inefficiency and datacenter sprawl. Liqid Matrix-based composable disaggregated infrastructure (CDI) software unlocks cloud-flexibility and agility for on-prem supercomputing deployments.

Liqid is excited today to discuss the details of our ongoing work with Orange Silicon Valley (OrangeSV) to deliver one of the industry’s fastest, most and adaptive data center-scale performance for AI workloads!

These new results build on earlier work by OrangeSV and Liqid focused on pooling GPU resources via Liqid composable software (See OrangeSV’s Medium article on its work with Liqid one of the world’s first adaptive AI supercomputing blocks with a Liqid heart).

Their previous work demonstrated Liqid’s ability to compose a multi-GPU, single-node supercomputer, utilizing any off-the-shelf server and NVIDIA RTX 8000 GPUs. Last year, the composable system powered by Liqid Matrix achieved one of the fastest Imagenet deep learning performance benchmarks ever documented.

Building upon that achievement, OrangeSV and Liqid recently delivered new single-server performance results that continue to demonstrate the superior performance for AI workloads.

Natural Language Processing: OrangeSV and Liqid take AI training to the next level with CDI

This time OrangeSV obtained NVIDIA A100 GPUs from NVIDIA and a Liqid PCIe Gen 4.0 composable Infrastructure stack and Liqid Matrix composable software.

Working Liqid and Orange’s in-house Innovation Data & AI team, OrangeSV conducted natural language processing (NPL) training with a standardized English-to-German translation task. The OrangeSV team used Facebook AI’s fairsec language model tool, which is powered by Transformer by PyTorch. The open source framework for accelerated machine learning developed by the social media company is considered the state-of-the-art for AI+ML architectures.

For this exercise, OrangeSV selected one Dell Server and -- using Liqid Matrix CDI software and composable fabric -- assigned 16x A100 GPUs (40GB), 8x GPUs per JBOG (Just a Bunch of GPUs) with Liqid’s 16TB NVM Express for Deep Learning Cache, where training data was stored.

Server Specs: Two-way AMD 7H12 64 Core, 1 TB memory
The Liqid Matrix™ composable disaggregated infrastructure software platform
All NVIDIA A100 GPUs had peer-to-peer enabled across Liqid’s PCIe Fabric.

Transformer for pyTorch NPL test results:

Orange executed the standard WMT14_en_de benchmark to collect a standard baseline:

*GPU : Nvidia A100 running WMT14_en_de training throughput benchmark using transformer over PyTorch ; Batch Size = 10240*

Using this composable configuration, OrangeSV achieved a training throughput of 935,343 tokens/sec and reached minimal validation loss under 1 hour and 49 mins on the WMT14 en de transformer translation task, which is a standardized translation task from English to German. This accomplishment was significant because the company achieved its highest training speed so far using commercially available general-purpose GPUs.

Orange also ran full training to achieve minimal validation loss, which was reached within two hours with a batch size of 10,240 tokens, eventually achieving a maximum batch size of 16,000. This allowed to achieve a throughput of 935,343 tokens/sec. Also, we were able to reach the minimal validation loss (the objective function the Neiral Net is trying to minimize) under 1 hour and 49 mins.

Flexible, fast, and highly efficient: The value of composability speaks for itself in NPL training

Based on the current results, OrangeSV concluded (Liqid concurred) that the company has effectively built upon previous work to again be able to declare they’ve configured one of the fastest single-node deep learning supercomputers by leveraging composable architecture and commercial off-the-shelf general purpose GPUs.

The ability to build in this kind of performance from the ground up is essential to the continued evolution of AI applications upstream. By offering adaptive data performance that is disaggregated and change ready, IT organizations can prepare for and keep pace with the next wave of AI+ML innovation, regardless of bare-metal resource requirements. Compose servers and GPUs across fabrics for efficient, adaptive data performance that can handle whatever new workloads the data demands.

To learn more about OrangeSV’s testing models and see further results here. Find out why a composable disaggregated system from Liqid is the right infrastructure for your ongoing AI requirements, download this free white paper outlining the benefits of CDI bring to organizations that must adapt in real-time to a constantly evolving data ecosystem. If you would like speak with a Liqid CDI expert, go here.

‍

Written by

Posted on

July 16, 2021

in

Composable Supercomputing/HPC

You Might Also Like

Read more about the composable future of data center infrastructure.

CXL Memory

Why CXL Will Change the Game for In-Memory Databases

CXL redefines in-memory databases—eliminating sharding, scaling memory to 100TB+, and uniting AI and analytics for faster, simpler, real-time performance.

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

DeepSeek is redefining AI infrastructure with efficiency and open-source innovation, enabling faster, cost-effective deployments. LIQID ensures agility for this new era.

Expert humans, specialized processors; compose your specialists wisely

Optimize performance and cost by choosing the right processors and GPUs for every workload. Explore LIQID's composable infrastructure solutions today!

Would you like to learn more?

Speak with one of our sales experts to learn more about how we aim to deliver complete composability. For other inquiries, you can drop us a line. We'll get back to you as soon as possible.

CONTACT SALES General info

Orange Silicon Valley composed one of the fastest single-node GPU supercomputers in the world with Liqid Matrix… Again

You Might Also Like

Why CXL Will Change the Game for In-Memory Databases

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

Expert humans, specialized processors; compose your specialists wisely

Would you like to learn more?

Navigation

Resources

Speak With An Expert

Orange Silicon Valley composed one of the fastest single-node GPU supercomputers in the world with Liqid Matrix… Again

You Might Also Like

Why CXL Will Change the Game for In-Memory Databases

DeepSeek Changed the Rules for AI Infrastructure - Perhaps Forever

Expert humans, specialized processors; compose your specialists wisely

Would you like to learn more?

Navigation

Resources

Subscribe to our newsletter