Redefining Data Center Interconnections: The NVLink Spine and Its Capabilities.

Share Post:

Editorial by: Qëndrim Demiraj

Technical Team Lead, QUAD A Development

In May 2025, NVIDIA CEO Jen-Hsun Huang made a bold claim during his Computex keynote: a single spine of the company’s newly announced NVLink Fusion technology can “move more traffic than the entire Internet.” The declaration, though sensational, was grounded in real technical achievement. At 130 terabytes per second (TB/s) of sustained throughput, a single NVLink Fusion spine does indeed exceed the commonly cited 900 terabits per second (Tb/s) peak bandwidth of the global Internet (1 B = 8 b). While some sources estimate that real-world Internet traffic can peak above 1,200 Tb/s, the comparison remains stunning and underscores the upgrade NVLink Fusion represents for AI and high-performance computing (HPC) infrastructure.

The NVLink Fusion spine is not necessarily a metaphor. It consists of a physical structure comprising thousands of ultra-dense coaxial links and multiple NVLink switch chips. These spines form the backbone of a fully connected mesh network. This enables low-latency and high-throughput communication across all accelerators in a rack. When connected via a spine, the GPUs in a rack can communicate at a combined rate of up to 3,600 TB/s. This makes NVLink Fusion ideal for workloads like AI training and scientific simulations that rely on frequent inter-device data exchange.

NVIDIA’s performance claims regarding NVLink Fusion are not marketing hyperbole. According to documentation and benchmarking details released during Computex 2025, the NVLink Fusion switch chip supports 144 NVLink ports and can handle up to 14.4 TB/s of non-blocking bandwidth. When scaled across a full 72-GPU rack, this configuration supports coherent memory access across the entire system, eliminating traditional bottlenecks introduced by PCIe and Ethernet fabrics.

By comparison, PCIe 5.0, which is still common in many enterprise systems, tops out at around 128 GB/s of bidirectional bandwidth. This means NVLink Fusion delivers up to 14 times more bandwidth while enabling direct GPU-to-GPU communication without CPU involvement. Such performance is essential for the parallelism demanded by modern AI training pipelines and large-scale language model computation.

One of the most groundbreaking aspects of NVLink Fusion is its openness to third-party processors. In contrast to past NVLink versions, which were tightly bound to NVIDIA’s proprietary ecosystem, NVLink Fusion is being offered through chiplet licensing. This allows companies such as Qualcomm, Fujitsu, MediaTek, and Marvell to integrate NVLink chiplets into their own CPUs or ASICs. These partners can create semi-custom processors that directly interface with NVLink Fusion fabrics, enabling a level of integration previously reserved for NVIDIA’s own hardware.

The capabilities of NVLink Fusion have major implications for hyperscale cloud providers such as AWS, Google Cloud, and Microsoft Azure. By enabling faster communication between CPU and GPU instances within a data center, NVLink Fusion can increase the performance of AI-as-a-Service offerings. More tightly coupled racks mean faster training times, reduced inference latency, and better scaling for models that now span trillions of parameters.

For HPC environments, NVLink Fusion enables supercomputer-like mesh interconnects that reduce synchronization overhead and boost computing efficiency. For AI researchers and developers, the high bandwidth and low latency of NVLink Fusion directly translates to faster time-to-results, whether training a large transformer model or conducting high-fidelity simulations

Importantly, the emergence of sovereign AI stacks is supported by NVLink Fusion’s cross-vendor support. Governments, research institutions, and regulated industries can now construct AI platforms using custom components while still benefiting from NVIDIA’s interconnect technology. This enables performance at scale while maintaining data locality and control.

While Huang’s statement comparing NVLink Fusion’s spine to the global Internet is attention-grabbing, it is important to contextualize the claim. The Internet backbone’s traffic is distributed globally and shaped by dynamic usage patterns, whereas NVLink Fusion’s bandwidth operates within a localized, deterministic environment. Nonetheless, the raw speed is real, and the architecture is optimized for specific high-performance tasks that traditional networks cannot efficiently support.

Some skepticism remains about the extent to which peak throughput can be consistently maintained across all workloads. Like all technologies, NVLink Fusion will be limited by application-specific bottlenecks, thermal constraints, and the surrounding infrastructure like memory and power. However, early results suggest that it delivers a transformative leap in effective data-center bandwidth for AI and HPC applications.

NVLink Fusion, with its spine-based interconnect architecture, signals a major step forward in computing infrastructure. Offering up to 130 TB/s per spine, support for heterogeneous computing environments, and the ability to scale efficiently across racks of accelerators, it redefines how modern data centers will be designed. Whether training trillion-parameter AI models or orchestrating scientific simulations, NVLink Fusion positions NVIDIA and its partners at the center of the most bandwidth-intensive computing challenges of the decade.

As AI workloads continue to grow in complexity and size, technologies like NVLink Fusion will become essential. The ability to efficiently move data, and not just compute it, may prove to be the defining factor in the race for AI supremacy. With NVLink Fusion, NVIDIA is making a compelling argument that the interconnect is just as important as the processor in the era of accelerated computing.

Source Links

Stay Connected

More Updates