We’re Building the Future of Data Infrastructure

Archive for the 'Ethernet Switching' Category

  • October 24, 2024

    Cloud-Managed Enterprise (CME) Switches Powered by SONiC

    By Gidi Navon, Senior Principal Architect, Marvell

    Open Networking is not a new concept. SONiC (the Software for Open Networking) has been around for some time, and cloud-managed campus switches based on proprietary NOS (Network Operating System), are also not new. What’s truly novel is the comprehensive solution that finally brings open-source SONiC to the campus networks and adds to it a layer of cloud management and zero trust provisioning, all running on a cost-optimized hardware platform specifically tailored to campus networks: “Cloud Managed Enterprise” or CME.

    In recent years, Open Networking offered the hyperscale operators the option to use open-source software on a variety of merchant silicon, providing freedom from the lockdown imposed by the big system vendors. The next challenge was to bring similar benefits to campus networks, particularly to what is now referred to as CME.

    This blog will demonstrate how the Marvell ® Prestera® switches, equipped with a comprehensive Software Development Kit, along with the collaborative efforts of a vibrant industry community called OpenLAN Switching (OLS), have created this open cloud-managed solution for campus networks.

     

    The power of a community

    The first thing to recognize is that it took teamwork from multiple companies to create this open solution. Under the umbrella of the Telecom Infra Project (TIP), various companies gathered and created a working group called OpenLAN. Within OpenLAN, two sub working groups formed: OpenWi-Fi and OpenLAN Switching (OLS), which is the subgroup relevant to this discussion.

    Numerous companies actively participated in the collaborative effort—Figure 1 mentions just some of them—organized according to their role in the solution.

    Breakdown of companies involved in OpenLAN Switching

    Figure 1: Breakdown of companies involved in OpenLAN Switching

  • June 11, 2024

    How AI Will Drive Cloud Switch Innovation

    This article is part five in a series on talks delivered at Accelerated Infrastructure for the AI Era, a one-day symposium held by Marvell in April 2024. 

    AI has fundamentally changed the network switching landscape. AI requirements are driving foundational shifts in the industry roadmap, expanding the use cases for cloud switching semiconductors and creating opportunities to redefine the terrain.

    Here’s how AI will drive cloud switching innovation.

    A changing network requires a change in scale

    In a modern cloud data center, the compute servers are connected to themselves and the internet through a network of high-bandwidth switches. The approach is like that of the internet itself, allowing operators to build a network of any size while mixing and matching products from various vendors to create a network architecture specific to their needs.

    Such a high-bandwidth switching network is critical for AI applications, and a higher-performing network can lead to a more profitable deployment.

    However, expanding and extending the general-purpose cloud network to AI isn’t quite as simple as just adding more building blocks. In the world of general-purpose computing, a single workload or more can fit on a single server CPU. In contrast, AI’s large datasets don’t fit on a single processor, whether it’s a CPU, GPU or other accelerated compute device (XPU), making it necessary to distribute the workload across multiple processors. These accelerated processors must function as a single computing element. 

    AI calls for enhanced cloud switch architecture

    AI requires accelerated infrastructure to split workloads across many processors.

  • June 06, 2024

    Silicon Photonics Comes of Age

    This article is part four in a series on talks delivered at Accelerated Infrastructure for the AI Era, a one-day symposium held by Marvell in April 2024. 

    Silicon photonics—the technology of manufacturing the hundreds of components required for optical communications with CMOS processes—has been employed to produce coherent optical modules for metro and long-distance communications for years. The increasing bandwidth demands brought on by AI are now opening the door for silicon photonics to come inside data centers to enhance their economics and capabilities.  

    What’s inside an optical module?

    As the previous posts in this series noted, critical semiconductors like digital signal processors (DSPs), transimpedance amplifiers (TIAs) and drivers for producing optical modules have steadily improved in terms of performance and efficiency with each new generation of chips thanks to Moore’s Law and other factors.

    The same is not true for optics. Modulators, multiplexers, lenses, waveguides and other devices for managing light impulses have historically been delivered as discrete components.

    “Optics pretty much uses piece parts,” said Loi Nguyen, executive vice president and general manager of cloud optics at Marvell. “It is very hard to scale.”

    Lasers have been particularly challenging with module developers forced to choose between a wide variety of technologies. Electro-absorption-modulated (EML) lasers are currently the only commercially viable option capable of meeting the 200G per second speed necessary to support AI models. Often used for longer links, EML is the laser of choice for 1.6T optical modules. Not only is fab capacity for EML lasers constrained, but they are also incredibly expensive. Together, these factors make it difficult to scale at the rate needed for AI.

  • June 02, 2024

    A Deep Dive into the Copper and Optical Interconnects Weaving AI Clusters Together

    This article is part three in a series on talks delivered at Accelerated Infrastructure for the AI Era, a one-day symposium held by Marvell in April 2024.

    Twenty-five years ago, network bandwidth ran at 100 Mbps, and it was aspirational to think about moving to 1 Gbps over optical. Today, links are running at 1 Tbps over optical, or 10,000 times faster than cutting edge speeds two decades ago.

    Another interesting fact. “Every single large language model today runs on compute clusters that are enabled by Marvell’s connectivity silicon,” said Achyut Shah, senior vice president and general manager of Connectivity at Marvell.

    To keep ahead of what customers need, Marvell continually seeks to boost capacity, speed, and performance of the digital signal processors (DSPs), transimpedance amplifiers or TIAs, drivers, firmware and other components inside interconnects. It’s an interdisciplinary endeavor involving expertise in high frequency analog, mixed signal, digital, firmware, software and other technologies. The following is a map to the different components and challenges shaping the future of interconnects and how that future will shape AI.

    Inside the Data Center

    From a high level, optical interconnects perform the task their name implies: they deliver data from one place to another while keeping errors from creeping in during transmission. Another important task, however, is enabling data center operators to scale quickly and reliably.

    “When our customers deploy networks, they don’t start deploying hundreds or thousands at a time,” said Shah. “They have these massive data center clusters—tens of thousands, hundreds of thousands and millions of (computing) units—that all need to work and come up at the exact same time. These are at multiple locations, across different data centers. The DSP helps ensure that they don’t have to fine tune every link by hand.”

    Optical Interconnect Module

     

  • May 23, 2024

    Scaling AI Means Scaling Interconnects

    This article is part two in a series on talks delivered at Accelerated Infrastructure for the AI Era, a one-day symposium held by Marvell in April 2024.

    Interconnects have played a key role in enabling technology since the dawn of computing. During World War II, Alan Turing used the Turing machine to perform mathematical computations to break the Nazi’s code. This fast—at least at the time—computer used a massive parallel system and numerous interconnects. Eighty years later, interconnects play a similar role for AI—providing a foundation for massively parallel problems. However, with the growth of AI comes unique networking challenges—and Marvell is poised to meet the needs of this ever-growing market.

    What’s driving interconnect growth?
    Before 2023, the interconnect world was a different place. Interconnect speeds were driven by the pace of cloud data center server upgrades: the upgrades occurred every four years so the speed of interconnects doubled every four years at the same time. In 2023, generative AI took the interconnect wheel, and demand for AI is driving speeds to double every two years. And, while copper remains a viable technology for chip-to-chip and other short reach connections, optical is the dominant medium for AI.

    “Optical is the only technology that can give you the bandwidth and reach needed to connect hundreds and thousands and tens of thousands of servers across the whole data center,” said Dr. Loi Nguyen, executive vice president and general manager of Cloud Optics at Marvell. “No other technology can do the job—except optical.”

    AI doubles interconnect speed in half the time

Archives