Home > Article > Technology peripherals > Use DDC to build AI networks? This might just be a beautiful illusion
ChatGPT, AIGC, large model... A series of dazzling terms have emerged, and the commercial value of AI has attracted great attention from society. As the scale of training models increases, the data center network that supports AI computing power has also become a hot topic. Improve computing power efficiency and build high-performance networks... Major manufacturers are showing their talents and working hard to open up a "new F1 track" for AI networks in the Ethernet industry. In this AI arms race, DDC made a high-profile appearance and overnight seemed to become synonymous with revolutionary technology for building high-performance AI networks. But is it really as beautiful as it seems? Let us analyze in detail and judge calmly.
Started in 2019, the essence of DDC is to replace the frame router with a box routerWith the rapid growth of DCN traffic, the need for DCI network upgrades is becoming increasingly urgent. However, the expansion capacity of DCI router frame equipment is limited by the size of the frame; at the same time, the equipment consumes high power. When expanding the frame, the requirements for cabinet power and heat dissipation are high, and the transformation cost is high. Against this background, in 2019 AT&T submitted box router specifications based on commercial chips to OCP and proposed the concept of DDC (Disaggregated Distributed Chassis). To put it simply, DDC uses a cluster composed of several low-power boxed devices to replace hardware units such as service line cards and network boards of modular devices. The boxed devices are interconnected through cables. The entire cluster is managed through a centralized or distributed NOS (network operating system) in order to break through the performance and power consumption bottlenecks of DCI single-frame equipment.
The advantages claimed by DDC include:
Breaking through the expansion limitations of frame-type equipment: Capacity expansion is achieved through multi-device clusters, without machine control Frame size limit;
Reducing single-point power consumption: Multiple low-power box-type devices are deployed in a distributed manner, which solves the problem of concentrated power consumption and reduces cabinet power and heat dissipation requirements;
Improve bandwidth utilization: Compared with the traditional ETH network Hash exchange, DDC uses cell (Cell) exchange and load balancing based on Cell, which helps To improve bandwidth utilization;
mitigating packet loss: Use the device’s large cache capability to meet the high convergence ratio requirements of DCI scenarios. First, the VOQ (Virtual Output Queue) technology is used to allocate the packets received in the network to different virtual outqueues, and then the Credit communication mechanism is used to determine that the receiving end has enough buffer space before sending these packets, thereby reducing the risk of Packet loss caused by egress congestion.
The DDC solution is only a flash in the pan in the DCI sceneThe idea seems perfect, but the implementation is not smooth sailing. DriveNets' Network Cloud product is the industry's first and only commercial DDC solution, and the entire software is adapted to universal white-box routers. However, no clear sales cases have been seen on the market so far. AT&T, as the proposer of the DDC architecture solution, deployed the DDC solution on a gray scale in its self-built
IPbackbone network in 2020, but there has been little follow-up. Why didn't this splash make much waves? This should be attributed to the four major flaws of DDC. Defect 1: Unreliable equipment management and control plane
Each component of the frame-type equipment realizes the control and management plane interconnection through the highly integrated and highly reliable PCIe bus. All equipment uses a dual main control board design to ensure high reliability of the equipment's management and control plane. DDC uses "replace if broken" vulnerable module cables to interconnect to build a multi-device cluster and support the operation of the cluster management and control plane. Although it breaks through the scale of box-type equipment, this unreliable interconnection method brings great risks to the management and control surface. When two devices are stacked, problems such as split brain and out-of-synchronization of table entries may occur. For the unreliable management and control plane of DDC, this kind of problem is more likely to occur.
Defect 2: Highly complex equipment NOS
The SONiC community has already designed a distributed forwarding frame based on the VOQ architecture, and continues to iteratively supplement and modify it to meet the support for DDC. Although there are indeed many implementation cases of white box, few people challenge the "white box". To build a remote "white frame", we not only need to consider the status of multiple devices in the cluster, the synchronization and management of table entry information, but also need to consider multiple practical scenarios such as version upgrades, rollbacks, and hot patches under multiple devices. systematic implementation. DDC has exponentially increased NOS complexity requirements for clusters. Currently, there are no mature commercial cases in the industry, and there are great development risks.
Defect 3: Lack of maintainable solutions
The network is unreliable, so the ETH network has made a lot of maintainable and positionable features or tools, such as the familiar INT, MOD. These tools can monitor specific flows and identify flow characteristics of packet loss to locate and troubleshoot problems. However, the cell used by DDC is only a slice of the message. It does not have five-tuple information such as related IP and cannot be associated with a specific service flow. Once packet loss occurs in DDC, the current operation and maintenance methods cannot locate the packet loss point, and the maintenance plan is seriously lacking.
Defect 4: Cost increase
In order to break through the frame size limitation, DDC needs to interconnect the various devices in the cluster through high-speed cables/modules; the interconnection cost is far Line cards and network boards of higher than frame-type equipment are interconnected through PCB traces and high-speed links, and the larger the scale, the higher the interconnection cost.
At the same time, in order to reduce the concentration of power consumption at a single point, the overall power consumption of a DDC cluster interconnected through cables/modules is higher than that of frame-type devices. For chips of the same generation, assuming that DDC cluster devices are interconnected by modules, the power consumption of the cluster is 30% higher than that of frame-type devices.
Refuse to fry the leftovers, the DDC solution is also not suitable for AI networks
The immaturity and imperfection of the DDC solution has made it sadly withdrawn from the DCI scene. But currently, it has made a resurgence under the pressure of AI. The author believes that DDC is also not suitable for AI networks. Next, we will analyze it in detail.
Two core demands of AI networks: high throughput and low latency
The services supported by AI networks are characterized by a small number of flows and a large bandwidth of a single flow; At the same time, the traffic flow is uneven, and there are often situations where one or more are hit (All-to-All and All-Reduce). Therefore, it is extremely prone to problems such as uneven traffic load, low link utilization, packet loss caused by frequent traffic congestion, etc., and cannot fully release computing power.
DDC only solves the Hash problem, but also brings many defects
DDC uses cell switching to slice the message into Cells, and uses polling based on reachability information mechanism is sent. The traffic load will be distributed to each link in a relatively balanced manner, fully utilizing the bandwidth and better solving the hash problem. But apart from this, DDC still has four major flaws in the AI scenario.
Defect 1: The hardware requires specific equipment and is not universal for closed private networks
The cell switching and VOQ technologies in the DDC architecture all rely on specific hardware chips for implementation. Currently, DCN network equipment cannot be reused. The rapid development of the ETH network benefits from its plug-and-play convenience, generalization and standardization. DCC relies on hardware and builds a closed private network through a proprietary switching protocol, which is not universal.
Defect 2: The large cache design increases network costs and is not suitable for large-scale DCN networking
If the DDC solution enters the DCN, in addition to high interconnection costs, it also bears the burden of This reduces the cost burden of the large cache on the chip. DCN networks currently use small cache devices, with a maximum of only 64M; DDC solutions derived from DCI scenarios usually have a chip HBM of over GB. Compared with DCI, large-scale DCN networks are more concerned about network costs.
Defect 3: The static network delay increases and does not match the AI scenario
As a high-performance AI network that releases computing power, the goal is to shorten the completion time of services. The large cache capability of DDC caches packets, which will inevitably increase the static delay of hardware forwarding. At the same time, cell switching, slicing, encapsulation and reassembly of messages also increase the network forwarding delay. Through test data comparison, DDC forwarding delay increases by 1.4 times compared with traditional ETH network.
Defect 4: As the scale of DC increases, the problem of unreliability of DDC will worsen
Compared to the scenario where DDC replaces frame equipment in DCI scenarios, DDC needs to satisfy a larger cluster to enter DCN, or at least one network POD. This means that the "box" is further apart, and the components are further apart. Then there are higher requirements for the reliability of the management and control plane of this cluster, the synchronization management of the device network NOS, and the network POD-level operation and maintenance management. DDC's various flaws will crack.
DDC is at best a transitional solution
Of course, no problem is unsolvable. Accepting some constraints, this specific scenario can easily become a stage for major manufacturers to "show off their skills". The network pursues reliability, simplicity, and efficiency, and rejects complexity. Especially under the current background of "reducing staff and increasing efficiency", we really need to consider the cost of implementing DDC.
Faced with the problem of network load sharing in AI scenarios, many cases have been solved through global static or dynamic orchestration of forwarding paths. In the future, it can also be solved through the network card on the end side based on Packet Spray and out-of-order Solved by rearrangement. Therefore, DDC is at best a short-term transition plan.
After a deep dive, the driving force behind DDC may be DNX
Finally, let’s talk about the mainstream network chip companyBroadcom (Broadcom), we compare The two familiar product series are StrataXGS and StrataDNX. XGS continues the high-bandwidth, low-cost route, quickly launches small cache, large-bandwidth chip products, and continues to dominate the DCN network occupancy rate. StrataDNX, however, carries the cost of a large cache and continues the myth of VOQ cell exchange, hoping that DDC will enter DC to continue its life. There seems to be no case in North America. The domestic DDC may be the last straw for DNX.
Today, a large number of hardware facilities such as GPUs have been restricted to a certain extent in our country. Do we really need DDC? Let’s leave more opportunities for domestically produced devices!
The above is the detailed content of Use DDC to build AI networks? This might just be a beautiful illusion. For more information, please follow other related articles on the PHP Chinese website!