Full CPU offload via TOE100G-IP eliminates software TCP/IP overhead — enabling peak 100G throughput for FPGA-to-FPGA interconnects in AI clusters and data centers.
TCP transmission at 100G speed is a compute-intensive workload for CPUs and it is difficult to achieve high sustained throughput without hardware accelerator. TOE100G-IP core is designed to fully offload TCP transmission from CPU by pure hardware logic. Our demo application on Alveo Card can achieve nearly 100G throughput with minimal CPU usage.
| Aspect | ⚠ Standard NIC (Problem) | ✓ TOE100G-IP + Alveo (Solution) |
|---|---|---|
| TCP/IP Processing | Handled entirely in host CPU software — device driver, network subsystem, TCP stack, socket interface all run on CPU cores | Fully offloaded to FPGA hardware logic inside the Alveo accelerator card — zero CPU involvement in packet processing |
| Peak Throughput | ~68 Gbps (~8,500 MB/s)Only ~68% of 100G line rate achieved per single TCP session | 12,300 MB/s (~98.4 Gbps)Near wire-rate, saturating 100G Ethernet with headroom |
| Bandwidth Stability | Unpredictable — CPU context switches and OS scheduling cause bandwidth drops and jitter under load | Deterministic and jitter-free — FPGA pipeline is independent of OS scheduling, delivering consistent throughput |
| CPU Utilization | High CPU load — cores consumed by interrupt handling, memory copies, and stack processing | ~0% TCP/IP CPU load — CPU is fully free for application workloads and other tasks |
| Scalability | Limited by CPU core count and OS overhead — adding sessions degrades per-session performance | Multiple TOE100G-IP instances can be integrated; additional Alveo cards add independent 100G channels linearly |
System architecture of 100G TCP/IP acceleration using AMD Alveo and TOE100G-IP. TCP/IP processing is offloaded from host software to FPGA hardware, while payload data moves through DMA buffers over PCIe/XRT. This reduces CPU and OS network-stack overhead, allowing full-duplex FPGA-to-FPGA 100G Ethernet transfer to approach maximum throughput.
AI cluster acceleration using multiple AMD Alveo FPGA cards connected through 100G Ethernet. Each FPGA runs TOE100G-IP as a hardware TCP/IP pipeline, bypassing CPU network-stack bottlenecks and shared CPU limitations. Performance scales by adding more FPGA cards, enabling high-throughput, low-latency, parallel processing for data center and AI workloads.
| FPGA Accelerator Card | AMD Alveo U50 or AMD Alveo U250 |
| Ethernet Connection | 100G QSFP28 Cable |
| IP Core | TOE100G-IP — Full TCP Offload Engine Hardware TCP/IP |
| Host Software | TOE100DMATest application — Client/Server mode, half & full duplex test |
| Host System | Linux-based Turnkey system with PCIe Gen3/4 slot |
| Reference Design | TOE100G-IP on Alveo Card Reference Design |
A free evaluation demo for AMD Alveo U250 and U50 is publicly available, allowing you to directly verify 100G TCP/IP acceleration performance on real hardware.
For more details, please refer to the demo video and documentation published on our website.