12,000 MB/s TCP/IP Performance on 100G Ethernet | AMD Alveo

Problem vs. Solution

TCP transmission at 100G speed is a compute-intensive workload for CPUs and it is difficult to achieve high sustained throughput without hardware accelerator. TOE100G-IP core is designed to fully offload TCP transmission from CPU by pure hardware logic. Our demo application on Alveo Card can achieve nearly 100G throughput with minimal CPU usage.

Aspect	⚠ Standard NIC (Problem)	✓ TOE100G-IP + Alveo (Solution)
TCP/IP Processing	Handled entirely in host CPU software — device driver, network subsystem, TCP stack, socket interface all run on CPU cores	Fully offloaded to FPGA hardware logic inside the Alveo accelerator card — zero CPU involvement in packet processing
Peak Throughput	~68 Gbps (~8,500 MB/s)Only ~68% of 100G line rate achieved per single TCP session	12,300 MB/s (~98.4 Gbps)Near wire-rate, saturating 100G Ethernet with headroom
Bandwidth Stability	Unpredictable — CPU context switches and OS scheduling cause bandwidth drops and jitter under load	Deterministic and jitter-free — FPGA pipeline is independent of OS scheduling, delivering consistent throughput
CPU Utilization	High CPU load — cores consumed by interrupt handling, memory copies, and stack processing	~0% TCP/IP CPU load — CPU is fully free for application workloads and other tasks
Scalability	Limited by CPU core count and OS overhead — adding sessions degrades per-session performance	Multiple TOE100G-IP instances can be integrated; additional Alveo cards add independent 100G channels linearly

Data Flow: TCP/IP Accelerator by TOE100G-IP

System architecture of 100G TCP/IP acceleration using AMD Alveo and TOE100G-IP. TCP/IP processing is offloaded from host software to FPGA hardware, while payload data moves through DMA buffers over PCIe/XRT. This reduces CPU and OS network-stack overhead, allowing full-duplex FPGA-to-FPGA 100G Ethernet transfer to approach maximum throughput.

TOE100G-IP System Architecture Flow Diagram

Measured Performance Results

Standard NIC (1 TCP session)

~68 Gbps ≈ 8,500 MB/s

Full Duplex (simultaneous TX+RX)

10,500 MB/s

Peak (half-duplex, no overhead)

12,300 MB/s — 100G Line Rate

Test: AMD Alveo U50 / U250 accelerator cards

Application: TOE100DMATest on two Turnkey systems

Frame type: Jumbo frames enabled for max efficiency

Peak achieved without data generation or verification overhead

Ideal for Applications Requiring High-Speed FPGA Interconnect

AI cluster acceleration using multiple AMD Alveo FPGA cards connected through 100G Ethernet. Each FPGA runs TOE100G-IP as a hardware TCP/IP pipeline, bypassing CPU network-stack bottlenecks and shared CPU limitations. Performance scales by adding more FPGA cards, enabling high-throughput, low-latency, parallel processing for data center and AI workloads.

AI Cluster with AMD Alveo and TOE100G-IP over 100G Ethernet

System Requirements

FPGA Accelerator Card	AMD Alveo U50 or AMD Alveo U250
Ethernet Connection	100G QSFP28 Cable
IP Core	TOE100G-IP — Full TCP Offload Engine Hardware TCP/IP
Host Software	TOE100DMATest application — Client/Server mode, half & full duplex test
Host System	Linux-based Turnkey system with PCIe Gen3/4 slot
Reference Design	TOE100G-IP on Alveo Card Reference Design

Free Evaluation Demo

A free evaluation demo for AMD Alveo U250 and U50 is publicly available, allowing you to directly verify 100G TCP/IP acceleration performance on real hardware.

For more details, please refer to the demo video and documentation published on our website.

📄 Reference Design Document 📋 Demo Instruction Manual 💾 Free Evaluation Demo Bitfile (U250) 💾 Free Evaluation Demo Bitfile (U50)

DEMO VIDEO

100G TCP/IP Acceleration with AMD Alveo and TOE100G-IP
— Up to 12,300 MB/s

▶ Watch Demo Video

Accelerating TCP/IP to 12,000 MB/s on 100G Ethernet with AMD Alveo

Accelerating TCP/IP to 12,000 MB/s
on 100G Ethernet with AMD Alveo