Built on an AMD Alveo accelerator card and powered by multiple NVMeTCP25G-IP cores, this reference design enables a host system to access four remote NVMe SSDs simultaneously through four independent 25G Ethernet links — with a scalable FPGA architecture ready for more NVMe/TCP sessions and zero CPU involvement in protocol processing.
A standard NIC-based NVMe/TCP host relies on CPU software to handle all protocol layers — a bottleneck that grows with each additional remote SSD. This Alveo-based design offloads TCP/IP and NVMe/TCP processing into FPGA logic, keeping the CPU free for the application while the card handles data movement to host memory over PCIe/XRT.
| Aspect | ⚠ Standard NIC Approach | ✓ NVMeTCP25G-IP on Alveo |
|---|---|---|
| Protocol Processing | CPU processes TCP/IP stack and NVMe/TCP protocol | Hardware offload for TCP/IP and NVMe/TCP host operations |
| Data Path | Multiple intermediate data copies in host memory | Independent DMA paths for multiple 25G sessions |
| Scalability | Scaling to multiple remote SSDs increases CPU and memory load | FPGA scalability for additional NVMe/TCP channels |
The design combines FPGA protocol offload, independent 25G Ethernet channels, direct host-memory transfer, and a scalable IP-core architecture to create a practical platform for high-performance remote storage access.
NVMeTCP25G-IP integrates TCP/IP stack and NVMe/TCP host functions in hardware for write/read access to remote NVMe SSDs — with zero CPU involvement in protocol processing.
Four 25G Ethernet connections operate simultaneously, allowing the host to access four remote NVMe SSDs in parallel with full per-session bandwidth isolation.
The Alveo card plugs into the host system as a PCIe accelerator, using XRT for register access and high-speed DMA transfers directly into host memory.
Additional NVMeTCP25G-IP instances can be integrated to expand the number of remote NVMe SSD sessions, scaling storage bandwidth for larger deployments.
Especially attractive where data is generated, stored, or processed across multiple remote locations but must be accessed by a central host with minimal CPU overhead.
Offload NVMe/TCP transport entirely to the Alveo FPGA, freeing host CPUs for training and inference. A GPU server and FPGA card connect through a 25G Ethernet switch to access model data, intermediate results, and inference artifacts in parallel — with zero software stack overhead and deterministic low latency across all sessions.
Sustain high-bandwidth, multi-stream media workflows without burdening the host CPU. A GPU server with the Alveo card routes 25GbE sessions through a switch to dedicated media servers — enabling simultaneous codec processing, frame-accurate playback, and adaptive HTTP delivery from independent NVMe storage targets.
Benchmarked on a single Alveo U50 card. All throughput figures are sustained transfer rates over 25G Ethernet to remote NVMe SSDs, with TCP/IP and NVMe/TCP fully offloaded to FPGA logic — the host CPU contributes 0% to protocol processing.
| Configuration | Read Speed | Write Speed | CPU Usage |
|---|---|---|---|
| NVMeTCP25G ×1 | 2,679 MB/s | 2,531 MB/s | 0% |
| NVMeTCP25G ×4 | ~9,500 MB/s | ~9,000 MB/s | 0% |
A single IP core sustains 2,679 MB/s read / 2,531 MB/s write — near line-rate for a 25G Ethernet link. Scaling to four independent NVMeTCP25G-IP instances delivers ~9,500 MB/s read and ~9,000 MB/s write, with throughput growing linearly as additional cores are added. Because the entire protocol stack runs in FPGA logic, host CPU utilisation remains at 0% throughout.
| FPGA Accelerator Card | Xilinx Alveo U50 (16nm UltraScale+ FPGA, PCIe Gen3 x16) |
| Network Interfaces | 100G to 4× 25G breakout cable |
| IP Core | NVMeTCP25G-IP — 4 instances, each managing one 25GbE session |
| Protocol Support | NVMe/TCP (NVMe over TCP/IP) |
| Host Interface | PCIe Gen3 x16 — standard add-in card slot |
| Target System | Any Linux PC or server running NVMe/TCP target driver (nvmet) with NVMe SSD |
| Host OS | Ubuntu 20.04.1 OS |
Watch the full demo of four NVMeTCP25G-IP sessions running simultaneously on the Alveo U50 card — real hardware, real NVMe SSDs, real 25G Ethernet links.
For more details, please refer to the demo videos and documentation published on our website.