

# NVMe-IP for PLDA PCIe reference design manual

Rev1.1 9-Oct-18

### 1 NVMe

NVM Express (NVMe) defines interface for the host controller to access solid state drives (SSD) through PCI Express. NVM Express optimizes the process to issue command and completion by using only 2 register writes (one for command and another for completion). Also, NVMe supports parallel operation by supporting up to 64K commands within single queue. So, performance for sequential and random access is improved.

In PCIe SSD market, two standards are found, i.e. AHCI and NVMe. AHCI is the older standard to interface with SATA hard disk drives while NVMe is designed for non volatile memory like SSD. The comparison between AHCI and NVMe protocol in more details can be found from "A Comparison of NVMe and AHCI" document.

https://sata-io.org/system/files/member-downloads/NVMe%20and%20AHCI %20 long .pdf

The example of NVMe storage devices is shown in <u>http://www.nvmexpress.org/products/</u>.

Generally, user needs to install NVMe driver to access NVMe SSD as shown in Figure 1-1. Physical connector of NVMe SSD is PCIe type such as M.2 connector. NVMe-IP implements NVMe driver and the task running on CPU by pure-hardware logic. So, CPU is not required to access NVMe SSD when using NVMe-IP in FPGA board.





### 2 Hardware overview



Figure 2-1 NVMe-IP demo hardware

The hardware system can be split into three groups following the interface.

- TestGen: The example of user logic to write and read data in this reference design is TestGen module. TestGen module generates test data to U2IPFIFO at the highest speed with flow control in Write command. For Read command, TestGen reads and verifies test data from IP2UFIFO at the highest speed with flow control. TestGen uses 256-bit data bus and runs in UserClk domain which is equal to 200 MHz. Maximum bandwidth of TestGen is more than maximum performance of Gen3 SSD.
- 2) NVMe: NVMe-IP connecting with XpressRICH3 is used to interface with NVMe SSD. Command and data interface of NVMe-IP is dgIF typeS format. Command interface is controlled by CPU while data interface is connected to FIFO. IdenRAM (implemented by simple dual port RAM) is used to connect with Identify interface of NVMe-IP and CPU.
- 3) CPU: Test operation in the demo is controlled by user through Serial console. CPU firmware is designed to receive command and command parameters from user. Then, parameters are set to the hardware through AXI4-Lite bus. LAxi2Reg has the register sets of test parameters which are mapped to different address of CPU. LAxi2Reg decodes the address of AXI4-Lite bus to select the active parameter. For write access, Write data from AXI4-Lite bus is set to the selected parameter following the address. For read access, Read data from selected parameter is returned to AXI4-Lite bus. Read access is applied for CPU monitoring and displaying the hardware status to the user through Serial console.

More details of the hardware are described as follows.



#### 2.1 TestGen

This module is designed to generate Test pattern to WrFf in Write command or reads data from RdFf to verify in Read command at the fastest speed to check system performance. The details of hardware inside TestGen are shown in Figure 2-2.





To start Write operation, rWrTrans is asserted to '1' when WrPattStart from LAxi2Reg is asserted to '1'. If rWrTrans='1' (Write command is operating) and WrFfAFull='0' (WrFf is ready to receive new data), rWrFfWrEn[0] will be asserted to '1' to send test data to WrFf. If WrFfAFull='1', rWrFfWrEn[0] will be de-asserted to '0' to pause data transferring. rDataCnt is data counter to check total transfer size, increased by rWrFfWrEn[0]. When total data are transferred complete (rDataCnt=EndSize), rWrTrans and rWrFfWrEn[0] are de-asserted to '0' to stop data transferring.

For Read operation, RdFfRdEn signal is designed by using NOT logic to RdFfEmpty. rDataCnt is increased when RdFfRdEn is asserted to '1'.

Block no.1 in lower side of Figure 2-2 shows the logic for generating test pattern in TestGen module. To create unique test data for each 512-byte data, test pattern is designed as shown in Figure 2-4.



Test pattern consists of two parts, i.e. 64-bit header in Dword#0 and Dword#1 of each 512-byte and test data in Dword#2 – Dword#127. 64-bit header is created by using address value in 512-byte unit. As shown in Figure 2-2, TrnAddr is loaded to be initial value of rTrnAddr. rTrnAddr is applied to be 64-bit header of each 512-byte data and increased every 512-byte transferring. rDataCnt and write/read enable signal are monitored to check end of 512-byte transferring.

TestGen supports to generate five patterns, i.e. 32-bit increment, 32-bit decrement, all 0, all 1, and 32-bit LFSR. 32-bit increment is generated by using lower-bit of rTrnAddr and rDataCnt. Decrement pattern is designed by using NOT logic of increment data. The equation of 32-bit LFSR is  $x^{31} + x^{21} + x + 1$ . To create 256-bit LFSR pattern, two sets of 32-bit LFSR are designed as shown in Figure 2-5.

The 1<sup>st</sup> DW data of set#1 uses 16 lower bit of Addr512B (address in 512-byte unit) and 16 higher bit of NAddr512B (not logic of Addr512B) to be initial value for generating test pattern. Otherwise, the 1<sup>st</sup> DW data of set#2 uses 16 lower bit of NAddr512B and 16 higher bit of Addr512B.





Each LFSR logic set is designed to generate 128-bit LFSR data, so four 32-bit LFSR data must be generated within one clock. The logic to design LFSR must use look-ahead style to generate PattD0/D2/D4/D6 or PattD1/D3/D5/D7 in the same clock.

3-bit PattSel signal is used to select one of five test patterns. Header Inserter logic inserts 64-bit header to be the 1<sup>st</sup> and 2<sup>nd</sup> data of each 512 byte. After that, test data from pattern counter is used to be rWrFfWrData. In Read command, rWrFfWrData is used to be expected value to compare with read data from FIFO (RdFfRdData). PattFail is asserted to '1' when data verification is failed.



#### 2.2 NVMe

User interface of NVMe-IP is designed by using dgIF typeS format. CMD interface is connected to LAxi2Reg to receive the parameter from user through Serial console. 256-bit data bus is connected with U2IPFIFO and IP2UFIFO. NVMe-IP connects to XpressRICH3 for creating PCIe packet and converting to PCIe signals. SSD is directly connected to XpressRICH3.



To support Identify command, one additional RAM is connected in the system. IdenRAM is simple dual-port RAM which is used to store 8K byte data output from Identify command. IdenRAM is read by CPU through LAxi2Reg.

2.2.1 NVMe-IP for PLDA PCIe

NVMe-IP for PLDA PCIe implements NVMe protocol of Host side to access NVMe SSD. User interface is simple designed by using dgIF typeS format. NVMe-IP is designed to connect with XpressRICH3. More details of NVMe-IP are described in datasheet. https://dgway.com/products/IP/NVMe-IP/dg\_nvmeip\_pldapcie\_data\_sheet\_en.pdf

#### 2.2.2 XpressRICH3 for Xilinx

This block is PCIe soft IP core from PLDA. More details are described in following website. <u>https://www.plda.com/node/403</u>



#### 2.3 CPU and Peripherals

The hardware is connected to CPU through AXI4-Lite bus, similar to other CPU peripherals. The hardware registers are mapped to CPU memory address, as shown in Table 2-1. LAxi2Reg is the module to interface with CPU following memory map.

LAxi2Reg connects to many hardwares in the system such as TestGen, NVMe-IP, and IdenRAM to interface control and status signals of each module. As shown in Figure 2-7, there are two clock domains applied in this block, i.e. CpuClk (CPU Clock and AXI4-Lite bus) and UserClk (User clock domain for TestGen and NVMe block).

AsyncAxiReg includes asynchronous circuit between CpuClk and UserClk. More details of each hardware are described as follows.





#### 2.3.1 AsyncAxiReg

This module is designed to convert the signal interface of AXI4-Lite to be register interface. Also, it transfers signals in CpuClk domain to be UserClk domain. Timing diagram of register interface is shown in Figure 2-8.

To write register, timing diagram is same as RAM interface. RegWrEn is asserted to '1' with the valid signal of RegAddr (Register address in 32-bit unit), RegWrData (write data of the register), and RegWrByteEn (the byte enable of this access: bit[0] is write enable for RegWrData[7:0], bit[1] is used for RegWrData[15:8], ..., and bit[3] is used for RegWrData[31:24]).

To read register, AsyncAxiReg asserts RegRdReq to '1' with the valid value of RegAddr (the register address is used for 32-bit data). After that, the read data is valid on RegRdData bus with asserting RegRdValid to '1'.





#### 2.3.2 UserReg

As shown in Figure 2-7, after RegWrEn or RegRdReq is asserted to '1' to request write or read register, RegAddr is loaded to Address decoder to select the active register. For write register, RegWrData signal is loaded to be the new value of active register. In this module, RegWrByteEn is not used, so CPU firmware needs to access the hardware register by using 32-bit pointer only.

For read request, CPU monitors status signals of many modules such as TestGen, NVMe-IP, and IdenRAM. To avoid timing constraint problem, many status signals are selected by using multiplexer with two-stage pipeline registers. So, RegRdValid is asserted to '1' after RegRdReq is asserted for two clock cycles. Two latency clock cycles is designed by adding two D Flip-flops to generate RegRdValid from RegRdReq.

Memory map of control and status signals inside UserReg module is shown in Table 2-1.



### Table 2-1 Register Map

| Address          | Register Name                    | Description                                                                                                                                  |
|------------------|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| Rd/Wr            | (Label in the "nvmexr1iptest.c") |                                                                                                                                              |
| BA+0x0000        | User Address (Low) Reg           | [31:0]: Input to be start address as 512-byte unit                                                                                           |
| Wr               | (USRADRL_REG)                    | (UserAddr[31:0] of dgIF typeS)                                                                                                               |
| BA+0x0004        | User Address (High) Reg          | [15:0]: Input to be start address as 512-byte unit                                                                                           |
| Wr               | (USRADRH_REG)                    | (UserAddr[47:32] of dgIF typeS)                                                                                                              |
| BA+0x0008        | User Length (Low) Reg            | [31:0]: Input to be transfer length as 512-byte unit                                                                                         |
| Wr               | (USRLENL_REG)                    | (UserLen[31:0] of dgIF typeS)                                                                                                                |
| BA+0x000C        | User Length (High) Reg           | [15:0]: Input to be transfer length as 512-byte unit                                                                                         |
| Wr               | (USRLENH_REG)                    | (UserLen[47:32] of dgIF typeS)                                                                                                               |
| BA+0x0010        | User Command Reg                 | [1:0]: Input to be user command (UserCmd of dgIF typeS for NVMe-IP)                                                                          |
| Wr               | (USRCMD_REG)                     | "00": Identify, "10": Write SSD, "11": Read SSD                                                                                              |
|                  |                                  | When this register is written, the design generates command request to                                                                       |
|                  |                                  | NVMe-IP to start new command operation.                                                                                                      |
| BA+0x0014        | Test Pattern Reg                 | [2:0]: Test pattern select                                                                                                                   |
| Wr               | (PATTSEL_REG)                    | "000"-Increment, "001"-Decrement, "010"-All 0, "011"-All 1, "100"-LFSR                                                                       |
| BA+0x0100        | User Status Reg                  | [0]: UserBusy of dgIF TypeS ('0': Idle, '1': Busy)                                                                                           |
| Rd               | (USRSTS_REG)                     | [1]: UserError of dgIF TypeS ('0': Normal, '1': Error)                                                                                       |
|                  |                                  | [2]: Data verification fail ('0': Normal, '1': Error)                                                                                        |
|                  |                                  | [4:3]: PCIe speed from IP                                                                                                                    |
|                  |                                  | ("00": No linkup, "01": PCle Gen1, "10": PCle Gen2, "11": PCle Gen3)                                                                         |
| BA+0x0104        | Total disk size (Low) Reg        | [31:0]: Total capacity of SSD in 512-byte unit                                                                                               |
| Rd               | (LBASIZEL_REG)                   | (LBASize[31:0] of dgIF typeS)                                                                                                                |
| BA+0x0108        | Total disk size (High) Reg       | [15:0]: Total capacity of SSD in 512-byte unit                                                                                               |
| Rd               | (LBASIZEH_REG)                   | (LBASize[47:32] of dgIF typeS)                                                                                                               |
| BA+0x010C        | User Error Type Reg              | [31:0]: User error status                                                                                                                    |
| Rd<br>RA JOV0114 | (USRERRTYPE_REG)                 | (UserErrorType[31:0] of dgIF typeS)                                                                                                          |
| BA+0x0114        | Completion Status Reg            | [15:0]: Status from Admin completion (AdmCompStatus[15:0] of NVMe-IP)<br>[31:16]: Status from I/O completion (IOCompStatus[15:0] of NVMe-IP) |
| Rd<br>RA 10v0119 | (COMPSTS_REG)                    |                                                                                                                                              |
| BA+0x0118<br>Rd  | NVMe CAP Reg<br>(NVMCAP REG)     | [31:0]: NVMeCAPReg[31:0] output from NVMe-IP                                                                                                 |
| BA+0x011C        | NVMCAP_REG)                      | [31:0]: TestPin[31:0] output from NVMe-IP                                                                                                    |
| Rd               | (NVMTESTPIN REG)                 |                                                                                                                                              |
| BA+0x0120        | Data Failure Address (Low) Reg   | [31:0]: Latch value of failure address[31:0] in byte unit from read command                                                                  |
| Rd               | (RDFAILNOL_REG)                  | נסיו.סן. במנהי שמשפ טו ומווטרפ מטנופאנסיו.טן ווי שעופ טוווג ווטוון ופמט כטווווומווט                                                          |
| BA+0x0124        | Data Failure Address (High) Reg  | [24:0]: Latch value of failure address [56:32] in byte unit from read command                                                                |
| Rd               | (RDFAILNOH REG)                  | [24.0]. Laten value of failure address [50.52] in byte drift notified continuand                                                             |
| nu               |                                  |                                                                                                                                              |



| Address          | Register Name                              | Description                                                                |
|------------------|--------------------------------------------|----------------------------------------------------------------------------|
| Rd/Wr            | (Label in the "nvmeiptest.c")              |                                                                            |
| BA+0x0140        | Expected value Word0 Reg                   | [31:0]: Latch value of expected data [31:0] from read command              |
| Rd               | (EXPPATW0_REG)                             |                                                                            |
| BA+0x0144        | Expected value Word1 Reg                   | [31:0]: Latch value of expected data [63:32] from read command             |
| Rd               | (EXPPATW1_REG)                             |                                                                            |
| BA+0x0148        | Expected value Word2 Reg                   | [31:0]: Latch value of expected data [95:64] from read command             |
| Rd               | (EXPPATW2_REG)                             |                                                                            |
| BA+0x014C        | Expected value Word3 Reg                   | [31:0]: Latch value of expected data [127:96] from read command            |
| Rd               | (EXPPATW3_REG)                             |                                                                            |
| BA+0x0150        | Expected value Word4 Reg                   | [31:0]: Latch value of expected data [159:128] from read command           |
| Rd               | (EXPPATW4_REG)                             |                                                                            |
| BA+0x0154<br>Rd  | Expected value Word5 Reg<br>(EXPPATW5_REG) | [31:0]: Latch value of expected data [191:160] from read command           |
| BA+0x0158        | Expected value Word6 Reg                   | [31:0]: Latch value of expected data [223:192] from read command           |
| Rd               | (EXPPATW6_REG)                             |                                                                            |
| BA+0x015C        | Expected value Word7 Reg                   | [31:0]: Latch value of expected data [255:224] from read command           |
| Rd               | (EXPPATW7_REG)                             |                                                                            |
| BA+0x0180        | Read value Word0 Reg                       | [31:0]: Latch value of read data [31:0] from read command                  |
| Rd               | (RDPATW0_REG)                              |                                                                            |
| BA+0x0184        | Read value Word1 Reg                       | [31:0]: Latch value of read data [63:32] from read command                 |
| Rd<br>RA (0x0199 | (RDPATW1_REG)                              | [31:0]: Latch value of read data [95:64] from read command                 |
| BA+0x0188<br>Rd  | Read value Word2 Reg<br>(RDPATW2_REG)      | [31.0]. Latch value of read data [95.64] from read command                 |
| BA+0x018C        | Read value Word3 Reg                       | [31:0]: Latch value of read data [127:96] from read command                |
| Rd               | (RDPATW3_REG)                              | [51.0]. Laten value of fead data [127.90] for fead command                 |
| BA+0x0190        | Read value Word4 Reg                       | [31:0]: Latch value of read data [159:128] from read command               |
| Rd               | (RDPATW4_REG)                              | [51.0]. Laten value of read data [100.120] non read command                |
| BA+0x0194        | Read value Word5 Reg                       | [31:0]: Latch value of read data [191:160] from read command               |
| Rd               | (RDPATW5 REG)                              |                                                                            |
| BA+0x0198        | Read value Word6 Reg                       | [31:0]: Latch value of read data [223:192] from read command               |
| Rd               | (RDPATW6_REG)                              | · ·                                                                        |
| BA+0x019C        | Read value Word7 Reg                       | [31:0]: Latch value of read data [255:224] from read command               |
| Rd               | (RDPATW7_REG)                              |                                                                            |
| BA+0x01C0        | Current test byte (Low) Reg                | [31:0]: Current test data size of TestGen module in byte unit (bit[31:0])  |
| Rd               | (CURTESTSIZEL_REG)                         |                                                                            |
| BA+0x01C4        | Current test byte (High) Reg               | [24:0]: Current test data size of TestGen module in byte unit (bit[56:32]) |
| Rd               | (CURTESTSIZEH_REG)                         |                                                                            |
| BA+0x2000        | Identify Controller Data                   | 4Kbyte Identify Controller Data Structure                                  |
| - 0x2FFF         | (IDENCTRL_REG)                             |                                                                            |
| BA+0x3000        | Identify Namespace Data                    | 4Kbyte Identify Namespace Data Structure                                   |
| – 0x3FFF         | (IDENNAME_REG)                             |                                                                            |



### 3 CPU Firmware

After system boot-up, CPU initializes its peripherals such as UART and Timer. Next, CPU waits until PCIe connection links up (PCISTS\_REG[0]='1'). Finally, CPU waits until NVMe-IP completes initialization process (USRSTS\_REG[0]='0').

To receive command from user, Main menu is displayed on the console for user selecting one of six commands (Identify, Write, or Read). More details of the sequence in each command are described as follows.

#### 3.1 Identify Command

The sequence of the firmware when user selects Identify command is below.

- 1) Set USRCMD\_REG="00". Next, Test logic generates command and request to NVMe-IP. After that, Busy flag (USRSTS\_REG[0]) changes from '0' to '1'.
- 2) CPU waits until the operation is completed or some errors are found by monitoring USRSTS\_REG value. Bit[0] is de-asserted to '0' when command is completed. Bit[1] is asserted to '1' when some errors are detected. In case of error condition, there is error message displayed on the console. If the command is completed, the data from Identify command of NVMe-IP will be stored in IdenRAM.
- CPU reads Identify data from IdenRAM (IDENCTRL\_REG) and displays SSD model name. Otherwise, SSD capacity and LBA unit size are also displayed by reading from NVMe-IP output (LBASIZEL\_REG and LBASIZEH\_REG).

#### 3.2 Write/Read Command

The sequence of the firmware when user selects Write/Read command is below.

- 1) Receive start address, transfer length, and test pattern through Serial console. If some inputs are invalid, the operation will be cancelled.
- Get all inputs and set the value to USRADRL/H\_REG, USRLENL/H\_REG, PATTSEL\_REG, and USRCMD\_REG (USRCMD\_REG="10" for Write command, and "11" for Read command).
- 3) CPU waits until the operation is completed or some errors (except verification error) are found by monitoring USRSTS\_REG[2:0]. If USRSTS\_REG[2] (verification error) is asserted to '1', verification error message will be displayed. After that, CPU still runs until end of operation or user inputs any key to cancel operation.
- 4) During running command, current transfer size reading from CURTESTSIZE\_REG is displayed every second. Finally, test performance is displayed on Serial console when command is completed.



# 4 Example Test Result

The example test result when running demo system by using 512 GB Samsung 960 Pro is shown in Figure 4-1.



By using PCIe Gen3 on ZCU102 board, write performance is about 2100 Mbyte/sec and read performance is about 3200 Mbyte/sec.



## 5 Revision History

| Revision | Date     | Description          |
|----------|----------|----------------------|
| 1.0      | 5-Feb-18 | Initial Release      |
| 1.1      | 9-Oct-18 | Add more information |

Copyright: 2018 Design Gateway Co,Ltd.