NVMe IP Core for AG5 Datasheet

Features. 1

Applications. 2

General Description. 3

Functional Description. 4

NVMe. 6

·      NVMe Host Controller 6

·      Command Parameter 6

·      Data Buffer (256KB RAM) 6

·      NVMe Data Controller 6

PCIe. 7

·      PCIe Config. 7

·      PCIe Adapter 7

·      Async Control 7

User Logic. 7

PCIe Hard IP from Altera (GTS AXI Streaming IP for PCIe) 7

Core I/O Signals. 8

Timing Diagram.. 12

Initialization. 12

Control Interface of dgIF typeS. 13

Data Interface of dgIF typeS. 14

IdenCtrl/IdenName. 16

Shutdown. 18

SMART. 19

Secure Erase. 21

Flush. 22

Error 23

Verification Methods. 24

Recommended Design Experience. 24

Ordering Information. 24

Revision History. 24

 

 

  Core Facts

Provided with Core

Documentation

Reference design manual

Demo instruction manual

Design File Formats

Encrypted File

Instantiation Templates

VHDL

Reference Designs & Application Notes

Quartus Project,

See Reference design manual

Additional Items

Demo on

Sulfur Agilex5 E-Series board

Support

Support Provided by Design Gateway Co., Ltd.

 

Design Gateway Co., Ltd

E-mail: ip-sales@design-gateway.com

URL:    design-gateway.com

Features

·    Direct NVMe SSD access without the need for CPU or external memory

·    Include 256KB RAM for data buffering

·    Simple user interface through dgIF typeS

·    Support seven commands: Identify, Shutdown, Write, Read, SMART, Secure Erase, and Flush

·    Supported NVMe device specifications:

·    Base Class Code: 01h (mass storage)

·    Sub Class code: 08h (Non-volatile)

·    Programming Interface: 02h (NVMHCI)

·    MPSMIN (Memory Page Size Minimum): 0 (4KB)

·    MDTS (Maximum Data Transfer Size): At least 5 (128 KB) or 0 (no limitation)

·    MQES (Maximum Queue Entries Support): At least 8

·    LBA unit: 512 bytes or 4 KB

·    User clock frequency: At least the PCIe clock frequency (300 MHz for Gen3)

·    PCIe Gen3 Hard IP interface: 128-bit AXI4-Stream bus

·    Available reference design

·    Sulfur Agilex5 E-Series development board with AB18-PCIeX16, AB19-M2PCI, or AB20-U2PCI adapter boards

·    Customized services available for:

·    Additional NVMe commands

·    RAM size modification

 

Table 1 Example Implementation Statistics

Family

Example Device

Fmax

(MHz)

ALMs1

Registerer1

Pin

Block Memory bit

Design Tools

Agilex 5

A5ED065BB32AE5SR0

350

2,846

6,488

-

2,162,688

Quartus 24.3.1

Notes:

1)      The actual logic resource depends on the percentage of unrelated logic.

 

Applications

 

Figure 1 NVMe IP Application

 

The NVMe IP Core, integrated with GTS AXI Streaming PCIe IP (PCIe hard IP) from Altera, provides an ideal solution for accessing NVMe Gen3 SSD without the need for CPU or external memory, such as DDR. The NVMe IP core includes a data buffer implemented using embedded memory block to store data transferred between user logic and NVMe SSD. The PCIe hard IP is configured as a 4-lane PCIe Gen3 interface, allowing for direct connection to NVMe SSDs.

To enhance transfer performance, a RAID0 architecture is implemented using multiple NVMe IPs and PCIe hard IPs, as shown in Figure 1. Two-channel RAID0 system, comprising two NVMe IPs and two NVMe SSDs, can double the throughput (e.g., achieving 6 GB/s if each SSD supports 3 GB/s). This design is ideal for high-bandwidth applications, such as radar data acquisition or real-time video streaming.

Design Gateway also offers alternative IP cores for specific applications:

Multiple User NVMe IP Core – Enables multiple users to access an NVMe SSD simultaneously, enabling high-performance concurrent write/read operation.

https://dgway.com/muNVMe-IP_A_E.html

Random Access by Multiple User NVMe IP Core – Enables simultaneous write/read access by two users to the same SSD. Optimized for high random-access performance in systems with non-contiguous data access patterns.

https://dgway.com/rmNVMe-IP_A_E.html

NVMe IP Core for PCIe Switch – Enable access to multiple NVMe SSDs through a PCIe switch, effectively scaling storage capacity and supporting shared high-speed access.

https://dgway.com/NVMe-IP_A_E.html

NVMe IP Core with PCIe Soft IP – Provides NVMe SSD access without relying on PCIe hard IP, making it suitable for FPGA platforms that lack PCIe hard IP.

https://dgway.com/NVMeG4-IP_A_E.html

 

General Description

 

Figure 2 NVMe IP Block Diagram

 

The NVMe IP is a complete host controller solution that enables access to an NVMe Gen3 SSD using the NVM express standard. The physical interface for the NVMe SSD is PCIe, and the lower layer hardware is implemented using GTS AXI Streaming PCIe IP from Altera.

The NVMe IP core implements seven NVMe commands, including Identify, Shutdown, Write, Read, SMART, Secure Erase, and Flush, and utilizes two user interface groups to transfer commands and data. The Control interface is used for transferring commands and their parameters, while the Data interface is used for transferring data when required by the command. For Write/Read commands, the Control interface and Data interface use dgIF typeS, which is our standard interface for the storage. The Control interface of dgIF typeS includes start address, transfer length, and request signals, and the Data interface uses the standard FIFO interface.

SMART, Secure Erase, and Flush command are Custom commands that use the Custom Cmd I/F for control path and Custom RAM I/F for data path. Meanwhile, the Identify command uses its own data interface – Iden RAM I/F, and the same Control interface as Write or Read command, as shown in Figure 2.

If abnormal conditions are detected during initialization or certain command operation, the NVMe IP may assert an error signal. The error status can be read from the IP for more details. Once the error cause is resolved, both the NVMe IP and SSD must be reset.

To ensure continuous packet transmission until the end of the packet on the user interface of PCIe hard IP, the user logic clock frequency must be equal to or greater than the PCIe clock frequency (recommended at 300 MHz, which is used to maximize performance). During a transaction, valid data must be presented on every clock cycle between the start and end of the frame. This condition guarantee that the user interface bandwidth matches or exceeds the PCIe hard IP bandwidth, ensuring efficient data transfer without bottlenecks.

Overall, the NVMe IP provides a comprehensive solution for accessing NVMe SSDs. The IP core comes with reference designs on FPGA evaluation boards, allowing users to asses functionality and performance prior to purchase. These reference designs are available for evaluation before deployment.

 

Functional Description

The NVMe IP operation is divided into three phases, including IP initialization, Operating command, and Inactive status, as shown in Figure 3. Upon de-asserting the IP reset, the initialization phase begins, and the user should execute the Identify command to check the device status and capacity. During the Operating command phase, the user can perform write and read operations, execute Custom commands such as SMART, Secure Erase, and Flush. Finally, before powering down the system, it is recommended to execute the Shutdown command to ensure safe operation.

 

 

Figure 3 NVMe IP Operation Flow

 

The operational sequence of the NVMe IP can be outlined in the following steps.

1)     The IP waits for PCIe to be ready by monitoring the Linkup status from the PCIe IP core.

2)     The IP begins the initialization process by setting up flow control and configuring PCIe and NVMe registers. Upon successful completion of the initialization, the IP transitions to the Idle state, where it awaits new command request from the user. If any errors are detected during the initialization process, the IP switches to the Inactive state, with UserError set to 1b.

3)     The first command from the user must be the Identify command (UserCmd=000b), which updates the LBASize (disk capacity) and LBAMode (LBA unit=512 bytes or 4 KB).

4)     The last command before powering down the system must be the Shutdown command (UserCmd=001b). This command is recommended to guarantee the SSD is powered down in a proper sequence. Without the Shutdown command, the write data in the SSD cannot be guaranteed. After the Shutdown command completion, both the NVMe IP and SSD change to the Inactive state, and no new command can be executed until the IP is reset.

5)     When executing a Write command (UserCmd=010b), the maximum data size for each command is 128 KB. If the total data length from the user exceeds 128 KB, the IP automatically repeats the following steps, 5i) – 5ii), until all data has been fully transferred.

i)       The IP waits until the write data, sent by the user, is sufficient for one command. The transfer size of each command in the NVMe IP is 128 KB, except for the last loop, which may be less than 128 KB.

ii)      The IP sends the Write command to the SSD and then waits until the status response from the SSD. The IP returns to the Idle state only when all the data has been completely transferred. If not, the IP goes back to step 5i) to send the next Write command.

6)     Similar to the Write command, when executing a Read command (UserCmd=011b) with a transfer size exceeding 128 KB, the IP must iterate through the following steps, 6i) – 6ii).

i)       If the remaining transfer size is zero, the IP proceeds to step 6iii). Otherwise, it waits until there is sufficient free space in the Data buffer of the NVMe IP for one command (either 128 KB or the remaining transfer size for the last loop).

ii)      The IP sends the Read command to the SSD and then returns to step 6i).

iii)     The IP waits until all the data has been completely transferred from the Data buffer to the user logic and then returns to the Idle state. Therefore, the Data buffer becomes empty after the Read command is completed.

7)     When executing a SMART command (UserCmd=100b and CtmSubmDW0-15=SMART), 512-byte data is returned upon operation completion.

i)       The IP sends the Get Log Page command to retrieve SMART/Health information from the SSD.

ii)      The 512-byte data response is received from the SSD, and the IP forwards this data through the Custom command RAM interface (CtmRamAddr=00h – 1Fh).

8)     When executing a Secure Erase command (UserCmd=100b and CtmSubmDW0-15=Secure Erase), no data transfer occurs during the operation.

i)       The IP sends the Secure Erase command to the SSD.

ii)      The IP waits until the SSD returns a status response to confirm the completion of the operation.

9)     When executing a Flush command (UserCmd=110b), no data transfer occurs during the operation.

i)       The IP sends the Flush command to the SSD.

ii)      The IP waits until the SSD returns a status response to confirm the completion of the operation.

 

To design the host controller for NVMe SSD, NVMe IP implements two protocols: NVMe and PCIe. The NVMe protocol is used to interface with the user, while the PCIe protocol is used to interface with PCIe hard IP. Figure 2 shows the hardware inside the NVMe IP which is split into two groups, NVMe and PCIe.

NVMe

The NVMe group supports seven commands, which are split into two categories - Admin commands and NVM commands. Admin commands include Identify, Shutdown, SMART, and Secure Erase, while NVM commands include Write, Read, and Flush. After executing a command, the status returned from the SSD is latched either to AdmCompStatus (for status returned from Admin commands) or IOCompStatus (for status returned from NVM commands), depending on the command type.

The parameters of Write or Read command are configured through the Control interface of dgIF typeS, while the parameters of SMART, Secure Erase, or Flush command are set by CtmSubmDW0-15 of the Custom Cmd I/F. The Data interface for Write or Read command is transferred using the FIFO interface, a part of dgIF typeS. The data for Write and Read commands is stored in the IP’s Data buffer. For other command types, the Data interface utilizes distinct interfaces – Iden RAM I/F for the Identify command and Custom RAM I/F for the SMART command.

Further details of each submodule are described as follows.

·        NVMe Host Controller

The NVMe Host Controller serves as the core controller within the NVMe IP. It operates in two phases: the initialization phase and the command operation phase. The initialization phase runs once when the system is booted up, for configuring the NVMe register within the SSD. Once the initialization phase is completed, it enters the command operation phase. During this phase, the controller controls the sequence of transmitted and received packets for each command.

To initiate the execution of each command, the command parameters are stored in the Command Parameter, facilitating packet creation. Subsequently, the packet is forwarded to the PCIe Adapter for converting NVMe packets into PCIe packets. After each command operation is executed, a status packet is received from the SSD. The controller decodes the status value, verifying whether the operation was completed successfully or an error occurred. In cases where the command involves data transfer, such as Write or Read command, the controller must handle the order of data packets, which are created and decoded by the NVMe Data controller.

·        Command Parameter

The Command Parameter module creates the command packet sent to the SSD and decodes the status packet returned from the SSD. The inputs and outputs of this module are controlled by the NVMe host controller. Typically, a command consists of 16 Dwords (1 Dword = 32 bits). When executing Identify, Shutdown, Write, and Read commands, all 16 Dwords are created by the Command parameter module, which are initialized by the user inputs on dgIF typeS. When executing SMART, Secure Erase, and Flush commands, all 16 Dwords are directly loaded via CtmSubmDW0-CtmSubmDW15 of Custom Cmd I/F.

·        Data Buffer (256KB RAM)

The RAM is implemented using embedded memory block. The buffers stores data for transferring between UserLogic and SSD while operating Write and Read commands.

·        NVMe Data Controller

The NVMe Data Controller module is used when the command must transfer data such as Identify, SMART, Write, and Read. This module manages three data interfaces for transferring with the SSD.

1)     The FIFO interface is used with the Data buffer during the execution of Write or Read commands.

2)     The Custom RAM interface is used when executing SMART command.

3)     The Identify interface is used when executing Identify command.

The NVMe Data Controller is responsible for creating and decoding of data packets. Similar to the Command Parameter module, the input and output signals of the NVMe Data Controller module are controlled by the NVMe Host Controller.

 

PCIe

The PCIe protocol is the outstanding low-layer protocol for the high-speed application, and the NVMe protocol runs over it. Therefore, the NVMe layer can be operated after the PCIe layer completes the initialization. Three modules are designed to support the PCIe protocol – PCIe Config, PCIe Adapter, and Async Control. Additional details of these modules are provided below.

·        PCIe Config

In initialization process, the PCIe Config sets up the PCIe environment of the SSD via the AXI4-Lite interface.

·        PCIe Adapter

After the PCIe Config completes the PCIe environment setup, the PCIe Adapter converts command and data packets from the NVMe module into PCIe packets using a 128-bit Tx AXI4-Stream interface. It also performs the reverse conversion for incoming PCIe packets using a 128-bit Rx AXI4-Stream interface. The data flow on the Tx AXI4-Stream interface is managed using the Tx Credit port.

·        Async Control

Async Control incorporates asynchronous registers and buffers designed to facilitate clock domain crossing. The user clock frequency must match or exceed the PCIe clock frequency to ensure sufficient bandwidth for continuous packet data transmission. The majority of the logic within the NVMe IP operates in the user clock domain, while the PCIe hard IP operates in the PCIe clock domain.

User Logic

The user logic can be implemented using a small state machine responsible for sending commands along with their corresponding parameters. For instance, simple registers are used to specify parameters for Write or Read command, such as address and transfer size. Two separate FIFOs are connected to manage data transfer for Write and Read commands independently.

When executing the SMART and Identify commands, each data output interface connects to 2-Port RAM (one read port and one write port) with byte enable capability. Both the FIFO and RAM have a data width of 128 bits, while their memory depth can be configured to different values. Specifically, the data size for the Identify command is 8 KB, while the SMART command has a data size of 512 bytes.

PCIe Hard IP from Altera (GTS AXI Streaming IP for PCIe)

The PCIe Hard IP on Agilex5, also known as GTS AXI Streaming IP for PCIe, is used to interface with the NVMe IP through four key connections. These include the AXI4-ST Tx Port for transmitting packets, the AXI4-ST Rx Port for receiving PCIe packets, the Tx Credit Port for managing transmit flow control, and the AXI4-Lite interface for configuring PCIe settings.

This Hard IP implements the Transaction layer, Data Link layer, and Physical layer of the PCIe protocol. The number of NVMe SSDs that can be connected to a single FPGA device is limited by the number of available PCIe Hard IP blocks on that FPGA. More comprehensive details about GTS AXI Streaming IP for PCIe can be found in the documentation at the following link:

https://www.intel.com/content/www/us/en/docs/programmable/813754/24-3-1/introduction.html

Additionally, the IP requires a specific reset sequence, which must be implemented as described in the following Altera document:

https://www.intel.com/content/www/us/en/docs/programmable/813754/24-3-1/interface-reset-signals.html

This reset sequencer is provided as HDL code as part of the NVMe IP reference design included in the IP deliverables.

 

Core I/O Signals

Table 2 - Table 4 outline the I/O signals for NVMe IP.

Table 2 User I/O Signals (Synchronous with Clk)

Signal name

Dir

Description

Control I/F of dgIF typeS

RstB

In

Synchronous reset. Active low. It should be de-asserted to 1b when the Clk signal is stable.

Clk

In

User clock to run the NVMe IP. The frequency of this clock must be equal to or greater than the PCIeClk, which is the clock output from the PCIe hard IP. To achieve maximum performance, it is recommended that PCIeClk operates at 300 MHz.

UserCmd[2:0]

In

User Command. Valid when UserReq=1b. The possible values are

000b: Identify, 001b: Shutdown, 010b: Write SSD, 011b: Read SSD,

100b: SMART/Secure Erase, 110b: Flush, 101b/111b: Reserved.

UserAddr[47:0]

In

The start address to write/read from the SSD in 512-byte units. Valid when UserReq=1b.

If the LBA unit = 4 KB, UserAddr[2:0] must always be set to 000b to align with 4 KB unit.

If the LBA unit = 512 bytes, it is recommended to set UserAddr[2:0]=000b to align with 4 KB size (SSD page size). The 4KB address unalignment results in reduced write/read performance for most SSDs.

UserLen[47:0]

in

The total transfer size to write/read from the SSD in 512-byte units. Valid from 1 to (LBASize-UserAddr). If the LBA unit = 4 KB, UserLen[2:0] must always be set to 000b to align with the 4 KB unit. This parameter is applicable when UserReq=1b.

UserReq

In

Set to 1b to initiate a new command request and reset to 0b after the IP starts the operation, signaled by setting UserBusy to 1b. This signal can only be asserted when the IP is an Idle state (UserBusy=0b). Command parameters, including UserCmd, UserAddr, UserLen, and CtmSubmDW0-DW15, must maintain their value during UserReq set to 1b. UserAddr and UserLen are inputs for Write/Read commands while CtmSubmDW0-DW15 are inputs for SMART, Secure Erase, or Flush command.

UserBusy

Out

Set to 1b when the IP is busy. New request must not be sent (UserReq=1b) while the IP is busy.

LBASize[47:0]

Out

The total capacity of the SSD in 512-byte units. Default value is 0.

This value is valid after the Identify command is completed.

LBAMode

Out

The LBA unit size of the SSD (0b: 512 bytes, 1b: 4 KB). Default value is 0b.

This value is valid after the Identify command is completed.

UserError

Out

Error flag. Asserted to 1b when the UserErrorType is not equal to 0.

The flag is cleared to 0b by asserting RstB to 0b.

UserErrorType[31:0]

Out

Error status.

[0] – An error when PCIe class code is incorrect.

[1] – An error from Controller capabilities (CAP) register, which can occur due to various reasons.

- Memory Page Size Minimum (MPSMIN) is not equal to 0.

- NVM command set flag (bit 37 of CAP register) is not set to 1.

- Doorbell Stride (DSTRD) is not equal to 0.

- Maximum Queue Entries Supported (MQES) is less than 7.

More details of each register can be found in the NVMeCAPReg signal.

[2] – An error when the Admin completion entry is not received within the specified timeout.

[3] – An error when the status register in the Admin completion entry is not 0 or when the phase tag/command ID is invalid. Further information can be found in the AdmCompStatus signal.

[4] – An error when the IO completion entry is not received within the specified timeout.

[5] – An Error when the status register in the IO completion entry is not 0 or when the phase tag is invalid. More details are available in the IOCompStatus signal.

[6] – An error from unsupported LBA unit (not equal to 512 bytes or 4KB).

[7] – An error when configuration interface response is not successful.

[8] – An error when the received size of Transaction Layer Packet (TLP) is incorrect.

[9] – Reserved.

Bit[15:10] are mapped to Uncorrectable Error Status Register.

[10] – Mapped to Unsupported Request (UR) Status (bit[20]).

[11] – Mapped to Completer Abort (CA) Status (bit[15]).

[12] – Mapped to Unexpected Completion Status (bit[16]).

[13] – Mapped to Completion Timeout Status (bit[14]).

[14] – Mapped to Poisoned TLP Received Status (bit[12]).

[15] – Mapped to ECRC Error Status (bit[19]).

 

Signal name

Dir

Description

Control I/F of dgIF typeS

UserErrorType[31:0]

Out

[23:16] – Reserved.

Bit[30:24] are also mapped to Uncorrectable Error Status Register.

[24] – Mapped to Data Link Protocol Error Status (bit[4]).

[25] – Mapped to Surprise Down Error Status (bit[5]).

[26] – Mapped to Receiver Overflow Status (bit[17]).

[27] – Mapped to Flow Control Protocol Error Status (bit[13]).

[28] – Mapped to Uncorrectable Internal Error Status (bit[22]).

[29] – Mapped to Malformed TLP Status (bit[18]).

[30] – Mapped to ACS Violation Status (bit[21]).

[31] – Reserved.

Note: Timeout period of bit[2]/[4] is determined by the TimeOutSet input.

Data I/F of dgIF typeS

UserFifoWrCnt[15:0]

In

Write data counter for the Receive FIFO. Used to monitor the FIFO full status. When the FIFO becomes full, data transmission from the Read command temporarily halts. If the data count of FIFO is less than 16 bits, the upper bits should be padded with 1b to complete the 16-bit count.

UserFifoWrEn

Out

Asserted to 1b to write data to the Receive FIFO when executing the Read command.

UserFifoWrData[127:0]

Out

Write data bus of the Receive FIFO. Valid when UserFifoWrEn=1b.

UserFifoRdCnt[15:0]

In

Read data counter for the Transmit FIFO. Used to monitor the amount of data stored in the FIFO. If the counter indicates an empty status, the transmission of data packets for the Write command temporarily pauses. When the data count of FIFO is less than 16 bits, the upper bits should be padded with 0b to complete the 16-bit count.

UserFifoEmpty

In

Unused for this IP.

UserFifoRdEn

Out

Asserted to 1b to read data from the Transmit FIFO when executing the Write command.

UserFifoRdData[127:0]

In

Read data returned from the Transmit FIFO.

Valid in the next clock after UserFifoRdEn is asserted to 1b.

NVMe IP Interface

IPVesion[31:0]

Out

IP version number.

TestPin[31:0]

Out

Reserved to be the IP Test point.

TimeOutSet[31:0]

In

Timeout value to wait for completion from SSD. The time unit is equal to 1/(Clk frequency).

When TimeOutSet is equal to 0, Timeout function is disabled.

AdmCompStatus[15:0]

Out

Status output from Admin Completion Entry.

[0] – Set to 1b when the Phase tag or Command ID in Admin Completion Entry is invalid.

[15:1] – Status field value of Admin Completion Entry

IOCompStatus[15:0]

Out

Status output from IO Completion Entry

[0] – Set to 1b when Phase tag in IO Completion Entry is invalid.

[15:1] – Status field value of IO Completion Entry

NVMeCAPReg[31:0]

Out

The parameter value of the NVMe capability register when UserErrorType[1] is asserted to 1b.

[15:0] – Maximum Queue Entries Supported (MQES)

[19:16] – Doorbell Stride (DSTRD)

[20] – NVM command set flag

[24:21] – Memory Page Size Minimum (MPSMIN)

[31:25] – Undefined

Identify Interface

IdenWrEn

Out

Asserted to 1b for sending data output from Identify command.

IdenWrDWEn[3:0]

Out

Dword (32-bit) enable of IdenWrData. Valid when IdenWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], [2], and [3] corresponds to IdenWrData[31:0], [63:32], [95:64], and [127:96], respectively.

IdenWrAddr[8:0]

Out

Index of IdenWrData in 128-bit unit. Valid when IdenWrEn=1b.

0x000-0x0FF is 4KB Identify controller data,

0x100-0x1FF is 4KB Identify namespace data.

IdenWrData[127:0]

Out

4KB Identify controller data or Identify namespace data. Valid when IdenWrEn=1b.

 

Signal name

Dir

Description

Custom Interface (Command and RAM)

CtmSubmDW0[31:0] – CtmSubmDW15[31:0]

In

16 Dwords of Submission queue entry for SMART, Secure Erase, or Flush command.

DW0: Command Dword0, DW1: Command Dword1, …, and DW15: Command Dword15.

These inputs must maintain their value during UserReq set to 1b and UserCmd=100b (SMART/Secure Erase) or 110b (Flush).

CtmCompDW0[31:0] –

CtmCompDW3[31:0]

Out

4 Dwords of Completion queue entry, output from SMART, Secure Erase, or Flush command.

DW0: Completion Dword0, DW1: Completion Dword1, …, and DW3: Completion Dword3.

CtmRamWrEn

Out

Asserted to 1b for sending data output from Custom command such as SMART command.

CtmRamWrDWEn[3:0]

Out

Dword (32 bits) enable of CtmRamWrData. Valid when CtmRamWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], [2], and [3] corresponds to CtmRamWrData[31:0], [63:32], [95:64], and [127:96], respectively.

CtmRamAddr[8:0]

Out

Index of CtmRamWrData when SMART data is received. Valid when CtmRamWrEn=1b.

(Optional) Index to request data input through CtmRamRdData for customized Custom commands.

CtmRamWrData[127:0]

Out

512-byte data output from SMART command. Valid when CtmRamWrEn=1b.

CtmRamRdData[127:0]

In

(Optional) Data input for customized Custom commands.

 

Table 3 Physical I/O Signals for PCIe Hard IP (Synchronous to PCIeClk)

Signal name

Dir

Description

Clock and Reset

PCIeRstB

In

Synchronous reset signal. Active low. De-assert to 1b when PCIe hard IP is not in reset state.

PCIeClk

In

Clock output from PCIe Hard IP. To achieve maximum performance, it is recommended to configure its frequency to 300 MHz.

Transmit Flow Control Credit Interface

TxCreditValid

In

Asserted to 1b to indicate that TxCreditData is valid.

TxCreditData[18:0]

In

Credit limit information and type of transmit credit.

[15:0] – Credit limit value.

[18:16] – Credit type, defined as follows.

000b: Posted Header Credit, 001b: Non-Posted Header Credit, 010b: Completion Header, 011b: Reserved,

100b: Posted Data Credit, 101b: Non-Posted Data Credit, 110b: Completion Data Credit, 111b: Reserved

PCIe Hard IP Rx Interface

PCIeRxReady

Out

Asserted to 1b to indicate that NVMe IP is ready to accept data. Data is transferred when both PCIeRxValid and PCIeRxReady are asserted in the same clock cycle.

PCIeRxValid

In

Asserts to 1b to indicate that PCIeRxData is valid.

PCIeRxKeep[15:0]

In

Bit[i] indicates that byte[i] of PCIeRxData contains valid data.

PCIeRxData[127:0]

In

Receive data bus. Valid when PCIeRxValid is asserted to 1b.

PCIeRxEOP

In

Asserts to 1b to indicate that this is the last cycle of the TLP. Valid when PCIeRxValid is asserted to 1b.

PCIe Hard IP Tx Interface

PCIeTxReady

In

Asserts to 1b to indicate that PCIe Hard IP is ready to accept data.

Data is transferred when both PCIeTxValid and PCIeTxReady are asserted in the same clock cycle.

PCIeTxValid

Out

Asserted to 1b to indicate that PCIeTxData is valid.

The NVMe IP maintains this signal asserted throughout the transmission of a TLP.

PCIeTxKeep[15:0]

Out

Bit[i] indicates that byte[i] of PCIeTxData contains valid data.

PCIeTxData[127:0]

Out

Data for transmission. Valid when PCIeTxValid is asserted to 1b.

PCIeTxEOP

Out

Asserted to 1b to indicate the last cycle of a TLP. Valid when PCIeTxValid is asserted to 1b.

 

Table 4 PCIe Hard IP Control and Status Responder Interface (Synchronous to CsrClk)

Signal name

Dir

Description

System and Link status signal

CsrRstB

In

Synchronous reset signal. Active low. De-assert this signal to 1b when the PCIe hard IP is not in reset state.

CsrClk

In

The frequency of CsrClk is recommended to be equal between 100MHz to 250MHz.

PCIeLinkup

In

Asserted to 1b when LTSSM state of the PCIe hard IP is in L0 State.

LinkSpeed[1:0]

Out

Negotiated PCIe link speed that is read out from the PCIe Hard IP Configuration Space.

00b: Undefined,      01b: Gen1,             10b: Gen2,             11b: Gen3.

This register is valid after the NVMe IP has finished initialization (UserBusy=0b).

LinkWidth[2:0]

Out

Negotiated PCIe link width that is read out from the PCIe Hard IP Configuration Space.

000b: Undefined,    001b: x1 lane,         010b: x2 lane,         100b: x4 lane.

This register is valid after the NVMe IP has finished initialization (UserBusy=0b).

Control and Status Responder Interface

CsrAwValid

Out

Asserted to 1b to indicate that CsrAwAddr is valid.

CsrAwReady

In

Asserted to 1b to indicate that a transfer on CsrAwAddr can be accepted.

CsrAwAddr[19:0]

Out

The address of the first transfer in a write transaction.

CsrwValid

Out

Asserted to 1b to indicate that CfgWrData is valid.

CsrwReady

In

Asserted to 1b to indicate that a transfer on CfgWrData can be accepted.

CsrwData[31:0]

Out

Write data.

CsrwStrb[3:0]

Out

Bit[i] indicates that byte[i] of CsrwData holds valid data.

CsrbValid

In

Asserted to 1b to indicate that CsrbResp is valid.

CsrbReady

Out

Asserted to 1b to indicate that a transfer on CsrbResp can be accepted.

CsrbResp[1:0]

In

Write response to indicate the status of a write transfer.

CsrArValid

Out

Asserted to 1b to indicate that CsrArAddr is valid.

CsrArReady

In

Asserted to 1b to indicate that a transfer on CsrArAddr can be accepted.

CsrArAddr[19:0]

Out

The address of the first transfer in a read transaction.

CsrrValid

In

Asserted to 1b to indicate that CsrrData is valid.

CsrrReady

Out

Asserted to 1b to indicate that a transfer on CsrrData can be accepted.

CsrrData[31:0]

In

Read data.

CsrrResp[1:0]

In

Read response to indicate the status of a read transfer.

 

Timing Diagram

Initialization

 

 

Figure 4 Timing Diagram During Initialization Process

 

The initialization process of the NVMe IP follows the steps outlined below, as illustrated in the timing diagram:

1)     De-assert RstB to 1b once the Clk signal is stable.

2)     After completing the PCIe reset sequence, both PCIeRstB and CsrRstB should be de-asserted to 1b.

3)     The PCIe hard IP begins its initialization process and asserts PCIeLinkup to 1b once the LTSSM state reaches L0 state. Since PCIeLinkup operates on CsrClk and is decoded from the LTSSM signal generated on PCIeClk, an asynchronous register must be used for properly synchronize the signal between clock domains.

4)     Once PCIe initialization is complete, the user can read the PCIe link speed and link width from the NVMe IP. These registers are synchronous with CsrClk, so the values must be converted to the user logic clock domain using asynchronous registers.

5)     When the NVMe IP completes its internal initialization, it de-asserts UserBusy to 0b, indicating readiness.

After all of the above steps are completed, the NVMe IP is ready to receive user commands.

 

Control Interface of dgIF typeS

The dgIF typeS signals can be split into two groups: the Control interface for sending commands and monitoring status, and the Data interface for transferring data streams in both directions.

Figure 5 shows an example of how to send a new command to the IP via the Control interface of dgIF typeS.

 

 

Figure 5 Control Interface of dgIF typeS Timing Diagram

1)     UserBusy must be equal to 0b before sending a new command request to confirm that the IP is Idle.

2)     Command and its parameters such as UserCmd, UserAddr, and UserLen must be valid when asserting UserReq to 1b to send the new command request.

3)     IP asserts UserBusy to 1b after starting the new command operation.

4)     After UserBusy is asserted to 1b, UserReq is de-asserted to 0b to finish the current request. New parameters for the next command could be prepared on the bus. UserReq for the new command must not be asserted to 1b until the current command operation is finished.

5)     UserBusy is de-asserted to 0b after the command operation is completed. Next, new command request can be initiated by asserting UserReq to 1b.

Note: The number of parameters used in each command is different. More details are described below.

·        Write and Read commands: UserCmd, UserAddr, and UserLen.

·        SMART, Secure Erase, and Flush commands: UserCmd and CtmSubmDW0-DW15.

·        Identify and Shutdown commands: UserCmd.

 

Data Interface of dgIF typeS

Data interface of dgIF typeS is applied for transferring data stream when operating Write or Read command, and it is compatible with a general FIFO interface. Figure 6 shows the data interface of dgIF typeS when transferring Write data to the IP in the Write command.

 

 

Figure 6 Transmit FIFO Interface for Write Command

 

The 16-bit FIFO read data counter (UserFifoRdCnt) shows the total amount of data stored, and if the amount of data is sufficient, 512-byte data (32x128-bit) is transferred.

In the Write command, data is read from the Transmit FIFO until the total data is transferred completely, and the process to transfer data are described as follows.

1)     Before starting a new burst transfer, the IP waits until at least 512 bytes of data are available in the Transmit FIFO. This is determined by monitoring UserFifoRdCnt[15:5], which must not be equal to zero.

2)     The IP asserts UserFifoRdEn to 1b for 32 clock cycles to read 512-byte data from the Transmit FIFO.

3)     UserFifoRdData becomes valid in the next clock cycle after asserting UserFifoRdEn to 1b, and 32 data are transferred continuously.

4)     After the 32nd data (D31) of each burst has been transferred, UserFifoRdEn is de-asserted to 0b.

5)     Repeat steps 1) – 4) to transfer the next 512-byte block until the total amount of data transferred matches the transfer size specified in the command.

6)     Once all data has been successfully transferred, UserBusy is de-asserted to 0b, indicating the completion of the Write operation.

 

 

Figure 7 Receive FIFO Interface for Read Command

 

When executing the Read command, the data is transferred from the SSD to the Receive FIFO until the entire data is transferred. The steps for transferring a burst of data are below.

1)     Before starting a new burst transmission, the UserFifoWrCnt[15:6] is checked to ensure that there is enough free space in the Receive FIFO, which is indicated by the condition UserFifoWrCnt[15:6] ≠ all 1 or 1023. The IP waits until there is enough free space available, and the received data from the SSD is at least 512 bytes. Once both conditions are met, the new burst transmission begins.

2)     The IP asserts UserFifoWrEn to 1b for 32 clock cycles to transfer 512-byte data from the Data buffer to the user logic.

3)     After each 512-byte data transmission, UserFifoWrEn is de-asserted to 0b for one clock cycle. If there is more data to transfer, repeat steps 1) – 3) until total transferred data matches the transfer size specified in the command.

4)     Once all data has been successfully transferred, UserBusy is de-asserted to 0b, indicating the at the operation is complete.

 

IdenCtrl/IdenName

To ensure proper operation of the system, it is recommended to send the Identify command to the IP as the first command after the system boots up. This command updates important information about the SSD, such as its total capacity (LBASize) and LBA unit size (LBAMode), which are necessary for Write and Read commands to operate correctly. The following rules apply to the input parameters of these commands.

1)     The sum of the address (UserAddr) and transfer length (UserLen), inputs of Write and Read command, must not exceed the total capacity (LBASize) of the SSD.

2)     If LBAMode is 1b (LBA unit size is 4 KB), the three lower bits (bits[2:0]) of UserAddr and UserLen must be set to 0 to align with the 4 KB unit.

 

 

Figure 8 Identify Command Timing Diagram

 

When executing the Identify command, the following steps are taken.

1)     Send the Identify command to the IP (UserCmd=000b and UserReq=1b).

2)     The IP asserts UserBusy to 1b after receiving the Identify command.

3)     The IP returns 4KB Identify controller data to the user with IdenWrAddr equal to 0-255 and asserts IdenWrEn. IdenWrData and IdenWrDWEn are valid at the same clock as IdenWrEn=1b.

4)     The IP returns 4KB Identify namespace data to the user with IdenWrAddr equal to 256-511. IdenWrAddr[8] can be used to determine the data type as Identify controller data or Identify namespace data.

5)     UserBusy is de-asserted to 0b upon the Identify command completion.

6)     The LBASize and LBAMode of the SSD are simultaneously updated with the values obtained from the Identify command.

 

 

Figure 9 IdenWrDWEn Timing Diagram

 

The signal IdenWrDWEn is a 4-bit signal used to validate corresponding segments of a 128-bit data bus that carries 32-bit words. This is used for SSDs that return 4KB Identify controller and Identify namespace data one 32-bit word at a time, rather than a continuous stream.

To forward a single 32-bit word during a write cycle, one bit of IdenWrDWEn is asserted to 1b, as illustrated in Figure 9. Each bit of IdenWrDWEn corresponds to a specific 32-bit segment of IdenWrData: bit[0], [1], [2], and [3] of IdenWrDWEn map to bits[31:0], [63:32], [95:64], and [127:96] of IdenWrData, respectively

 

Shutdown

The Shutdown command is a command that should be sent as the last command before the system is powered down. The SSD ensures that the data from its internal cache is written to the flash memory before the shutdown process finishes. After the shutdown operation is completed, the NVMe IP and the SSD become inactive status. If the SSD is powered down without executing the Shutdown command, the total count of unsafe shutdowns is increased, as returned data from the SMART command.

 

 

Figure 10 Shutdown Command Timing Diagram

The process for executing the Shutdown command is described below.

1)     Ensure that the IP is in an Idle state (UserBusy=0b) before sending the Shutdown command. The user must set UserReq=1b and UserCmd=001b to send the Shutdown command request.

2)     Once the NVMe IP runs the Shutdown command, UserBusy is asserted to 1b.

3)     To clear the current request, UserReq is de-asserted to 0b after UserBusy is asserted to 1b.

4)     UserBusy is de-asserted to 0b when the SSD is completely shut down. After the shutdown process is completed, the IP will not receive any further user commands.

 

SMART

The SMART command is the command to check the health of the SSD. When this command is sent, the SSD returns 512-byte health information. The SMART command parameters are loaded from the CtmSubmDW0-DW15 signals on the Custom command interface. The user must set the 16-Dword data which is a constant value before asserting UserReq. Once the SMART data is returned, it can be accessed via the Custom RAM I/F, as shown in Figure 11.

 

 

Figure 11 SMART Command Timing Diagram

 

Below are the details of how to run the SMART command.

1)     Ensure that the NVMe IP is in the Idle state (UserBusy=0b) before issuing the command. All input parameters must maintain their values while UserReq is set to 1b to initiate the request. The CtmSubmDW0-DW15 fields must be set to fixed values for the SMART command as follows:

CtmSubmDW0                                           = 0000_0002h

CtmSubmDW1                                           = FFFF_FFFFh

CtmSubmDW2 – CtmSubmDW5                  = 0000_0000h

CtmSubmDW6                                            = 2000_0000h

CtmSubmDW7 – CtmSubmDW9                  = 0000_0000h

CtmSubmDW10                                          = 007F_0002h

CtmSubmDW11 – CtmSubmDW15              = 0000_0000h

2)     Once the command is accepted, the IP asserts UserBusy to 1b, indicating that the command is being executed.

3)     After the request has been acknowledged, de-assert UserReq to 0b to clear the current command. Next, the user logic may update the input parameters for the next command.

4)     The 512-byte SMART data is returned on the CtmRamWrData signal, with CtmRamWrEn set to 1b. CtmRamAddr ranges from 0 to 31, representing the index of each 16-byte segment within the 512-byte block. When CtmRamAddr=0, bytes 0-15 of SMART data are valid on CtmRamWrData. CtmRamWrDWEn indicates the validity of each 32-bit CtmRamWrData. If CtmRamWrDWEn=1111b, CtmRamWrData[127:0] are valid.

5)     When the SMART command has completed, UserBusy is de-asserted to 0b.

 

Figure 12 CtmRamWrDWEn Timing Diagram

 

Similar to Identify command, some SSDs return only one Dword (32-bit) of data at a time instead of streaming 512-byte block continuously. In such cases, one bit of CtmRamWrDWEn is asserted to 1b during the write cycle to indicate that a specific 32-bit segment of CtmRamWrData is valid. Each bit of CtmRamWrDWEn corresponds to a 32-bit segment within the 128-bit CtmRamWrData signal as follows: bit[0], [1], [2], and [3] of CtmRamWrDWEn correspond to bits[31:0], [63:32], [95:64], and [127:96] of CtmRamWrData, respectively.

 

Secure Erase

The Secure Erase is a command that erases all user data in the SSD. After the Secure Erase command is executed, the contents of the user data are indeterminate. Since executing this command may require long time for operation, users need to disable timer of the IP by setting ‘TimeoutSet’ signal to zero value.

 

 

Figure 13 Secure Erase Command Timing Diagram

 

Below are the details of how to run the Secure Erase command.

1)     Ensure that the NVMe IP is in the Idle state (UserBusy=0b) before issuing the command. All input parameters must maintain their value while UserReq is set to 1b to send the request. The TimeoutSet and CtmSubmDW0-DW15 fields must be set to fixed values for the Secure Erase command as follows:

TimeoutSet                                                 = 0000_0000h (Disable Timer)

CtmSubmDW0                                           = 0000_0080h

CtmSubmDW1                                           = 0000_0001h

CtmSubmDW2 – CtmSubmDW9                  = 0000_0000h

CtmSubmDW10                                          = 0000_0200h

CtmSubmDW11– CtmSubmDW15               = 0000_0000h

2)     Once the command is accepted, the IP asserts UserBusy to 1b, indicating that the command is being executed.

3)     After the request has been acknowledged, de-assert UserReq to 0b to clear the current command. Next, the user logic may update the input parameters for the next command.

4)     When the Secure Erase command has completed, UserBusy is de-asserted to 0b. Following this, the ‘TimeoutSet’ can be changed to other values to enable Timeout function of the IP.

Note: Some SSDs may experience a decrease in performance after long data transfer. In such cases, executing the Secure Erase command can help restore the SSD’s performance

 

Flush

The SSDs typically enhance write performance by caching write data before writing it to the flash memory. However, unexpected power loss can result data loss as cached data may not be stored in flash memory. To avoid data loss, the Flush command can be used to force the SSD controller to write cached data to the flash memory.

 

 

Figure 14 Flush Command Timing Diagram

 

To execute the Flush command, follows the steps outlined below:

1)     Ensure the IP is the Idle state (UserBusy=0b) before sending the command request. All input parameters must maintain their value while UserReq is set to 1b for sending the request. For the Flush command, the CtmSubmDW0-DW15 fields must be set to fixed values as follows:

CtmSubmDW0                                           = 0000_0000h

CtmSubmDW1                                           = 0000_0001h

CtmSubmDW2 – CtmSubmDW15                = 0000_0000h

2)     Once the command is accepted, the IP asserts UserBusy to 1b, indicating that the command is being executed.

3)     After the request has been acknowledged, de-assert UserReq to 0b to clear the current command. Next, the user logic may update the input parameters for the next command.

4)     When the Flush command has completed, UserBusy is de-asserted to 0b.

Using the Flush command ensures that all data from the previous Write operations is guaranteed to be stored in flash memory, thus preventing data loss in the event of unexpected power loss.

 

Error

 

Figure 15 Error Flag Timing Diagram

 

If an error occurs during the initialization process or while executing a command, the UserError flag is set to 1b. To identify the type of error, the UserErrorType signal should be examined. Additoianlly, the NVMeCAPReg, AdmCompStatus, and IOCompStatus signals can be used to monitor error details after UserError is asserted.

If the error occurs during initialization, it is recommended to read the NVMeCAPReg signal to verify the capabilities of the connected NVMe SSD. If the error occurs during command execution, it is recommended to check the AdmCompStatus and IOCompStatus signals.

·        If bit[3] of UserErrorType is asserted, refer to the AdmCompStatus signal for more detailed error information.

·        If bit[5] of UserErrorType is asserted, refer to the IOCompStatus signal for further details.

The UserError flag is cleared only by asserting the RstB signal. Once the issue is resolved, assert RstB to 0b to reset the IP and clear the error status.

 

Verification Methods

The NVMe IP Core functionality was verified by simulation and also proved on real board design by using Sulfur Agilex5 E-Series Development board.

Recommended Design Experience

Experience design engineers with a knowledge of Quartus Tools should easily integrate this IP into their design.

Ordering Information

This product is available directly from Design Gateway Co., Ltd. For pricing and additional information about this product, please refer to the contact information on the front page of this datasheet.

Revision History

Revision

Date (D-M-Y)

Description

1.00

18-Apr-25

Initial release