DDR4 SDRAMs are very prevalent in devices that use ASICs and FPGAs. In this article we explore the basics.
What a DDR4 SDRAM looks like on the inside
What goes on during basic operations such as READ & WRITE, and
A high-level picture of the SDRAM sub-system, i.e., what your ASIC/FPGA needs in order to talk to a DDR4 SDRAM memory
A good place to start is to look at some of the essential IOs and understand what their functions are. From there we'll dive deeper until we get to the basic unit that makes up a DRAM memory.
As you would expect, the DRAM has clock, reset, chip-select, address and data inputs. The table below has little more detail about each of them. This is not a complete list of IOs, only the basic ones are listed here. Take a little time to carefully read what each IO does, especially the dual-function address inputs.
Symbol | Type | Function |
---|---|---|
RESET_n | Input | DRAM is active only when this signal is HIGH |
CS_n | Input | The memory looks at all the other inputs only if this is LOW. |
CKE | Input | Clock Enable. HIGH activates internal clock signals and device input buffers and output drivers. |
CK_t/CK_c | Input | Differential clock inputs. All address & control signals are sampled at the crossing of posedge of CK_t & negedge of CK_n. |
DQ/DQS | Inout | Data Bus & Data Strobe. This is how data is written in and read out. The strobe is essentially a data valid flag. |
RAS_n/A16 CAS_n/A15 WE_n/A14 | Input | These are dual function inputs. When ACT_n & CS_n are LOW, these are interpreted as Row Address Bits. When ACT_n is HIGH, these are interpreted as command pins to indicate READ, WRITE or other commands. |
ACT_n | Input | Activate command input |
BG0-1 BA0-1 | Input | Bank Group, Bank Address |
A0-13 | Input | Address inputs |
The top-level picture shows what a DRAM looks like on the outside. Going a level deeper, this is how memory is organized - in Bank Groups and Banks.
To READ from memory you provide an address and to WRITE to it you additionally provide data. This address provided by you, the user, is typically called "logical address". This logical address is translated to a physical address before it is presented to the DRAM. The physical address is made up of the following fields:
Bank Group
Bank
Row
Column
these individual fields are then used to identify the exact location in the memory to read-from or write-to.
Going down another level, this is what you'll see within each Bank.
Memory Arrays
Row Decoder
Column Decoder
Sense Amplifiers
Figure 3: Row & Column Decodping
Once the Bank Group and Bank have been identified, the Row part of the address activates a line in the memory array. This is called the "Word Line" and activating it reads data from the memory array into something called "Sense Amplifiers". The Column address then reads out a part of the word that was loaded into the Sense Amps. The width of the column is called the "Bit Line".
The width of a colum is standard - it is either 4 bits, 8 bits or 16 bits wide and DRAMs are classified as x4, x8 or x16 based on this column width. Another thing to note is that, the width of DQ data bus is same as the column width. So, to simplify things, you can say that DRAMs are classified based on the width of the DQ bus.
Note: x16 devices have only 2 Bank Groups whereas x4 and x8 have 4 as shown in figure 2.
At the lowest level, a bit is essentially a capacitor that holds the charge and a transistor acting as a switch.
Figure 4: Bit Level
Since the capacitor discharges over time, the information eventually fades unless the capacitor is periodically REFRESHed. This is where the 'D' in DRAM comes from - it refers to Dynamic as opposed to SRAM (Static Random Access Memory).
DRAMs come in standard sizes and this is specified in the JEDEC spec. JEDEC is the standards committee that decides the design and roadmap of DDR memories. This is from section 2.7 of the DDR4 JEDEC specification (JESD79-4B).
Figure 5: Addressing
Let's try to make some more sense of the above table by hand-calculating two of the sizes
In the table above, there's a mention of Page Size. Page size is essentially the number of bits per row. Or put it another way, it is the number of bits loaded into the Sense Amps when a row is activated. Since the column address is 10 bits wide, there are 1K bit-lines per row. So, for a x4 device number of bits is 1K x 4 = 4K bits (or 512B). Similarly, for x8 device it is 1KB and for x16 it is 2KB per page.
When dealing with DRAMs you'll come across terminology such as Single-Rank, Dual-Rank or Quad-Rank. Rank is the highest logical unit and is typically used to increase the memory capacity of your system.
Say you need 16Gb of memory. Depending on what's available in the market and what is cheaper, you could have a single 16Gb memory die, in this case you would call it a Single Rank system because you just need 1 ChipSelect signal (CS_n) to read all the contents of the memory. Or you could choose to have 2 individual 8Gb discrete devices soldered down on the PCB (because 2x8Gb devices happen to be cheaper than 1x16Gb). In this case the 2 devices will be connected to the same address and data busses, but you will need 2 ChipSelects to separately address each device. Since you need two ChipSelects, this setup is called Dual-Rank.
Note: One other DRAM variety you may come across is a "Dual-Die Package" or DDP. In this case you'll have a single DRAM chip soldered on the board but internally within the package it'll have a stack of 2 dies. Each die will once again share address and data lines but will have separate chip selects, making it a Dual Rank device.
Figure 6: Rank
Another example - Say you need an 8Gb memory and the interface to your chip is x8. Then you could pick a single 8Gb x8 device or two 4Gb x4 devices and connect them in a "width cascaded" fashion on the PCB. With width cascading, both DRAMs are connected to the same ChipSelects, Address and Command bus, but use different portions of the data bus (DQ & DQS). In the picture below, the first x4 DRAM is connected to DQ[3:0] and the second on to DQ[7:4].
Figure 7: DRAMs Width Cascading
Read and write operations to the DDR4 SDRAM are burst oriented. It starts at a selected location (as specified by the user provided address), and continues for a burst length of eight or a ‘chopped’ burst of four.
Read and write operations are a 2-step process. It begins with the ACTIVATE Command (ACT_n & CS_n are made LOW for a clock cycle), which is then followed by a RD or WR command.
The address bits registered coincident with the ACTIVATE Command are used to select the BankGroup, Bank and Row to be activated (BG0-BG1 in x4/8 and BG0 in x16 selects the bankgroup; BA0-BA1 select the bank; A0-A17 select the row). This step is also called RAS - Row Address Strobe.
The address bits registered coincident with the Read or Write command are used to select the starting column location for the burst operation. This step is also referred to as CAS - Column Address Strobe.
Each bank has only one set of Sense Amps. Before a read/write to a different row in the same bank can be performed, the current open row has to be de-activated using a PRECHARGE command. PRECHARGE is equivalent to closing the current file drawer in the cabinet, it causes the data in the Sense Amps to be written back into the row.
Instead of issuing an explicit PRECHARGE command to deactivate a row, the RDA (Read with Auto-Precharge) and WRA (Write with Auto-Precharge) commands can be used. These commands tell the DRAM to automatically deactivate/precharge the row once the read or write operation is complete. Since column address uses only address bits A0-A9, A10 which is an unused bit during CAS is overloaded to indicate Auto-Precharge.
I'm constantly referring to something called "commands" - ACTIVATE command, PRECHARGE command, READ command, WRITE command. But in the very first picture of this article, there is no "Command" input to the DRAM. So how are these commands issued?
Well, the DRAM interprets the ACT_n, RAS_n, CAS_n & WE_n inputs as commands based on the truth table below.
Partial Command Truth-Table
The table above is only a subset of commands you can issue to the DRAM. The entire DDR4 command truth table is specified in section 4.1 of the JEDEC spec JESD79-4B.
Figure 8 shows the timing diagram of a READ operation with burst length of 8 (BL8).
The first step is an ACT command. The value on the address bus at this time indicates the row address.
In the second step a RDA (Read with Auto-Precharge) is issued. The value on the address bus during at this time is the column address.
The RDA command tells the DRAM to automatically PRECHARGEs the bank after the read is complete
Figure 9 shows the timing diagram of a WRITE operation.
The first step activates a row
Then 2 WRITE commands are issued. The first one to address COL and second one to COL+8.
The second write operation does not need an ACT before it because the row we intend to write to is already active in the Sense Amps
Also note that the first command is a plain WR, so this leaves the row active. The second command is a WRA which de-activates the row after the write completes.
Note: I sneaked something in here without much explanation. A16, A15 & A14 are not the only address bits with dual function. The auto precharge command is issued via A10, and select BurstChop4 (BC4) or BurstLength8 (BL8) mode is selected via A12, if enabled in the mode register.
Now that we've had a sufficiently long discussion about the DRAM, it is time to talk about what the ASIC or FPGA needs in-order to talk to the DRAM. This is called the DRAM sub-system and it's made up of 3 components:
The DRAM memory itself, which comprises of everything described above
A DDR PHY
A DDR Controller
Figure 10: DRAM Sub-System
There's a lot going on in the picture above, so lets break it down:
The DRAM is soldered down on the board. The PHY and controller, along with user logic are typically part of the same FPGA or ASIC.
The interface between the user-logic and the controller can be user defined and need not be standard
When the user-logic makes a read or write request to the controller, it issues a logical address
The controller then converts this logical address to a physical address and issues a command to the PHY
The Controller and PHY talk to each other over a standard interface called the DFI interface. You can download the DFI specification from here.
The PHY then does all the lower level signaling and drives the physical interface to the DRAM.
This interface between the PHY and memory is specified in the JEDEC standard JESD79-49B specification
Think of the controller as the brains and the PHY as the brawns.
When you activate a row, the whole page is loaded into the Sense Amps, so multiple reads to an already open page are lesser expensive because you can skip the first step of row activation. The controller typically has the capability to re-order requests issued by the user to take advantage of this. To do the re-ordering it uses a small cache or TCAM and always returns the latest data, so you don't have to worry about stale data or collisions occurring because of this re-ordering done by the controller.
The PHY contains the analog drivers and provides the capability to tweak registers to increase drive strength or change terminations, in order to improve signal integrity.
Let's wrap this up
The DRAM is organized as Bank Groups, Bank, Row & Columns
The address issued by the user is called Logical Address and it is converted to a Physical Address by the DRAM controller, before it presented to the memory
DDR4 DRAMs are classified as x4, x8 or x16 based on the width of the DQ data bus
You can depth cascade or width cascade DRAMs to achieve the required size
Read and write operations are a 2-step process. 1st step activates a row, 2nd step reads or write to the memory.
The DRAM sub system comprises of the memory, a PHY layer and a controller