Overcoming Industrial Data Storage Problems with 3D NAND
March 20, 2020
The use of mass storage in embedded industrial applications continues to grow as the demand for new features and functionality increases.
The use of mass storage in embedded industrial applications continues to grow as the demand for new features and functionality increases. While more complex GUIs and applications have been made possible by increasing NAND die capacities; faster interfaces and the availability of a variety of managed NAND solutions; the challenge of finding adequate solid state storage solutions that can cope with the demands of extreme environments still remains. Fortunately, the evolution of both NAND storage media and controller design means that more reliable and cost-effective options are now available.
Meeting the Demands of Extreme Environments
At the top of an embedded designer’s wish list for mass storage capabilities is usually high reliability. Also on the list is a need for high mechanical resistance to shock and vibration, which often rules out the use of removable storage in favor of soldered-down ball grid array (BGA) devices. Guaranteed operation over extended temperature ranges can be added to the list too. Furthermore, the ideal solution should be available for a long period of time to prevent expensive and time-consuming storage device requalification.
Real-World Use Case—Finding the Right Storage Solution
A real-world use case where the benefits of data integrity and power-fail data protection in SSDs are crucial is that of a brake management system in trains. While great care is taken by transport system designers to ensure a stable power supply, brown-outs are not completely preventable. Without built-in, inherent power-fail protection, there is a clear risk of data corruption. This could mean a substantial failure of the brake management system if the affected files are part of the OS or application software. A typical brake management system monitors key parameters such as total hours in use, brake efficiency, and temperature for informing critical maintenance schedules. A failure during the logging of this data could mean missed or unnecessary downtime and increased maintenance costs.
Selecting the right SSD for this type of embedded application is critical. In many cases, single-level cell (SLC) NAND memory may be the ideal technology, offering both robust data retention capabilities as well as high program-and-erase (P/E) cycles. However, the main issue with this type of technology is the lack of high capacity options and higher memory costs. If we look at a lower cost technology like planar (2D) multi-level cell (MLC) NAND, which holds two bits per cell, we immediately get more economical, higher capacity options. In most cases, the available endurance is 3,000 to 10,000 P/E cycles, which is plenty for many applications.
Well, not quite.
Planar MLC NAND stores its two bits of data in one memory cell. These two bits are in two different paired pages, which are programmed in separate stages. This means that if the power fails while writing to one page, the data in the paired page can be corrupted too. The host file system may be able to manage the page being written at the time of the power failure, but it will have no knowledge of the corrupted paired page until it tries to read that data at some later time. The contents of the paired page will contain uncorrectable (UNC) data where the charge state for each cell is indeterminate and cannot be resolved to a 0 or 1.
Traditional solutions to prevent this involve retaining power to the drive for a sufficient amount of time to allow the page program operation to complete. This could be achieved through on-board power loss protection capacitance to provide enough charge for the page program time plus some program latency. If the drive being used had a DRAM cache, the amount of stored energy would need to be significantly higher to prevent the cache contents from being lost. A typical power-loss protection (PLP) solution may look like the generic example in Figure 1.
Figure 1: Generic power retention circuit
A New Class of NAND Technology
Recent advances in memory architecture have enabled a new class of 3D NAND-based solid state storage solutions that eliminate the paired page issue. 3D NAND uses vertically stacked memory cell layers that can provide the same endurance as planar NAND flash with increased cost effectiveness and faster performance. With Micron’s industrial 3D MLC NAND, programming can now be achieved in a single pass where both pages are programmed at the same time. A representation of single-pass programming in Figure 2 shows the classic threshold voltage (Vt) distribution of cells in MLC NAND and how the charge state is decoded to the bit values for those cells.
Figure 2: Representation of single-pass programming
The upper and lower pages can be programmed by a NAND flash controller in one operation so the cell charge is moved to the required level for both pages simultaneously, effectively eliminating the possibility of data corruption in paired pages during power interrupts. It is the responsibility of the controller to ensure the pages within a block are sequentially programmed and that the lower and upper page addresses are on a shared word line (WL).
Micron’s 3D NAND + Greenliant’s NANDrive Solution
With an intelligent controller, like those developed by Greenliant for use in its small form factor eMMC NANDrive BGA SSDs, and 3D MLC NAND’s single-pass programming feature. A brake management system designer can now ensure that stored data is not affected by sudden power losses.
The controller programs all states in a single step without disturbing adjacent cells, reducing risk to data already in place on the drive (called “data at rest”). Furthermore, the controller helps minimize corruption of data in transition or in flight (in temporary DRAM or SRAM cache buffers) through use of Micron’s advanced 3D NAND features.
If the power fails part way through the write operation, the host may typically use journaling or some other transaction failsafe protocol to determine that the last file written did not complete, and therefore the data in that file should be ignored or replaced. If the application uses small writes, which optimally should be the size of a NAND page. Then, sophisticated controller firmware will use advanced algorithms leveraging 3D NAND’s automatic read calibration to try to recover the last page, even if the power failed during the write operation.
Controller adaptive threshold voltage tuning further enhances the ability of the controller to recover last page data. To retain data that could otherwise be lost due to the dielectric leakage caused by excessive P/E cycling, the controller may also periodically refresh data in the memory cells.
By implementing all of the aforementioned features, Greenliant’s industrial eMMC 5.1 SSDs with Micron’s 3D MLC NAND have successfully passed extensive power-fail testing (several thousand power-interrupt cycles) without data corruption in the brake management system.