The trends and challenges for High-capacity SSD

Source:   Editor: Linda Zeng Update Time :2019-10-27

The driving force of large-capacity SSD

SSD is significantly more reliable than mechanical hard disk because it does not have mechanical components of HDD. The annual failure rate of SSD is usually less than 1%, while the failure rate of HDD is usually 4% to 5%.

Intel: Based on 2-3 million SSDs, the failure rate has been less than 0.2% since 2013; the 0.8% fluctuation of the early AFR of the DC SSD was caused by the initial problems of the DC series SSD.

While ensuring high reliability, SSD has a qualitative leap in performance compared with HDD. As a big customer of memory demand, data center is always pursuing efficiency and reducing TCO (Total Cost of Ownership), and SSD has satisfied this demand from the following two aspects.

1 High density

The increase in SSD storage capacity allows a storage array of the same capacity to be formed with a smaller number of SSDs. For example, the mainstream SSD of enterprise terminal has a maximum capacity of only 3.2 TB in the past. Therefore, 16 SSDs must be used to form a 50 TB storage array. Now only 4 of them are needed when 15.3 TB SSD is used.

High density does not only mean capacity, but also means high-performance delivery. In a system composed of a dozen of SSD disks, the overall performance usually reaches from 400,000 to 500,000 IOPS, which means that the system can carry more applications. Virtualization technology has strengthened this trend. A storage system is fully capable of supporting enterprise virtual desktops, core databases, and Web applications.

From a cost perspective, high density means increasing application server utilization (NRDC: 2014 data center Efficiency Assessment report shows that the CPU utilization of data center server is only 10% to 15%) and reducing the number of servers; it also means reducing the license fees for software and middleware, such as Oracle's license fee. The most critical is to improve the response time of the business, for example, a large number of report analysis time is reduced from a few days to a few hours; the startup time of the virtual desktop in the startup storm scenario is reduced from tens of minutes to one minute, and the electronic business order processing time is from minutes to seconds.

2 operation and maintenance costs

Large-capacity SSDs effectively reduce the occupation space of the data center and reduce system power consumption. The SSD power consumption of various interfaces is as follows:

The power consumption remains basically unchanged when the SSD capacity becomes larger and the performance does not change much. The power consumption of the system using the large-capacity SSD is significantly reduced if the storage system of the same capacity is provided. At the same time, large-capacity SSDs reduce space usage. For a storage system with 100TB usable capacity, the SSD power consumption and space consumption are as follows (the space is reduced by 75%, and the overall power consumption of the SSD is reduced by 75%):

The challenges of system availability 

The increase of SSD storage capacity means that greater impact will be caused when a piece of SSD fails. In the previous example, when a 3.2TB SSD is used to form a 50TB storage array, each SSD occupies an average of 6% of the entire system, and 6% of the entire data needs to be reconstructed when a piece of SSD fails. Up to 25% data needs rebuilding once the SSD fails as each occupies 25% data of the entire system when it is replaced with a 15.3TB SSD. The time required for reconstruction and the risk for data loss are greatly increased to several times than before.

In fact, with the increase of the capacity of HHD’s single disk, the problem that the time of rebuilding RAID array increases has arisen since large-capacity disks such as 4TB and 6TB were produced. The same problem is also going to be faced by the array system composed of SSDs after the SSD capacity catches up with and even surpasses the disk.

An urgent need to introduce a new generation of data protection technology

The traditional RAID technology must specify the Hot Spare disk in the disk array. If the disk fails and the data must be reconstructed, the write burden will be concentrated on the Hot Spare disk, which will become a performance bottleneck when rebuilt. Therefore, the larger the amount of data to be reconstructed is, the more reconstruction time will increase in linear growth.

 You will face problems such as long-time reconstruction and insufficient system reliability if you continue to use traditional RAID technology when the disk or SSD capacity reaches 10TB or more. This also explains why large-capacity hard drives are applied to some data-intensive applications cautiously, such as video surveillance storage systems, most of them are available to HDDs with a maximum capacity of 4TB. Let’s make a simple calculation, taking 10TB HDD disk as an example, with a rebuilding speed of 115MB/s bandwidth, reconstruction needs 10TB/115M ~=25 hours. It will risk missing data as if there has another hard disk went wrong during the period as it is such a long time. To solve this problem, we must divide it into three levels to handle.

Improve the reliability of large-capacity hard disk

For mechanical hard disk, HDD drives benefit from new technologies including HAMR, BPMR and HDMR. However, it will take a long time to restore data and greatly affect the processing of business IO once it fails due to unpredictable high failure rates (SMART messages are basically unpredictable) though storage density increases to 20-40TB. From this perspective, the mechanical disk has come to its end.

For SSDs,the failure rate of SSD is low and predictable because there are no mechanical parts, the most factors are the frequency of cell wear. When introducing a large-capacity SSD:

• Replace damaged blocks with more OP (Over-Provisioning) space, or even reduce SSD capacity to satisfy bad block replacement;

• Correct NAND flash bit misalignment with stronger error correction algorithms such as LDPC;

• Reduce the frequency of cell wear by reducing the amount of data write, such as compression;

• Reduce the frequency of cell wear by reducing SSD Write Amplification (WA), such as multi-stream identification;

In addition, some array manufacturers even deal with SSD errors through proprietary interfaces and through proprietary interfaces such as finer-grained monitoring, life expectancy, and error handling. These custom SSDs typically have capacities of up to 8TB, 52TB, and more.

Improve array data redundancy rate

It is not enough to allow only one disk to lose the RAID structure of the entire array when the capacity of a single disk or SSD reaches several terabytes since the required reconstruction time is quite long. In the long reconstruction period, the entire array cannot be recovered if there is another block If the SSD fails. Therefore, at least the architecture equivalent to RAID 6 and the loss of two hard disks must be provided to ensure that the system has basic protection when data is reconstructed.

The protection of the RAID 6 level is not sufficient when the capacity of a single hard disk reaches 10 TB and the reconstruction time required for the failure of the hard disk increases several times, so it is necessary to introduce more verification hard disks through the Erasure code technology to ensure reliability.

However, there are other problems in improving data redundancy, that is, the impact on performance. The read-modify-write mode of Erasure code brings serious write penalties, and the improved aggregate full-band write brings system-level Garbage Collector, which brings a sharp drop in performance when capacity utilization is high.

Flash-oriented storage architecture

The traditional RAID architecture should be replaced by a new and decentralized one, and the rebuilding burden should be dispersed to all disks instead of using single Hot Spare one. By doing so, time to rebuild arrays can be reduced and efficiency of the system will be improved dramatically.

A more innovative approach is to use the share-nothing architecture. Data redundancy is achieved by replicating. All hard drives are concurrently involved in the rebuilding, just read the redundant data and then write to the new location. While RAID5/RAID6. /Erasure code mode needs to read the data first, then recalculate the lost data, and then write the lost data, which  is inefficient and consumes a lot of computing resources, impacting the application IO. At the same time, performance improvements were also seen in the copy form, which eliminates the write penalty performance impact of RAID/EC.

Some of the industry's innovative storage vendors are also exploring further, and solving performance and reliability problems by means of the close cooperation of storage controllers and SSDs. The general direction is that custom SSDs expose more information to the storage controller which can participate in the flash management of SSD more frequently. In this way, performance and reliability are further enhanced, which drives SSD capacity to become larger in turn.