Handing over the Data to the SSD is the Hottest Form of Computational Storage Currently
Computational storage is a new trend of development. What is computational storage? Computational storage refers to hand over the task of data processing to storage layer instead of moving the data to main memory and processing it on the host CPU.
Whenever new trends emerge, various innovative startups will emerge and develop different technical solutions along different technical routes. Computational storage, like many new types of technology, lacks some standards.
So SNIA has set up a technical working group dedicated to computational storage, co-hosted by NGD and SK Hynix, with more than 20 companies involved.
A report from 451 Research briefly introduced computational storage, and this report will appear on SNIA's official website.
This report is also available on the Blocks & Files website, where a few innovative companies are briefly introduced.
NGD implements field data processing in a simple way by integrating the ARM Cortex-A53 processor into the NVMe SSD controller.
Data still needs to be moved from the NAND flash chip to the processor, but this is done using a Common Flash Interface (CFI) that has three to six times the bandwidth of the host interface.
The advantage of this approach is that the processor runs a standard operating system (such as Ubuntu Linux), which means that any software running on Ubuntu can be used for live computing, and the NGD hard drive itself can be used as a standard SSD here.
NGD does not specifically say how much performance is improved. It just means that compared to the previous generation hardware, the image recognition speed is increased by two orders of magnitude, and Hadoop's data processing efficiency is improved by more than 40%.
Samsung announced the launch of SmartSSD in October 2018. Samsung described it as an intelligent subsystem, not a storage device. This is a server with multiple SmartSSDs. This is similar to a computing cluster.
Each smart SSD is based on Samsung's 3D V-NAND TLC flash memory, plus the Xilinx Zynq FPGA with ARM as the computing unit. Samsung's Smart SSD is aimed at two types of workloads: data analysis on the one hand, and storage-related transactions on the other, such as data compression, deduplication, and encryption.
Unlike NGD, SmartSSD can't run standard software. Samsung and Xilinx jointly developed a runtime library to run some software.
It is understood that these devices are currently being tested by oversized data center users and storage system manufacturers.
The Bigstream, which developed data analysis and machine learning, showed a demo that used Samsung SmartSSD to run Apache Spark, which improved performance by a factor of three to five.
Scaleflux's CSS SSD also combines data storage with data processing. The CSS 1000 series PCIe card or U.2 disk currently on sale has a capacity of 2TB-8TB, and the third generation will be available later this year.
Each CSS SSD is also based on Xilinx FPGAs, which handle data and act as flash controllers. The CSS SSD is first integrated into the host server and storage environment through the ScaleFlux software module, and then the computing functions can be accessed through the APIs exposed in the software modules.
In addition, the original FTL in the SSD controller was taken out and placed in the software module. This means that it will take up a portion of the host's CPU resources, but ScaleFlux believes that operating as system software has its inherent advantages, such as some optimizations to adapt to a particular workload or something.
Unfortunately, migrating data from the server to the CSS SSD requires code changes, but ScaleFlux provides off-the-shelf code packages to accelerate application migration, supporting Aerospike, Apache HBase, Hadoop and MySQL, OpenZFS file systems and CEPH storage systems.
ScaleFlux said that China's Alibaba intends to use the CSS SSD to accelerate PolarDB. PolarDB is a transaction-oriented and analytic database built by Ali. It is understood that in order to use this disk, Alibaba changed its code. Adapted and used the API provided by SacleFlux.
Eideticom's NoLoad is not common. It's also a computational storage. It's a 2.5-inch U.2 NVMe SSD with Xilinx FPGA accelerator inside and a small memory.
Eideticom takes advantage of the PCIe bus, which can move data between NoLoad accelerators and NVMe SSD storage quickly, with little or no host CPU involvement.
The advantage of this is that since the computational portion and the storage portion of the computational storage are separate, the computational portion and the storage partition can be independently extended.
Eideticom's previous DEMO showed that a total of 160GB of 18 SSDs was connected to six NoLoads, compressing the data on the hard disk, and the CPU usage was less than 5%.
According to Eideticom, a piece of NoLoad can compress or decompress data at speeds in excess of 3GB/sec. Therefore, Eideticom's main scenario is data compression and deduplication, and will strengthen its ability to accelerate data analysis in the future.
When the New Zealand company first developed the NSULATE system, it was actually used to process large-scale data for the SKA radio telescope.
The NSULATE solution is not a hardware product, but a Linux block device. It is a high-performance RAID solution for large-scale storage software. It uses NVIDIA's GPU as a storage controller to perform erasure with deep parity calculation. Encoding to achieve a very high level of data protection.
Of course, the expensive GPU can't be used only as a storage controller. Nyriad means that the GPU can also be used for other workloads, such as machine learning and blockchain computing.
Computational storage is an emerging technology, and some analysts predict that this technology will soon become popular.
Emerging workloads such as machine learning and analysis scenarios require very fast data access. Computational storage should be a good solution. With the future application of SCM, the capacity of computational storage will be further amplified.