WO2019047612A1 - 基于闪存的存储路径优化的键值存储管理方法 - Google Patents

基于闪存的存储路径优化的键值存储管理方法 Download PDF

Info

Publication number
WO2019047612A1
WO2019047612A1 PCT/CN2018/094909 CN2018094909W WO2019047612A1 WO 2019047612 A1 WO2019047612 A1 WO 2019047612A1 CN 2018094909 W CN2018094909 W CN 2018094909W WO 2019047612 A1 WO2019047612 A1 WO 2019047612A1
Authority
WO
WIPO (PCT)
Prior art keywords
key value
flash
value storage
storage management
management system
Prior art date
Application number
PCT/CN2018/094909
Other languages
English (en)
French (fr)
Inventor
陆游游
舒继武
张佳程
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2019047612A1 publication Critical patent/WO2019047612A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks

Definitions

  • the present invention relates to the field of flash memory storage technologies, and in particular, to a key value storage management method based on flash memory storage path optimization.
  • Flash memory is an electronic erasable programming memory. Compared with traditional disk media, flash memory has the characteristics of high read/write bandwidth, low access latency, low power consumption, and strong stability. It has begun to spread in data centers, personal computers, and mobile devices. The flash memory is read and written in units of pages. Before the flash memory rewrites a page, it needs to be erased first. Flash memory is erased in blocks, and a flash block contains hundreds of flash pages. The cells of the flash have a limited number of erase operations, ie each flash cell has a limited lifetime. The internal structure of the flash drive and the disk are also significantly different. Inside the flash drive, the flash chip is connected to the flash controller through different channels.
  • a flash chip In a flash chip, a plurality of flash granules are packaged, and each granule can execute instructions independently. Each particle contains multiple flash slices, each with separate registers that provide pipelined instruction execution between multiple flash slices. With different levels of concurrency, SSDs provide sufficient access bandwidth. This feature is known as the internal concurrency of flash devices.
  • the current general-purpose SSD is built on a flash drive. It uses the flash translation layer to manage the read and write erase of the flash and internal concurrency, and provides the same read and write interface to the software system as the traditional disk.
  • the flash translation layer consists of three main functions: address mapping, garbage collection, and wear leveling.
  • the address mapping is used to record the mapping relationship between the logical address and the physical address of the flash during the remote update of the flash memory; garbage collection is to select and erase the invalid flash page to restore the idle state and keep the new data written; wear leveling is to ensure Reliable storage of data, evenly writing data to each flash page.
  • the flash translation layer also includes ECC (Error Correction Code), bad block management and other functions.
  • the flash translation layer manages the unique properties of flash and is transparent to the upper layers of software. This allows existing software programs to run on SSDs without any modifications, saving the cost of secondary software development and deployment.
  • existing software such as key-value storage management software
  • Existing key value storage management software runs on top of the file system.
  • the file system is functionally redundant with the flash translation layer. For example, the file system and the flash translation layer have the functions of space allocation, garbage collection, and addressing. These redundant features bring additional software overhead to the performance of the software.
  • the flash translation layer cannot be specifically optimized for the upper-level key-value storage management software, and the file system cannot sense the characteristics of the underlying flash memory and exert its internal concurrency.
  • Another outstanding problem is that all software middleware is designed and developed based on the characteristics of the previous disk in the storage path of the current key storage system. It is not optimized according to the characteristics of the flash and key storage system. Take advantage of the performance of both. There is currently little research on how to optimize the storage path for flash-based key-value storage.
  • the present invention aims to solve at least one of the above technical problems.
  • the object of the present invention is to provide a flash value-based storage path optimization key value storage management method, which improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the use of the device. life.
  • an embodiment of the present invention provides a flash value-based storage path optimization key value storage management method, including the following steps: S1: directly managing a bare flash memory device through a key value storage management system, bypassing files System and flash translation layer; S2: when the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key The value storage management system stores the key value storage file as an integer multiple of the flash block; S3: when the key value storage management system performs data compression, adopts a dynamic compression method, and dynamically adopts a corresponding quantity according to the access characteristics of the foreground user.
  • the flash channel writes the compressed data; S4: when the key value storage management system performs data caching, the compressed sensing cache algorithm is used, and the compressed data is not cached, and the saved space is used for caching the user's reading.
  • flash value-based storage path optimization key value storage management method may further have the following additional technical features:
  • the bare flash device directly exports the internal structure information of the device to the user state, and through a specific interface, enables the key value storage management system to be in the user mode, directly to the bare flash memory.
  • the device is managed to bypass the file system and the flash translation layer in the traditional key storage management system structure, wherein the internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory. .
  • the concurrent data layout method specifically includes: the number of flash channels and the flash block size that are transmitted by the key value storage management system through the bare flash device, and the length of the key file is set;
  • the key value storage management system divides the key file in units of flash block lengths when storing the key value file, and distributes different flash blocks to different flash channels; the data in the key file is in the flash block. , distributed in a polling manner.
  • the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determining the read/write ratio requested by the foreground user, when the user When the write ratio is greater than the first preset ratio, the key value storage management system writes the compressed key file using all the flash channels, and when the read ratio of the user is higher than the second preset ratio, The key value storage management system writes the compressed file using half of the flash channel; the key value storage management system performs the judgment of the read/write ratio by recording the type of the user request between two consecutive compression operations; Compressing the request, the key value storage management system uses all flash channels for data writing.
  • the method for compressing data specifically includes: the key value storage management system first reads a key value file that needs to be compressed when performing data compression, and the key value storage management system reads multiple times at the same time.
  • the fixed length in the file is in the cache; after the key value data in the cache is compressed, the subsequent data in the plurality of files is read, and so on, all the files to be compressed are read; the key value storage The management system writes the compressed data to the flash device and the compression process ends.
  • the compression-aware caching algorithm specifically includes: the key value storage management system reads the first part of the key file to be compressed into the cache after the compression process is started. And compressing the data in the cache; when the key value storage management system compresses, the cache of the key value storage management system starts a prefetch process, and the latter part of the key value file to be compressed is required, Preloading into the cache; after the first part of the data is compressed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; The user reads and writes the request, and the cache of the key value storage management system adopts a caching algorithm optimized for the foreground request.
  • the cache algorithm optimized for the foreground request includes: read and write requests to the foreground user, use the length of the flash page as the cache granularity, perform cache management, and read requests to the foreground user. , no prefetch processing; when the cache space is insufficient, the cache data is replaced according to the principle of least recently used.
  • the priority-based scheduling policy specifically includes: for a read/write request of a foreground user, the key value storage management system gives a high priority to scheduling when scheduling For the read and write request generated by the background data compression operation, the key value storage management system gives its low priority and schedules when scheduling; in the same priority level, the read request takes precedence over the write request for scheduling; In addition to the request, the key value storage management system dynamically adjusts its priority according to the usage of the current flash device when scheduling.
  • the key value storage management system dynamically adjusts the priority of the wipe request, specifically: the key value storage management system records the available space of the current flash device; when the available space is less than the third storage of the total storage space When the ratio is set, the key value storage management system gives the highest priority to the erasure request, and the erasure request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key value storage management system gives The erase request has the lowest priority and the erase request is scheduled at the latest.
  • the third predetermined ratio is 40%.
  • the flash value-based storage path optimization key value storage management method removes the file system and the flash translation layer in the traditional key value storage management system structure on the storage architecture, and directly pairs the key value data in the user state.
  • Management of storage on flash memory eliminates the problem of semantic isolation and redundancy management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification; the physicality of key-value data
  • the concurrent data layout method by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the internal concurrency advantage of the flash is utilized. Improve the effective bandwidth of data reading and writing, and reduce the read and write delay.
  • When compressing key-value data use dynamic compression method to dynamically reduce the amount of compressed data written according to the read/write ratio of the foreground user.
  • the number of channels can reduce the interference of the front-end user's reading and writing when the data is compressed in the background; cache the key-value data
  • Using the compression-aware caching algorithm prioritizes the elimination of compressed data, and uses more space to store the user's read and write data, which can effectively improve the cache hit rate; when scheduling read and write requests to the flash, use priority-based
  • the scheduling policy which preferentially schedules user and foreground compression requests, can reduce the delays visible to the user. Therefore, the method improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the service life of the device.
  • FIG. 1 is a flow chart of a method for managing a key value storage based on a flash memory based storage path optimization according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing an implementation principle of a flash value-based storage path optimization key value storage management method according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a concurrent data layout in accordance with one embodiment of the present invention.
  • FIG. 4 is a schematic diagram of dynamic compression in accordance with one embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a buffering algorithm for compressed sensing according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a priority based scheduling policy in accordance with one embodiment of the present invention.
  • connection In the description of the present invention, it should be noted that the terms “installation”, “connected”, and “connected” are to be understood broadly, and may be fixed or detachable, for example, unless otherwise explicitly defined and defined. Connected, or integrally connected; can be mechanical or electrical; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of the two components.
  • Connected, or integrally connected can be mechanical or electrical; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of the two components.
  • the specific meaning of the above terms in the present invention can be understood in a specific case by those skilled in the art.
  • FIG. 1 is a flow chart of a flash memory based storage path optimized key value storage management method in accordance with one embodiment of the present invention. As shown in Figure 1, the method includes the following steps:
  • Step S1 directly manage the bare flash device through the key value storage management system, bypassing the file system and the flash translation layer.
  • the bare flash memory device refers to a flash memory storage device in which the flash memory storage device removes the flash memory conversion layer of the conventional flash memory.
  • the bare flash device directly exports the internal structure information of the device to the user mode, and through a specific interface, enables the key value storage management system to directly manage the bare flash device in the user state to bypass the traditional key value storage management system.
  • the internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory.
  • the bare flash device includes, for example, a flash disk device that removes the flash translation layer, and transfers the flash device internal information and control commands to the kernel mode device driver of the user state key value storage management system.
  • the key value storage management system directly manages the bare flash device, including: the key value storage management system allocates a physical flash page for the key file to be written; the key value storage management system uses the erase request to reclaim the used physical flash memory. Space, this process is also called garbage collection; the key value storage management system caches the read and written data; the key value storage management system schedules read requests, write requests, and erase requests to the flash device.
  • Step S2 When the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key value file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key value storage management system sets the key value.
  • the storage file is stored as an integer multiple of the flash block.
  • the concurrent data layout method specifically includes: the number of flash channels and the flash block size transmitted by the key value storage management system through the bare flash device, and the length of the key file, wherein the length of the key file is set. Is the product of the number of flash channels and the length of the flash block; when storing the key file, the key value storage system divides the key file into units of flash block length, and distributes different flash blocks to different flash channels; The data in the value file is distributed in the flash block in a polling manner.
  • Step S3 When the key value storage management system performs data compression, the dynamic compression method is adopted, and the compressed data is dynamically written by using a corresponding number of flash channels according to the access characteristics of the foreground user.
  • the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determines the read/write ratio requested by the foreground user, and when the user write ratio is greater than the first preset ratio (ie, indicating a high write ratio), the key value storage management system uses all the flash channels to write the compressed key file.
  • the user requests the read ratio is higher than the second preset ratio (ie, the read ratio is higher).
  • the key value storage management system uses half of the flash channel to write compressed files; the key value storage management system records the user request type between two consecutive compression operations to determine the read and write ratio; The front-end compression request, the key value storage management system will use all the flash channels for data writing.
  • the method for determining the ratio of the read and write requests by the foreground user includes: the key value storage management system records the number of read and write requests of the foreground user between the two compressions; and by comparing the ratio of the number of times the user reads the request and the number of write requests, Whether the current user has a higher read ratio or a higher write ratio.
  • the method for compressing data specifically includes: when the data storage is performed, the key value storage management system first reads the key value file that needs to be compressed, and the key value storage management system reads multiple files at the same time. The fixed length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, all the files to be compressed are read; the key value storage management system will The compressed data is written to the flash device and the compression process ends.
  • Step S4 When the key value storage management system performs data caching, the compressed sensing cache algorithm is used to cache the compressed data, and the saved space is used to cache the user's read and write data.
  • the compressed sensing cache algorithm specifically includes: after the compression process is started, the key value storage management system reads the first part of the key file to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and the latter part of the data of the key file to be compressed is preloaded into the cache; in the first part of the data compression After the completion, the key value storage management system compresses the data of the pre-fetched part. At this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the read and write requests of the user in the foreground, the cache of the key value storage management system is adopted.
  • the cache algorithm optimized for the foreground request includes: a read and write request to the foreground user, a length of the flash page is used for the cache granularity, and the cache management is performed; and the read request to the foreground user is not performed. Prefetch processing; when the cache space is insufficient, the cache data is replaced according to the least recently used principle.
  • Step S5 When the key value storage management system performs the request scheduling, the priority-based scheduling policy is adopted, and the compression request of the user and the foreground is preferentially scheduled, and the priority of the erasure request scheduling is determined according to the available space of the current flash storage device.
  • the priority-based scheduling policy specifically includes: for the read/write request of the foreground user, the key value storage management system gives the high priority to perform scheduling during scheduling; and generates the background data compression operation.
  • Read and write request the key value storage management system gives its low priority to schedule when scheduling; in the same priority level, the read request is prioritized over the write request for scheduling; for the erase request, the key value storage management system is When scheduling, its priority is dynamically adjusted based on the current flash device usage.
  • the key value storage management system dynamically adjusts the priority of the wipe request, and specifically includes: the key value storage management system records the available space of the current flash device; when the available space is less than the third preset ratio of the total storage space, the key The value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key storage management system gives the lowest priority to the erase request, In addition to requesting the latest schedule.
  • the third preset ratio is, for example, 40%.
  • the key value storage management system records the available space of the current flash memory device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; When the available space is greater than 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
  • the key value storage management system is directly deployed on the bare flash memory device, and the bare flash memory device uses the specific interface to read the internal structure information and read of the flash memory.
  • the write and erase control interfaces are passed to the key value storage management system; the key value storage management system distributes the key value files on the physical unit of the flash memory through the concurrent data layout method; the key value storage management system adopts a dynamic compression method according to the foreground The user's read/write ratio dynamically selects the number of flash channels used for compression; the key value storage management system uses a compression-aware caching algorithm to adopt a strategy of prefetching and prioritizing the compressed key-value data, and data requested by the foreground user.
  • Adopting a strategy of non-prefetching, the least recently used policy to replace; the key value storage management system uses a priority-based scheduling policy to schedule read, write, and erase requests of the flash device, wherein the user's read and write requests are prioritized Level is higher than the priority of the background compression request, the priority of the erase request is current Memory device is inversely proportional to the available space.
  • the bare flash device is a SSD device that removes the flash translation layer and passes the internal information and control commands of the flash device to the user-mode key value storage management software through a specific interface.
  • the internal information of the device includes: the number of channels of the flash device, the length of the flash block, the length of the flash page, and the capacity of the flash chip; the control commands of the device include: a read flash page, a write flash page, and an erase flash block command.
  • the key value storage management system can directly control the space management and garbage collection of the underlying flash memory device, without the participation of the file system and the flash translation layer, eliminating the original
  • the problem of functional redundancy and semantic isolation caused by the existence of these two layers reduces the overhead on the system software and improves the overall performance of the system.
  • the middleware function of the method in the entire storage path has been specifically optimized, including: optimization of concurrent data layout for data layout distribution; dynamic compression method for data compression; data cache
  • a compression-aware caching algorithm is designed.
  • a priority-based scheduling strategy is designed for request scheduling.
  • the method uses the key value management system to directly manage the underlying flash memory device, bypassing the redundancy overhead of the file system and the flash translation layer, and optimizing the storage path, thereby reducing the software overhead of the key value storage management system during data storage and improving The performance of the system; at the same time, reduce the amount of writing to the flash device, which is conducive to the improvement of the life of the device.
  • FIG. 2 is a schematic diagram showing an implementation principle of a flash value based storage path optimization key value storage management method according to an embodiment of the present invention.
  • the functional structure of the flash-based storage path optimized key value storage management method is mainly divided into three parts, namely user-level key value storage management software and kernel-mode bare flash memory.
  • Drive bare flash device.
  • the bare flash device consists of a flash chip, flash channel and flash control firmware that removes the flash translation layer of the traditional SSD.
  • the bare flash drive runs in the kernel state of the operating system, which is responsible for the internal information of the bare flash device, including: number of flash channels, flash page size, flash block size, flash chip capacity, and read, write, and erase command interfaces.
  • the user-mode key value storage management software at the same time, the bare flash drive is also responsible for forwarding read, write, and erase requests from the user state key value storage management software to the bare flash device.
  • User-mode key-value storage management software which manages the internal information of the flash memory and the read, write, and erase control interfaces transmitted by the bare flash drive, and directly manages the underlying flash memory hardware in the user state, through concurrent data layout, dynamic compression, and compressed sensing.
  • the cache algorithm and the priority-based scheduling policy optimize the storage path.
  • Figure 3 is a functional diagram of the concurrent data layout.
  • the key value storage management software sets the length of the key file to a length of four flash blocks distributed in four flash co-channels, called a block group.
  • the key value storage management system writes the key value file
  • the flash blocks corresponding to the key value file are distributed on different flash channels. Therefore, the key file can be simultaneously and concurrently written into the blocks of the four channels of the flash memory, and the internal concurrency characteristics of the flash device are exerted in the write operation of the file.
  • the key data is distributed in four flash blocks in a polling manner.
  • the consecutive key values are distributed in four flash channels in a polling manner, that is, the data will be simultaneously read out from the flash device, and the file is read.
  • the internal concurrency features of the flash device are utilized.
  • the key value storage management system manages the flash device in units of block groups, which reduces the management of the physical space of the flash memory by the key value storage management system. Overhead.
  • FIG. 1 a functional schematic diagram of dynamic compression is shown in FIG.
  • the key value storage management system records the number of read and write requests of the foreground user between the two compressions; by comparing the ratio of the number of times the user reads the request and the number of write requests, it can be determined whether the current user has a higher read ratio or a higher write ratio. When the user's write ratio is high, the key value storage management system will use all the flash channels to write the compressed key file to relieve the pressure of the front user data writing.
  • the system uses Full concurrency writes to the compressed file; when the user's request has a high read ratio, the key value storage management system uses half of the flash channel to write the compressed file to reduce the interference of the background data compression on the foreground user read request. Reduce the delay of the user's read request, as shown by the dotted line in Figure 4, semi-concurrency compression; for the user's foreground compression request, the key value storage management system uses all the flash channels to write data to reduce the user's waiting time. Delay.
  • the key value file to be compressed is first read, and the key value storage management system simultaneously reads the fixed files in the file to be compressed.
  • the length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, until all the files to be compressed are read; the key value storage management system will compress The post data is written to the flash device and the compression process ends.
  • the functional structure diagram of the compressed sensing cache algorithm is shown in Figure 5.
  • the key value storage management system After the compression process is started, the key value storage management system reads the first part of the plurality of key value files that need to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and preloads the latter part of the data of the plurality of key value files that need to be compressed into the cache; After the data compression is completed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the front-end user read and write request, the key value storage management system
  • the cache uses a caching algorithm optimized for foreground requests.
  • the length of the flash page is used as the cache granularity for cache management; the read request to the foreground user is not prefetched; when the cache space is insufficient, the cached data is used according to the least recently used principle. Replace it.
  • the key value storage management system uses a priority-based scheduling policy to schedule a request, such as a priority-based scheduling policy diagram as shown in FIG. 6.
  • a priority-based scheduling policy diagram as shown in FIG. 6.
  • the key value storage management system gives high priority to the scheduling when scheduling, and for the read and write requests generated by the background data compression operation, the key value storage management system gives the low priority to the scheduling.
  • Level, scheduling in the same priority level, the read request is prioritized over the write request; for the erase request, the key value storage management system dynamically adjusts its priority according to the current flash device usage during scheduling.
  • the key value storage management system records the available space of the current flash device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is larger than At 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
  • the flash value-based storage path optimization key value storage management method removes the file system and the flash translation layer in the traditional key value storage management system structure in the storage architecture, and directly in the user state.
  • Key-value data is managed on the storage of flash memory, eliminating the problem of semantic isolation and redundant management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification;
  • On the physical storage of data using the concurrent data layout method, by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the flash memory is utilized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

本发明提出一种基于闪存的存储路径优化的键值存储管理方法,包括:键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层;在进行物理空间分配时,将键值文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时将键值存储文件以闪存块的整数倍进行存储;在进行数据压缩时,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入;在进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存;在进行请求调度时,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。本发明提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。

Description

基于闪存的存储路径优化的键值存储管理方法
相关申请的交叉引用
本申请要求清华大学于2017年09月11日提交的、发明名称为“基于闪存的存储路径优化的键值存储管理方法”的、中国专利申请号“201710812821.9”的优先权。
技术领域
本发明涉及闪存存储技术领域,特别涉及一种基于闪存的存储路径优化的键值存储管理方法。
背景技术
闪存是一种电子式可擦除编程存储器。与传统的磁盘介质相比,闪存具有读写带宽高、访问延迟低、功耗低、稳定性强的特点,目前已经开始在数据中心、个人电脑、移动设备上普及。闪存以页为单位进行读写,当闪存重写一个页之前,需要先进行擦除操作。闪存以块为单位进行擦除,一个闪存块中包含几百个闪存页。闪存的单元具有有限次的擦写操作,即每个闪存单元具有有限的寿命。闪存盘的内部结构和磁盘也有显著的区别。在闪存盘内部,闪存芯片通过不同的通道连接到闪存控制器。在闪存芯片中,封装有多个闪存颗粒,每个颗粒可以独立执行指令。每个颗粒包含多个闪存片,每个闪存片拥有独立的寄存器,可提供多闪存片之间的流水指令执行。通过不同级别的并发,固态盘可提供充足的访问带宽。这一特性被称为闪存设备的内部并发。
目前通用的固态盘是以闪存盘为基础构建的,通过使用闪存转换层对闪存的读写擦除、内部并发进行管理,并向软件***提供与传统磁盘相同的读写接口。闪存转换层主要包含三种功能:地址映射、垃圾回收和磨损均衡。地址映射是用于记录闪存异地更新过程中逻辑地址和闪存物理地址的映射关系;垃圾回收是选择并擦除失效的闪存页,以恢复空闲状态,留作新数据写入;磨损均衡是为保证数据的可靠存储,将数据均匀的写入每个闪存页中。除此之外,闪存转换层还包含ECC(Error Correction Code,错误检查和纠正)校验、坏块管理等功能。
闪存转换层对闪存的特有性质进行管理,对上层的软件透明。这样现有的软件程序能够在不做任何修改的情况下,运行在固态盘上,节省了软件二次开发和部署的成本。但是将现有的软件,如键值存储管理软件直接部署到闪存存储***上还存在很多的问题和额外 的开销。现有的键值存储管理软件是运行在文件***之上的。文件***在功能上,与闪存转换层存在着冗余。例如文件***与闪存转换层都具有空间分配、垃圾回收、寻址的功能。这些冗余的功能,给软件的性能带来了额外的软件开销。同时,文件***与闪存转换层存在着语义隔离的问题。闪存转换层不能针对上层的键值存储管理软件做特定的优化,而文件***也无法感知底层闪存的特性,发挥其内部并发的特性。另一个突出的问题是,在整个当前键值存储***的存储路径上,所有的软件中间件是基于以前磁盘的特性进行设计和开发的,没有根据闪存和键值存储***的特性进行优化,不能发挥两者的性能优势。目前针对如何优化基于闪存的键值存储的存储路径的研究较少。
发明内容
本发明旨在至少解决上述技术问题之一。
为此,本发明的目的在于提出一种基于闪存的存储路径优化的键值存储管理方法,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。
为了实现上述目的,本发明的实施例提出了一种基于闪存的存储路径优化的键值存储管理方法,包括以下步骤:S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层;S2:在所述键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储;S3:在所述键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入;S4:在所述键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据;S5:在所述键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。
另外,根据本发明上述实施例的基于闪存的存储路径优化的键值存储管理方法还可以具有如下附加的技术特征:
在一些示例中,在所述S1中,所述裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层,其中,所述设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。
在一些示例中,在所述S2中,所述并发数据布局方法,具体包括:所述键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度;所述键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪 存块分布到不同的闪存通道中;键值文件中的数据在闪存块中,以轮询的方式进行分布。
在一些示例中,在所述S3中,所述动态压缩方法,具体包括:所述键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例时,所述键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例时,所述键值存储管理***会使用一半的闪存通道写入压缩文件;所述键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;对于用户的前台压缩请求,所述键值存储管理***会使用所有的闪存通道,进行数据写入。
在一些示例中,压缩数据的方法,具体包括:所述键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,所述键值存储管理***会同时读取多个文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;所述键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。
在一些示例中,在所述S4中,所述压缩感知的缓存算法,具体包括:所述键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;在所述键值存储管理***压缩时,所述键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,所述键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,所述键值存储管理***的缓存采用针对前台请求优化的缓存算法。
在一些示例中,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。
在一些示例中,在所述S5中,所述基于优先级的调度策略,具体包括:对于前台用户的读写请求,所述键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,所述键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,所述键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。
在一些示例中,所述键值存储管理***动态调整擦出请求的优先级,具体包括:所述键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用 空间大于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。
在一些示例中,所述第三预设比例为40%。
根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,在存储架构上,去除了传统键值存储管理***结构中的文件***和闪存转换层,在用户态直接对键值数据在闪存上的存储进行管理,消除了由文件***和闪存转换层带来的语义隔离、冗余管理的问题,避免了其带来的额外垃圾回收和写放大的开销;在键值数据的物理存储上,使用并发数据布局方法,通过将键值文件的大小设计成闪存块的整数倍,并将属于同一个键值文件的闪存块分布到不同的闪存通道中,从而发挥闪存的内部并发优势,提升数据读取和写入的有效带宽,降低读写延迟;在对键值数据进行压缩时,使用动态压缩方法,根据前台用户的读写比例,动态的减少压缩数据写入时所占的通道数量,能够降低数据在后台压缩时,对前台用户读写的干扰;在对键值数据进行缓存时,使用压缩感知的缓存算法,优先淘汰压缩数据,将更多的空间用来存储用户的读写数据,能够有效提高缓存的命中率;在对闪存的读写请求进行调度时,使用基于优先级的调度策略,优先调度用户和前台的压缩请求,能够减少用户可见的延迟。因此,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:
图1是根据一个本发明实施例的基于闪存的存储路径优化的键值存储管理方法的流程图;
图2是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的实现原理示意图;
图3是根据本发明一个实施例的并发数据布局的示意图;
图4是根据本发明一个实施例的动态压缩的示意图;
图5是根据本发明一个实施例的压缩感知的缓存算法的示意图;
图6是根据本发明一个实施例的基于优先级的调度策略的示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。
在本发明的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本发明中的具体含义。
以下结合附图描述根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法。
图1是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的流程图。如图1所示,该方法包括以下步骤:
步骤S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层。
具体地,在步骤S1中,所述的裸闪存设备是指闪存存储设备去除了传统闪存固态盘的闪存转换层。裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层。其中,设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。
换言之,即,裸闪存设备例如包括:去除闪存转换层的闪存盘设备,将闪存设备内部信息和控制命令传递到用户态键值存储管理***的内核态设备驱动。键值存储管理***直接对裸闪存设备进行管理,包括:键值存储管理***为需要写入的键值文件,分配物理闪存页;键值存储管理***使用擦除请求,回收用过的物理闪存空间,这一过程也叫垃圾回收;键值存储管理***对读取和写入的数据进行缓存;键值存储管理***对发向闪存设备的读请求、写请求、擦除请求进行调度。
步骤S2:在键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值 文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储。
具体地,在步骤S2中,并发数据布局方法,具体包括:键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度,其中,键值文件的长度是闪存通道数和闪存块长度的乘积;键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪存块分布到不同的闪存通道中;键值文件中的数据在闪存块中,以轮询的方式进行分布。
步骤S3:在键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入。
具体地,在步骤S3中,动态压缩方法具体包括:键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例(即表示写比例较高)时,键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例(即表示读比例较高)时,键值存储管理***会使用一半的闪存通道写入压缩文件;键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;对于用户的前台压缩请求,键值存储管理***会使用所有的闪存通道,进行数据写入。其中,判断前台用户请求读写比例的方法,包括:键值存储管理***对两次压缩之间的前台用户读写请求次数进行记录;通过比较用户读请求和写请求的次数的比,能够得出当前用户是读比例较高还是写比例较高。
其中,在上述过程中,压缩数据的方法,具体包括:键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,键值存储管理***会同时读取多个文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。
步骤S4:在键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据。
具体地,在步骤S4中,压缩感知的缓存算法,具体包括:键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;在键值存储管理***压缩时,键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,键值存储管理***的缓存采用针对前台请求优化的缓存算法。
其中,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。
步骤S5:在键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。
具体地,在步骤S5中,基于优先级的调度策略,具体包括:对于前台用户的读写请求,键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。
其中,键值存储管理***动态调整擦出请求的优先级的方法,具体包括:键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的第三预设比例时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的第三预设比例时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。其中,第三预设比例例如为40%。换言之,即键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的40%时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的40%时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。
综上,根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,键值存储管理***直接部署在裸闪存设备上,裸闪存设备通过特定借口将闪存的内部结构信息和读、写、擦除控制接口传递到键值存储管理***;键值存储管理***通过并发数据布局方法,对键值文件在闪存的物理单元上进行分布;键值存储管理***采用动态压缩方法,根据前台用户的读写比例,动态选择压缩使用的闪存通道数量;键值存储管理***,通过压缩感知的缓存算法,对压缩的键值数据采取预取和优先淘汰的策略,对前台用户请求的数据,采取非预取,最近最少使用的策略进行替换的策略;键值存储管理***采用基于优先级的调度策略对闪存设备的读、写、擦除请求进行调度,其中,用户的读写请求的优先级高于后台压缩请求的优先级,擦除请求的优先级与当前闪存设备的可用空间成反比。
进一步地,裸闪存设备是固态盘设备去除了闪存转换层,并通过特定的接口,将闪存设备的内部信息和控制命令传递到用户态的键值存储管理软件中。其中,设备的内部信息包括:闪存设备的通道数量、闪存块的长度、闪存页的长度、闪存芯片的容量;设备的控制命令包括:读闪存页、写闪存页、擦除闪存块命令。通过将这些信息和控制命令传递到 键值存储管理***中,使得键值存储管理***能直接控制底层闪存设备的空间管理、垃圾回收等工作,无需文件***、闪存转换层的参与,消除了原有这两层的存在带来的功能冗余、语义隔离的问题,减小了***软件上的开销,提高***整体性能。
因此,该方法在整个存储路径上的中间件功能都得到了特定的优化,具体包括:针对数据的布局分布,设计了并发数据布局的优化;针对数据压缩,设计了动态压缩方法;针对数据缓存,设计了压缩感知的缓存算法;针对请求调度,设计了基于优先级的调度策略。这些优化措施,减少了键值存储管理***在数据存储时的软件开销,提高了***的性能;同时,减少了对闪存设备的写入量,利于设备寿命的提升。换言之,即该方法使用键值管理***直接管理底层闪存设备,绕过文件***与闪存转换层的冗余开销,优化存储路径,从而减少了键值存储管理***在数据存储时的软件开销,提高了***的性能;同时,减少了对闪存设备的写入量,利于设备寿命的提升。
为了便于更好地理解本发明,以下结合附图及具体的实施例,对本发明上述实施例的基于闪存的存储路径优化的键值存储管理方法进行示例性描述。
图2是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的实现原理示意图。结合图2所示,在具体实施例中,该基于闪存的存储路径优化的键值存储管理方法的功能结构主要分成三个部分,分别是用户态的键值存储管理软件、内核态的裸闪存驱动、裸闪存设备。裸闪存设备是由闪存芯片,闪存通道和闪存控制固件构成的,其上去掉了传统固态盘的闪存转换层。裸闪存驱动运行在操作***的内核态,该驱动负责将裸闪存设备的内部信息,包括:闪存通道数量、闪存页大小、闪存块大小、闪存芯片容量,以及读、写、擦除命令接口传递到用户态的键值存储管理软件;同时,裸闪存驱动还负责将用户态键值存储管理软件发出的读、写、擦除请求转发到裸闪存设备上。用户态的键值存储管理软件,通过裸闪存驱动传递的闪存内部信息和读、写、擦除控制接口,在用户态直接对底层闪存硬件进行管理,通过并发数据布局、动态压缩、压缩感知的缓存算法和基于优先级的调度策略,对存储路径进行优化。
图3是并发数据布局的功能示意图。在本发明的实施例中,键值存储管理软件将键值文件的长度设置为四个闪存块的长度分布在四个闪存同道中,称为块组。当键值存储管理***写入键值文件时,键值文件对应的闪存块,分布在不同的闪存通道上。因此,键值文件能够同时,并行的写入闪存的四个通道中的块上,在文件的写操作上,发挥了闪存设备的内部并发特性。在键值文件的内部,键值数据是以轮询的方式分布在四个闪存块中。当键值文件中的连续数据被读取时,连续的键值是以轮询的方式分布在四个闪存通道中,即这些数据将同时并发地从闪存设备中读出,在文件的读操作上,发挥了闪存设备的内部并发特性。同时,因为在管理上,键值文件与块组的长度是一致的,键值存储管理***以块 组为单位对闪存设备进行管理,这减少了键值存储管理***对闪存物理空间管理上的开销。
在本发明的实施例中,动态压缩的功能示意图如图4所示。键值存储管理***对两次压缩之间的前台用户读写请求次数进行记录;通过比较用户读请求和写请求的次数的比,能够得出当前用户是读比例较高还是写比例较高。当用户的写比例较高时,键值存储管理***会使用所有的闪存通道写入压缩后的键值文件,以缓解前台用户数据写入的压力,如图4的实线所示,***使用全并发度写入压缩文件;当用户的请求中,读比例较高时,键值存储管理***会使用一半的闪存通道写入压缩文件,以减少后台的数据压缩对前台用户读请求的干扰,降低用户读请求的延迟,如图4的虚线,半并发度压缩所示;对于用户的前台压缩请求,键值存储管理***会使用所有的闪存通道,进行数据写入,以减少用户的等待时延。
在本发明的实施例中,键值存储管理***在进行数据压缩时,会先读取需要被压缩的键值文件,键值存储管理***会同时读取多个要被压缩的文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,直到所有要被压缩的文件读取完毕;键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。压缩感知的缓存算法的功能结构图如图5所示,键值存储管理***在压缩过程启动后,将需要被压缩的多个键值文件的首部分读入到缓存中,并对缓存中的数据进行压缩;在键值存储管理***压缩时,键值存储管理***的缓存会启动预取过程,将需要被压缩的多个键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,键值存储管理***的缓存采用针对前台请求优化的缓存算法。对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。
在本发明的实施例中,键值存储管理***使用基于优先级的调度策略对请求进行调度,如基于优先级的调度策略示意图如图6所示。对于前台用户的读写请求,键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的40%时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的40%时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。
综上,根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,在存储架构上,去除了传统键值存储管理***结构中的文件***和闪存转换层,在用户态直接对键值数据在闪存上的存储进行管理,消除了由文件***和闪存转换层带来的语义隔离、冗余管理的问题,避免了其带来的额外垃圾回收和写放大的开销;在键值数据的物理存储上,使用并发数据布局方法,通过将键值文件的大小设计成闪存块的整数倍,并将属于同一个键值文件的闪存块分布到不同的闪存通道中,从而发挥闪存的内部并发优势,提升数据读取和写入的有效带宽,降低读写延迟;在对键值数据进行压缩时,使用动态压缩方法,根据前台用户的读写比例,动态的减少压缩数据写入时所占的通道数量,能够降低数据在后台压缩时,对前台用户读写的干扰;在对键值数据进行缓存时,使用压缩感知的缓存算法,优先淘汰压缩数据,将更多的空间用来存储用户的读写数据,能够有效提高缓存的命中率;在对闪存的读写请求进行调度时,使用基于优先级的调度策略,优先调度用户和前台的压缩请求,能够减少用户可见的延迟。因此,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同限定。

Claims (10)

  1. 一种基于闪存的存储路径优化的键值存储管理方法,其特征在于,包括以下步骤:
    S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层;
    S2:在所述键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储;
    S3:在所述键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入;
    S4:在所述键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据;
    S5:在所述键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。
  2. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S1中,所述裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层,其中,所述设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。
  3. 根据权利要求1所述基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S2中,所述并发数据布局方法,具体包括:
    所述键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度;
    所述键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪存块分布到不同的闪存通道中;
    键值文件中的数据在闪存块中,以轮询的方式进行分布。
  4. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S3中,所述动态压缩方法,具体包括:
    所述键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例时,所述键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例时,所述键值存储管理***会使用一半的闪存通道写入压缩文件;
    所述键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;
    对于用户的前台压缩请求,所述键值存储管理***会使用所有的闪存通道,进行数据写入。
  5. 根据权利要求4所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,压缩数据的方法,具体包括:
    所述键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,所述键值存储管理***会同时读取多个文件中的固定长度到缓存中;
    在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;
    所述键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。
  6. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S4中,所述压缩感知的缓存算法,具体包括:
    所述键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;
    在所述键值存储管理***压缩时,所述键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;
    在第一部分数据压缩完毕后,所述键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;
    对于前台的用户读写请求,所述键值存储管理***的缓存采用针对前台请求优化的缓存算法。
  7. 根据权利要求6所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:
    对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;
    对前台用户的读请求,不进行预取处理;
    当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。
  8. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S5中,所述基于优先级的调度策略,具体包括:
    对于前台用户的读写请求,所述键值存储管理***在调度时,给予其高优先级,进行调度;
    对于后台数据压缩操作产生的读写请求,所述键值存储管理***在调度时,给予其低优先级,进行调度;
    在同一个优先级别中,读请求优先于写请求进行调度;
    对于擦除请求,所述键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。
  9. 根据权利要求8所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,所述键值存储管理***动态调整擦出请求的优先级,具体包括:
    所述键值存储管理***记录当前闪存设备的可用空间;
    当可用空间小于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;
    当可用空间大于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。
  10. 根据权利要求9所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,所述第三预设比例为40%。
PCT/CN2018/094909 2017-09-11 2018-07-06 基于闪存的存储路径优化的键值存储管理方法 WO2019047612A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710812821.9 2017-09-11
CN201710812821.9A CN107678685B (zh) 2017-09-11 2017-09-11 基于闪存的存储路径优化的键值存储管理方法

Publications (1)

Publication Number Publication Date
WO2019047612A1 true WO2019047612A1 (zh) 2019-03-14

Family

ID=61135865

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/094909 WO2019047612A1 (zh) 2017-09-11 2018-07-06 基于闪存的存储路径优化的键值存储管理方法

Country Status (2)

Country Link
CN (1) CN107678685B (zh)
WO (1) WO2019047612A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742127A (zh) * 2021-09-16 2021-12-03 重庆大学 一种裸闪存文件***的故障恢复方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678685B (zh) * 2017-09-11 2020-01-17 清华大学 基于闪存的存储路径优化的键值存储管理方法
CN108509353A (zh) * 2018-03-14 2018-09-07 清华大学 基于裸闪存的对象存储构建方法及装置
CN113742304B (zh) * 2021-11-08 2022-02-15 杭州雅观科技有限公司 一种混合云的数据存储方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436420A (zh) * 2010-10-20 2012-05-02 微软公司 使用辅助存储器的低ram空间、高吞吐量的持久键值存储
CN102929793A (zh) * 2011-08-08 2013-02-13 株式会社东芝 包括键-值存储的存储器***
US20170139594A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Key-value integrated translation layer
CN107066498A (zh) * 2016-12-30 2017-08-18 成都华为技术有限公司 键值kv存储方法和装置
CN107678685A (zh) * 2017-09-11 2018-02-09 清华大学 基于闪存的存储路径优化的键值存储管理方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8634247B1 (en) * 2012-11-09 2014-01-21 Sandisk Technologies Inc. NAND flash based content addressable memory
CN106469198B (zh) * 2016-08-31 2019-10-15 华为技术有限公司 键值存储方法、装置及***

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436420A (zh) * 2010-10-20 2012-05-02 微软公司 使用辅助存储器的低ram空间、高吞吐量的持久键值存储
CN102929793A (zh) * 2011-08-08 2013-02-13 株式会社东芝 包括键-值存储的存储器***
US20170139594A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Key-value integrated translation layer
CN107066498A (zh) * 2016-12-30 2017-08-18 成都华为技术有限公司 键值kv存储方法和装置
CN107678685A (zh) * 2017-09-11 2018-02-09 清华大学 基于闪存的存储路径优化的键值存储管理方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742127A (zh) * 2021-09-16 2021-12-03 重庆大学 一种裸闪存文件***的故障恢复方法

Also Published As

Publication number Publication date
CN107678685A (zh) 2018-02-09
CN107678685B (zh) 2020-01-17

Similar Documents

Publication Publication Date Title
US10013177B2 (en) Low write amplification in solid state drive
US10489295B2 (en) Systems and methods for managing cache pre-fetch
US10095613B2 (en) Storage device and data processing method thereof
US9256527B2 (en) Logical to physical address mapping in storage systems comprising solid state memory devices
US8914600B2 (en) Selective data storage in LSB and MSB pages
Wu et al. GCaR: Garbage collection aware cache management with improved performance for flash-based SSDs
US10572379B2 (en) Data accessing method and data accessing apparatus
US10203876B2 (en) Storage medium apparatus, method, and program for storing non-contiguous regions
WO2019047612A1 (zh) 基于闪存的存储路径优化的键值存储管理方法
US20140082323A1 (en) Address mapping
KR20210130829A (ko) 캐시 이동을 비휘발성 대량 메모리 시스템에 제공하기 위한 장치 및 방법
CN110413537B (zh) 一种面向混合固态硬盘的闪存转换层及转换方法
US11144464B2 (en) Information processing device, access controller, information processing method, and computer program for issuing access requests from a processor to a sub-processor
US20170228191A1 (en) Systems and methods for suppressing latency in non-volatile solid state devices
WO2016056104A1 (ja) ストレージ装置、及び、記憶制御方法
US20190303019A1 (en) Memory device and computer system for improving read performance and reliability
Mativenga et al. RFTL: Improving performance of selective caching-based page-level FTL through replication
KR101180288B1 (ko) 하이브리드 메모리와 ssd 로 구성된 시스템에서의 읽기 캐시 및 쓰기 캐시 관리 방법
US20240020014A1 (en) Method for Writing Data to Solid-State Drive
CN110908595B (zh) 存储装置及信息处理***
JP6254986B2 (ja) 情報処理装置、アクセスコントローラ、および情報処理方法
JP6243884B2 (ja) 情報処理装置、プロセッサ、および情報処理方法
KR102088945B1 (ko) 메모리 컨트롤러 및 이를 포함하는 스토리지 디바이스
US11036414B2 (en) Data storage device and control method for non-volatile memory with high-efficiency garbage collection
CN114185492A (zh) 一种基于强化学习的固态硬盘垃圾回收算法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18852815

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18852815

Country of ref document: EP

Kind code of ref document: A1