WO2019047612A1 - Key value storage and management method for flash-based storage path optimization - Google Patents

Key value storage and management method for flash-based storage path optimization Download PDF

Info

Publication number
WO2019047612A1
WO2019047612A1 PCT/CN2018/094909 CN2018094909W WO2019047612A1 WO 2019047612 A1 WO2019047612 A1 WO 2019047612A1 CN 2018094909 W CN2018094909 W CN 2018094909W WO 2019047612 A1 WO2019047612 A1 WO 2019047612A1
Authority
WO
WIPO (PCT)
Prior art keywords
key value
flash
value storage
storage management
management system
Prior art date
Application number
PCT/CN2018/094909
Other languages
French (fr)
Chinese (zh)
Inventor
陆游游
舒继武
张佳程
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2019047612A1 publication Critical patent/WO2019047612A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks

Definitions

  • the present invention relates to the field of flash memory storage technologies, and in particular, to a key value storage management method based on flash memory storage path optimization.
  • Flash memory is an electronic erasable programming memory. Compared with traditional disk media, flash memory has the characteristics of high read/write bandwidth, low access latency, low power consumption, and strong stability. It has begun to spread in data centers, personal computers, and mobile devices. The flash memory is read and written in units of pages. Before the flash memory rewrites a page, it needs to be erased first. Flash memory is erased in blocks, and a flash block contains hundreds of flash pages. The cells of the flash have a limited number of erase operations, ie each flash cell has a limited lifetime. The internal structure of the flash drive and the disk are also significantly different. Inside the flash drive, the flash chip is connected to the flash controller through different channels.
  • a flash chip In a flash chip, a plurality of flash granules are packaged, and each granule can execute instructions independently. Each particle contains multiple flash slices, each with separate registers that provide pipelined instruction execution between multiple flash slices. With different levels of concurrency, SSDs provide sufficient access bandwidth. This feature is known as the internal concurrency of flash devices.
  • the current general-purpose SSD is built on a flash drive. It uses the flash translation layer to manage the read and write erase of the flash and internal concurrency, and provides the same read and write interface to the software system as the traditional disk.
  • the flash translation layer consists of three main functions: address mapping, garbage collection, and wear leveling.
  • the address mapping is used to record the mapping relationship between the logical address and the physical address of the flash during the remote update of the flash memory; garbage collection is to select and erase the invalid flash page to restore the idle state and keep the new data written; wear leveling is to ensure Reliable storage of data, evenly writing data to each flash page.
  • the flash translation layer also includes ECC (Error Correction Code), bad block management and other functions.
  • the flash translation layer manages the unique properties of flash and is transparent to the upper layers of software. This allows existing software programs to run on SSDs without any modifications, saving the cost of secondary software development and deployment.
  • existing software such as key-value storage management software
  • Existing key value storage management software runs on top of the file system.
  • the file system is functionally redundant with the flash translation layer. For example, the file system and the flash translation layer have the functions of space allocation, garbage collection, and addressing. These redundant features bring additional software overhead to the performance of the software.
  • the flash translation layer cannot be specifically optimized for the upper-level key-value storage management software, and the file system cannot sense the characteristics of the underlying flash memory and exert its internal concurrency.
  • Another outstanding problem is that all software middleware is designed and developed based on the characteristics of the previous disk in the storage path of the current key storage system. It is not optimized according to the characteristics of the flash and key storage system. Take advantage of the performance of both. There is currently little research on how to optimize the storage path for flash-based key-value storage.
  • the present invention aims to solve at least one of the above technical problems.
  • the object of the present invention is to provide a flash value-based storage path optimization key value storage management method, which improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the use of the device. life.
  • an embodiment of the present invention provides a flash value-based storage path optimization key value storage management method, including the following steps: S1: directly managing a bare flash memory device through a key value storage management system, bypassing files System and flash translation layer; S2: when the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key The value storage management system stores the key value storage file as an integer multiple of the flash block; S3: when the key value storage management system performs data compression, adopts a dynamic compression method, and dynamically adopts a corresponding quantity according to the access characteristics of the foreground user.
  • the flash channel writes the compressed data; S4: when the key value storage management system performs data caching, the compressed sensing cache algorithm is used, and the compressed data is not cached, and the saved space is used for caching the user's reading.
  • flash value-based storage path optimization key value storage management method may further have the following additional technical features:
  • the bare flash device directly exports the internal structure information of the device to the user state, and through a specific interface, enables the key value storage management system to be in the user mode, directly to the bare flash memory.
  • the device is managed to bypass the file system and the flash translation layer in the traditional key storage management system structure, wherein the internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory. .
  • the concurrent data layout method specifically includes: the number of flash channels and the flash block size that are transmitted by the key value storage management system through the bare flash device, and the length of the key file is set;
  • the key value storage management system divides the key file in units of flash block lengths when storing the key value file, and distributes different flash blocks to different flash channels; the data in the key file is in the flash block. , distributed in a polling manner.
  • the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determining the read/write ratio requested by the foreground user, when the user When the write ratio is greater than the first preset ratio, the key value storage management system writes the compressed key file using all the flash channels, and when the read ratio of the user is higher than the second preset ratio, The key value storage management system writes the compressed file using half of the flash channel; the key value storage management system performs the judgment of the read/write ratio by recording the type of the user request between two consecutive compression operations; Compressing the request, the key value storage management system uses all flash channels for data writing.
  • the method for compressing data specifically includes: the key value storage management system first reads a key value file that needs to be compressed when performing data compression, and the key value storage management system reads multiple times at the same time.
  • the fixed length in the file is in the cache; after the key value data in the cache is compressed, the subsequent data in the plurality of files is read, and so on, all the files to be compressed are read; the key value storage The management system writes the compressed data to the flash device and the compression process ends.
  • the compression-aware caching algorithm specifically includes: the key value storage management system reads the first part of the key file to be compressed into the cache after the compression process is started. And compressing the data in the cache; when the key value storage management system compresses, the cache of the key value storage management system starts a prefetch process, and the latter part of the key value file to be compressed is required, Preloading into the cache; after the first part of the data is compressed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; The user reads and writes the request, and the cache of the key value storage management system adopts a caching algorithm optimized for the foreground request.
  • the cache algorithm optimized for the foreground request includes: read and write requests to the foreground user, use the length of the flash page as the cache granularity, perform cache management, and read requests to the foreground user. , no prefetch processing; when the cache space is insufficient, the cache data is replaced according to the principle of least recently used.
  • the priority-based scheduling policy specifically includes: for a read/write request of a foreground user, the key value storage management system gives a high priority to scheduling when scheduling For the read and write request generated by the background data compression operation, the key value storage management system gives its low priority and schedules when scheduling; in the same priority level, the read request takes precedence over the write request for scheduling; In addition to the request, the key value storage management system dynamically adjusts its priority according to the usage of the current flash device when scheduling.
  • the key value storage management system dynamically adjusts the priority of the wipe request, specifically: the key value storage management system records the available space of the current flash device; when the available space is less than the third storage of the total storage space When the ratio is set, the key value storage management system gives the highest priority to the erasure request, and the erasure request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key value storage management system gives The erase request has the lowest priority and the erase request is scheduled at the latest.
  • the third predetermined ratio is 40%.
  • the flash value-based storage path optimization key value storage management method removes the file system and the flash translation layer in the traditional key value storage management system structure on the storage architecture, and directly pairs the key value data in the user state.
  • Management of storage on flash memory eliminates the problem of semantic isolation and redundancy management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification; the physicality of key-value data
  • the concurrent data layout method by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the internal concurrency advantage of the flash is utilized. Improve the effective bandwidth of data reading and writing, and reduce the read and write delay.
  • When compressing key-value data use dynamic compression method to dynamically reduce the amount of compressed data written according to the read/write ratio of the foreground user.
  • the number of channels can reduce the interference of the front-end user's reading and writing when the data is compressed in the background; cache the key-value data
  • Using the compression-aware caching algorithm prioritizes the elimination of compressed data, and uses more space to store the user's read and write data, which can effectively improve the cache hit rate; when scheduling read and write requests to the flash, use priority-based
  • the scheduling policy which preferentially schedules user and foreground compression requests, can reduce the delays visible to the user. Therefore, the method improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the service life of the device.
  • FIG. 1 is a flow chart of a method for managing a key value storage based on a flash memory based storage path optimization according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing an implementation principle of a flash value-based storage path optimization key value storage management method according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a concurrent data layout in accordance with one embodiment of the present invention.
  • FIG. 4 is a schematic diagram of dynamic compression in accordance with one embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a buffering algorithm for compressed sensing according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a priority based scheduling policy in accordance with one embodiment of the present invention.
  • connection In the description of the present invention, it should be noted that the terms “installation”, “connected”, and “connected” are to be understood broadly, and may be fixed or detachable, for example, unless otherwise explicitly defined and defined. Connected, or integrally connected; can be mechanical or electrical; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of the two components.
  • Connected, or integrally connected can be mechanical or electrical; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of the two components.
  • the specific meaning of the above terms in the present invention can be understood in a specific case by those skilled in the art.
  • FIG. 1 is a flow chart of a flash memory based storage path optimized key value storage management method in accordance with one embodiment of the present invention. As shown in Figure 1, the method includes the following steps:
  • Step S1 directly manage the bare flash device through the key value storage management system, bypassing the file system and the flash translation layer.
  • the bare flash memory device refers to a flash memory storage device in which the flash memory storage device removes the flash memory conversion layer of the conventional flash memory.
  • the bare flash device directly exports the internal structure information of the device to the user mode, and through a specific interface, enables the key value storage management system to directly manage the bare flash device in the user state to bypass the traditional key value storage management system.
  • the internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory.
  • the bare flash device includes, for example, a flash disk device that removes the flash translation layer, and transfers the flash device internal information and control commands to the kernel mode device driver of the user state key value storage management system.
  • the key value storage management system directly manages the bare flash device, including: the key value storage management system allocates a physical flash page for the key file to be written; the key value storage management system uses the erase request to reclaim the used physical flash memory. Space, this process is also called garbage collection; the key value storage management system caches the read and written data; the key value storage management system schedules read requests, write requests, and erase requests to the flash device.
  • Step S2 When the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key value file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key value storage management system sets the key value.
  • the storage file is stored as an integer multiple of the flash block.
  • the concurrent data layout method specifically includes: the number of flash channels and the flash block size transmitted by the key value storage management system through the bare flash device, and the length of the key file, wherein the length of the key file is set. Is the product of the number of flash channels and the length of the flash block; when storing the key file, the key value storage system divides the key file into units of flash block length, and distributes different flash blocks to different flash channels; The data in the value file is distributed in the flash block in a polling manner.
  • Step S3 When the key value storage management system performs data compression, the dynamic compression method is adopted, and the compressed data is dynamically written by using a corresponding number of flash channels according to the access characteristics of the foreground user.
  • the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determines the read/write ratio requested by the foreground user, and when the user write ratio is greater than the first preset ratio (ie, indicating a high write ratio), the key value storage management system uses all the flash channels to write the compressed key file.
  • the user requests the read ratio is higher than the second preset ratio (ie, the read ratio is higher).
  • the key value storage management system uses half of the flash channel to write compressed files; the key value storage management system records the user request type between two consecutive compression operations to determine the read and write ratio; The front-end compression request, the key value storage management system will use all the flash channels for data writing.
  • the method for determining the ratio of the read and write requests by the foreground user includes: the key value storage management system records the number of read and write requests of the foreground user between the two compressions; and by comparing the ratio of the number of times the user reads the request and the number of write requests, Whether the current user has a higher read ratio or a higher write ratio.
  • the method for compressing data specifically includes: when the data storage is performed, the key value storage management system first reads the key value file that needs to be compressed, and the key value storage management system reads multiple files at the same time. The fixed length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, all the files to be compressed are read; the key value storage management system will The compressed data is written to the flash device and the compression process ends.
  • Step S4 When the key value storage management system performs data caching, the compressed sensing cache algorithm is used to cache the compressed data, and the saved space is used to cache the user's read and write data.
  • the compressed sensing cache algorithm specifically includes: after the compression process is started, the key value storage management system reads the first part of the key file to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and the latter part of the data of the key file to be compressed is preloaded into the cache; in the first part of the data compression After the completion, the key value storage management system compresses the data of the pre-fetched part. At this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the read and write requests of the user in the foreground, the cache of the key value storage management system is adopted.
  • the cache algorithm optimized for the foreground request includes: a read and write request to the foreground user, a length of the flash page is used for the cache granularity, and the cache management is performed; and the read request to the foreground user is not performed. Prefetch processing; when the cache space is insufficient, the cache data is replaced according to the least recently used principle.
  • Step S5 When the key value storage management system performs the request scheduling, the priority-based scheduling policy is adopted, and the compression request of the user and the foreground is preferentially scheduled, and the priority of the erasure request scheduling is determined according to the available space of the current flash storage device.
  • the priority-based scheduling policy specifically includes: for the read/write request of the foreground user, the key value storage management system gives the high priority to perform scheduling during scheduling; and generates the background data compression operation.
  • Read and write request the key value storage management system gives its low priority to schedule when scheduling; in the same priority level, the read request is prioritized over the write request for scheduling; for the erase request, the key value storage management system is When scheduling, its priority is dynamically adjusted based on the current flash device usage.
  • the key value storage management system dynamically adjusts the priority of the wipe request, and specifically includes: the key value storage management system records the available space of the current flash device; when the available space is less than the third preset ratio of the total storage space, the key The value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key storage management system gives the lowest priority to the erase request, In addition to requesting the latest schedule.
  • the third preset ratio is, for example, 40%.
  • the key value storage management system records the available space of the current flash memory device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; When the available space is greater than 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
  • the key value storage management system is directly deployed on the bare flash memory device, and the bare flash memory device uses the specific interface to read the internal structure information and read of the flash memory.
  • the write and erase control interfaces are passed to the key value storage management system; the key value storage management system distributes the key value files on the physical unit of the flash memory through the concurrent data layout method; the key value storage management system adopts a dynamic compression method according to the foreground The user's read/write ratio dynamically selects the number of flash channels used for compression; the key value storage management system uses a compression-aware caching algorithm to adopt a strategy of prefetching and prioritizing the compressed key-value data, and data requested by the foreground user.
  • Adopting a strategy of non-prefetching, the least recently used policy to replace; the key value storage management system uses a priority-based scheduling policy to schedule read, write, and erase requests of the flash device, wherein the user's read and write requests are prioritized Level is higher than the priority of the background compression request, the priority of the erase request is current Memory device is inversely proportional to the available space.
  • the bare flash device is a SSD device that removes the flash translation layer and passes the internal information and control commands of the flash device to the user-mode key value storage management software through a specific interface.
  • the internal information of the device includes: the number of channels of the flash device, the length of the flash block, the length of the flash page, and the capacity of the flash chip; the control commands of the device include: a read flash page, a write flash page, and an erase flash block command.
  • the key value storage management system can directly control the space management and garbage collection of the underlying flash memory device, without the participation of the file system and the flash translation layer, eliminating the original
  • the problem of functional redundancy and semantic isolation caused by the existence of these two layers reduces the overhead on the system software and improves the overall performance of the system.
  • the middleware function of the method in the entire storage path has been specifically optimized, including: optimization of concurrent data layout for data layout distribution; dynamic compression method for data compression; data cache
  • a compression-aware caching algorithm is designed.
  • a priority-based scheduling strategy is designed for request scheduling.
  • the method uses the key value management system to directly manage the underlying flash memory device, bypassing the redundancy overhead of the file system and the flash translation layer, and optimizing the storage path, thereby reducing the software overhead of the key value storage management system during data storage and improving The performance of the system; at the same time, reduce the amount of writing to the flash device, which is conducive to the improvement of the life of the device.
  • FIG. 2 is a schematic diagram showing an implementation principle of a flash value based storage path optimization key value storage management method according to an embodiment of the present invention.
  • the functional structure of the flash-based storage path optimized key value storage management method is mainly divided into three parts, namely user-level key value storage management software and kernel-mode bare flash memory.
  • Drive bare flash device.
  • the bare flash device consists of a flash chip, flash channel and flash control firmware that removes the flash translation layer of the traditional SSD.
  • the bare flash drive runs in the kernel state of the operating system, which is responsible for the internal information of the bare flash device, including: number of flash channels, flash page size, flash block size, flash chip capacity, and read, write, and erase command interfaces.
  • the user-mode key value storage management software at the same time, the bare flash drive is also responsible for forwarding read, write, and erase requests from the user state key value storage management software to the bare flash device.
  • User-mode key-value storage management software which manages the internal information of the flash memory and the read, write, and erase control interfaces transmitted by the bare flash drive, and directly manages the underlying flash memory hardware in the user state, through concurrent data layout, dynamic compression, and compressed sensing.
  • the cache algorithm and the priority-based scheduling policy optimize the storage path.
  • Figure 3 is a functional diagram of the concurrent data layout.
  • the key value storage management software sets the length of the key file to a length of four flash blocks distributed in four flash co-channels, called a block group.
  • the key value storage management system writes the key value file
  • the flash blocks corresponding to the key value file are distributed on different flash channels. Therefore, the key file can be simultaneously and concurrently written into the blocks of the four channels of the flash memory, and the internal concurrency characteristics of the flash device are exerted in the write operation of the file.
  • the key data is distributed in four flash blocks in a polling manner.
  • the consecutive key values are distributed in four flash channels in a polling manner, that is, the data will be simultaneously read out from the flash device, and the file is read.
  • the internal concurrency features of the flash device are utilized.
  • the key value storage management system manages the flash device in units of block groups, which reduces the management of the physical space of the flash memory by the key value storage management system. Overhead.
  • FIG. 1 a functional schematic diagram of dynamic compression is shown in FIG.
  • the key value storage management system records the number of read and write requests of the foreground user between the two compressions; by comparing the ratio of the number of times the user reads the request and the number of write requests, it can be determined whether the current user has a higher read ratio or a higher write ratio. When the user's write ratio is high, the key value storage management system will use all the flash channels to write the compressed key file to relieve the pressure of the front user data writing.
  • the system uses Full concurrency writes to the compressed file; when the user's request has a high read ratio, the key value storage management system uses half of the flash channel to write the compressed file to reduce the interference of the background data compression on the foreground user read request. Reduce the delay of the user's read request, as shown by the dotted line in Figure 4, semi-concurrency compression; for the user's foreground compression request, the key value storage management system uses all the flash channels to write data to reduce the user's waiting time. Delay.
  • the key value file to be compressed is first read, and the key value storage management system simultaneously reads the fixed files in the file to be compressed.
  • the length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, until all the files to be compressed are read; the key value storage management system will compress The post data is written to the flash device and the compression process ends.
  • the functional structure diagram of the compressed sensing cache algorithm is shown in Figure 5.
  • the key value storage management system After the compression process is started, the key value storage management system reads the first part of the plurality of key value files that need to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and preloads the latter part of the data of the plurality of key value files that need to be compressed into the cache; After the data compression is completed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the front-end user read and write request, the key value storage management system
  • the cache uses a caching algorithm optimized for foreground requests.
  • the length of the flash page is used as the cache granularity for cache management; the read request to the foreground user is not prefetched; when the cache space is insufficient, the cached data is used according to the least recently used principle. Replace it.
  • the key value storage management system uses a priority-based scheduling policy to schedule a request, such as a priority-based scheduling policy diagram as shown in FIG. 6.
  • a priority-based scheduling policy diagram as shown in FIG. 6.
  • the key value storage management system gives high priority to the scheduling when scheduling, and for the read and write requests generated by the background data compression operation, the key value storage management system gives the low priority to the scheduling.
  • Level, scheduling in the same priority level, the read request is prioritized over the write request; for the erase request, the key value storage management system dynamically adjusts its priority according to the current flash device usage during scheduling.
  • the key value storage management system records the available space of the current flash device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is larger than At 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
  • the flash value-based storage path optimization key value storage management method removes the file system and the flash translation layer in the traditional key value storage management system structure in the storage architecture, and directly in the user state.
  • Key-value data is managed on the storage of flash memory, eliminating the problem of semantic isolation and redundant management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification;
  • On the physical storage of data using the concurrent data layout method, by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the flash memory is utilized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

Provided is a key value storage and management method for flash-based storage path optimization, the method comprising: a key value storage and management system directly managing a bare flash device by bypassing a file system and a flash translation layer; when allocating a physical space, distributing a key value file to different flash channels of the flash device in units of flash blocks, and storing a key value storage file with an integral multiple of the flash blocks; when compressing data, according to an access characteristic of a foreground user, dynamically using a corresponding number of flash channels to write in the compressed data; when caching data, using a compression sensing caching algorithm, and not caching the compressed data; and when performing request scheduling, preferentially scheduling a compression request of a user and the foreground, and determining the priority of erase request scheduling according to the available space of the current flash storage device. The present invention improves the performance of a key value storage system, reduces the write-in amount of a flash device, and prolongs the service life of the device.

Description

基于闪存的存储路径优化的键值存储管理方法Flash-based storage path optimization key value storage management method
相关申请的交叉引用Cross-reference to related applications
本申请要求清华大学于2017年09月11日提交的、发明名称为“基于闪存的存储路径优化的键值存储管理方法”的、中国专利申请号“201710812821.9”的优先权。The present application claims priority from the Chinese patent application No. "201710812821.9" filed on September 11, 2017 by Tsinghua University, entitled "Key Value Storage Management Method Based on Flash Memory Storage Path Optimization".
技术领域Technical field
本发明涉及闪存存储技术领域,特别涉及一种基于闪存的存储路径优化的键值存储管理方法。The present invention relates to the field of flash memory storage technologies, and in particular, to a key value storage management method based on flash memory storage path optimization.
背景技术Background technique
闪存是一种电子式可擦除编程存储器。与传统的磁盘介质相比,闪存具有读写带宽高、访问延迟低、功耗低、稳定性强的特点,目前已经开始在数据中心、个人电脑、移动设备上普及。闪存以页为单位进行读写,当闪存重写一个页之前,需要先进行擦除操作。闪存以块为单位进行擦除,一个闪存块中包含几百个闪存页。闪存的单元具有有限次的擦写操作,即每个闪存单元具有有限的寿命。闪存盘的内部结构和磁盘也有显著的区别。在闪存盘内部,闪存芯片通过不同的通道连接到闪存控制器。在闪存芯片中,封装有多个闪存颗粒,每个颗粒可以独立执行指令。每个颗粒包含多个闪存片,每个闪存片拥有独立的寄存器,可提供多闪存片之间的流水指令执行。通过不同级别的并发,固态盘可提供充足的访问带宽。这一特性被称为闪存设备的内部并发。Flash memory is an electronic erasable programming memory. Compared with traditional disk media, flash memory has the characteristics of high read/write bandwidth, low access latency, low power consumption, and strong stability. It has begun to spread in data centers, personal computers, and mobile devices. The flash memory is read and written in units of pages. Before the flash memory rewrites a page, it needs to be erased first. Flash memory is erased in blocks, and a flash block contains hundreds of flash pages. The cells of the flash have a limited number of erase operations, ie each flash cell has a limited lifetime. The internal structure of the flash drive and the disk are also significantly different. Inside the flash drive, the flash chip is connected to the flash controller through different channels. In a flash chip, a plurality of flash granules are packaged, and each granule can execute instructions independently. Each particle contains multiple flash slices, each with separate registers that provide pipelined instruction execution between multiple flash slices. With different levels of concurrency, SSDs provide sufficient access bandwidth. This feature is known as the internal concurrency of flash devices.
目前通用的固态盘是以闪存盘为基础构建的,通过使用闪存转换层对闪存的读写擦除、内部并发进行管理,并向软件***提供与传统磁盘相同的读写接口。闪存转换层主要包含三种功能:地址映射、垃圾回收和磨损均衡。地址映射是用于记录闪存异地更新过程中逻辑地址和闪存物理地址的映射关系;垃圾回收是选择并擦除失效的闪存页,以恢复空闲状态,留作新数据写入;磨损均衡是为保证数据的可靠存储,将数据均匀的写入每个闪存页中。除此之外,闪存转换层还包含ECC(Error Correction Code,错误检查和纠正)校验、坏块管理等功能。The current general-purpose SSD is built on a flash drive. It uses the flash translation layer to manage the read and write erase of the flash and internal concurrency, and provides the same read and write interface to the software system as the traditional disk. The flash translation layer consists of three main functions: address mapping, garbage collection, and wear leveling. The address mapping is used to record the mapping relationship between the logical address and the physical address of the flash during the remote update of the flash memory; garbage collection is to select and erase the invalid flash page to restore the idle state and keep the new data written; wear leveling is to ensure Reliable storage of data, evenly writing data to each flash page. In addition, the flash translation layer also includes ECC (Error Correction Code), bad block management and other functions.
闪存转换层对闪存的特有性质进行管理,对上层的软件透明。这样现有的软件程序能够在不做任何修改的情况下,运行在固态盘上,节省了软件二次开发和部署的成本。但是将现有的软件,如键值存储管理软件直接部署到闪存存储***上还存在很多的问题和额外 的开销。现有的键值存储管理软件是运行在文件***之上的。文件***在功能上,与闪存转换层存在着冗余。例如文件***与闪存转换层都具有空间分配、垃圾回收、寻址的功能。这些冗余的功能,给软件的性能带来了额外的软件开销。同时,文件***与闪存转换层存在着语义隔离的问题。闪存转换层不能针对上层的键值存储管理软件做特定的优化,而文件***也无法感知底层闪存的特性,发挥其内部并发的特性。另一个突出的问题是,在整个当前键值存储***的存储路径上,所有的软件中间件是基于以前磁盘的特性进行设计和开发的,没有根据闪存和键值存储***的特性进行优化,不能发挥两者的性能优势。目前针对如何优化基于闪存的键值存储的存储路径的研究较少。The flash translation layer manages the unique properties of flash and is transparent to the upper layers of software. This allows existing software programs to run on SSDs without any modifications, saving the cost of secondary software development and deployment. However, there are still many problems and additional overhead in deploying existing software, such as key-value storage management software, directly to a flash storage system. Existing key value storage management software runs on top of the file system. The file system is functionally redundant with the flash translation layer. For example, the file system and the flash translation layer have the functions of space allocation, garbage collection, and addressing. These redundant features bring additional software overhead to the performance of the software. At the same time, there is a problem of semantic isolation between the file system and the flash translation layer. The flash translation layer cannot be specifically optimized for the upper-level key-value storage management software, and the file system cannot sense the characteristics of the underlying flash memory and exert its internal concurrency. Another outstanding problem is that all software middleware is designed and developed based on the characteristics of the previous disk in the storage path of the current key storage system. It is not optimized according to the characteristics of the flash and key storage system. Take advantage of the performance of both. There is currently little research on how to optimize the storage path for flash-based key-value storage.
发明内容Summary of the invention
本发明旨在至少解决上述技术问题之一。The present invention aims to solve at least one of the above technical problems.
为此,本发明的目的在于提出一种基于闪存的存储路径优化的键值存储管理方法,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。To this end, the object of the present invention is to provide a flash value-based storage path optimization key value storage management method, which improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the use of the device. life.
为了实现上述目的,本发明的实施例提出了一种基于闪存的存储路径优化的键值存储管理方法,包括以下步骤:S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层;S2:在所述键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储;S3:在所述键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入;S4:在所述键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据;S5:在所述键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。In order to achieve the above object, an embodiment of the present invention provides a flash value-based storage path optimization key value storage management method, including the following steps: S1: directly managing a bare flash memory device through a key value storage management system, bypassing files System and flash translation layer; S2: when the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key The value storage management system stores the key value storage file as an integer multiple of the flash block; S3: when the key value storage management system performs data compression, adopts a dynamic compression method, and dynamically adopts a corresponding quantity according to the access characteristics of the foreground user. The flash channel writes the compressed data; S4: when the key value storage management system performs data caching, the compressed sensing cache algorithm is used, and the compressed data is not cached, and the saved space is used for caching the user's reading. Write data; S5: when the key value storage management system performs request scheduling, The scheduling policy of the first level preferentially schedules the compression request of the user and the foreground, and determines the priority of the erasure request scheduling according to the available space of the current flash storage device.
另外,根据本发明上述实施例的基于闪存的存储路径优化的键值存储管理方法还可以具有如下附加的技术特征:In addition, the flash value-based storage path optimization key value storage management method according to the above embodiment of the present invention may further have the following additional technical features:
在一些示例中,在所述S1中,所述裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层,其中,所述设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。In some examples, in the S1, the bare flash device directly exports the internal structure information of the device to the user state, and through a specific interface, enables the key value storage management system to be in the user mode, directly to the bare flash memory. The device is managed to bypass the file system and the flash translation layer in the traditional key storage management system structure, wherein the internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory. .
在一些示例中,在所述S2中,所述并发数据布局方法,具体包括:所述键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度;所述键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪 存块分布到不同的闪存通道中;键值文件中的数据在闪存块中,以轮询的方式进行分布。In some examples, in the S2, the concurrent data layout method specifically includes: the number of flash channels and the flash block size that are transmitted by the key value storage management system through the bare flash device, and the length of the key file is set; The key value storage management system divides the key file in units of flash block lengths when storing the key value file, and distributes different flash blocks to different flash channels; the data in the key file is in the flash block. , distributed in a polling manner.
在一些示例中,在所述S3中,所述动态压缩方法,具体包括:所述键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例时,所述键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例时,所述键值存储管理***会使用一半的闪存通道写入压缩文件;所述键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;对于用户的前台压缩请求,所述键值存储管理***会使用所有的闪存通道,进行数据写入。In some examples, in the S3, the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determining the read/write ratio requested by the foreground user, when the user When the write ratio is greater than the first preset ratio, the key value storage management system writes the compressed key file using all the flash channels, and when the read ratio of the user is higher than the second preset ratio, The key value storage management system writes the compressed file using half of the flash channel; the key value storage management system performs the judgment of the read/write ratio by recording the type of the user request between two consecutive compression operations; Compressing the request, the key value storage management system uses all flash channels for data writing.
在一些示例中,压缩数据的方法,具体包括:所述键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,所述键值存储管理***会同时读取多个文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;所述键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。In some examples, the method for compressing data specifically includes: the key value storage management system first reads a key value file that needs to be compressed when performing data compression, and the key value storage management system reads multiple times at the same time. The fixed length in the file is in the cache; after the key value data in the cache is compressed, the subsequent data in the plurality of files is read, and so on, all the files to be compressed are read; the key value storage The management system writes the compressed data to the flash device and the compression process ends.
在一些示例中,在所述S4中,所述压缩感知的缓存算法,具体包括:所述键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;在所述键值存储管理***压缩时,所述键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,所述键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,所述键值存储管理***的缓存采用针对前台请求优化的缓存算法。In some examples, in the S4, the compression-aware caching algorithm specifically includes: the key value storage management system reads the first part of the key file to be compressed into the cache after the compression process is started. And compressing the data in the cache; when the key value storage management system compresses, the cache of the key value storage management system starts a prefetch process, and the latter part of the key value file to be compressed is required, Preloading into the cache; after the first part of the data is compressed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; The user reads and writes the request, and the cache of the key value storage management system adopts a caching algorithm optimized for the foreground request.
在一些示例中,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。In some examples, when caching the request of the foreground user, the cache algorithm optimized for the foreground request includes: read and write requests to the foreground user, use the length of the flash page as the cache granularity, perform cache management, and read requests to the foreground user. , no prefetch processing; when the cache space is insufficient, the cache data is replaced according to the principle of least recently used.
在一些示例中,在所述S5中,所述基于优先级的调度策略,具体包括:对于前台用户的读写请求,所述键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,所述键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,所述键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。In some examples, in the S5, the priority-based scheduling policy specifically includes: for a read/write request of a foreground user, the key value storage management system gives a high priority to scheduling when scheduling For the read and write request generated by the background data compression operation, the key value storage management system gives its low priority and schedules when scheduling; in the same priority level, the read request takes precedence over the write request for scheduling; In addition to the request, the key value storage management system dynamically adjusts its priority according to the usage of the current flash device when scheduling.
在一些示例中,所述键值存储管理***动态调整擦出请求的优先级,具体包括:所述键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用 空间大于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。In some examples, the key value storage management system dynamically adjusts the priority of the wipe request, specifically: the key value storage management system records the available space of the current flash device; when the available space is less than the third storage of the total storage space When the ratio is set, the key value storage management system gives the highest priority to the erasure request, and the erasure request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key value storage management system gives The erase request has the lowest priority and the erase request is scheduled at the latest.
在一些示例中,所述第三预设比例为40%。In some examples, the third predetermined ratio is 40%.
根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,在存储架构上,去除了传统键值存储管理***结构中的文件***和闪存转换层,在用户态直接对键值数据在闪存上的存储进行管理,消除了由文件***和闪存转换层带来的语义隔离、冗余管理的问题,避免了其带来的额外垃圾回收和写放大的开销;在键值数据的物理存储上,使用并发数据布局方法,通过将键值文件的大小设计成闪存块的整数倍,并将属于同一个键值文件的闪存块分布到不同的闪存通道中,从而发挥闪存的内部并发优势,提升数据读取和写入的有效带宽,降低读写延迟;在对键值数据进行压缩时,使用动态压缩方法,根据前台用户的读写比例,动态的减少压缩数据写入时所占的通道数量,能够降低数据在后台压缩时,对前台用户读写的干扰;在对键值数据进行缓存时,使用压缩感知的缓存算法,优先淘汰压缩数据,将更多的空间用来存储用户的读写数据,能够有效提高缓存的命中率;在对闪存的读写请求进行调度时,使用基于优先级的调度策略,优先调度用户和前台的压缩请求,能够减少用户可见的延迟。因此,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。The flash value-based storage path optimization key value storage management method according to the embodiment of the present invention removes the file system and the flash translation layer in the traditional key value storage management system structure on the storage architecture, and directly pairs the key value data in the user state. Management of storage on flash memory eliminates the problem of semantic isolation and redundancy management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification; the physicality of key-value data On the storage, using the concurrent data layout method, by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the internal concurrency advantage of the flash is utilized. Improve the effective bandwidth of data reading and writing, and reduce the read and write delay. When compressing key-value data, use dynamic compression method to dynamically reduce the amount of compressed data written according to the read/write ratio of the foreground user. The number of channels can reduce the interference of the front-end user's reading and writing when the data is compressed in the background; cache the key-value data Using the compression-aware caching algorithm, prioritizes the elimination of compressed data, and uses more space to store the user's read and write data, which can effectively improve the cache hit rate; when scheduling read and write requests to the flash, use priority-based The scheduling policy, which preferentially schedules user and foreground compression requests, can reduce the delays visible to the user. Therefore, the method improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the service life of the device.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。The additional aspects and advantages of the invention will be set forth in part in the description which follows.
附图说明DRAWINGS
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and readily understood from
图1是根据一个本发明实施例的基于闪存的存储路径优化的键值存储管理方法的流程图;1 is a flow chart of a method for managing a key value storage based on a flash memory based storage path optimization according to an embodiment of the present invention;
图2是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的实现原理示意图;2 is a schematic diagram showing an implementation principle of a flash value-based storage path optimization key value storage management method according to an embodiment of the present invention;
图3是根据本发明一个实施例的并发数据布局的示意图;3 is a schematic diagram of a concurrent data layout in accordance with one embodiment of the present invention;
图4是根据本发明一个实施例的动态压缩的示意图;4 is a schematic diagram of dynamic compression in accordance with one embodiment of the present invention;
图5是根据本发明一个实施例的压缩感知的缓存算法的示意图;FIG. 5 is a schematic diagram of a buffering algorithm for compressed sensing according to an embodiment of the present invention; FIG.
图6是根据本发明一个实施例的基于优先级的调度策略的示意图。6 is a schematic diagram of a priority based scheduling policy in accordance with one embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are intended to be illustrative of the invention and are not to be construed as limiting.
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "upper", "lower", "front", "back", "left", "right", " The orientation or positional relationship of the indications of "upright", "horizontal", "top", "bottom", "inside", "outside", etc. is based on the orientation or positional relationship shown in the drawings, only for the convenience of describing the present invention and The simplification of the description is not intended to limit or imply that the device or component that is referred to has a particular orientation, is constructed and operated in a particular orientation, and thus is not to be construed as limiting. Moreover, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
在本发明的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it should be noted that the terms "installation", "connected", and "connected" are to be understood broadly, and may be fixed or detachable, for example, unless otherwise explicitly defined and defined. Connected, or integrally connected; can be mechanical or electrical; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of the two components. The specific meaning of the above terms in the present invention can be understood in a specific case by those skilled in the art.
以下结合附图描述根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法。A flash value-based storage path optimization key value storage management method according to an embodiment of the present invention will be described below with reference to the accompanying drawings.
图1是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的流程图。如图1所示,该方法包括以下步骤:1 is a flow chart of a flash memory based storage path optimized key value storage management method in accordance with one embodiment of the present invention. As shown in Figure 1, the method includes the following steps:
步骤S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层。Step S1: directly manage the bare flash device through the key value storage management system, bypassing the file system and the flash translation layer.
具体地,在步骤S1中,所述的裸闪存设备是指闪存存储设备去除了传统闪存固态盘的闪存转换层。裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层。其中,设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。Specifically, in step S1, the bare flash memory device refers to a flash memory storage device in which the flash memory storage device removes the flash memory conversion layer of the conventional flash memory. The bare flash device directly exports the internal structure information of the device to the user mode, and through a specific interface, enables the key value storage management system to directly manage the bare flash device in the user state to bypass the traditional key value storage management system. The file system and flash translation layer in the structure. The internal structure information of the device includes at least the number of flash channels, the size of the flash block, and the read/write and erase control of the flash memory.
换言之,即,裸闪存设备例如包括:去除闪存转换层的闪存盘设备,将闪存设备内部信息和控制命令传递到用户态键值存储管理***的内核态设备驱动。键值存储管理***直接对裸闪存设备进行管理,包括:键值存储管理***为需要写入的键值文件,分配物理闪存页;键值存储管理***使用擦除请求,回收用过的物理闪存空间,这一过程也叫垃圾回收;键值存储管理***对读取和写入的数据进行缓存;键值存储管理***对发向闪存设备的读请求、写请求、擦除请求进行调度。In other words, the bare flash device includes, for example, a flash disk device that removes the flash translation layer, and transfers the flash device internal information and control commands to the kernel mode device driver of the user state key value storage management system. The key value storage management system directly manages the bare flash device, including: the key value storage management system allocates a physical flash page for the key file to be written; the key value storage management system uses the erase request to reclaim the used physical flash memory. Space, this process is also called garbage collection; the key value storage management system caches the read and written data; the key value storage management system schedules read requests, write requests, and erase requests to the flash device.
步骤S2:在键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值 文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储。Step S2: When the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key value file is distributed to different flash channels of the flash device in units of flash blocks, and at the same time, the key value storage management system sets the key value. The storage file is stored as an integer multiple of the flash block.
具体地,在步骤S2中,并发数据布局方法,具体包括:键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度,其中,键值文件的长度是闪存通道数和闪存块长度的乘积;键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪存块分布到不同的闪存通道中;键值文件中的数据在闪存块中,以轮询的方式进行分布。Specifically, in step S2, the concurrent data layout method specifically includes: the number of flash channels and the flash block size transmitted by the key value storage management system through the bare flash device, and the length of the key file, wherein the length of the key file is set. Is the product of the number of flash channels and the length of the flash block; when storing the key file, the key value storage system divides the key file into units of flash block length, and distributes different flash blocks to different flash channels; The data in the value file is distributed in the flash block in a polling manner.
步骤S3:在键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入。Step S3: When the key value storage management system performs data compression, the dynamic compression method is adopted, and the compressed data is dynamically written by using a corresponding number of flash channels according to the access characteristics of the foreground user.
具体地,在步骤S3中,动态压缩方法具体包括:键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例(即表示写比例较高)时,键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例(即表示读比例较高)时,键值存储管理***会使用一半的闪存通道写入压缩文件;键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;对于用户的前台压缩请求,键值存储管理***会使用所有的闪存通道,进行数据写入。其中,判断前台用户请求读写比例的方法,包括:键值存储管理***对两次压缩之间的前台用户读写请求次数进行记录;通过比较用户读请求和写请求的次数的比,能够得出当前用户是读比例较高还是写比例较高。Specifically, in step S3, the dynamic compression method specifically includes: when the key value storage management system performs background compression on the key value data, first determines the read/write ratio requested by the foreground user, and when the user write ratio is greater than the first preset ratio (ie, indicating a high write ratio), the key value storage management system uses all the flash channels to write the compressed key file. When the user requests, the read ratio is higher than the second preset ratio (ie, the read ratio is higher). High), the key value storage management system uses half of the flash channel to write compressed files; the key value storage management system records the user request type between two consecutive compression operations to determine the read and write ratio; The front-end compression request, the key value storage management system will use all the flash channels for data writing. The method for determining the ratio of the read and write requests by the foreground user includes: the key value storage management system records the number of read and write requests of the foreground user between the two compressions; and by comparing the ratio of the number of times the user reads the request and the number of write requests, Whether the current user has a higher read ratio or a higher write ratio.
其中,在上述过程中,压缩数据的方法,具体包括:键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,键值存储管理***会同时读取多个文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。In the above process, the method for compressing data specifically includes: when the data storage is performed, the key value storage management system first reads the key value file that needs to be compressed, and the key value storage management system reads multiple files at the same time. The fixed length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, all the files to be compressed are read; the key value storage management system will The compressed data is written to the flash device and the compression process ends.
步骤S4:在键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据。Step S4: When the key value storage management system performs data caching, the compressed sensing cache algorithm is used to cache the compressed data, and the saved space is used to cache the user's read and write data.
具体地,在步骤S4中,压缩感知的缓存算法,具体包括:键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;在键值存储管理***压缩时,键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,键值存储管理***的缓存采用针对前台请求优化的缓存算法。Specifically, in step S4, the compressed sensing cache algorithm specifically includes: after the compression process is started, the key value storage management system reads the first part of the key file to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and the latter part of the data of the key file to be compressed is preloaded into the cache; in the first part of the data compression After the completion, the key value storage management system compresses the data of the pre-fetched part. At this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the read and write requests of the user in the foreground, the cache of the key value storage management system is adopted. A cache algorithm optimized for foreground requests.
其中,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。The cache algorithm optimized for the foreground request includes: a read and write request to the foreground user, a length of the flash page is used for the cache granularity, and the cache management is performed; and the read request to the foreground user is not performed. Prefetch processing; when the cache space is insufficient, the cache data is replaced according to the least recently used principle.
步骤S5:在键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。Step S5: When the key value storage management system performs the request scheduling, the priority-based scheduling policy is adopted, and the compression request of the user and the foreground is preferentially scheduled, and the priority of the erasure request scheduling is determined according to the available space of the current flash storage device.
具体地,在步骤S5中,基于优先级的调度策略,具体包括:对于前台用户的读写请求,键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。Specifically, in step S5, the priority-based scheduling policy specifically includes: for the read/write request of the foreground user, the key value storage management system gives the high priority to perform scheduling during scheduling; and generates the background data compression operation. Read and write request, the key value storage management system gives its low priority to schedule when scheduling; in the same priority level, the read request is prioritized over the write request for scheduling; for the erase request, the key value storage management system is When scheduling, its priority is dynamically adjusted based on the current flash device usage.
其中,键值存储管理***动态调整擦出请求的优先级的方法,具体包括:键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的第三预设比例时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的第三预设比例时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。其中,第三预设比例例如为40%。换言之,即键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的40%时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的40%时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。The key value storage management system dynamically adjusts the priority of the wipe request, and specifically includes: the key value storage management system records the available space of the current flash device; when the available space is less than the third preset ratio of the total storage space, the key The value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is greater than the third preset ratio of the total storage space, the key storage management system gives the lowest priority to the erase request, In addition to requesting the latest schedule. The third preset ratio is, for example, 40%. In other words, the key value storage management system records the available space of the current flash memory device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; When the available space is greater than 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
综上,根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,键值存储管理***直接部署在裸闪存设备上,裸闪存设备通过特定借口将闪存的内部结构信息和读、写、擦除控制接口传递到键值存储管理***;键值存储管理***通过并发数据布局方法,对键值文件在闪存的物理单元上进行分布;键值存储管理***采用动态压缩方法,根据前台用户的读写比例,动态选择压缩使用的闪存通道数量;键值存储管理***,通过压缩感知的缓存算法,对压缩的键值数据采取预取和优先淘汰的策略,对前台用户请求的数据,采取非预取,最近最少使用的策略进行替换的策略;键值存储管理***采用基于优先级的调度策略对闪存设备的读、写、擦除请求进行调度,其中,用户的读写请求的优先级高于后台压缩请求的优先级,擦除请求的优先级与当前闪存设备的可用空间成反比。In summary, according to the flash value-based storage path optimization key value storage management method according to the embodiment of the present invention, the key value storage management system is directly deployed on the bare flash memory device, and the bare flash memory device uses the specific interface to read the internal structure information and read of the flash memory. The write and erase control interfaces are passed to the key value storage management system; the key value storage management system distributes the key value files on the physical unit of the flash memory through the concurrent data layout method; the key value storage management system adopts a dynamic compression method according to the foreground The user's read/write ratio dynamically selects the number of flash channels used for compression; the key value storage management system uses a compression-aware caching algorithm to adopt a strategy of prefetching and prioritizing the compressed key-value data, and data requested by the foreground user. Adopting a strategy of non-prefetching, the least recently used policy to replace; the key value storage management system uses a priority-based scheduling policy to schedule read, write, and erase requests of the flash device, wherein the user's read and write requests are prioritized Level is higher than the priority of the background compression request, the priority of the erase request is current Memory device is inversely proportional to the available space.
进一步地,裸闪存设备是固态盘设备去除了闪存转换层,并通过特定的接口,将闪存设备的内部信息和控制命令传递到用户态的键值存储管理软件中。其中,设备的内部信息包括:闪存设备的通道数量、闪存块的长度、闪存页的长度、闪存芯片的容量;设备的控制命令包括:读闪存页、写闪存页、擦除闪存块命令。通过将这些信息和控制命令传递到 键值存储管理***中,使得键值存储管理***能直接控制底层闪存设备的空间管理、垃圾回收等工作,无需文件***、闪存转换层的参与,消除了原有这两层的存在带来的功能冗余、语义隔离的问题,减小了***软件上的开销,提高***整体性能。Further, the bare flash device is a SSD device that removes the flash translation layer and passes the internal information and control commands of the flash device to the user-mode key value storage management software through a specific interface. The internal information of the device includes: the number of channels of the flash device, the length of the flash block, the length of the flash page, and the capacity of the flash chip; the control commands of the device include: a read flash page, a write flash page, and an erase flash block command. By passing these information and control commands to the key value storage management system, the key value storage management system can directly control the space management and garbage collection of the underlying flash memory device, without the participation of the file system and the flash translation layer, eliminating the original The problem of functional redundancy and semantic isolation caused by the existence of these two layers reduces the overhead on the system software and improves the overall performance of the system.
因此,该方法在整个存储路径上的中间件功能都得到了特定的优化,具体包括:针对数据的布局分布,设计了并发数据布局的优化;针对数据压缩,设计了动态压缩方法;针对数据缓存,设计了压缩感知的缓存算法;针对请求调度,设计了基于优先级的调度策略。这些优化措施,减少了键值存储管理***在数据存储时的软件开销,提高了***的性能;同时,减少了对闪存设备的写入量,利于设备寿命的提升。换言之,即该方法使用键值管理***直接管理底层闪存设备,绕过文件***与闪存转换层的冗余开销,优化存储路径,从而减少了键值存储管理***在数据存储时的软件开销,提高了***的性能;同时,减少了对闪存设备的写入量,利于设备寿命的提升。Therefore, the middleware function of the method in the entire storage path has been specifically optimized, including: optimization of concurrent data layout for data layout distribution; dynamic compression method for data compression; data cache A compression-aware caching algorithm is designed. A priority-based scheduling strategy is designed for request scheduling. These optimization measures reduce the software overhead of the key storage management system during data storage and improve the performance of the system; at the same time, the write amount of the flash memory device is reduced, which is beneficial to the improvement of the life of the device. In other words, the method uses the key value management system to directly manage the underlying flash memory device, bypassing the redundancy overhead of the file system and the flash translation layer, and optimizing the storage path, thereby reducing the software overhead of the key value storage management system during data storage and improving The performance of the system; at the same time, reduce the amount of writing to the flash device, which is conducive to the improvement of the life of the device.
为了便于更好地理解本发明,以下结合附图及具体的实施例,对本发明上述实施例的基于闪存的存储路径优化的键值存储管理方法进行示例性描述。In order to facilitate a better understanding of the present invention, a flash memory based storage path optimized key value storage management method of the above embodiment of the present invention will be exemplarily described below with reference to the accompanying drawings and specific embodiments.
图2是根据本发明一个实施例的基于闪存的存储路径优化的键值存储管理方法的实现原理示意图。结合图2所示,在具体实施例中,该基于闪存的存储路径优化的键值存储管理方法的功能结构主要分成三个部分,分别是用户态的键值存储管理软件、内核态的裸闪存驱动、裸闪存设备。裸闪存设备是由闪存芯片,闪存通道和闪存控制固件构成的,其上去掉了传统固态盘的闪存转换层。裸闪存驱动运行在操作***的内核态,该驱动负责将裸闪存设备的内部信息,包括:闪存通道数量、闪存页大小、闪存块大小、闪存芯片容量,以及读、写、擦除命令接口传递到用户态的键值存储管理软件;同时,裸闪存驱动还负责将用户态键值存储管理软件发出的读、写、擦除请求转发到裸闪存设备上。用户态的键值存储管理软件,通过裸闪存驱动传递的闪存内部信息和读、写、擦除控制接口,在用户态直接对底层闪存硬件进行管理,通过并发数据布局、动态压缩、压缩感知的缓存算法和基于优先级的调度策略,对存储路径进行优化。2 is a schematic diagram showing an implementation principle of a flash value based storage path optimization key value storage management method according to an embodiment of the present invention. As shown in FIG. 2, in a specific embodiment, the functional structure of the flash-based storage path optimized key value storage management method is mainly divided into three parts, namely user-level key value storage management software and kernel-mode bare flash memory. Drive, bare flash device. The bare flash device consists of a flash chip, flash channel and flash control firmware that removes the flash translation layer of the traditional SSD. The bare flash drive runs in the kernel state of the operating system, which is responsible for the internal information of the bare flash device, including: number of flash channels, flash page size, flash block size, flash chip capacity, and read, write, and erase command interfaces. The user-mode key value storage management software; at the same time, the bare flash drive is also responsible for forwarding read, write, and erase requests from the user state key value storage management software to the bare flash device. User-mode key-value storage management software, which manages the internal information of the flash memory and the read, write, and erase control interfaces transmitted by the bare flash drive, and directly manages the underlying flash memory hardware in the user state, through concurrent data layout, dynamic compression, and compressed sensing. The cache algorithm and the priority-based scheduling policy optimize the storage path.
图3是并发数据布局的功能示意图。在本发明的实施例中,键值存储管理软件将键值文件的长度设置为四个闪存块的长度分布在四个闪存同道中,称为块组。当键值存储管理***写入键值文件时,键值文件对应的闪存块,分布在不同的闪存通道上。因此,键值文件能够同时,并行的写入闪存的四个通道中的块上,在文件的写操作上,发挥了闪存设备的内部并发特性。在键值文件的内部,键值数据是以轮询的方式分布在四个闪存块中。当键值文件中的连续数据被读取时,连续的键值是以轮询的方式分布在四个闪存通道中,即这些数据将同时并发地从闪存设备中读出,在文件的读操作上,发挥了闪存设备的内部并发特性。同时,因为在管理上,键值文件与块组的长度是一致的,键值存储管理***以块 组为单位对闪存设备进行管理,这减少了键值存储管理***对闪存物理空间管理上的开销。Figure 3 is a functional diagram of the concurrent data layout. In an embodiment of the invention, the key value storage management software sets the length of the key file to a length of four flash blocks distributed in four flash co-channels, called a block group. When the key value storage management system writes the key value file, the flash blocks corresponding to the key value file are distributed on different flash channels. Therefore, the key file can be simultaneously and concurrently written into the blocks of the four channels of the flash memory, and the internal concurrency characteristics of the flash device are exerted in the write operation of the file. Inside the key file, the key data is distributed in four flash blocks in a polling manner. When the continuous data in the key file is read, the consecutive key values are distributed in four flash channels in a polling manner, that is, the data will be simultaneously read out from the flash device, and the file is read. On, the internal concurrency features of the flash device are utilized. At the same time, because the key file is consistent with the length of the block group in management, the key value storage management system manages the flash device in units of block groups, which reduces the management of the physical space of the flash memory by the key value storage management system. Overhead.
在本发明的实施例中,动态压缩的功能示意图如图4所示。键值存储管理***对两次压缩之间的前台用户读写请求次数进行记录;通过比较用户读请求和写请求的次数的比,能够得出当前用户是读比例较高还是写比例较高。当用户的写比例较高时,键值存储管理***会使用所有的闪存通道写入压缩后的键值文件,以缓解前台用户数据写入的压力,如图4的实线所示,***使用全并发度写入压缩文件;当用户的请求中,读比例较高时,键值存储管理***会使用一半的闪存通道写入压缩文件,以减少后台的数据压缩对前台用户读请求的干扰,降低用户读请求的延迟,如图4的虚线,半并发度压缩所示;对于用户的前台压缩请求,键值存储管理***会使用所有的闪存通道,进行数据写入,以减少用户的等待时延。In the embodiment of the present invention, a functional schematic diagram of dynamic compression is shown in FIG. The key value storage management system records the number of read and write requests of the foreground user between the two compressions; by comparing the ratio of the number of times the user reads the request and the number of write requests, it can be determined whether the current user has a higher read ratio or a higher write ratio. When the user's write ratio is high, the key value storage management system will use all the flash channels to write the compressed key file to relieve the pressure of the front user data writing. As shown by the solid line in Figure 4, the system uses Full concurrency writes to the compressed file; when the user's request has a high read ratio, the key value storage management system uses half of the flash channel to write the compressed file to reduce the interference of the background data compression on the foreground user read request. Reduce the delay of the user's read request, as shown by the dotted line in Figure 4, semi-concurrency compression; for the user's foreground compression request, the key value storage management system uses all the flash channels to write data to reduce the user's waiting time. Delay.
在本发明的实施例中,键值存储管理***在进行数据压缩时,会先读取需要被压缩的键值文件,键值存储管理***会同时读取多个要被压缩的文件中的固定长度到缓存中;在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,直到所有要被压缩的文件读取完毕;键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。压缩感知的缓存算法的功能结构图如图5所示,键值存储管理***在压缩过程启动后,将需要被压缩的多个键值文件的首部分读入到缓存中,并对缓存中的数据进行压缩;在键值存储管理***压缩时,键值存储管理***的缓存会启动预取过程,将需要被压缩的多个键值文件的后一部分数据,预先加载到缓存中;在第一部分数据压缩完毕后,键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;对于前台的用户读写请求,键值存储管理***的缓存采用针对前台请求优化的缓存算法。对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;对前台用户的读请求,不进行预取处理;当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。In the embodiment of the present invention, when the data storage management system performs data compression, the key value file to be compressed is first read, and the key value storage management system simultaneously reads the fixed files in the file to be compressed. The length is in the cache; after the key value data in the cache is compressed, the subsequent data in the multiple files is read, and so on, until all the files to be compressed are read; the key value storage management system will compress The post data is written to the flash device and the compression process ends. The functional structure diagram of the compressed sensing cache algorithm is shown in Figure 5. After the compression process is started, the key value storage management system reads the first part of the plurality of key value files that need to be compressed into the cache, and The data is compressed; when the key value storage management system is compressed, the cache of the key value storage management system starts the prefetch process, and preloads the latter part of the data of the plurality of key value files that need to be compressed into the cache; After the data compression is completed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache; for the front-end user read and write request, the key value storage management system The cache uses a caching algorithm optimized for foreground requests. For the read and write requests of the foreground user, the length of the flash page is used as the cache granularity for cache management; the read request to the foreground user is not prefetched; when the cache space is insufficient, the cached data is used according to the least recently used principle. Replace it.
在本发明的实施例中,键值存储管理***使用基于优先级的调度策略对请求进行调度,如基于优先级的调度策略示意图如图6所示。对于前台用户的读写请求,键值存储管理***在调度时,给予其高优先级,进行调度;对于后台数据压缩操作产生的读写请求,键值存储管理***在调度时,给予其低优先级,进行调度;在同一个优先级别中,读请求优先于写请求进行调度;对于擦除请求,键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。键值存储管理***记录当前闪存设备的可用空间;当可用空间小于总存储空间的40%时,键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;当可用空间大于总存储空间的40%时,键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。In an embodiment of the present invention, the key value storage management system uses a priority-based scheduling policy to schedule a request, such as a priority-based scheduling policy diagram as shown in FIG. 6. For the read and write requests of the foreground user, the key value storage management system gives high priority to the scheduling when scheduling, and for the read and write requests generated by the background data compression operation, the key value storage management system gives the low priority to the scheduling. Level, scheduling; in the same priority level, the read request is prioritized over the write request; for the erase request, the key value storage management system dynamically adjusts its priority according to the current flash device usage during scheduling. The key value storage management system records the available space of the current flash device; when the available space is less than 40% of the total storage space, the key value storage management system gives the highest priority to the erase request, and the erase request is first scheduled; when the available space is larger than At 40% of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
综上,根据本发明实施例的基于闪存的存储路径优化的键值存储管理方法,在存储架构上,去除了传统键值存储管理***结构中的文件***和闪存转换层,在用户态直接对键值数据在闪存上的存储进行管理,消除了由文件***和闪存转换层带来的语义隔离、冗余管理的问题,避免了其带来的额外垃圾回收和写放大的开销;在键值数据的物理存储上,使用并发数据布局方法,通过将键值文件的大小设计成闪存块的整数倍,并将属于同一个键值文件的闪存块分布到不同的闪存通道中,从而发挥闪存的内部并发优势,提升数据读取和写入的有效带宽,降低读写延迟;在对键值数据进行压缩时,使用动态压缩方法,根据前台用户的读写比例,动态的减少压缩数据写入时所占的通道数量,能够降低数据在后台压缩时,对前台用户读写的干扰;在对键值数据进行缓存时,使用压缩感知的缓存算法,优先淘汰压缩数据,将更多的空间用来存储用户的读写数据,能够有效提高缓存的命中率;在对闪存的读写请求进行调度时,使用基于优先级的调度策略,优先调度用户和前台的压缩请求,能够减少用户可见的延迟。因此,该方法提高了键值存储***的性能,减少了对闪存设备的写入量,延长了设备的使用寿命。In summary, the flash value-based storage path optimization key value storage management method according to the embodiment of the present invention removes the file system and the flash translation layer in the traditional key value storage management system structure in the storage architecture, and directly in the user state. Key-value data is managed on the storage of flash memory, eliminating the problem of semantic isolation and redundant management caused by the file system and flash translation layer, avoiding the overhead of additional garbage collection and write amplification; On the physical storage of data, using the concurrent data layout method, by designing the size of the key file to an integral multiple of the flash block, and distributing the flash blocks belonging to the same key file to different flash channels, the flash memory is utilized. Internal concurrency advantage, improve the effective bandwidth of data reading and writing, reduce read and write delay; use dynamic compression method to compress key data, and dynamically reduce compressed data input according to the read/write ratio of foreground users The number of channels occupied can reduce the interference of the front-end user's reading and writing when the data is compressed in the background; When caching, use the compression-aware caching algorithm to prioritize the elimination of compressed data, and use more space to store the user's read and write data, which can effectively improve the cache hit rate. When scheduling read and write requests to the flash memory, the use is based on The priority scheduling policy preferentially schedules the compression request of the user and the foreground, which can reduce the delay visible to the user. Therefore, the method improves the performance of the key value storage system, reduces the writing amount to the flash memory device, and prolongs the service life of the device.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material or feature is included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms does not necessarily mean the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples.
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同限定。While the embodiments of the present invention have been shown and described, the embodiments of the invention may The scope of the invention is defined by the claims and their equivalents.

Claims (10)

  1. 一种基于闪存的存储路径优化的键值存储管理方法,其特征在于,包括以下步骤:A flash memory based storage path optimized key value storage management method, comprising the following steps:
    S1:通过键值存储管理***直接对裸闪存设备进行管理,绕过文件***和闪存转换层;S1: directly manage the bare flash device through the key value storage management system, bypassing the file system and the flash translation layer;
    S2:在所述键值存储管理***进行物理空间分配时,采用并发数据布局方法,将键值文件以闪存块为单位分布到闪存设备的不同闪存通道上,同时,键值存储管理***将键值存储文件以闪存块的整数倍进行存储;S2: when the key value storage management system performs physical space allocation, the concurrent data layout method is adopted, and the key value file is distributed to different flash channels of the flash device in units of flash blocks, and the key value storage management system will key The value storage file is stored as an integer multiple of the flash block;
    S3:在所述键值存储管理***进行数据压缩时,采用动态压缩方法,根据前台用户的访问特征,动态的采用相应数量的闪存通道对压缩数据进行写入;S3: When the key value storage management system performs data compression, the dynamic compression method is adopted, and the compressed data is dynamically written by using a corresponding number of flash channels according to the access characteristics of the foreground user;
    S4:在所述键值存储管理***进行数据缓存时,采用压缩感知的缓存算法,对压缩的数据不进行缓存,节省出的空间用于缓存用户的读写数据;S4: When the key value storage management system performs data caching, the compressed sensing cache algorithm is used, and the compressed data is not cached, and the saved space is used to cache the user's read and write data;
    S5:在所述键值存储管理***进行请求调度时,采用基于优先级的调度策略,优先调度用户和前台的压缩请求,根据当前闪存存储设备的可用空间,判断擦除请求调度的优先级。S5: When the key value storage management system performs request scheduling, adopting a priority-based scheduling policy, preferentially scheduling a compression request of the user and the foreground, and determining a priority of the erasure request scheduling according to the available space of the current flash storage device.
  2. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S1中,所述裸闪存设备直接将设备的内部结构信息,导出到用户态,并通过特定的接口,使得键值存储管理***能够在用户态,直接对裸闪存设备进行管理,以绕过传统键值存储管理***结构中的文件***和闪存转换层,其中,所述设备的内部结构信息至少包括闪存通道数量,闪存块大小,闪存的读写、擦除控制。The flash memory-based storage path optimized key value storage management method according to claim 1, wherein in the S1, the bare flash memory device directly exports the internal structure information of the device to the user state, and passes the A specific interface enables the key value storage management system to directly manage the bare flash device in the user mode to bypass the file system and the flash translation layer in the traditional key value storage management system structure, wherein the internal structure of the device The information includes at least the number of flash channels, flash block size, flash read and write, and erase control.
  3. 根据权利要求1所述基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S2中,所述并发数据布局方法,具体包括:The flash memory-based storage path optimization key value storage management method according to claim 1, wherein in the S2, the concurrent data layout method comprises:
    所述键值存储管理***通过裸闪存设备传递的闪存通道数量和闪存块大小,设定键值文件的长度;The key value storage management system sets the length of the key file by the number of flash channels and the flash block size delivered by the bare flash device;
    所述键值存储管理***在存储键值文件时,将键值文件以闪存块长度为单位进行分割,将不同的闪存块分布到不同的闪存通道中;The key value storage management system divides the key file in units of flash block lengths when storing the key value file, and distributes different flash blocks to different flash channels;
    键值文件中的数据在闪存块中,以轮询的方式进行分布。The data in the key file is distributed in the flash block in a polling manner.
  4. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S3中,所述动态压缩方法,具体包括:The flash memory-based storage path optimization key value storage management method according to claim 1, wherein in the S3, the dynamic compression method specifically includes:
    所述键值存储管理***在对键值数据进行后台压缩时,先判断前台用户请求的读写比例,当用户的写比例大于第一预设比例时,所述键值存储管理***使用所有的闪存通道写入压缩后的键值文件,当用户的请求中,读比例高于第二预设比例时,所述键值存储管理***会使用一半的闪存通道写入压缩文件;When the key value storage management system performs background compression on the key value data, it first determines the read/write ratio requested by the foreground user. When the write ratio of the user is greater than the first preset ratio, the key value storage management system uses all the The flash channel writes the compressed key file. When the read ratio of the user is higher than the second preset ratio, the key storage management system writes the compressed file by using half of the flash channel;
    所述键值存储管理***通过对连续两次压缩操作之间的用户请求类型进行记录,来进行判断读写比例;The key value storage management system performs a judgment on the read/write ratio by recording a type of user request between two consecutive compression operations;
    对于用户的前台压缩请求,所述键值存储管理***会使用所有的闪存通道,进行数据写入。For the user's foreground compression request, the key value storage management system uses all the flash channels for data writing.
  5. 根据权利要求4所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,压缩数据的方法,具体包括:The method for managing a key value storage based on the flash memory-based storage path according to claim 4, wherein the method for compressing data specifically includes:
    所述键值存储管理***在进行数据压缩时,先读取需要被压缩的键值文件,所述键值存储管理***会同时读取多个文件中的固定长度到缓存中;The key value storage management system first reads a key value file that needs to be compressed when performing data compression, and the key value storage management system simultaneously reads a fixed length of the plurality of files into the cache;
    在将缓存中的键值数据压缩完成后,再读取多个文件中的后续数据,依次类推,所有要被压缩的文件读取完;After the key value data in the cache is compressed, the subsequent data in the plurality of files is read, and so on, and all the files to be compressed are read;
    所述键值存储管理***会将压缩后的数据写入到闪存设备中,压缩过程结束。The key value storage management system writes the compressed data to the flash device, and the compression process ends.
  6. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S4中,所述压缩感知的缓存算法,具体包括:The flash memory-based storage path optimization key value storage management method according to claim 1, wherein in the S4, the compressed sensing cache algorithm specifically includes:
    所述键值存储管理***在压缩过程启动后,将需要被压缩的键值文件的第一部分读入到缓存中,并对缓存中的数据进行压缩;After the compression process is started, the key value storage management system reads the first part of the key file to be compressed into the cache, and compresses the data in the cache;
    在所述键值存储管理***压缩时,所述键值存储管理***的缓存会启动预取过程,将需要被压缩的键值文件的后一部分数据,预先加载到缓存中;When the key value storage management system compresses, the cache of the key value storage management system starts a prefetch process, and preloads the latter part of the data of the key file to be compressed into the cache;
    在第一部分数据压缩完毕后,所述键值存储管理***对预取的后部分数据进行压缩,此时缓存将第一部分已经使用过的数据,替换出缓存;After the first part of the data compression is completed, the key value storage management system compresses the pre-fetched part of the data, and at this time, the cache replaces the data that has been used in the first part, and replaces the cache;
    对于前台的用户读写请求,所述键值存储管理***的缓存采用针对前台请求优化的缓存算法。For the user read and write request of the foreground, the cache of the key value storage management system adopts a cache algorithm optimized for the foreground request.
  7. 根据权利要求6所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在对前台用户的请求缓存时,针对前台请求优化的缓存算法包括:The flash value-based storage path optimization key value storage management method according to claim 6, wherein when the request to the foreground user is cached, the cache algorithm optimized for the foreground request includes:
    对前台用户的读写请求,使用闪存页的长度为缓存粒度,进行缓存管理;For the read and write requests of the foreground user, the length of the flash page is used as the cache granularity for cache management;
    对前台用户的读请求,不进行预取处理;No prefetch processing is performed on the read request of the foreground user;
    当缓存空间不足时,按照最近最少使用的原则,对缓存数据进行替换。When the cache space is insufficient, the cached data is replaced according to the least recently used principle.
  8. 根据权利要求1所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,在所述S5中,所述基于优先级的调度策略,具体包括:The method for managing a key value storage based on the flash memory-based storage path according to claim 1, wherein in the S5, the priority-based scheduling policy specifically includes:
    对于前台用户的读写请求,所述键值存储管理***在调度时,给予其高优先级,进行调度;For the read and write request of the foreground user, the key value storage management system gives the high priority to the scheduling when scheduling;
    对于后台数据压缩操作产生的读写请求,所述键值存储管理***在调度时,给予其低优先级,进行调度;For the read and write request generated by the background data compression operation, the key value storage management system gives a low priority to the scheduling when scheduling;
    在同一个优先级别中,读请求优先于写请求进行调度;In the same priority level, the read request is scheduled in preference to the write request;
    对于擦除请求,所述键值存储管理***在调度时,会根据当前闪存设备的使用情况,动态调整其优先级。For the erase request, the key value storage management system dynamically adjusts its priority according to the usage of the current flash device when scheduling.
  9. 根据权利要求8所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,所述键值存储管理***动态调整擦出请求的优先级,具体包括:The key value storage management method of the flash memory-based storage path optimization according to claim 8, wherein the key value storage management system dynamically adjusts the priority of the wipe request, specifically comprising:
    所述键值存储管理***记录当前闪存设备的可用空间;The key value storage management system records the available space of the current flash memory device;
    当可用空间小于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最高的优先级,擦除请求最先调度;When the available space is less than the third preset ratio of the total storage space, the key value storage management system gives the highest priority to the erasure request, and the erasure request is first scheduled;
    当可用空间大于总存储空间的第三预设比例时,所述键值存储管理***给予擦除请求最低的优先级,擦除请求最晚调度。When the available space is greater than the third predetermined ratio of the total storage space, the key value storage management system gives the lowest priority to the erase request, and the erase request is scheduled at the latest.
  10. 根据权利要求9所述的基于闪存的存储路径优化的键值存储管理方法,其特征在于,所述第三预设比例为40%。The flash value-based storage path optimization key value storage management method according to claim 9, wherein the third preset ratio is 40%.
PCT/CN2018/094909 2017-09-11 2018-07-06 Key value storage and management method for flash-based storage path optimization WO2019047612A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710812821.9A CN107678685B (en) 2017-09-11 2017-09-11 Key value storage management method based on flash memory storage path optimization
CN201710812821.9 2017-09-11

Publications (1)

Publication Number Publication Date
WO2019047612A1 true WO2019047612A1 (en) 2019-03-14

Family

ID=61135865

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/094909 WO2019047612A1 (en) 2017-09-11 2018-07-06 Key value storage and management method for flash-based storage path optimization

Country Status (2)

Country Link
CN (1) CN107678685B (en)
WO (1) WO2019047612A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742127A (en) * 2021-09-16 2021-12-03 重庆大学 Fault recovery method for bare flash memory file system

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678685B (en) * 2017-09-11 2020-01-17 清华大学 Key value storage management method based on flash memory storage path optimization
CN108509353A (en) * 2018-03-14 2018-09-07 清华大学 Object storage construction method based on naked flash memory and device
CN113742304B (en) * 2021-11-08 2022-02-15 杭州雅观科技有限公司 Data storage method of hybrid cloud

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436420A (en) * 2010-10-20 2012-05-02 微软公司 Low RAM space, high-throughput persistent key-value store using secondary memory
CN102929793A (en) * 2011-08-08 2013-02-13 株式会社东芝 Memory system including key-value store
US20170139594A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Key-value integrated translation layer
CN107066498A (en) * 2016-12-30 2017-08-18 成都华为技术有限公司 Key assignments KV storage methods and device
CN107678685A (en) * 2017-09-11 2018-02-09 清华大学 The key assignments memory management method of store path optimization based on flash memory

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8634247B1 (en) * 2012-11-09 2014-01-21 Sandisk Technologies Inc. NAND flash based content addressable memory
CN106469198B (en) * 2016-08-31 2019-10-15 华为技术有限公司 Key assignments storage method, apparatus and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436420A (en) * 2010-10-20 2012-05-02 微软公司 Low RAM space, high-throughput persistent key-value store using secondary memory
CN102929793A (en) * 2011-08-08 2013-02-13 株式会社东芝 Memory system including key-value store
US20170139594A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Key-value integrated translation layer
CN107066498A (en) * 2016-12-30 2017-08-18 成都华为技术有限公司 Key assignments KV storage methods and device
CN107678685A (en) * 2017-09-11 2018-02-09 清华大学 The key assignments memory management method of store path optimization based on flash memory

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742127A (en) * 2021-09-16 2021-12-03 重庆大学 Fault recovery method for bare flash memory file system

Also Published As

Publication number Publication date
CN107678685B (en) 2020-01-17
CN107678685A (en) 2018-02-09

Similar Documents

Publication Publication Date Title
US10013177B2 (en) Low write amplification in solid state drive
US10489295B2 (en) Systems and methods for managing cache pre-fetch
US10095613B2 (en) Storage device and data processing method thereof
US9256527B2 (en) Logical to physical address mapping in storage systems comprising solid state memory devices
US8914600B2 (en) Selective data storage in LSB and MSB pages
Wu et al. GCaR: Garbage collection aware cache management with improved performance for flash-based SSDs
US10572379B2 (en) Data accessing method and data accessing apparatus
US10203876B2 (en) Storage medium apparatus, method, and program for storing non-contiguous regions
WO2019047612A1 (en) Key value storage and management method for flash-based storage path optimization
US20140082323A1 (en) Address mapping
KR20210130829A (en) Apparatus and method to provide cache move with nonvolatile mass memory system
CN110413537B (en) Flash translation layer facing hybrid solid state disk and conversion method
US11144464B2 (en) Information processing device, access controller, information processing method, and computer program for issuing access requests from a processor to a sub-processor
US20170228191A1 (en) Systems and methods for suppressing latency in non-volatile solid state devices
WO2016056104A1 (en) Storage device and memory control method
US20190303019A1 (en) Memory device and computer system for improving read performance and reliability
Mativenga et al. RFTL: Improving performance of selective caching-based page-level FTL through replication
KR101180288B1 (en) Method for managing the read and write cache in the system comprising hybrid memory and ssd
US20240020014A1 (en) Method for Writing Data to Solid-State Drive
CN110908595B (en) Storage device and information processing system
JP6254986B2 (en) Information processing apparatus, access controller, and information processing method
JP6243884B2 (en) Information processing apparatus, processor, and information processing method
KR102088945B1 (en) Memory controller and storage device including the same
US11036414B2 (en) Data storage device and control method for non-volatile memory with high-efficiency garbage collection
CN114185492A (en) Solid state disk garbage recycling algorithm based on reinforcement learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18852815

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18852815

Country of ref document: EP

Kind code of ref document: A1