CN113312300A

CN113312300A - Nonvolatile memory caching method integrating data transmission and storage

Info

Publication number: CN113312300A
Application number: CN202110670041.1A
Authority: CN
Inventors: 康亮; 童飞文; 马名; 马可
Original assignee: Shanghai Phegda Technology Co ltd; SHANGHAI DRAGONNET TECHNOLOGY CO LTD
Current assignee: Shanghai Phegda Technology Co ltd; SHANGHAI DRAGONNET TECHNOLOGY CO LTD
Priority date: 2021-06-17
Filing date: 2021-06-17
Publication date: 2021-08-27
Anticipated expiration: 2041-06-17
Also published as: CN113312300B

Abstract

The invention relates to a nonvolatile memory caching method for integrating data transmission and storage. Compared with the prior art, the method has the advantages that zero memory copy is realized in the whole storage system access process; establishing a read/write separated cache system, wherein the read cache adopts a linear mapping mode, and the write cache adopts a log mode; when the cache dirty data is exchanged to the back-end hard disk, the repeated write-in requests can be removed, and the random write-in requests can be sequenced and combined, so that the method has the advantages of high speed and high performance.

Description

Nonvolatile memory caching method integrating data transmission and storage

Technical Field

The invention relates to a data storage method, in particular to a nonvolatile memory caching method integrating data transmission and storage.

Background

A non-volatile memory (NVRAM) is a non-volatile storage medium having memory access characteristics in both hardware and software, and is located in a memory slot in hardware, and supports a memory address access manner in software. Currently, the price and performance of the NVRAM of a unit capacity are between the memory and the Solid State Disk (SSD), and the capacity of a single NVRAM is also between the memory and the SSD; in the current server, the NVRAM capacity cannot be large due to the limitation of factors such as the number of memory slots and the price, so that the NVRAM is suitable for being used as a cache of a hard disk (especially an SSD).

The storage device or the system can guarantee a certain degree of atomic operation to avoid data damage in case of failure. The atomic operation granularity of the block device is generally one sector (512 bytes), while the atomic operation granularity of the nonvolatile memory device is generally one CPU cache line length (8 bytes), so that the nonvolatile memory needs to be at least simulated by software as storage to ensure the atomic operation of a sector unit.

Currently, a nonvolatile cache software system is generally constructed on a block device system, and needs to access a cache medium through a block device interface, that is, a nonvolatile memory needs to be simulated as a block device, and then data access is realized through a memory copy method, which may cause the following disadvantages:

1. the block device interface needs to run a program of a general block device layer, and NVRAM access delay is increased;

2. a Remote Direct Memory Access (RDMA) transfer technique cannot directly access a block device and requires memory transfer through an operating system;

3. data transmission between the memory and the NVRAM needs to be copied through the CPU, a large amount of CPU resources are consumed, and the response speed of a storage system is low;

4. the NVRAM simulates a blocking device, CPU resources are additionally consumed to solve the problem of atomic operation of sector units, and the method can only ensure the atomicity of each sector in the write request and cannot ensure the atomicity of the whole write request;

5. the existing cache technology generally uses a read/write integrated cache system, the cache granularity is relatively small, effective data combination cannot be carried out, and a hard disk becomes a performance bottleneck due to large amount of data exchange;

6. the read/write integrated cache system cannot guarantee the integrity of the data write request, and the write request is partially updated.

Disclosure of Invention

The present invention aims to overcome the defects of the prior art and provide a high-speed and high-performance nonvolatile memory caching method for merging data transmission and storage, which can merge a transmission protocol into data storage, achieve zero memory copy of a caching system, ensure atomicity of a write-in request, maximally reduce system delay, reduce CPU resource consumption, and exert the memory characteristics of an NVRAM, thereby improving the performance of the caching system.

The purpose of the invention can be realized by the following technical scheme:

a nonvolatile memory caching method integrating data transmission and storage directly accesses nonvolatile memory data of a service end node in a network transmission process through an RDMA technology, and executes corresponding caching operation based on a data request and a mapping relation between caching resources and a hard disk space.

Furthermore, the nonvolatile memory is corresponding to a cache resource pool,

the nonvolatile memory is divided into a data area having a plurality of chunks chunk for storing cache data, a metadata area creating 2 metadata chunks meta0 and meta1 for each chunk, and an index area creating a valid index for each chunk,

one quadruple (index, meta0, meta1, chunk) constitutes one cache data resource, and all the quadruples in the nonvolatile memory constitute the cache resource pool.

Further, an atom updating method is adopted to update the metadata of the data area.

Furthermore, in the hard disk space, each hard disk is allocated with a unique hard disk ID, each hard disk is divided into a plurality of logical spaces according to the size of the data block and a linear mapping mode, a hard disk Hashtable is constructed based on logical space migration, and each logical space has at least one cache data block.

Further, the data request contains at least one triplet (ID, offset, length), which indicates the ID of the hard disk based on the ID, offset indicates the position offset of the accessed hard disk, length indicates the data length of the accessed hard disk,

after receiving the data request, inquiring or creating a required logic space based on the hard disk ID and the corresponding hard disk Hashtable, inquiring the related nonvolatile memory data block, mapping data to the cache data block, calculating the memory address of the data request in the cache data block, establishing RDMA operation, and directly reading/writing the memory address on the nonvolatile memory data block.

Further, when the data request is a read request, the following caching steps are executed:

101) judging whether the corresponding read cache data block R exists in the required logic space_chunkIf yes, executing step 102), if not, applying for a cache data resource from the cache resource pool, and constructing a read cache data block R_chunkRecording the hard disk ID, the logic space offset and the initialization bitmap in the corresponding metadata, and executing step 102);

102) according to R_chunkCalculating whether a cache region required by the current read request is valid data or not according to a linear mapping mode by using a bitmap in the metadata;

103) judging whether a cache region required by the current read request contains invalid data in the bitmap, if so, loading a data segment to R by adopting a piecewise linear mapping data loading mode_chunkIf not, executing step 104);

104) calculating the current read request at R_chunkThe memory address in (1), the memory address is read directly using RDMA.

Further, the piecewise linear mapping data loading mode is specifically as follows:

and calculating the effective bit of the bitmap required in the logic space according to the offset and length of the read request, and calculating the bitmap of the data segment required to be loaded by combining the initialized bitmap in the read cache data block.

Further, when the data request is a write request, the following caching steps are executed:

201) judging whether a corresponding write cache data block Wchunk exists in a required logic space, if so, executing step 202), otherwise, applying for a cache data resource from a cache resource pool, constructing the write cache data block Wchunk, recording a hard disk ID, a logic space offset and a global unique ID in corresponding metadata, wherein the global unique ID is realized based on a cache pool distribution sequence ID;

202) calculating whether the write cache data block Wchunk residual space meets the requirement of adding the write request in a log write mode or not according to the length in the current write request, if so, executing a step 204), and if not, executing a step 203);

203) marking the current write cache data block Wchunk as Schunk, immediately applying a cache data resource to the cache resource pool again as the Wchunk, simultaneously starting a background thread, synchronizing the data in the Schunk to the hard disk, and returning to the step 202);

204) and calculating an additional data address required by the write request in a log write mode, and then directly writing the additional data address into the memory address by using RDMA (remote direct memory access).

Further, the synchronizing the data in the Schunk to the hard disk specifically includes:

1a) in the logic space, creating a linear address mapping table;

1b) determining log write-in data by retrieving and checking data in a write request log header WLH on a write cache data block;

1c) according to the log sequence, mapping the log data addresses to a linear address mapping table, wherein the log address space is written first when the log is written later;

1d) and merging and updating the hard disk data according to the effective address sequence in the linear address mapping table.

Further, in the log writing mode, the write request is written into the cache DATA block in a log mode, and each log is divided into a write request log header WLH and write request DATA, which specifically includes:

2a) appending a write request log header WLH, wherein the write request status flag is in progress;

2b) appending write request DATA;

2c) modifying the status flag of the write request in the WLH to be complete;

2d) and updating the WLH integrity check value.

Compared with the prior art, the invention has the following beneficial effects:

(1) the invention directly accesses the nonvolatile memory data of the service end node in the network transmission process by the RDMA technology, maximally utilizes the memory characteristic of the nonvolatile memory, reduces unnecessary software stack overhead, reduces the delay of a storage system, and has zero memory copy in the whole storage system access process.

(2) The invention combines the RDMA technology, and the data access in the whole cache system is zero memory copy, thereby greatly reducing the CPU resource consumption.

(3) The invention adopts a cache design of read/write separation, can access read/write cache data according to different modes, the read cache adopts piecewise linear mapping, and the write cache adopts a log addition mode, so that when loading hard disk data, the data can be loaded according to requirements, and less invalid data is loaded; when a write request is made, the system generates an exception, and the atomicity of the whole write request can be ensured.

(4) When the cache dirty data is exchanged to the back-end hard disk, the repeated write requests can be removed, and the random write requests can be sequenced and combined.

Drawings

FIG. 1 is a schematic diagram of the principles of the present invention;

FIG. 2 is a non-volatile memory data area layout of the present invention;

FIG. 3 is a metadata update diagram of the present invention;

FIG. 4 is a cache resource map of the present invention;

FIG. 5 is a data segment loading diagram of the present invention;

FIG. 6 is a data synchronization diagram of the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.

The embodiment provides a nonvolatile memory caching method for fusing data transmission and storage, as shown in fig. 1, a data transmission layer and a storage layer are fused, data of a server node is directly accessed in a network transmission process through a Remote Direct Memory Access (RDMA) technology, and corresponding caching operation is executed based on a data request and a mapping relation between a caching resource and a hard disk space.

The method supports the segmented loading of the cache data, supports the sequencing, merging and changing out of the cache data, and has the following key technologies:

1) creating a pool of cache resources

Formatting the nonvolatile memory, as shown in fig. 2, the nonvolatile memory is divided into a data area, a metadata area and an index area, wherein the data area is divided into equal fixed-size data blocks (chunks) which serve as basic units for cache allocation; the metadata area is used for storing metadata of each data block in the data area, and in order to ensure transactional updating of the metadata, two metadata spaces need to be allocated in the metadata area for each data block, that is, 2 metadata blocks (meta0 and meta1) are created for each chunk, one is valid, and the other is invalid; the index area is used for storing the validity index of the metadata area, creating a valid index (index) for each chunk, the size of the valid index is just the granularity of the nonvolatile memory atoms, and recording the valid and valid metadata block index in the metadata area.

Four-tuples (index, meta0, meta1, chunk) can be obtained based on the above division, one four-tuple represents one cache data resource, and all the four-tuples in the nonvolatile memory form one cache resource pool.

As shown in fig. 2, in the metadata area, each chunk has two metadata correspondences (meta0, meta1), and when the metadata of the chunk is updated as shown in fig. 3, the valid metadata block is not overwritten, but the invalid metadata block is written, and then the modified index area points to the newly written metadata block. Since the indexes in the index area are all atomic operations, the metadata update can ensure atomicity.

2) Building a mapping of cache resources to hard disk space

And allocating a unique ID in the cache system to each hard disk, and organizing all the hard disks into a red-black tree according to the ID of the hard disks.

According to the size of the data block chunk, a linear mapping mode is adopted to divide the hard disk into a plurality of logical spaces, each logical space manages an allocated variable NVRAM data block, and the logical space ID is the hard disk ID and the internal deviation of the hard disk where the logical space is located.

And dynamically creating and distributing physical resources to the logic space according to the read-write request, taking the offset of the logic space in the hard disk as a Key, and constructing a Hashtable for each hard disk for inquiring the logic space information.

According to the read and write requests of the logical spaces, different NVRAMs are distributed to each logical space to serve as cache data blocks, the distributed NVRAMs serve as NVRAM data blocks, and the physical memory addresses of the NVRAMs serve as RDMA memory access addresses to conduct data transmission. The transfer behavior at this time is also an NVRAM memory access operation.

The NVRAM only reads the cache data block, the metadata area of which contains a 64-bit valid bit mark, and can load the required data according to the data volume required by the read request and mark the validity. The NVRAM only writes the cache data block in a log mode, and when the data block in the logic space is swapped out, all write-in operations in the space are subjected to de-duplication, sequencing and combination, and then the data block is swapped out to the hard disk.

As shown in FIG. 4, each logical space may be associated with 0-1 read cache data blocks and 0-2 write cache data blocks, but there is at least one cache data block per logical space.

3) Piecewise linear mapping data loading

And if the data request is a read request, allocating a read-only data block from the buffer resource pool, and loading the hard disk data by using a linear address mapping mode.

The read cache data block is divided into a plurality of data segments, a bitmap is stored in a metadata area corresponding to the read cache data block and used for identifying effective data on the read cache data block, a read request is calculated according to the bitmap on the read cache data block in a linear mapping mode, and the read request needs to load the data segments.

As shown in fig. 5, the read request can calculate the valid bits of the required bitmap in the logical space according to the offset and length, and then calculate the bitmap of the data segment to be loaded in combination with the bitmap in the read buffer data block.

load bitmap＝read bitmap&(～valid bitmap)

And then loading data at a corresponding position from the hard disk according to the load bitmap, and updating the valid bitmap.

4) Log writing mode

And if the data request is a write request, 1-2 write-only data blocks are distributed from the cache resource pool, and the write request data is stored in a log mode.

As shown in fig. 4, the write request is written into the cache DATA block in a log manner, and each log is divided into a write request log header (WLH) and write request DATA (DATA); the WLH includes a data request triple (ID, offset, length), a globally unique ID of the chunk where the data request triple is located, a write request status, and a WLH integrity check value.

The log writing process can be divided into the following steps;

a) appending a write request log header (WLH), wherein the write request status flag is in progress;

b) appending write request DATA (DATA);

c) modifying the status flag of the write request in the WLH to be complete;

d) and updating the WLH integrity check value.

5) Data synchronization

As shown in FIG. 6, the present invention uses a linear address mapping table to merge and sort the updated hard disk data. The main process is as follows:

a) in the logic space, creating a linear address mapping table;

b) determining log write-in data by retrieving and checking data in WLH on a write cache data block;

c) according to the log sequence, mapping the log data addresses to a linear address mapping table, wherein the log address space is written first when the log is written later;

d) and merging and updating the hard disk data according to the effective address sequence in the linear address mapping table.

6) Data request access

The maximum length of the client data request is limited, and the length of the read request, the length of the write request and the length of the write request log data header are required to be less than or equal to the length of the cache data block. The logical space may be queried or created by a data request access.

Each data request contains at least one triplet (ID, offset, length), which is the ID of the hard disk, the offset of the location of the accessed hard disk, and the length of the data of the accessed hard disk. Through the ID, the hard disk to be accessed can be found based on the hard disk ID red-black tree data. And then according to the Hashtable established on each hard disk, the logic space where the data request is located can be found. If the logical space does not exist, an empty logical space is created to insert into the Hashtable.

Based on the acquired logical space, inquiring the related nonvolatile memory data block, mapping data to the cache data block, calculating the memory address of the data request in the cache data block, establishing RDMA operation, and directly reading/writing the memory address on the nonvolatile memory data block by the client node.

7) Read request access

And when the data request is a read request, finding or creating a logic space corresponding to the read request according to the data request access in 6).

As shown in FIG. 4, if there is no corresponding read cache data block (R) in the logical space_chunk) Then, a cache data resource is applied to the cache resource pool, and the hard disk ID, the logical space offset and the initialization bitmap are recorded in the cache data metadata.

If there is a corresponding R in the logical space_chunkThen according to R_chunkAnd calculating whether a cache region required by the current read request is valid data according to a linear mapping method by using the bitmap in the metadata.

If the cache region required by the current read request contains invalid data in the bitmap, the piecewise linear mapping data loading method as described in 3) is adoptedLoad data segments into R_chunk。

If all the cache areas required by the current read request are valid data in the bitmap, calculating the R of the current read request_chunkThe memory address in (1), the memory address is read directly using RDMA.

8) Write request access

And when the data request is a write request, finding or creating a logic space corresponding to the write request according to the data request access in the step 6).

As shown in fig. 4, if there is no corresponding write cache data block (wchouk) in the logical space, a cache data resource is applied to the cache resource pool, and the hard disk ID, the logical space offset, and the globally unique ID are recorded in the wchouk metadata. The writing of the globally unique ID of the cache can be realized by allocating a sequence ID to the cache pool, that is, the sequence ID is incremented every time the cache resource is allocated to the cache resource pool.

And if the corresponding Wchunk exists in the logic space, calculating whether the residual space of the write cache data block meets the requirement of adding the write request in a log mode according to the length of the current write request. The calculation method is that the length of the head (WLH) of the write request log and the length of the data of the write request are less than or equal to the Wchunk residual space.

And if the current Wchunk space is insufficient, marking the current Wchunk space as Schunk, and immediately applying a cache data resource from the cache resource pool again as the Wchunk. And starting a background thread, and synchronizing the data in the Schunk to the hard disk by adopting the data synchronization method in the step 4).

If the Wchunk space is enough, immediately starting the log writing mode as described in 4), calculating the additional data address required by the write request, and then directly writing the additional data address into the memory address by using RDMA.

The above functions, if implemented in the form of software functional units and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. A nonvolatile memory caching method integrating data transmission and storage is characterized in that nonvolatile memory data of a service end node is directly accessed in a network transmission process through an RDMA technology, and corresponding caching operation is executed based on a data request and a mapping relation between caching resources and a hard disk space.

2. The method as claimed in claim 1, wherein the nonvolatile memory has a buffer resource pool,

3. The fused data transmission and storage nonvolatile memory caching method according to claim 2, wherein an atomic updating method is adopted to update the metadata of the data area.

4. The nonvolatile memory caching method for fusing data transmission and storage according to claim 2, wherein each hard disk in the hard disk space is allocated with a unique hard disk ID, each hard disk is divided into a plurality of logical spaces according to the size of the data block and in a linear mapping manner, a hard disk Hashtable is constructed based on logical space migration, and each logical space has at least one cached data block.

5. The method of claim 4, wherein the data request comprises at least one triplet (ID, offset, length), the ID indicates the ID of the hard disk, the offset indicates the position offset of the hard disk, the length indicates the length of the data accessed to the hard disk,

6. The method for caching in a nonvolatile memory for merging data transmission and storage according to claim 5, wherein when the data request is a read request, the following caching steps are performed:

7. The nonvolatile memory caching method for fusing data transmission and storage according to claim 6, wherein the piecewise linear mapping data loading manner is specifically:

8. The method for caching in a nonvolatile memory for merging data transmission and storage according to claim 5, wherein when the data request is a write request, the following caching steps are performed:

9. The nonvolatile memory caching method for fusing data transmission and storage according to claim 8, wherein the synchronizing data in the Schunk to the hard disk specifically comprises:

1a) in the logic space, creating a linear address mapping table;

10. The nonvolatile memory caching method for fusing DATA transmission and storage according to claim 8, wherein in the log writing mode, the write request is written into the cache DATA block in a log mode, and each log is divided into a write request log header WLH and a write request DATA, specifically:

2b) appending write request DATA;

2c) modifying the status flag of the write request in the WLH to be complete;

2d) and updating the WLH integrity check value.