WO2023029624A1 - Storage block collection method and related apparatus - Google Patents

Storage block collection method and related apparatus Download PDF

Info

Publication number
WO2023029624A1
WO2023029624A1 PCT/CN2022/096184 CN2022096184W WO2023029624A1 WO 2023029624 A1 WO2023029624 A1 WO 2023029624A1 CN 2022096184 W CN2022096184 W CN 2022096184W WO 2023029624 A1 WO2023029624 A1 WO 2023029624A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
granularity
logical unit
block
data
Prior art date
Application number
PCT/CN2022/096184
Other languages
French (fr)
Chinese (zh)
Inventor
陈朝阳
兰国语
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023029624A1 publication Critical patent/WO2023029624A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers

Definitions

  • the present application relates to the field of storage technologies, and in particular to a storage block recovery method and a related device.
  • the current distributed storage system architecture can be shown in FIG. 1 , which provides a client, a storage engine, and a distributed storage device.
  • the user can see the storage space, and the user can store data in the storage space; then the storage engine writes the data to be stored by the client into the logical unit created by the storage engine, and then through additional writing
  • the interface stores the data written into the logical unit in the distributed storage device in different redundant storage modes.
  • the space used to store data in the logic unit can be reclaimed through the garbage collection (GC) mechanism to release the corresponding storage resources. to continue storing other data.
  • GC garbage collection
  • the embodiment of the present application provides a storage block recovery method and a related device, which are used to speed up the recovery speed of garbage data in the distributed storage system.
  • the embodiment of the present application provides a storage block recovery method, which is applied to a distributed storage system including a storage engine and a distributed storage device, wherein the storage block recovery device obtains the first logical unit created by the storage engine The recovery granularity; then the storage reclamation device determines the first storage block in the reclaimable state in the storage area to be reclaimed of the first logical unit according to the recovery granularity; finally the storage reclamation device marks the first storage block as capable A free block used for data storage, and marking a corresponding second storage block of the first storage block in the distributed storage device as a free block available for data storage.
  • the storage area to be reclaimed in the first logic unit is a storage area where written data is marked as garbage data.
  • the size of the first logical unit is 128 megabytes (MB), wherein, the data written in the storage area from the beginning of the storage displacement of 4MB to the storage displacement of 25MB is marked as garbage data, then the storage displacement The storage area from the beginning of 4MB to the storage displacement of 25MB is the storage area to be reclaimed.
  • the storage block reclaiming device realizes storage resource reclaiming of the storage block according to the reclaiming granularity determined by the storage engine when creating the logical unit, and directly marks the reclaimable storage block as a free block, thereby eliminating the need to
  • the storage resource can only be recovered after the valid data in the storage block to be recovered is stored in other storage spaces, thereby speeding up the recovery speed of garbage data in the distributed storage system.
  • the specific method is as follows: the storage block reclamation device obtains the redundant storage module of the first logical unit; and then determines the storage block according to the redundant storage mode. The recycling granularity of the first logical unit.
  • the manner in which the storage block reclaiming device determines the reclaiming granularity of the first logical unit according to the redundant storage mode may include the following possible implementations:
  • the storage block reclamation device when the redundant storage mode of the first logical unit is copy redundancy, the storage block reclamation device has the minimum write granularity in the distributed storage device according to the copy redundancy as the first logical unit The recycling granularity of the unit. That is, if the minimum writing granularity in the distributed storage device is 1MB, then the recovery granularity of the first logical unit is 1MB.
  • the storage block recycling device is based on the redundant stripe depth of the EC algorithm and the distribution The minimum write granularity of the storage device determines the recovery granularity of the first logical unit.
  • error correction code erasure code
  • the stripe depth is greater than the minimum write granularity
  • the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is the recycling granularity of the first logical unit
  • the multiplication of the minimum writing granularity and the number of data fragments redundant by the EC algorithm is the recovery granularity of the first logical unit. For example, if the redundancy mode of the EC algorithm is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 2MB, then the recovery granularity of the first logical unit is 8MB. If the EC algorithm redundancy mode is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 0.5MB, then the granularity of the first logical unit is 4MB.
  • the storage block reclamation device determines the first storage block in the recoverable state in the to-be-reclaimed storage area of the first logical unit according to the recovery granularity in a specific manner as follows: the storage block recovery module follows the first sequence and The recovery granularity divides the first logical unit into a plurality of intermediate storage blocks, wherein the first order is the process of writing data in the first logical unit, and the order of occupying the storage blocks is first come first; then the storage blocks are recovered The device obtains the respective storage displacements of the plurality of intermediate storage blocks; when the storage displacement indicates that there are intermediate storage blocks located in the storage area to be reclaimed among the plurality of intermediate storage blocks, determine the intermediate storage block located in the storage area to be reclaimed It is the first storage block in reclaimable state.
  • the size of the first logical unit is 128MB, and the recovery granularity is 4MB.
  • the storage block recovery device divides the first logical unit into 0-4MB, 4MB-8MB, 8MB-12MB...124MB-128MB according to the recovery granularity storage block, wherein the storage area to be reclaimed in the first logical unit is 4MB-25MB, then the storage block reclaiming device can determine the five
  • the intermediate storage block is the first storage block in recyclable state.
  • the storage block reclamation device may also write data into the first storage block through the first interface when the first logic unit cannot perform additional writing, and the first interface supports intermediate writing.
  • an embodiment of the present application provides a device for reclaiming a storage block, which has a function of implementing the behavior of the device for reclaiming a storage block in the first aspect.
  • This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the apparatus includes a unit or a module for performing each step of the above first aspect.
  • the device includes: an acquiring module, configured to acquire the recovery granularity of the first logical unit created by the storage engine; a determining module, configured to determine the storage area of the first logical unit to be recovered according to the recovery granularity A first storage block in a recoverable state; a processing module, configured to mark the first storage block as a free block that can be used for data storage, and store the first storage block in the distributed storage device The corresponding second storage block is marked as a free block that can be used for data storage.
  • a storage module is also included for storing necessary program instructions and data of the storage block recovery device.
  • the apparatus includes: a processor and a transceiver, where the processor is configured to support the apparatus for reclaiming storage blocks to perform corresponding functions in the method provided in the first aspect above.
  • the transceiver is used for instructing the communication between the storage block reclamation device and other devices, and the receiving client device sends the data involved in the above method.
  • the device may further include a memory, which is used to be coupled with the processor, and stores necessary program instructions and data of the memory block reclamation device.
  • the chip when the device is a chip in the storage block recovery device, the chip includes: a processing module and a transceiver module, and the transceiver module can be, for example, an input/output interface, a pin or a Circuits, etc., obtain the recycling granularity of the first logical unit created by the storage engine, and transmit this information to other chips or modules coupled with this chip;
  • the recovery granularity determines the first storage block in the recoverable state in the storage area to be recovered of the first logical unit; marks the first storage block as a free block that can be used for data storage, and sets the second A storage block corresponding to a second storage block in the distributed storage device is marked as a free block that can be used for data storage.
  • the processing module can execute the computer-executed instructions stored in the storage unit, so as to support the storage block reclamation device to execute the method provided in the first aspect above.
  • the storage unit may be a storage unit in the chip, such as a register, a cache, etc., or a storage unit located outside the chip, such as a read-only memory (read-only memory, ROM) or a Other types of static storage devices that store static information and instructions, random access memory (random access memory, RAM), etc.
  • the device includes a communication interface and a logic circuit, where the communication interface is used to obtain the recovery granularity of the first logical unit created by the storage engine; the logic circuit is used to determine the recovery granularity according to the recovery granularity.
  • the processor mentioned in any of the above can be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, a specific application integrated circuit (application-specific integrated circuit, ASIC), or one or more An integrated circuit for controlling the program execution of the data transmission method in the above aspects.
  • CPU Central Processing Unit
  • ASIC application-specific integrated circuit
  • the embodiments of the present application provide a computer-readable storage medium, where the computer storage medium stores computer instructions, and the computer instructions are used to execute the method described in any possible implementation manner of any one of the above-mentioned aspects.
  • the embodiments of the present application provide a computer program including instructions, which, when run on a computer, cause the computer to execute the method described in any one of the above aspects.
  • the present application provides a system-on-a-chip, which includes a processor, configured to support the memory block reclamation device to implement the functions involved in the above-mentioned aspect, such as generating or processing the data involved in the above-mentioned method and/or information.
  • the system-on-a-chip further includes a memory, and the memory is configured to store necessary program instructions and data of the storage block reclamation device, so as to realize functions in any one of the above-mentioned aspects.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • Fig. 1 is an exemplary system architecture diagram of a distributed storage system
  • FIG. 2 is an exemplary architecture diagram of a copy redundant storage mode
  • Fig. 3 is an exemplary architecture diagram of EC 4+2 redundant storage mode
  • FIG. 4 is a schematic diagram of an embodiment of a method for recovering storage blocks in the embodiment of the present application.
  • Fig. 5a is an exemplary flow chart of writing data in the distributed storage system in the embodiment of the present application.
  • Fig. 5b is an exemplary flow chart of modifying data in the distributed storage system in the embodiment of the present application.
  • Fig. 6a is an exemplary flowchart of data recovery by the storage engine in the embodiment of the present application.
  • Fig. 6b is an exemplary flow chart of reclaiming data by a distributed storage device in the embodiment of the present application.
  • FIG. 7 is an exemplary flow chart of rewriting the reclaimed space in the embodiment of the present application.
  • FIG. 8 is a schematic diagram of an embodiment of a storage block recovery device in the embodiment of the present application.
  • FIG. 9 is a schematic diagram of another embodiment of the device for recovering storage blocks in the embodiment of the present application.
  • the naming or numbering of the steps in this application does not mean that the steps in the method flow must be executed in the time/logic sequence indicated by the naming or numbering.
  • the execution order of the technical purpose is changed, as long as the same or similar technical effect can be achieved.
  • the division of units presented in this application is a logical division. In actual application, there may be other division methods. For example, multiple units can be combined or integrated in another system, or some features can be ignored. , or not, in addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection between units may be electrical or other similar forms, this Applications are not limited.
  • the units or subunits described as separate components may or may not be physically separated, may or may not be physical units, or may be distributed into multiple circuit units, and some or all of them may be selected according to actual needs unit to realize the purpose of the application scheme.
  • the distributed storage system includes a client, a storage engine, and a distributed storage device.
  • the user can see the storage space (also called the file system, such as the D drive), and the user can store data in the storage space; then the storage engine writes the data to be stored in the storage space written by the client Write into the logical unit created by the storage engine; and then store the data written into the logical unit in the distributed storage device in different redundant storage methods through the additional write interface.
  • the storage space also called the file system, such as the D drive
  • the redundant storage mode includes copy redundancy or EC algorithm redundancy.
  • An exemplary scheme of replica redundancy is shown in FIG. 2 . That is, when the data A written in the logical unit is written into the distributed storage device, the data A can be copied into three copies, data A1, data A2 and data A3, and then the three copies of data are stored in the local storage 1 respectively. , local storage 2, and local storage 3.
  • the data can be divided into 4 nodes plus 2 nodes for storing verification data.
  • the nodes for storing data correspond to (local storage 1 to local storage 4)
  • the nodes for storing verification data correspond to (local storage 5 to local storage 6). That is, when the data A written in the logical unit is written into the distributed storage device, the data A can be fragmented with a predetermined stripe depth, that is, the data A can be fragmented into data 1, data 2, data 3 and data 4; At the same time, calculate and generate 2 verification data based on data 1, data 2, data 3 and data 4.
  • the 6 points of data are distributed and stored in local storage 1 to local storage 6 .
  • the stripe depth is used to indicate the size of each data fragment in the redundancy of the EC algorithm. For example, suppose the stripe depth of the EC4+2 is 1MB, and 4MB of data needs to be written to the distributed storage device, then write the data of 1MB in local storage 1 to local storage 4, and local storage 5 Write the verification data calculated according to the 4MB data to the local storage 6 .
  • the space used to store data in the logical unit can be reclaimed through the garbage collection (Garbage Collection, GC) mechanism to release the corresponding storage resources. to continue storing other data.
  • garbage collection Garbage Collection, GC
  • an embodiment of the present application provides the following storage block recovery method, which is applied to a distributed storage system including a storage engine and a distributed storage device, wherein the storage block recovery device obtains the first storage block created by the storage engine The recovery granularity of a logical unit; then the storage reclamation device determines the first storage block that is in a recoverable state in the storage area to be recovered of the first logical unit according to the recovery granularity; finally the storage recovery device determines the first storage block marking as a free block that can be used for data storage, and marking a corresponding second storage block of the first storage block in the distributed storage device as a free block that can be used for data storage.
  • GSM Global System of Mobile Communication
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • FDD Frequency Division Duplex
  • TDD Time Division Duplex
  • Universal Mobile Telecommunication System Universal Mobile Telecommunication System
  • UMTS Universal Mobile Telecommunication System
  • 5G communication system and future wireless communication systems, etc.
  • the client in the distributed storage system may be a user equipment, and this application describes various embodiments in conjunction with the user equipment.
  • User Equipment may also refer to terminal equipment, access terminal, subscriber unit, subscriber station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent or user device.
  • the access terminal can be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a Wireless Local Loop (WLL) station, a Personal Digital Assistant (PDA), a wireless communication Functional handheld devices, computing devices or other processing devices connected to wireless modems, vehicle devices, wearable devices, terminal devices in 5G networks or terminal devices in future evolved PLMN networks, etc.
  • SIP Session Initiation Protocol
  • WLL Wireless Local Loop
  • PDA Personal Digital Assistant
  • an embodiment of the method for reclaiming the storage block in the embodiment of the application includes:
  • the storage block reclamation device acquires the reclamation granularity of the first logical unit created by the storage engine.
  • the storage engine in the distributed storage system will create the first logical unit, and the first logical unit is used to store data.
  • the storage engine determines the redundant storage mode of the first logical unit when creating the first logical unit.
  • the storage block reclaiming device can determine the reclaiming granularity of the first logical unit according to the redundant storage mode of the first logical unit.
  • the storage block reclamation device when the redundant storage mode of the first logical unit is copy redundancy, the storage block reclamation device has the minimum write granularity in the distributed storage device according to the copy redundancy as the first logical unit The recycling granularity of the unit. That is, if the minimum writing granularity in the distributed storage device is 1MB, then the recovery granularity of the first logical unit is 1MB.
  • the storage block reclamation device is based on the stripe depth of the EC algorithm redundancy and the minimum write rate of the distributed storage device.
  • Granularity determines the reclaim granularity of the first logical unit.
  • the stripe depth is greater than the minimum write granularity
  • the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is the recycling granularity of the first logical unit
  • the multiplication of the minimum writing granularity and the number of data fragments redundant by the EC algorithm is the recovery granularity of the first logical unit. For example, if the redundancy mode of the EC algorithm is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 2MB, then the recovery granularity of the first logical unit is 8MB. If the EC algorithm redundancy mode is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 0.5MB, then the granularity of the first logical unit is 4MB.
  • the storage reclamation device determines, according to the reclamation granularity, a first storage block in a reclaimable state in the storage area to be reclaimed of the first logical unit.
  • the storage area to be reclaimed of the first logic unit is a storage area that stores data that has been marked as garbage.
  • the distributed storage system writes the first data into the first logical unit created by the storage engine in the distributed storage system; if at the second moment, the distributed If the storage system modifies the first data to obtain modified data, then the modified data will be written into another logic unit created by the storage engine.
  • the first data can be confirmed as garbage data, and the storage of the The storage area of the first data may be confirmed as a storage area to be reclaimed.
  • the specific process of writing data in the distributed storage system may be as shown in FIG. 5a. Assuming that the storage engine is a key-value (key value, KV) storage engine, then the user writes the first data (the first data includes lba and length.
  • KV key value
  • the storage engine looks for a logical unit that is not full, and calls the append of the distributed layer
  • the write interface writes the first data into the logic unit according to the mapping relationship stored in the hash table.
  • a redundant storage mode between the logical unit and the distributed storage device is provided, and then the first data in the logical unit is written into the distributed storage device according to the redundant storage mode, and the logic created by the storage engine
  • the unit may also store metadata for indicating the storage location where the first data is written in the distributed storage device.
  • the first data is modified to the second data, its specific operation may be as shown in FIG. 5b.
  • the KV storage engine will generate two requests, one is a write request, that is, the modified second data is written, and the other is a delete request, that is, the first data is confirmed as garbage data, and the first data needs to be deleted.
  • the storage engine may also be other storage engines, as long as distributed storage can be realized.
  • the storage block reclaiming device can determine the first storage block in reclaimable state in the storage area to be reclaimed as follows:
  • the storage block reclamation module divides the first logical unit into a plurality of intermediate storage blocks according to the first sequence and the recovery granularity, wherein the first sequence is the process of writing data in the first logical unit, which is occupied by first come first The sequence of storage blocks; then the storage block reclaiming device obtains the respective storage displacements of the plurality of intermediate storage blocks; when the storage displacement indicates that there is an intermediate storage block located in the storage area to be recovered in the plurality of intermediate storage blocks, it is determined that the intermediate storage block located in The intermediate storage block in the storage area to be reclaimed is the first storage block in reclaimable state.
  • the size of the first logical unit is 128MB
  • the recovery granularity is 4MB.
  • the storage block recovery device divides the first logical unit into 0-4MB, 4MB-8MB, 8MB-12MB...124MB-128MB according to the recovery granularity storage block, wherein the storage area to be reclaimed in the first logical unit is 4MB-25MB, then the storage block reclaiming device can determine the five
  • the intermediate storage block is the first storage block in recyclable state.
  • the storage reclamation device marks the first storage block as a free block that can be used for data storage, and marks the corresponding second storage block of the first storage block in the distributed storage device as a free block that can be used for data Free blocks of storage.
  • the storage block reclamation device sends a reclamation command to the distributed storage system to reclaim the first storage block and the first storage block in the distributed.
  • the corresponding second storage block in the storage device finally marks the first storage block and the second storage block as free blocks that can be used for data storage.
  • the distributed storage system determines that the first storage block is eligible for recycling according to the specified granularity, it directly marks the first storage block with a mark bit, thereby marking it as a reclaimed space, and at the same time, the first storage block The second storage block is also marked as reclaimed space.
  • the distributed storage system when the distributed storage system performs garbage data collection, it can find out the required data from the garbage hash table, so as to generate a recycling instruction.
  • the distributed storage system marks the first storage block in the first logic unit as reclaimed space according to the reclamation instruction.
  • the second storage block is also marked as reclaimed space, that is, the data block corresponding to the first storage block in the local storage is also marked as reclaimed space, as shown in FIG. 6b.
  • the details can be as follows: when the first logical unit created by the storage engine is full or the first logical unit created by the storage engine has been marked as no more writing, if the storage engine needs to write the second data, the storage engine The space marked as reclaimed in the first logical unit may be written through a first interface, wherein the first interface supports intermediate writing. In this way, after the second data is written, the space marked as reclaimed in the first logical unit created by the storage engine (such as the first storage block) can write data (that is, the first logical unit created by the storage engine is supplemented) unit holes), and delete the reclaimed marker marking the first logical unit created by the storage engine.
  • the user may search for reclaimed space from the reclaimed hash table for writing. After writing, it is removed from the reclaimed hash table and added to the valid data's hash table. After the logical unit has been written, the distributed storage device in the distributed storage system can write the new data in the data block corresponding to the reclaimed space (data block 3 as shown in Figure 7). , and clear the reclaimed flag for the data block.
  • the device for reclaiming storage blocks includes corresponding hardware structures and/or software modules for performing various functions.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the embodiment of the present application may divide the storage block recovery device into functional modules according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 8 is a schematic diagram of the hardware structure of the storage block reclamation device in the embodiment of the present application.
  • the storage block reclamation device may be a possible implementation manner of the distributed storage system in the embodiment of the present application.
  • the device for reclaiming storage blocks at least includes a processor 804 , a memory 803 , and a transceiver 802 , and the memory 803 is further used to store instructions 8031 and data 8032 .
  • the device for recovering storage blocks may further include an antenna 806 , an input/output (Input/Output, I/O) interface 810 and a bus 812 .
  • I/O input/output
  • the transceiver 802 further includes a transmitter 8021 and a receiver 8022 .
  • the processor 804 , the transceiver 802 , the memory 803 and the I/O interface 810 are communicatively connected to each other through the bus 812 , and the antenna 806 is connected to the transceiver 802 .
  • Processor 804 may be a general processor, such as but not limited to, a central processing unit (central processing unit, CPU), and may also be a special purpose processor, such as but not limited to, a digital signal processor (digital signal processor, DSP), application Application Specific Integrated Circuit (ASIC) and Field Programmable Gate Array (FPGA), etc.
  • the processor 804 may also be a neural network processing unit (neural processing unit, NPU).
  • the processor 804 may also be a combination of multiple processors.
  • the processor 804 may be used to execute the relevant steps of the storage block recovery method in the subsequent method embodiments.
  • the processor 804 may be a processor specially designed to perform the above steps and/or operations, or may be a processor that performs the above steps and/or operations by reading and executing the instructions 8031 stored in the memory 803.
  • the processor 804 The data 8032 may be needed during the execution of the above steps and/or operations.
  • the transceiver 802 includes a transmitter 8021 and a receiver 8022.
  • the transmitter 8021 is used to send signals through the antenna 806.
  • the receiver 8022 is used for receiving signals through at least one antenna among the antennas 806 .
  • the transmitter 8021 can be used to execute at least one antenna among the antennas 806.
  • the recovery method of the storage block in the subsequent method embodiments is applied to the recovery of the storage block When the device is installed, the operation performed by the receiving module or the transceiver module in the storage block recycling device.
  • the transceiver 802 is used to support the device for reclaiming storage blocks to perform the aforementioned receiving function and sending function.
  • a processor having processing functions is considered to be the processor 804 .
  • the receiver 8022 may also be called an input port, a receiving circuit, etc., and the transmitter 8021 may be called a transmitter or a transmitting circuit, etc.
  • the processor 804 can be used to execute the instructions stored in the memory 803 to control the transceiver 802 to receive messages and/or send messages, so as to complete the function of the memory block reclamation device in the method embodiment of the present application.
  • the function of the transceiver 802 may be considered to be realized by a transceiver circuit or a dedicated chip for transceiver.
  • receiving a message by the transceiver 802 may be understood as an input message by the transceiver 802
  • sending a message by the transceiver 802 may be understood as an output message by the transceiver 802.
  • Memory 803 can be various types of storage media, such as random access memory (random access memory, RAM), read only memory (read only memory, ROM), non-volatile RAM (non-volatile RAM, NVRAM), can Programmable ROM (programmable ROM, PROM), erasable PROM (erasable PROM, EPROM), electrically erasable PROM (electrically erasable PROM, EEPROM), flash memory, optical memory and registers, etc.
  • the memory 803 is specifically used to store instructions 8031 and data 8032.
  • the processor 804 can read and execute the instructions 8031 stored in the memory 803 to perform the steps and/or operations in the method embodiments of the present application.
  • the data 8032 may be needed during the operations and/or steps in the embodiments.
  • the apparatus for reclaiming storage blocks may further include an I/O interface 810, which is used for receiving instructions and/or data from peripheral devices and outputting instructions and/or data to peripheral devices.
  • I/O interface 810 which is used for receiving instructions and/or data from peripheral devices and outputting instructions and/or data to peripheral devices.
  • FIG. 9 is a schematic diagram of an embodiment of a storage block reclamation device in an embodiment of the present application.
  • the storage block recovery device 900 includes: an acquisition module 901, configured to acquire the recovery granularity of the first logical unit created by the storage engine;
  • a determination module 902 configured to determine, according to the recovery granularity, a first storage block in a recoverable state in the storage area to be recovered of the first logical unit;
  • a processing module 903 configured to mark the first storage block as a free block that can be used for data storage, and mark a second storage block corresponding to the first storage block in the distributed storage device as a free block that can be used for data storage Free blocks for data storage.
  • the obtaining module 901 is specifically configured to obtain the redundant storage mode of the first logical unit; and determine the recycling granularity of the first logical unit according to the redundant storage module.
  • the redundant storage mode is copy redundancy
  • the obtaining module 901 is specifically configured to determine that the minimum write granularity of the copy redundancy in the distributed storage device is the first logical unit recycling granularity.
  • the redundant storage mode is error-correcting code EC algorithm redundancy
  • the acquisition module 901 is specifically configured to use the EC algorithm redundant stripe depth and the minimum writing of the distributed storage device Granularity determines the reclaim granularity of the first logical unit.
  • the obtaining module 901 is specifically configured to, when the stripe depth is greater than the minimum write granularity, determine the multiplication of the stripe depth and the number of redundant data fragments of the EC algorithm The value is the recycling granularity of the first logical unit;
  • the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
  • the determining module 902 is specifically configured to divide the first logical unit into a plurality of intermediate storage blocks according to the first order and the recovery granularity, the first order is to provide the first logical unit with In the process of writing data, the order of the storage blocks is occupied first-come-first-served; the respective corresponding storage displacements of the plurality of intermediate storage blocks are obtained; the storage displacement indicates that the plurality of intermediate storage blocks exist in When storing the intermediate storage block in the storage area, it is determined that the intermediate storage block located in the storage area to be reclaimed is the first storage block in a recoverable state.
  • the storage block reclamation device further includes a writing module 904.
  • the writing module 904 is configured to write to the first storage block through the first interface. Write data in a block, and the first interface supports intermediate writing.
  • the storage block reclamation device in the above embodiments may be a chip applied to the storage block reclamation device or other combined devices, components, etc. that can realize the storage block reclamation device.
  • the transceiver module in the storage block reclamation device may be a transceiver, and the processing module may be a processor, such as a chip.
  • the storage block recycling device is a chip system
  • the part used for receiving in the transceiver module may be the input port of the chip system
  • the part used for sending in the transceiver module may be the output interface of the chip system
  • the processing module may be the processing module of the chip system.
  • Device for example: central processing unit (central processing unit, CPU).
  • the memory included in the storage block reclamation device is mainly used for storing software programs and data, for example, storing the programs described in the above embodiments.
  • the storage block recycling device also has the following functions:
  • a processor configured to obtain the reclaim granularity of the first logical unit created by the storage engine; determine the first storage block in a reclaimable state in the storage area to be reclaimed of the first logical unit according to the reclaim granularity;
  • the first storage block is marked as a free block that can be used for data storage
  • the second storage block corresponding to the first storage block in the distributed storage device is marked as a free block that can be used for data storage.
  • the processor is specifically configured to acquire a redundant storage mode of the first logical unit; and determine a recycling granularity of the first logical unit according to the redundant storage module.
  • the redundant storage mode is copy redundancy
  • the processor is specifically configured to determine that the minimum write granularity of the copy redundancy in the distributed storage device is Recycling granularity.
  • the redundant storage mode is error correction code EC algorithm redundancy
  • the processor is specifically configured to use the EC algorithm redundancy stripe depth and the minimum write granularity of the distributed storage device Determine the recovery granularity of the first logical unit.
  • the processor is specifically configured to, when the stripe depth is greater than the minimum write granularity, determine that the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is The recovery granularity of the first logical unit;
  • the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
  • the processor is specifically configured to divide the first logical unit into a plurality of intermediate storage blocks according to a first order and the recovery granularity, the first order is to add to the first logical unit
  • the sequence of occupying the storage blocks is first-come-first-served; the corresponding storage displacements of the plurality of intermediate storage blocks are acquired;
  • selecting an intermediate storage block in the area it is determined that the intermediate storage block located in the storage area to be reclaimed is the first storage block in a reclaimable state.
  • the processor is configured to write data into the first storage block through a first interface when the first logic unit cannot perform additional writing, and the first interface supports intermediate writing .
  • the embodiment of the present application also provides a processing device.
  • the processing device includes a processor and an interface; the processor is configured to execute the method for reclaiming a storage block in any one of the above method embodiments.
  • the above-mentioned processing device may be a chip, and the processor may be implemented by hardware or by software.
  • the processor When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc.; when implemented by software, the processor may be a general-purpose processor, and may be implemented by reading software codes stored in a memory.
  • the memory may be integrated in the processor, or may be located outside the processor and exist independently.
  • the hardware processing circuit may include an application-specific integrated circuit (ASIC), or a programmable logic device (programmable logic device, PLD); wherein, the PLD may include a field programmable gate array (field programmable gate array, FPGA) , complex programmable logic device (complex programmable logic device, CPLD) and so on.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • FPGA field programmable gate array
  • CPLD complex programmable logic device
  • These hardware processing circuits can be a semiconductor chip packaged separately (such as packaged into an ASIC); they can also be integrated with other circuits (such as CPU, DSP) and packaged into a semiconductor chip, for example, can be formed on a silicon base.
  • a variety of hardware circuits and CPUs are packaged separately into a chip.
  • This chip is also called SoC, or circuits and CPUs for realizing FPGA functions can also be formed on a silicon base, and separately sealed into a chip.
  • This chip Also known as a programmable system on a chip (system on a programmable chip, SoPC).
  • the embodiment of the present application also provides a computer-readable storage medium, including instructions, which, when run on a computer, enable the computer to control the device for reclaiming storage blocks to execute any one of the implementation manners shown in the foregoing method embodiments.
  • the embodiment of the present application also provides a computer program product, the computer program product includes computer program code, and when the computer program code is run on the computer, the computer is made to execute any one of the implementation manners shown in the foregoing method embodiments.
  • the embodiment of the present application also provides a chip system, including a memory and a processor, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the chip performs any implementation as shown in the foregoing method embodiments Way.
  • the embodiment of the present application also provides a chip system, including a processor, and the processor is configured to call and run a computer program, so that the chip executes any one of the implementation manners shown in the foregoing method embodiments.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be A physical unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to realize the purpose of the solution of this embodiment.
  • the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
  • the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device execute the method described in each embodiment of the present application.
  • a readable storage medium such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, a computer, a first network device or a second network device, computing device, or data center to another website site, computer, first network device or a second network device, computing device or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a first network device or a second network device, a data center, etc. integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
  • the disclosed system, device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a storage block recovery device, etc.) execute all or part of the steps of the methods in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided in the present application are a storage block collection method and a related apparatus, which are used for increasing the collection speed of garbage data in a distributed storage system. The method is applied to a distributed storage system which comprises a storage engine and a distributed storage device, and the method comprises: a storage block collection apparatus acquiring a collection granularity of a first logic unit that is created by a storage engine; the storage block collection apparatus then determining, according to the collection granularity, a first storage block that is in a collectable state and is in a storage area to be subjected to collection of the first logic unit; and finally, the storage block collection apparatus marking the first storage block as a free block that can be used for data storage, and marking a second storage block, which is in a distributed storage device and corresponds to the first storage block, as a free block that can be used for data storage.

Description

一种存储块的回收方法以及相关装置A method for reclaiming storage blocks and related devices
本申请要求于2021年09月03日提交中国国家知识产权局、申请号为202111033548.2、发明名称为“一种存储块的回收方法以及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111033548.2 and the title of the invention "a storage block recovery method and related device" submitted to the State Intellectual Property Office of China on September 3, 2021, the entire content of which is incorporated by reference incorporated in this application.
技术领域technical field
本申请涉及存储技术领域,尤其涉及一种存储块的回收方法以及相关装置。The present application relates to the field of storage technologies, and in particular to a storage block recovery method and a related device.
背景技术Background technique
随着计算机技术和网络技术的飞速发展,磁盘容量和数据总线带宽的增长速度无法满足海量数据的存储需求。因此海量数据的存储逐渐成为互联网技术发展急需解决的问题,为解决这一问题提出了分布式存储***技术。With the rapid development of computer technology and network technology, the growth rate of disk capacity and data bus bandwidth cannot meet the storage requirements of massive data. Therefore, the storage of massive data has gradually become an urgent problem to be solved in the development of Internet technology, and a distributed storage system technology is proposed to solve this problem.
目前的分布式存储***架构可以如图1所示,其提供客户端、存储引擎以及分布式存储设备。其中,在客户端,用户可以看到存储空间,用户可以在该存储空间进行数据存储;然后该存储引擎将客户端待存储的数据写入该存储引擎创建的逻辑单元中,然后再通过追加写接口将写入该逻辑单元中的数据采用不同的冗余存储方式存储在分布式存储设备中。在上述分布式存储***中,为了提高存储资源的利用率,可以通过垃圾回收(garbage collection,GC)机制,将该逻辑单元中用于存储数据的空间进行回收,以释放相应的存储资源,用于继续存储其他的数据。The current distributed storage system architecture can be shown in FIG. 1 , which provides a client, a storage engine, and a distributed storage device. Among them, on the client side, the user can see the storage space, and the user can store data in the storage space; then the storage engine writes the data to be stored by the client into the logical unit created by the storage engine, and then through additional writing The interface stores the data written into the logical unit in the distributed storage device in different redundant storage modes. In the above distributed storage system, in order to improve the utilization of storage resources, the space used to store data in the logic unit can be reclaimed through the garbage collection (GC) mechanism to release the corresponding storage resources. to continue storing other data.
而目前通常是需要将逻辑单元上的有效数据进行整体迁移之后,再将该逻辑单元进行回收,以释放相应的存储资源,这样导致数据回收速度较慢。At present, it is usually necessary to reclaim the logical unit after the effective data on the logical unit is migrated as a whole to release the corresponding storage resources, which leads to a slow data recovery speed.
发明内容Contents of the invention
本申请实施例提供了一种存储块的回收方法以及相关装置,用于加快该分布式存储***中垃圾数据的回收速度。The embodiment of the present application provides a storage block recovery method and a related device, which are used to speed up the recovery speed of garbage data in the distributed storage system.
第一方面,本申请实施例提供一种存储块的回收方法,应用于包括存储引擎和分布式存储设备的分布式存储***,其中,该存储块回收装置获取该存储引擎创建的第一逻辑单元的回收粒度;然后该存储回收装置根据该回收粒度确定该第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;最后该存储回收装置将该第一存储块标记为能够用于数据存储的空闲块,并将该第一存储块在该分布式存储设备中的对应的第二存储块标记为能够用于数据存储的空闲块。In the first aspect, the embodiment of the present application provides a storage block recovery method, which is applied to a distributed storage system including a storage engine and a distributed storage device, wherein the storage block recovery device obtains the first logical unit created by the storage engine The recovery granularity; then the storage reclamation device determines the first storage block in the reclaimable state in the storage area to be reclaimed of the first logical unit according to the recovery granularity; finally the storage reclamation device marks the first storage block as capable A free block used for data storage, and marking a corresponding second storage block of the first storage block in the distributed storage device as a free block available for data storage.
应用理解的是,该第一逻辑单元中的待回收存储区域为写入的数据被标记为垃圾数据的存储区域。比如该第一逻辑单元的大小为128兆字节(MB),其中,该从存储位移为4MB的开始直到存储位移为25MB的存储区域内写入的数据被标记为垃圾数据,则该存储位移为4MB的开始直到存储位移为25MB的存储区域为待回收存储区域。The application understands that the storage area to be reclaimed in the first logic unit is a storage area where written data is marked as garbage data. For example, the size of the first logical unit is 128 megabytes (MB), wherein, the data written in the storage area from the beginning of the storage displacement of 4MB to the storage displacement of 25MB is marked as garbage data, then the storage displacement The storage area from the beginning of 4MB to the storage displacement of 25MB is the storage area to be reclaimed.
本实施例中,该存储块回收装置根据该存储引擎在创建逻辑单元时确定的回收粒度对该存储块实现存储资源回收,并直接将处于可回收状态的存储块标记为空闲块,从而无需将待回收的存储块中的有效数据存储至其他存储空间之后,才能对该存储资源进行回收,从而加快该分布式存储***中垃圾数据的回收速度。In this embodiment, the storage block reclaiming device realizes storage resource reclaiming of the storage block according to the reclaiming granularity determined by the storage engine when creating the logical unit, and directly marks the reclaimable storage block as a free block, thereby eliminating the need to The storage resource can only be recovered after the valid data in the storage block to be recovered is stored in other storage spaces, thereby speeding up the recovery speed of garbage data in the distributed storage system.
可选的,该存储块回收装置在获取该第一逻辑单元的回收粒度时具体方式如下:该存 储块回收装置获取该第一逻辑单元的冗余存储模块;然后根据该冗余存储模式确定该第一逻辑单元的回收粒度。Optionally, when the storage block reclamation device obtains the recovery granularity of the first logical unit, the specific method is as follows: the storage block reclamation device obtains the redundant storage module of the first logical unit; and then determines the storage block according to the redundant storage mode. The recycling granularity of the first logical unit.
基于该方案,该存储块回收装置在根据该冗余存储模式确定该第一逻辑单元的回收粒度的方式可以包括如下几种可能实现方式:Based on this solution, the manner in which the storage block reclaiming device determines the reclaiming granularity of the first logical unit according to the redundant storage mode may include the following possible implementations:
一种可能实现方式中,该第一逻辑单元的冗余存储模式为副本冗余时,该存储块回收装置根据该副本冗余在该分布式存储设备中的最小写入粒度为该第一逻辑单元的回收粒度。即若在该分布式存储设备中的最小写入粒度为1MB时,则该第一逻辑单元的回收粒度则为1MB。In a possible implementation manner, when the redundant storage mode of the first logical unit is copy redundancy, the storage block reclamation device has the minimum write granularity in the distributed storage device according to the copy redundancy as the first logical unit The recycling granularity of the unit. That is, if the minimum writing granularity in the distributed storage device is 1MB, then the recovery granularity of the first logical unit is 1MB.
另一种可能实现方式中,该第一逻辑单元的冗余存储模式纠错码(erasure code,EC)算法冗余时,该存储块回收装置根据该EC算法冗余的条带深度和该分布式存储设备的最小写入粒度确定该第一逻辑单元的回收粒度。具体来说,在该条带深度大于该最小写入粒度时,确定该条带深度与所述EC算法冗余的数据分片数量的乘值为该第一逻辑单元的回收粒度;在该最小写入粒度大于该条带深度时,确定该最小写入粒度与所述EC算法冗余的数据分片数量的乘值为该第一逻辑单元的回收粒度。比如,EC算法冗余模式为EC 4+2,最小写入粒度为1MB,条带深度为2MB,则该第一逻辑单元的回收粒度为8MB。若EC算法冗余模式为EC 4+2,该最小写入粒度为1MB,条带深度为0.5MB,则该第一逻辑单元的粒度4MB。In another possible implementation manner, when the redundant storage mode of the first logic unit is redundant with an error correction code (erasure code, EC) algorithm, the storage block recycling device is based on the redundant stripe depth of the EC algorithm and the distribution The minimum write granularity of the storage device determines the recovery granularity of the first logical unit. Specifically, when the stripe depth is greater than the minimum write granularity, it is determined that the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is the recycling granularity of the first logical unit; When the writing granularity is greater than the stripe depth, it is determined that the multiplication of the minimum writing granularity and the number of data fragments redundant by the EC algorithm is the recovery granularity of the first logical unit. For example, if the redundancy mode of the EC algorithm is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 2MB, then the recovery granularity of the first logical unit is 8MB. If the EC algorithm redundancy mode is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 0.5MB, then the granularity of the first logical unit is 4MB.
可选的,该存储块回收装置根据该回收粒度确定该第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块的具体方式如下:该存储块回收模块按照第一顺序和该回收粒度将该第一逻辑单元划分得到多个中间存储块,其中,该第一顺序为该第一逻辑单元中写数据的过程,由先到后占用存储块的顺序;然后该存储块回收装置获取该多个中间存储块各自的存储位移;在该存储位移指示该多个中间存储块存在位于该待回收存储区域内的中间存储块时,确定位于该待回收存储区域内的中间存储块为处于可回收状态的第一存储块。比如,该第一逻辑单元的大小为128MB,回收粒度为4MB,该存储块回收装置按照该回收粒度将该第一逻辑单元划分为0-4MB、4MB-8MB、8MB-12MB……124MB-128MB的存储块,其中该第一逻辑单元中的待回收存储区域为4MB-25MB,则该存储块回收装置可以确定4MB-8MB、8MB-12MB、12MB-16MB、16MB-20MB、20MB-24MB这五个中间存储块为处于可回收状态的第一存储块。Optionally, the storage block reclamation device determines the first storage block in the recoverable state in the to-be-reclaimed storage area of the first logical unit according to the recovery granularity in a specific manner as follows: the storage block recovery module follows the first sequence and The recovery granularity divides the first logical unit into a plurality of intermediate storage blocks, wherein the first order is the process of writing data in the first logical unit, and the order of occupying the storage blocks is first come first; then the storage blocks are recovered The device obtains the respective storage displacements of the plurality of intermediate storage blocks; when the storage displacement indicates that there are intermediate storage blocks located in the storage area to be reclaimed among the plurality of intermediate storage blocks, determine the intermediate storage block located in the storage area to be reclaimed It is the first storage block in reclaimable state. For example, the size of the first logical unit is 128MB, and the recovery granularity is 4MB. The storage block recovery device divides the first logical unit into 0-4MB, 4MB-8MB, 8MB-12MB...124MB-128MB according to the recovery granularity storage block, wherein the storage area to be reclaimed in the first logical unit is 4MB-25MB, then the storage block reclaiming device can determine the five The intermediate storage block is the first storage block in recyclable state.
可选的,该存储块回收装置还可以在该第一逻辑单元无法执行追加写入时,通过第一接口向该第一存储块中写入数据,该第一接口支持中间写入。Optionally, the storage block reclamation device may also write data into the first storage block through the first interface when the first logic unit cannot perform additional writing, and the first interface supports intermediate writing.
第二方面,本申请实施例提供一种存储块回收装置,该装置具有实现上述第一方面中存储块回收装置行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。In a second aspect, an embodiment of the present application provides a device for reclaiming a storage block, which has a function of implementing the behavior of the device for reclaiming a storage block in the first aspect. This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more modules corresponding to the above functions.
在一个可能的实现方式中,该装置包括用于执行以上第一方面各个步骤的单元或模块。例如,该装置包括:获取模块,用于获取所述存储引擎创建的第一逻辑单元的回收粒度;确定模块,用于根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;处理模块,用于将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中的对应的第二存储块标记为能够用于数 据存储的空闲块。In a possible implementation manner, the apparatus includes a unit or a module for performing each step of the above first aspect. For example, the device includes: an acquiring module, configured to acquire the recovery granularity of the first logical unit created by the storage engine; a determining module, configured to determine the storage area of the first logical unit to be recovered according to the recovery granularity A first storage block in a recoverable state; a processing module, configured to mark the first storage block as a free block that can be used for data storage, and store the first storage block in the distributed storage device The corresponding second storage block is marked as a free block that can be used for data storage.
可选的,还包括存储模块,用于保存存储块回收装置必要的程序指令和数据。Optionally, a storage module is also included for storing necessary program instructions and data of the storage block recovery device.
在一种可能的实现方式中,该装置包括:处理器和收发器,该处理器被配置为支持存储块回收装置执行上述第一方面提供的方法中相应的功能。收发器用于指示存储块回收装置与其他装置之间的通信,接收客户装置发送上述方法中所涉及的数据。可选的,此装置还可以包括存储器,该存储器用于与处理器耦合,其保存存储块回收装置必要的程序指令和数据。In a possible implementation manner, the apparatus includes: a processor and a transceiver, where the processor is configured to support the apparatus for reclaiming storage blocks to perform corresponding functions in the method provided in the first aspect above. The transceiver is used for instructing the communication between the storage block reclamation device and other devices, and the receiving client device sends the data involved in the above method. Optionally, the device may further include a memory, which is used to be coupled with the processor, and stores necessary program instructions and data of the memory block reclamation device.
在一种可能的实现方式中,当该装置为存储块回收装置内的芯片时,该芯片包括:处理模块和收发模块,该收发模块例如可以是该芯片上的输入/输出接口、管脚或电路等,获取所述存储引擎创建的第一逻辑单元的回收粒度,并将这些信息传送给与此芯片耦合的其他芯片或模块中;该处理模块例如可以是处理器,此处理器用于根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中的对应的第二存储块标记为能够用于数据存储的空闲块。该处理模块可执行存储单元存储的计算机执行指令,以支持存储块回收装置执行上述第一方面提供的方法。可选地,该存储单元可以为该芯片内的存储单元,如寄存器、缓存等,该存储单元还可以是位于该芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。In a possible implementation, when the device is a chip in the storage block recovery device, the chip includes: a processing module and a transceiver module, and the transceiver module can be, for example, an input/output interface, a pin or a Circuits, etc., obtain the recycling granularity of the first logical unit created by the storage engine, and transmit this information to other chips or modules coupled with this chip; The recovery granularity determines the first storage block in the recoverable state in the storage area to be recovered of the first logical unit; marks the first storage block as a free block that can be used for data storage, and sets the second A storage block corresponding to a second storage block in the distributed storage device is marked as a free block that can be used for data storage. The processing module can execute the computer-executed instructions stored in the storage unit, so as to support the storage block reclamation device to execute the method provided in the first aspect above. Optionally, the storage unit may be a storage unit in the chip, such as a register, a cache, etc., or a storage unit located outside the chip, such as a read-only memory (read-only memory, ROM) or a Other types of static storage devices that store static information and instructions, random access memory (random access memory, RAM), etc.
在一种可能实现方式中,该装置包括通信接口和逻辑电路,该通信接口用于获取所述存储引擎创建的第一逻辑单元的回收粒度;该逻辑电路,用于根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中的对应的第二存储块标记为能够用于数据存储的空闲块。In a possible implementation manner, the device includes a communication interface and a logic circuit, where the communication interface is used to obtain the recovery granularity of the first logical unit created by the storage engine; the logic circuit is used to determine the recovery granularity according to the recovery granularity. The first storage block in the reclaimable state in the storage area to be reclaimed of the first logical unit; mark the first storage block as a free block that can be used for data storage, and place the first storage block in the The corresponding second storage block in the distributed storage device is marked as a free block that can be used for data storage.
其中,上述任一处提到的处理器,可以是一个通用中央处理器(Central Processing Unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制上述各方面数据传输方法的程序执行的集成电路。Among them, the processor mentioned in any of the above can be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, a specific application integrated circuit (application-specific integrated circuit, ASIC), or one or more An integrated circuit for controlling the program execution of the data transmission method in the above aspects.
第三方面,本申请实施例提供一种计算机可读存储介质,该计算机存储介质存储有计算机指令,该计算机指令用于执行上述各方面中任意一方面任意可能的实施方式所述的方法。In a third aspect, the embodiments of the present application provide a computer-readable storage medium, where the computer storage medium stores computer instructions, and the computer instructions are used to execute the method described in any possible implementation manner of any one of the above-mentioned aspects.
第四方面,本申请实施例提供一种包含指令的计算机程序,当其在计算机上运行时,使得计算机执行上述各方面中任意一方面所述的方法。In a fourth aspect, the embodiments of the present application provide a computer program including instructions, which, when run on a computer, cause the computer to execute the method described in any one of the above aspects.
第五方面,本申请提供了一种芯片***,该芯片***包括处理器,用于支持存储块回收装置实现上述方面中所涉及的功能,例如生成或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,该芯片***还包括存储器,该存储器,用于保存存储块回收装置必要的程序指令和数据,以实现上述各方面中任意一方面的功能。该芯片***可以由芯片构成,也可以包含芯片和其他分立器件。In a fifth aspect, the present application provides a system-on-a-chip, which includes a processor, configured to support the memory block reclamation device to implement the functions involved in the above-mentioned aspect, such as generating or processing the data involved in the above-mentioned method and/or information. In a possible design, the system-on-a-chip further includes a memory, and the memory is configured to store necessary program instructions and data of the storage block reclamation device, so as to realize functions in any one of the above-mentioned aspects. The system-on-a-chip may consist of chips, or may include chips and other discrete devices.
附图说明Description of drawings
图1为分布式存储***的一个示例性***架构图;Fig. 1 is an exemplary system architecture diagram of a distributed storage system;
图2为副本冗余存储模式的一个示例性架构图;FIG. 2 is an exemplary architecture diagram of a copy redundant storage mode;
图3为EC 4+2冗余存储模式的一个示例性架构图;Fig. 3 is an exemplary architecture diagram of EC 4+2 redundant storage mode;
图4为本申请实施例中存储块的回收方法的一个实施例示意图;FIG. 4 is a schematic diagram of an embodiment of a method for recovering storage blocks in the embodiment of the present application;
图5a为本申请实施例中分布式存储***写入数据的一个示例性流程图;Fig. 5a is an exemplary flow chart of writing data in the distributed storage system in the embodiment of the present application;
图5b为本申请实施例中分布式存储***修改数据的一个示例性流程图;Fig. 5b is an exemplary flow chart of modifying data in the distributed storage system in the embodiment of the present application;
图6a为本申请实施例中存储引擎回收数据的一个示例性流程图;Fig. 6a is an exemplary flowchart of data recovery by the storage engine in the embodiment of the present application;
图6b为本申请实施例中分布式存储设备回收数据的一个示例性流程图;Fig. 6b is an exemplary flow chart of reclaiming data by a distributed storage device in the embodiment of the present application;
图7为本申请实施例中已回收空间的再次写入的一个示例性流程图;FIG. 7 is an exemplary flow chart of rewriting the reclaimed space in the embodiment of the present application;
图8为本申请实施例中存储块回收装置的一个实施例示意图;FIG. 8 is a schematic diagram of an embodiment of a storage block recovery device in the embodiment of the present application;
图9为本申请实施例中存储块回收装置的另一个实施例示意图。FIG. 9 is a schematic diagram of another embodiment of the device for recovering storage blocks in the embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。In order to make the purpose, technical solutions and advantages of the present application clearer, the embodiments of the present application will be described below in conjunction with the accompanying drawings. Apparently, the described embodiments are only part of the present application, rather than all of them. . Those skilled in the art know that, with the emergence of new application scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行次序,只要能达到相同或者相类似的技术效果即可。本申请中所出现的单元的划分,是一种逻辑上的划分,实际应用中实现时可以有另外的划分方式,例如多个单元可以结合成或集成在另一个***中,或一些特征可以忽略,或不执行,另外,所显示的或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元之间的间接耦合或通信连接可以是电性或其他类似的形式,本申请中均不作限定。并且,作为分离部件说明的单元或子单元可以是也可以不是物理上的分离,可以是也可以不是物理单元,或者可以分布到多个电路单元中,可以根据实际的需要选择其中的部分或全部单元来实现本申请方案的目的。The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or modules is not necessarily limited to the expressly listed Instead, other steps or modules not explicitly listed or inherent to the process, method, product or apparatus may be included. The naming or numbering of the steps in this application does not mean that the steps in the method flow must be executed in the time/logic sequence indicated by the naming or numbering. The execution order of the technical purpose is changed, as long as the same or similar technical effect can be achieved. The division of units presented in this application is a logical division. In actual application, there may be other division methods. For example, multiple units can be combined or integrated in another system, or some features can be ignored. , or not, in addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection between units may be electrical or other similar forms, this Applications are not limited. Moreover, the units or subunits described as separate components may or may not be physically separated, may or may not be physical units, or may be distributed into multiple circuit units, and some or all of them may be selected according to actual needs unit to realize the purpose of the application scheme.
为了便于理解,下面对分布式存储***进行相应的描述:For ease of understanding, the following describes the distributed storage system accordingly:
如图1所示的一个示例性方案中,该分布式存储***包括客户端、存储引擎以及分布式存储设备。在客户端,用户可以看到存储空间(也可以称为文件***,如D盘),用户可以在该存储空间进行数据存储;然后该存储引擎将客户端写入该存储空间的待存储的数据写入该存储引擎创建的逻辑单元中;然后再通过追加写接口将写入该逻辑单元中的数据采 用不同的冗余存储方式存储在分布式存储设备中。In an exemplary solution shown in FIG. 1 , the distributed storage system includes a client, a storage engine, and a distributed storage device. On the client side, the user can see the storage space (also called the file system, such as the D drive), and the user can store data in the storage space; then the storage engine writes the data to be stored in the storage space written by the client Write into the logical unit created by the storage engine; and then store the data written into the logical unit in the distributed storage device in different redundant storage methods through the additional write interface.
其中,该冗余存储模式包括副本冗余或者EC算法冗余。如图2所示为副本冗余的一个示例性方案。即在该逻辑单元内写的数据A在写入分布式存储设备时,可以将该数据A复制为三份,数据A1、数据A2和数据A3,然后将该三份数据分别存储在本地存储1、本地存储2以及本地存储3。Wherein, the redundant storage mode includes copy redundancy or EC algorithm redundancy. An exemplary scheme of replica redundancy is shown in FIG. 2 . That is, when the data A written in the logical unit is written into the distributed storage device, the data A can be copied into three copies, data A1, data A2 and data A3, and then the three copies of data are stored in the local storage 1 respectively. , local storage 2, and local storage 3.
如图3所示为EC算法冗余的一个示例性方案,以EC4+2为例进行说明,数据可以分片为4个节点加2个存储校验数据的节点。其中,用于存储数据的节点分别对应(本地存储1至本地存储4),用于存储校验数据的节点分别对应(本地存储5至本地存储6)。即在该逻辑单元内写的数据A在写入分布式存储设备时,可以将该数据A以预定的分条深度进行分片,即将数据A分片为数据1、数据2、数据3和数据4;同时另外根据数据1、数据2、数据3和数据4计算生成2份校验数据。然后将该6分数据分布存储在本地存储1至本地存储6中。可以理解的是,该条带深度用于指示该EC算法冗余中各个数据分片的大小。例如,假设该EC4+2的条带深度为1MB,需要向该分布式存储设备中写入4MB的数据,则将数据分别在本地存储1至本地存储4中写入1MB的数据,本地存储5至本地存储6写入根据这4MB数据计算得到的校验数据。As shown in Figure 3, it is an exemplary scheme of EC algorithm redundancy. Taking EC4+2 as an example for illustration, the data can be divided into 4 nodes plus 2 nodes for storing verification data. Among them, the nodes for storing data correspond to (local storage 1 to local storage 4), and the nodes for storing verification data correspond to (local storage 5 to local storage 6). That is, when the data A written in the logical unit is written into the distributed storage device, the data A can be fragmented with a predetermined stripe depth, that is, the data A can be fragmented into data 1, data 2, data 3 and data 4; At the same time, calculate and generate 2 verification data based on data 1, data 2, data 3 and data 4. Then the 6 points of data are distributed and stored in local storage 1 to local storage 6 . It can be understood that the stripe depth is used to indicate the size of each data fragment in the redundancy of the EC algorithm. For example, suppose the stripe depth of the EC4+2 is 1MB, and 4MB of data needs to be written to the distributed storage device, then write the data of 1MB in local storage 1 to local storage 4, and local storage 5 Write the verification data calculated according to the 4MB data to the local storage 6 .
在该分布式存储***中,为了提高存储资源的利用率,可以通过垃圾回收(Garbage Collection,GC)机制,将该逻辑单元中用于存储数据的空间进行回收,以释放相应的存储资源,用于继续存储其他的数据。而目前通常是需要将逻辑单元上的有效数据进行整体迁移之后,再将该逻辑单元进行回收,以释放相应的存储资源,这样导致数据回收速度较慢。In this distributed storage system, in order to improve the utilization of storage resources, the space used to store data in the logical unit can be reclaimed through the garbage collection (Garbage Collection, GC) mechanism to release the corresponding storage resources. to continue storing other data. At present, it is usually necessary to reclaim the logical unit after the effective data on the logical unit is migrated as a whole to release the corresponding storage resources, which leads to a slow data recovery speed.
为了解决这一问题,本申请实施例提供如下一种存储块的回收方法,应用于包括存储引擎和分布式存储设备的分布式存储***,其中,该存储块回收装置获取该存储引擎创建的第一逻辑单元的回收粒度;然后该存储回收装置根据该回收粒度确定该第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;最后该存储回收装置将该第一存储块标记为能够用于数据存储的空闲块,并将该第一存储块在该分布式存储设备中的对应的第二存储块标记为能够用于数据存储的空闲块。In order to solve this problem, an embodiment of the present application provides the following storage block recovery method, which is applied to a distributed storage system including a storage engine and a distributed storage device, wherein the storage block recovery device obtains the first storage block created by the storage engine The recovery granularity of a logical unit; then the storage reclamation device determines the first storage block that is in a recoverable state in the storage area to be recovered of the first logical unit according to the recovery granularity; finally the storage recovery device determines the first storage block marking as a free block that can be used for data storage, and marking a corresponding second storage block of the first storage block in the distributed storage device as a free block that can be used for data storage.
本申请实施例的技术方案各个设备之间的通信还可以采用各种通信***,例如:全球移动通讯(Global System of Mobile Communication,GSM)***、码分多址(Code Division Multiple Access,CDMA)***、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)***、长期演进(Long Term Evolution,LTE)***、LTE频分双工(Frequency Division Duplex,FDD)***、LTE时分双工(Time Division Duplex,TDD)、通用移动通信***(Universal Mobile Telecommunication System,UMTS)、5G通信***、以及未来的无线通信***等。The communication between the various devices of the technical solution of the embodiment of the present application can also adopt various communication systems, for example: Global System of Mobile Communication (GSM) system, Code Division Multiple Access (Code Division Multiple Access, CDMA) system , Wideband Code Division Multiple Access (WCDMA) system, Long Term Evolution (LTE) system, LTE Frequency Division Duplex (FDD) system, LTE Time Division Duplex (Time Division Duplex) , TDD), Universal Mobile Telecommunication System (Universal Mobile Telecommunication System, UMTS), 5G communication system, and future wireless communication systems, etc.
本申请中该分布式存储***中的客户端可以是用户设备,其中本申请结合用户设备描述了各个实施例。用户设备(User Equipment,UE)也可以指终端设备、接入终端、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置。接入终端可以是蜂窝电话、无绳电话、会话启动协议(Session Initiation Protocol,SIP)电话、无线本地环路(Wireless Local Loop,WLL) 站、个人数字处理(Personal Digital Assistant,PDA)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备,5G网络中的终端设备或者未来演进的PLMN网络中的终端设备等。In this application, the client in the distributed storage system may be a user equipment, and this application describes various embodiments in conjunction with the user equipment. User Equipment (UE) may also refer to terminal equipment, access terminal, subscriber unit, subscriber station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent or user device. The access terminal can be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a Wireless Local Loop (WLL) station, a Personal Digital Assistant (PDA), a wireless communication Functional handheld devices, computing devices or other processing devices connected to wireless modems, vehicle devices, wearable devices, terminal devices in 5G networks or terminal devices in future evolved PLMN networks, etc.
下面结合具体图示对本申请实施例的存储块的回收方法进行说明,如图4所示,本申请实施例中存储块的回收方法的一个实施例包括:The method for reclaiming the storage block in the embodiment of the present application will be described below in conjunction with specific diagrams. As shown in FIG. 4 , an embodiment of the method for reclaiming the storage block in the embodiment of the application includes:
401、该存储块回收装置获取该存储引擎创建的第一逻辑单元的回收粒度。401. The storage block reclamation device acquires the reclamation granularity of the first logical unit created by the storage engine.
本实施例中,用户通过该分布式存储***进行数据存储时,该分布式存储***中的存储引擎将会创建该第一逻辑单元,该第一逻辑单元用于存储数据。该存储引擎在创建该第一逻辑单元时确定该第一逻辑单元的冗余存储模式。然后该存储块回收装置可以根据该第一逻辑单元的冗余存储模式确定该第一逻辑单元的回收粒度。具体可以如下:In this embodiment, when the user stores data through the distributed storage system, the storage engine in the distributed storage system will create the first logical unit, and the first logical unit is used to store data. The storage engine determines the redundant storage mode of the first logical unit when creating the first logical unit. Then the storage block reclaiming device can determine the reclaiming granularity of the first logical unit according to the redundant storage mode of the first logical unit. The details can be as follows:
一种可能实现方式中,该第一逻辑单元的冗余存储模式为副本冗余时,该存储块回收装置根据该副本冗余在该分布式存储设备中的最小写入粒度为该第一逻辑单元的回收粒度。即若在该分布式存储设备中的最小写入粒度为1MB时,则该第一逻辑单元的回收粒度则为1MB。In a possible implementation manner, when the redundant storage mode of the first logical unit is copy redundancy, the storage block reclamation device has the minimum write granularity in the distributed storage device according to the copy redundancy as the first logical unit The recycling granularity of the unit. That is, if the minimum writing granularity in the distributed storage device is 1MB, then the recovery granularity of the first logical unit is 1MB.
另一种可能实现方式中,该第一逻辑单元的冗余存储模式为EC算法冗余时,该存储块回收装置根据该EC算法冗余的条带深度和该分布式存储设备的最小写入粒度确定该第一逻辑单元的回收粒度。具体来说,在该条带深度大于该最小写入粒度时,确定该条带深度与所述EC算法冗余的数据分片数量的乘值为该第一逻辑单元的回收粒度;在该最小写入粒度大于该条带深度时,确定该最小写入粒度与所述EC算法冗余的数据分片数量的乘值为该第一逻辑单元的回收粒度。比如,EC算法冗余模式为EC 4+2,最小写入粒度为1MB,条带深度为2MB,则该第一逻辑单元的回收粒度为8MB。若EC算法冗余模式为EC 4+2,该最小写入粒度为1MB,条带深度为0.5MB,则该第一逻辑单元的粒度4MB。In another possible implementation manner, when the redundant storage mode of the first logical unit is EC algorithm redundancy, the storage block reclamation device is based on the stripe depth of the EC algorithm redundancy and the minimum write rate of the distributed storage device. Granularity determines the reclaim granularity of the first logical unit. Specifically, when the stripe depth is greater than the minimum write granularity, it is determined that the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is the recycling granularity of the first logical unit; When the writing granularity is greater than the stripe depth, it is determined that the multiplication of the minimum writing granularity and the number of data fragments redundant by the EC algorithm is the recovery granularity of the first logical unit. For example, if the redundancy mode of the EC algorithm is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 2MB, then the recovery granularity of the first logical unit is 8MB. If the EC algorithm redundancy mode is EC 4+2, the minimum write granularity is 1MB, and the stripe depth is 0.5MB, then the granularity of the first logical unit is 4MB.
402、该存储回收装置根据该回收粒度确定该第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块。402. The storage reclamation device determines, according to the reclamation granularity, a first storage block in a reclaimable state in the storage area to be reclaimed of the first logical unit.
本实施例中,该第一逻辑单元的待回收存储区域为存储已标记为垃圾数据的存储区域。在分布式存储***的数据存储过程,在第一时刻,该分布式存储***将该第一数据写入该分布式存储***中存储引擎创建的第一逻辑单元;若在第二时刻,该分布式存储***将该第一数据进行修改得到修改数据,则该修改数据将会被写入该存储引擎创建的另一个逻辑单元,此时该第一数据则可以被确认为垃圾数据,而存储该第一数据的存储区域可以确认为待回收存储区。In this embodiment, the storage area to be reclaimed of the first logic unit is a storage area that stores data that has been marked as garbage. In the data storage process of the distributed storage system, at the first moment, the distributed storage system writes the first data into the first logical unit created by the storage engine in the distributed storage system; if at the second moment, the distributed If the storage system modifies the first data to obtain modified data, then the modified data will be written into another logic unit created by the storage engine. At this time, the first data can be confirmed as garbage data, and the storage of the The storage area of the first data may be confirmed as a storage area to be reclaimed.
一种示例性方案中,分布式存储***中写入数据的具体过程可以如图5a所示。假设该存储引擎为键值(key value,KV)存储引擎,则在该用户通过客户端写入该第一数据(该第一数据包括lba和length。其中,lba用于指示该用户在客户端看到的卷的起始位置,比如D盘;该length用于指示该第一数据的长度)时,通过一个哈希hash表存储写入该第一数据的key和value,其中,该key用于指示该第一数据的lba,该value用于指示写入该第一数据的日志文件和存储位移(即offset);然后该存储引擎寻找一个未写满的逻辑单元,调用分布式层的追加写接口根据该hash表中存储的映射关系将该第一数据写入该 逻辑单元中。同时提供该逻辑单元至分布式存储设备之间的冗余存储模式,然后将该逻辑单元中的第一数据按照该冗余存储模式写入该分布式存储设备中,同时该存储引擎创建的逻辑单元中还可以存储用于指示该第一数据写入该分布式存储设备中的存储位置的元数据。当该第一数据修改为第二数据时,其具体操作可以如图5b所示。该KV存储引擎将生成2个请求,一个是写入请求,即写入修改后的第二数据,另一个是删除请求,即将第一数据确认为垃圾数据,需要将该第一数据进行删除。这时在该第二数据写入成功这后,将原来写入第一数据的位置在hash表中标记为垃圾,并放入垃圾hash表中。应理解的是,该存储引擎除了KV存储引擎,也可以是其他存储引擎,只要可以实现分布式存储即可。In an exemplary solution, the specific process of writing data in the distributed storage system may be as shown in FIG. 5a. Assuming that the storage engine is a key-value (key value, KV) storage engine, then the user writes the first data (the first data includes lba and length. When seeing the starting position of the volume, such as the D disk; the length is used to indicate the length of the first data), store the key and value of the first data written in a hash table, wherein the key is used For the lba indicating the first data, the value is used to indicate the log file and storage displacement (ie offset) to write the first data; then the storage engine looks for a logical unit that is not full, and calls the append of the distributed layer The write interface writes the first data into the logic unit according to the mapping relationship stored in the hash table. At the same time, a redundant storage mode between the logical unit and the distributed storage device is provided, and then the first data in the logical unit is written into the distributed storage device according to the redundant storage mode, and the logic created by the storage engine The unit may also store metadata for indicating the storage location where the first data is written in the distributed storage device. When the first data is modified to the second data, its specific operation may be as shown in FIG. 5b. The KV storage engine will generate two requests, one is a write request, that is, the modified second data is written, and the other is a delete request, that is, the first data is confirmed as garbage data, and the first data needs to be deleted. At this time, after the second data is successfully written, the location where the first data was originally written is marked as garbage in the hash table, and put into the garbage hash table. It should be understood that, in addition to the KV storage engine, the storage engine may also be other storage engines, as long as distributed storage can be realized.
而该存储块回收装置在该待回收存储区域中确定处于可回收状态的第一存储块的方式可以如下:And the storage block reclaiming device can determine the first storage block in reclaimable state in the storage area to be reclaimed as follows:
该存储块回收模块按照第一顺序和该回收粒度将该第一逻辑单元划分得到多个中间存储块,其中,该第一顺序为该第一逻辑单元中写数据的过程,由先到后占用存储块的顺序;然后该存储块回收装置获取该多个中间存储块各自的存储位移;在该存储位移指示该多个中间存储块存在位于该待回收存储区域内的中间存储块时,确定位于该待回收存储区域内的中间存储块为处于可回收状态的第一存储块。比如,该第一逻辑单元的大小为128MB,回收粒度为4MB,该存储块回收装置按照该回收粒度将该第一逻辑单元划分为0-4MB、4MB-8MB、8MB-12MB……124MB-128MB的存储块,其中该第一逻辑单元中的待回收存储区域为4MB-25MB,则该存储块回收装置可以确定4MB-8MB、8MB-12MB、12MB-16MB、16MB-20MB、20MB-24MB这五个中间存储块为处于可回收状态的第一存储块。The storage block reclamation module divides the first logical unit into a plurality of intermediate storage blocks according to the first sequence and the recovery granularity, wherein the first sequence is the process of writing data in the first logical unit, which is occupied by first come first The sequence of storage blocks; then the storage block reclaiming device obtains the respective storage displacements of the plurality of intermediate storage blocks; when the storage displacement indicates that there is an intermediate storage block located in the storage area to be recovered in the plurality of intermediate storage blocks, it is determined that the intermediate storage block located in The intermediate storage block in the storage area to be reclaimed is the first storage block in reclaimable state. For example, the size of the first logical unit is 128MB, and the recovery granularity is 4MB. The storage block recovery device divides the first logical unit into 0-4MB, 4MB-8MB, 8MB-12MB...124MB-128MB according to the recovery granularity storage block, wherein the storage area to be reclaimed in the first logical unit is 4MB-25MB, then the storage block reclaiming device can determine the five The intermediate storage block is the first storage block in recyclable state.
403、该存储回收装置将该第一存储块标记为能够用于数据存储的空闲块,并将该第一存储块在该分布式存储设备中的对应的第二存储块标记为能够用于数据存储的空闲块。403. The storage reclamation device marks the first storage block as a free block that can be used for data storage, and marks the corresponding second storage block of the first storage block in the distributed storage device as a free block that can be used for data Free blocks of storage.
本实施例中,在该第一存储块确认为可以回收的存储块之后,该存储块回收装置向该分布式存储***发出回收指令,回收该第一存储块以及该第一存储块在分布式存储设备中对应的第二存储块,最后将该第一存储块和该第二存储块标记为能够用于数据存储的空闲块。In this embodiment, after the first storage block is confirmed as a reclaimable storage block, the storage block reclamation device sends a reclamation command to the distributed storage system to reclaim the first storage block and the first storage block in the distributed The corresponding second storage block in the storage device finally marks the first storage block and the second storage block as free blocks that can be used for data storage.
本实施例中,该分布式存储***在确定该第一存储块符合按照指定粒度进行回收之后,直接将该第一存储块采用标记位进行标记,从而标记为已回收的空间,同时将该第二存储块也标记为已回收空间。如图6a所示,在分布式存储***进行垃圾数据回收时,可以从垃圾hash表中,寻找到符合要求的数据,从而生成回收指令。该分布式存储***根据该回收指令将该第一逻辑单元中的第一存储块标记为已回收空间。同时将第二存储块也标记为已回收空间,即对于本地存储中的与该第一存储块相对应的数据块也标记为已回收空间,如图6b所示。采用这样的方式实现垃圾回收后,第一存储块不再与第二存储块对应,因此,如果遇到要读第一存储块时(非正常)流程时,对外返回全0,即相当于该第一存储块已删除。In this embodiment, after the distributed storage system determines that the first storage block is eligible for recycling according to the specified granularity, it directly marks the first storage block with a mark bit, thereby marking it as a reclaimed space, and at the same time, the first storage block The second storage block is also marked as reclaimed space. As shown in Figure 6a, when the distributed storage system performs garbage data collection, it can find out the required data from the garbage hash table, so as to generate a recycling instruction. The distributed storage system marks the first storage block in the first logic unit as reclaimed space according to the reclamation instruction. At the same time, the second storage block is also marked as reclaimed space, that is, the data block corresponding to the first storage block in the local storage is also marked as reclaimed space, as shown in FIG. 6b. After implementing garbage collection in this way, the first storage block no longer corresponds to the second storage block. Therefore, if you encounter an (abnormal) process when you want to read the first storage block, all 0s will be returned externally, which is equivalent to the The first storage block has been deleted.
可以理解的是,在分布式存储***的存储引擎中,如果该存储引擎创建的逻辑单元进行了指定粒度的回收,那么该存储引擎创建的逻辑单元中的有效数据就会减少。而分布式存储***在长时间运行后,会存在大量的标记有“已回收”的逻辑单元,会增加管理负担 且增加逻辑单元的数量。因此,需要将该存储引擎创建的逻辑单元的空洞实现重复利用。具体可以下:该存储引擎创建的第一逻辑单元已经写满或者该存储引擎创建的第一逻辑单元已标记为不能再追加写时,若该存储引擎需要写入第二数据时,该存储引擎可以通过第一接口写入该第一逻辑单元中已标记为已回收的空间,其中,该第一接口支持中间写入。这样在写入第二数据后,该存储引擎创建的第一逻辑单元中标记为已回收的空间(如该第一存储块)可以写入数据(即补上了该存储引擎创建的第一逻辑单元的空洞),同时删除标记该存储引擎创建的第一逻辑单元中的已回收的标记。一种示例性方案中,可以如图7所示,当用户有新写入数据时,可从已回收哈希表中,寻找已回收的空间进行写入。写入后,将其从已回收哈希表中删除,添加到有效数据的哈希表中。而在该逻辑单元已完成写入之后,该分布式存储***中的分布式存储设备可以在该已回收空间对应的数据块(如图7中所示的数据块3)中写入该新数据,并清除该数据块的已回收标记。It can be understood that, in the storage engine of the distributed storage system, if the logical unit created by the storage engine is reclaimed at a specified granularity, the effective data in the logical unit created by the storage engine will decrease. However, after a distributed storage system runs for a long time, there will be a large number of logical units marked "recycled", which will increase the management burden and increase the number of logical units. Therefore, it is necessary to realize the reuse of the logical unit holes created by the storage engine. The details can be as follows: when the first logical unit created by the storage engine is full or the first logical unit created by the storage engine has been marked as no more writing, if the storage engine needs to write the second data, the storage engine The space marked as reclaimed in the first logical unit may be written through a first interface, wherein the first interface supports intermediate writing. In this way, after the second data is written, the space marked as reclaimed in the first logical unit created by the storage engine (such as the first storage block) can write data (that is, the first logical unit created by the storage engine is supplemented) unit holes), and delete the reclaimed marker marking the first logical unit created by the storage engine. In an exemplary solution, as shown in FIG. 7 , when the user has newly written data, the user may search for reclaimed space from the reclaimed hash table for writing. After writing, it is removed from the reclaimed hash table and added to the valid data's hash table. After the logical unit has been written, the distributed storage device in the distributed storage system can write the new data in the data block corresponding to the reclaimed space (data block 3 as shown in Figure 7). , and clear the reclaimed flag for the data block.
上面描述了本申请实施例中存储块的回收方法,可以理解的是,存储块回收装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The methods for reclaiming storage blocks in the embodiments of the present application have been described above. It can be understood that, in order to realize the above functions, the device for reclaiming storage blocks includes corresponding hardware structures and/or software modules for performing various functions. Those skilled in the art should easily realize that, in combination with the modules and algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
本申请实施例可以根据上述方法示例对存储块回收装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiment of the present application may divide the storage block recovery device into functional modules according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
下面对本申请中的存储块回收装置进行详细描述,图8为本申请实施例中存储块回收装置的硬件结构示意图。该存储块回收装置可以是本申请实施例中分布式存储***的一种可能的实现方式。如图8所示,存储块回收装置至少包括处理器804,存储器803,和收发器802,存储器803进一步用于存储指令8031和数据8032。可选的,该存储块回收装置还可以包括天线806,输入/输出(Input/Output,I/O)接口810和总线812。收发器802进一步包括发射器8021和接收器8022。此外,处理器804,收发器802,存储器803和I/O接口810通过总线812彼此通信连接,天线806与收发器802相连。The storage block reclamation device in the present application will be described in detail below, and FIG. 8 is a schematic diagram of the hardware structure of the storage block reclamation device in the embodiment of the present application. The storage block reclamation device may be a possible implementation manner of the distributed storage system in the embodiment of the present application. As shown in FIG. 8 , the device for reclaiming storage blocks at least includes a processor 804 , a memory 803 , and a transceiver 802 , and the memory 803 is further used to store instructions 8031 and data 8032 . Optionally, the device for recovering storage blocks may further include an antenna 806 , an input/output (Input/Output, I/O) interface 810 and a bus 812 . The transceiver 802 further includes a transmitter 8021 and a receiver 8022 . In addition, the processor 804 , the transceiver 802 , the memory 803 and the I/O interface 810 are communicatively connected to each other through the bus 812 , and the antenna 806 is connected to the transceiver 802 .
处理器804可以是通用处理器,例如但不限于,中央处理器(central processing unit,CPU),也可以是专用处理器,例如但不限于,数字信号处理器(digital signal processor,DSP),应用专用集成电路(application specific integrated circuit,ASIC)和现场可编程门阵列(field programmable gate array,FPGA)等。该处理器804还可以是神经网络处理单元(neural processing unit,NPU)。此外,处理器804还可以是多个处理器的组合。特别的,在本申请实施例提供的技术方案中,处理器804可以用于执行,后续方法实施例中存储块的回收方法的相关步骤。处理器804可以是专门设计用于执行上述步骤和/或操作 的处理器,也可以是通过读取并执行存储器803中存储的指令8031来执行上述步骤和/或操作的处理器,处理器804在执行上述步骤和/或操作的过程中可能需要用到数据8032。 Processor 804 may be a general processor, such as but not limited to, a central processing unit (central processing unit, CPU), and may also be a special purpose processor, such as but not limited to, a digital signal processor (digital signal processor, DSP), application Application Specific Integrated Circuit (ASIC) and Field Programmable Gate Array (FPGA), etc. The processor 804 may also be a neural network processing unit (neural processing unit, NPU). In addition, the processor 804 may also be a combination of multiple processors. In particular, in the technical solution provided by the embodiment of the present application, the processor 804 may be used to execute the relevant steps of the storage block recovery method in the subsequent method embodiments. The processor 804 may be a processor specially designed to perform the above steps and/or operations, or may be a processor that performs the above steps and/or operations by reading and executing the instructions 8031 stored in the memory 803. The processor 804 The data 8032 may be needed during the execution of the above steps and/or operations.
收发器802包括发射器8021和接收器8022,在一种可选的实现方式中,发射器8021用于通过天线806发送信号。接收器8022用于通过天线806之中的至少一根天线接收信号。特别的,在本申请实施例提供的技术方案中,发射器8021具体可以用于通过天线806之中的至少一根天线执行,例如,后续方法实施例中存储块的回收方法应用于存储块回收装置时,存储块回收装置中接收模块或收发模块所执行的操作。The transceiver 802 includes a transmitter 8021 and a receiver 8022. In an optional implementation manner, the transmitter 8021 is used to send signals through the antenna 806. The receiver 8022 is used for receiving signals through at least one antenna among the antennas 806 . In particular, in the technical solution provided by the embodiment of the present application, the transmitter 8021 can be used to execute at least one antenna among the antennas 806. For example, the recovery method of the storage block in the subsequent method embodiments is applied to the recovery of the storage block When the device is installed, the operation performed by the receiving module or the transceiver module in the storage block recycling device.
在本申请实施例中,收发器802用于支持存储块回收装置执行前述的接收功能和发送功能。将具有处理功能的处理器视为处理器804。接收器8022也可以称为输入口、接收电路等,发射器8021可以称为发射器或者发射电路等。In the embodiment of the present application, the transceiver 802 is used to support the device for reclaiming storage blocks to perform the aforementioned receiving function and sending function. A processor having processing functions is considered to be the processor 804 . The receiver 8022 may also be called an input port, a receiving circuit, etc., and the transmitter 8021 may be called a transmitter or a transmitting circuit, etc.
处理器804可用于执行该存储器803存储的指令,以控制收发器802接收消息和/或发送消息,完成本申请方法实施例中存储块回收装置的功能。作为一种实现方式,收发器802的功能可以考虑通过收发电路或者收发的专用芯片实现。本申请实施例中,收发器802接收消息可以理解为收发器802输入消息,收发器802发送消息可以理解为收发器802输出消息。The processor 804 can be used to execute the instructions stored in the memory 803 to control the transceiver 802 to receive messages and/or send messages, so as to complete the function of the memory block reclamation device in the method embodiment of the present application. As an implementation manner, the function of the transceiver 802 may be considered to be realized by a transceiver circuit or a dedicated chip for transceiver. In this embodiment of the present application, receiving a message by the transceiver 802 may be understood as an input message by the transceiver 802, and sending a message by the transceiver 802 may be understood as an output message by the transceiver 802.
存储器803可以是各种类型的存储介质,例如随机存取存储器(random access memory,RAM),只读存储器(read only memory,ROM),非易失性RAM(non-volatile RAM,NVRAM),可编程ROM(programmable ROM,PROM),可擦除PROM(erasable PROM,EPROM),电可擦除PROM(electrically erasable PROM,EEPROM),闪存,光存储器和寄存器等。存储器803具体用于存储指令8031和数据8032,处理器804可以通过读取并执行存储器803中存储的指令8031,来执行本申请方法实施例中该的步骤和/或操作,在执行本申请方法实施例中操作和/或步骤的过程中可能需要用到数据8032。Memory 803 can be various types of storage media, such as random access memory (random access memory, RAM), read only memory (read only memory, ROM), non-volatile RAM (non-volatile RAM, NVRAM), can Programmable ROM (programmable ROM, PROM), erasable PROM (erasable PROM, EPROM), electrically erasable PROM (electrically erasable PROM, EEPROM), flash memory, optical memory and registers, etc. The memory 803 is specifically used to store instructions 8031 and data 8032. The processor 804 can read and execute the instructions 8031 stored in the memory 803 to perform the steps and/or operations in the method embodiments of the present application. The data 8032 may be needed during the operations and/or steps in the embodiments.
可选的,该存储块回收装置还可以包括I/O接口810,该I/O接口810用于接收来自***设备的指令和/或数据,以及向***设备输出指令和/或数据。Optionally, the apparatus for reclaiming storage blocks may further include an I/O interface 810, which is used for receiving instructions and/or data from peripheral devices and outputting instructions and/or data to peripheral devices.
请参阅图9,图9为本申请实施例中存储块回收装置的一种实施例示意图。存储块回收装置900包括:获取模块901,用于获取所述存储引擎创建的第一逻辑单元的回收粒度;Please refer to FIG. 9 . FIG. 9 is a schematic diagram of an embodiment of a storage block reclamation device in an embodiment of the present application. The storage block recovery device 900 includes: an acquisition module 901, configured to acquire the recovery granularity of the first logical unit created by the storage engine;
确定模块902,用于根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;A determination module 902, configured to determine, according to the recovery granularity, a first storage block in a recoverable state in the storage area to be recovered of the first logical unit;
处理模块903,用于将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中对应的第二存储块标记为能够用于数据存储的空闲块。A processing module 903, configured to mark the first storage block as a free block that can be used for data storage, and mark a second storage block corresponding to the first storage block in the distributed storage device as a free block that can be used for data storage Free blocks for data storage.
可选的,所述获取模块901,具体用于获取所述第一逻辑单元的冗余存储模式;根据所述冗余存储模块确定所述第一逻辑单元的回收粒度。Optionally, the obtaining module 901 is specifically configured to obtain the redundant storage mode of the first logical unit; and determine the recycling granularity of the first logical unit according to the redundant storage module.
可选的,所述冗余存储模式为副本冗余,所述获取模块901,具体用于确定所述副本冗余在所述分布式存储设备中的最小写入粒度为所述第一逻辑单元的回收粒度。Optionally, the redundant storage mode is copy redundancy, and the obtaining module 901 is specifically configured to determine that the minimum write granularity of the copy redundancy in the distributed storage device is the first logical unit recycling granularity.
可选的,所述冗余存储模式为纠错码EC算法冗余,所述获取模块901,具体用于根据所述EC算法冗余的条带深度和所述分布式存储设备的最小写入粒度确定所述第一逻辑单 元的回收粒度。Optionally, the redundant storage mode is error-correcting code EC algorithm redundancy, and the acquisition module 901 is specifically configured to use the EC algorithm redundant stripe depth and the minimum writing of the distributed storage device Granularity determines the reclaim granularity of the first logical unit.
可选的,所述获取模块901,具体用于用于在所述条带深度大于所述最小写入粒度时,确定所述条带深度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度;Optionally, the obtaining module 901 is specifically configured to, when the stripe depth is greater than the minimum write granularity, determine the multiplication of the stripe depth and the number of redundant data fragments of the EC algorithm The value is the recycling granularity of the first logical unit;
在所述最小写入粒度大于所述条带深度时,确定所述最小写入粒度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度。When the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
可选的,所述确定模块902,具体用于按照第一顺序和所述回收粒度将所述第一逻辑单元划分得到多个中间存储块,所述第一顺序为向所述第一逻辑单元中写数据的过程中,由先到后占用存储块的顺序;获取所述多个中间存储块各自对应的存储位移;在所述存储位移指示所述多个中间存储块存在位于所述待回收存储区域内的中间存储块时,确定位于所述待回收存储区域内的中间存储块为处于可回收状态的第一存储块。Optionally, the determining module 902 is specifically configured to divide the first logical unit into a plurality of intermediate storage blocks according to the first order and the recovery granularity, the first order is to provide the first logical unit with In the process of writing data, the order of the storage blocks is occupied first-come-first-served; the respective corresponding storage displacements of the plurality of intermediate storage blocks are obtained; the storage displacement indicates that the plurality of intermediate storage blocks exist in When storing the intermediate storage block in the storage area, it is determined that the intermediate storage block located in the storage area to be reclaimed is the first storage block in a recoverable state.
可选的,所述存储块回收装置还包括写入模块904,在所述第一逻辑单元无法执行追加写入时,所述写入模块904,用于通过第一接口向所述第一存储块中写入数据,所述第一接口支持中间写入。Optionally, the storage block reclamation device further includes a writing module 904. When the first logic unit cannot perform additional writing, the writing module 904 is configured to write to the first storage block through the first interface. Write data in a block, and the first interface supports intermediate writing.
上述实施例中的存储块回收装置可以是应用于存储块回收装置中的芯片或者其他可实现上述存储块回收装置的组合器件、部件等。该存储块回收装置中收发模块可以是收发器,处理模块可以是处理器,例如芯片等。当存储块回收装置是芯片***时,收发模块中用于接收的部分可以是芯片***的输入端口,收发模块中用于发送的部分可以是芯片***的输出接口、处理模块可以是芯片***的处理器,例如:中央处理器(central processing unit,CPU)。The storage block reclamation device in the above embodiments may be a chip applied to the storage block reclamation device or other combined devices, components, etc. that can realize the storage block reclamation device. The transceiver module in the storage block reclamation device may be a transceiver, and the processing module may be a processor, such as a chip. When the storage block recycling device is a chip system, the part used for receiving in the transceiver module may be the input port of the chip system, the part used for sending in the transceiver module may be the output interface of the chip system, and the processing module may be the processing module of the chip system. Device, for example: central processing unit (central processing unit, CPU).
在本申请实施例中,该存储块回收装置所包括的存储器主要用于存储软件程序和数据,例如存储上述实施例中所描述的程序等。该存储块回收装置还具有以下功能:In the embodiment of the present application, the memory included in the storage block reclamation device is mainly used for storing software programs and data, for example, storing the programs described in the above embodiments. The storage block recycling device also has the following functions:
处理器,用于获取所述存储引擎创建的第一逻辑单元的回收粒度;根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中对应的第二存储块标记为能够用于数据存储的空闲块。A processor, configured to obtain the reclaim granularity of the first logical unit created by the storage engine; determine the first storage block in a reclaimable state in the storage area to be reclaimed of the first logical unit according to the reclaim granularity; The first storage block is marked as a free block that can be used for data storage, and the second storage block corresponding to the first storage block in the distributed storage device is marked as a free block that can be used for data storage.
可选的,所述处理器,具体用于用于获取所述第一逻辑单元的冗余存储模式;根据所述冗余存储模块确定所述第一逻辑单元的回收粒度。Optionally, the processor is specifically configured to acquire a redundant storage mode of the first logical unit; and determine a recycling granularity of the first logical unit according to the redundant storage module.
可选的,所述冗余存储模式为副本冗余,所述处理器,具体用于确定所述副本冗余在所述分布式存储设备中的最小写入粒度为所述第一逻辑单元的回收粒度。Optionally, the redundant storage mode is copy redundancy, and the processor is specifically configured to determine that the minimum write granularity of the copy redundancy in the distributed storage device is Recycling granularity.
可选的,所述冗余存储模式为纠错码EC算法冗余,所述处理器,具体用于根据所述EC算法冗余的条带深度和所述分布式存储设备的最小写入粒度确定所述第一逻辑单元的回收粒度。Optionally, the redundant storage mode is error correction code EC algorithm redundancy, and the processor is specifically configured to use the EC algorithm redundancy stripe depth and the minimum write granularity of the distributed storage device Determine the recovery granularity of the first logical unit.
可选的,所述处理器,具体用于在所述条带深度大于所述最小写入粒度时,确定所述条带深度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度;Optionally, the processor is specifically configured to, when the stripe depth is greater than the minimum write granularity, determine that the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is The recovery granularity of the first logical unit;
在所述最小写入粒度大于所述条带深度时,确定所述最小写入粒度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度。When the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
可选的,所述处理器,具体用于按照第一顺序和所述回收粒度将所述第一逻辑单元划分得到多个中间存储块,所述第一顺序为向所述第一逻辑单元中写数据的过程中,由先到后占用存储块的顺序;获取所述多个中间存储块各自对应的存储位移;在所述存储位移指示所述多个中间存储块存在位于所述待回收存储区域内的中间存储块时,确定位于所述待回收存储区域内的中间存储块为可回收状态的第一存储块。Optionally, the processor is specifically configured to divide the first logical unit into a plurality of intermediate storage blocks according to a first order and the recovery granularity, the first order is to add to the first logical unit In the process of writing data, the sequence of occupying the storage blocks is first-come-first-served; the corresponding storage displacements of the plurality of intermediate storage blocks are acquired; When selecting an intermediate storage block in the area, it is determined that the intermediate storage block located in the storage area to be reclaimed is the first storage block in a reclaimable state.
可选的,所述处理器,用于在所述第一逻辑单元无法执行追加写入时,通过第一接口向所述第一存储块中写入数据,所述第一接口支持中间写入。Optionally, the processor is configured to write data into the first storage block through a first interface when the first logic unit cannot perform additional writing, and the first interface supports intermediate writing .
本申请实施例还提供了一种处理装置。处理装置包括处理器和接口;该处理器,用于执行上述任一方法实施例的存储块的回收方法。The embodiment of the present application also provides a processing device. The processing device includes a processor and an interface; the processor is configured to execute the method for reclaiming a storage block in any one of the above method embodiments.
应理解,上述处理装置可以是一个芯片,该处理器可以通过硬件实现也可以通过软件来实现,当通过硬件实现时,该处理器可以是逻辑电路、集成电路等;当通过软件来实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现,该存储器可以集成在处理器中,可以位于该处理器之外,独立存在。It should be understood that the above-mentioned processing device may be a chip, and the processor may be implemented by hardware or by software. When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc.; when implemented by software, The processor may be a general-purpose processor, and may be implemented by reading software codes stored in a memory. The memory may be integrated in the processor, or may be located outside the processor and exist independently.
其中,“通过硬件实现”是指通过不具有程序指令处理功能的硬件处理电路来实现上述模块或者单元的功能,该硬件处理电路可以通过分立的硬件元器件组成,也可以是集成电路。为了减少功耗、降低尺寸,通常会采用集成电路的形式来实现。硬件处理电路可以包括专用集成电路(application-specific integrated circuit,ASIC),或者可编程逻辑器件(programmable logic device,PLD);其中,PLD又可包括现场可编程门阵列(field programmable gate array,FPGA)、复杂可编程逻辑器件(complex programmable logic device,CPLD)等等。这些硬件处理电路可以是单独封装的一块半导体芯片(如封装成一个ASIC);也可以跟其他电路(如CPU、DSP)集成在一起后封装成一个半导体芯片,例如,可以在一个硅基上形成多种硬件电路以及CPU,并单独封装成一个芯片,这种芯片也称为SoC,或者也可以在硅基上形成用于实现FPGA功能的电路以及CPU,并单独封闭成一个芯片,这种芯片也称为可编程片上***(system on a programmable chip,SoPC)。Wherein, "implemented by hardware" refers to realizing the functions of the above-mentioned modules or units through a hardware processing circuit that does not have the function of processing program instructions, and the hardware processing circuit may be composed of discrete hardware components, or may be an integrated circuit. In order to reduce power consumption and size, it is usually implemented in the form of an integrated circuit. The hardware processing circuit may include an application-specific integrated circuit (ASIC), or a programmable logic device (programmable logic device, PLD); wherein, the PLD may include a field programmable gate array (field programmable gate array, FPGA) , complex programmable logic device (complex programmable logic device, CPLD) and so on. These hardware processing circuits can be a semiconductor chip packaged separately (such as packaged into an ASIC); they can also be integrated with other circuits (such as CPU, DSP) and packaged into a semiconductor chip, for example, can be formed on a silicon base. A variety of hardware circuits and CPUs are packaged separately into a chip. This chip is also called SoC, or circuits and CPUs for realizing FPGA functions can also be formed on a silicon base, and separately sealed into a chip. This chip Also known as a programmable system on a chip (system on a programmable chip, SoPC).
本申请实施例还提供的一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机控制存储块回收装置执行如前述方法实施例所示任一项实现方式。The embodiment of the present application also provides a computer-readable storage medium, including instructions, which, when run on a computer, enable the computer to control the device for reclaiming storage blocks to execute any one of the implementation manners shown in the foregoing method embodiments.
本申请实施例还提供的一种计算机程序产品,计算机程序产品包括计算机程序代码,当计算机程序代码在计算机上运行时,使得计算机执行如前述方法实施例所示任一项实现方式。The embodiment of the present application also provides a computer program product, the computer program product includes computer program code, and when the computer program code is run on the computer, the computer is made to execute any one of the implementation manners shown in the foregoing method embodiments.
本申请实施例还提供一种芯片***,包括存储器和处理器,存储器用于存储计算机程序,处理器用于从存储器中调用并运行计算机程序,使得芯片执行如前述方法实施例所示任一项实现方式。The embodiment of the present application also provides a chip system, including a memory and a processor, the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the chip performs any implementation as shown in the foregoing method embodiments Way.
本申请实施例还提供一种芯片***,包括处理器,处理器用于调用并运行计算机程序,使得芯片执行如前述方法实施例所示任一项实现方式。The embodiment of the present application also provides a chip system, including a processor, and the processor is configured to call and run a computer program, so that the chip executes any one of the implementation manners shown in the foregoing method embodiments.
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际 的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。In addition, it should be noted that the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be A physical unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to realize the purpose of the solution of this embodiment. In addition, in the drawings of the device embodiments provided in the present application, the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus necessary general-purpose hardware, and of course it can also be realized by special hardware including application-specific integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions completed by computer programs can be easily realized by corresponding hardware, and the specific hardware structure used to realize the same function can also be varied, such as analog circuits, digital circuits or special-purpose circuit etc. However, for this application, software program implementation is a better implementation mode in most cases. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device execute the method described in each embodiment of the present application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、第一网络设备或第二网络设备、计算设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、第一网络设备或第二网络设备、计算设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的第一网络设备或第二网络设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, a computer, a first network device or a second network device, computing device, or data center to another website site, computer, first network device or a second network device, computing device or data center for transmission. The computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a first network device or a second network device, a data center, etc. integrated with one or more available media. The available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout the specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.
另外,本文中术语“***”和“网络”在本文中常被可互换使用。应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。Additionally, the terms "system" and "network" are often used herein interchangeably. It should be understood that in this embodiment of the present application, "B corresponding to A" means that B is associated with A, and B can be determined according to A. However, it should also be understood that determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些 功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the relationship between hardware and software Interchangeability. In the above description, the composition and steps of each example have been generally described according to their functions. Whether these functions are implemented by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, and will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。A unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者存储块回收装置等)执行本申请各个实施例方法的全部或部分步骤。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a storage block recovery device, etc.) execute all or part of the steps of the methods in various embodiments of the present application.
总之,以上所述仅为本申请技术方案的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。In a word, the above descriptions are only preferred embodiments of the technical solutions of the present application, and are not intended to limit the scope of protection of the present application. Any modifications, equivalent replacements, improvements, etc. made within the principles of this application shall be included within the scope of protection of this application.

Claims (18)

  1. 一种存储块的回收方法,应用于包括存储引擎和分布式存储设备的分布式存储***,其特征在于,包括:A method for reclaiming storage blocks, applied to a distributed storage system including a storage engine and a distributed storage device, characterized in that it includes:
    获取所述存储引擎创建的第一逻辑单元的回收粒度;Obtain the recovery granularity of the first logical unit created by the storage engine;
    根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;determining a first storage block in a recoverable state in the storage area to be recovered of the first logical unit according to the recovery granularity;
    将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中对应的第二存储块标记为能够用于数据存储的空闲块。marking the first storage block as a free block that can be used for data storage, and marking the second storage block corresponding to the first storage block in the distributed storage device as a free block that can be used for data storage .
  2. 根据权利要求1所述的方法,其特征在于,所述获取所述存储引擎创建的第一逻辑单元的回收粒度包括:The method according to claim 1, wherein the acquiring the recycling granularity of the first logical unit created by the storage engine comprises:
    获取所述第一逻辑单元的冗余存储模式;Obtain a redundant storage mode of the first logical unit;
    根据所述冗余存储模式确定所述第一逻辑单元的回收粒度。The recovery granularity of the first logic unit is determined according to the redundant storage mode.
  3. 根据权利要求2所述的方法,其特征在于,所述冗余存储模式为副本冗余,所述根据所述冗余存储模式确定所述第一逻辑单元的回收粒度包括:The method according to claim 2, wherein the redundant storage mode is copy redundancy, and determining the recycling granularity of the first logical unit according to the redundant storage mode comprises:
    确定所述副本冗余在所述分布式存储设备中的最小写入粒度为所述第一逻辑单元的回收粒度。It is determined that the minimum write granularity of the copy redundancy in the distributed storage device is the recycling granularity of the first logical unit.
  4. 根据权利要求2所述的方法,其特征在于,所述冗余存储模式为纠错码EC算法冗余,所述根据所述冗余存储模块确定所述第一逻辑单元的回收粒度包括:The method according to claim 2, wherein the redundant storage mode is error correction code (EC) algorithm redundancy, and determining the recycling granularity of the first logical unit according to the redundant storage module includes:
    根据所述EC算法冗余的条带深度和所述分布式存储设备的最小写入粒度确定所述第一逻辑单元的回收粒度。The recovery granularity of the first logical unit is determined according to the stripe depth of the EC algorithm redundancy and the minimum write granularity of the distributed storage device.
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述EC算法冗余的条带深度和所述分布式存储设备的最小写入粒度确定所述第一逻辑单元的回收粒度包括:The method according to claim 4, wherein the determining the recovery granularity of the first logical unit according to the stripe depth of the EC algorithm redundancy and the minimum write granularity of the distributed storage device comprises:
    在所述条带深度大于所述最小写入粒度时,确定所述条带深度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度;When the stripe depth is greater than the minimum write granularity, determine that the multiplied value of the stripe depth and the number of redundant data fragments of the EC algorithm is the recycling granularity of the first logical unit;
    在所述最小写入粒度大于所述条带深度时,确定所述最小写入粒度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度。When the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内处于可回收状态的第一存储块包括:The method according to any one of claims 1 to 5, wherein the determining the first storage block in the reclaimable state in the storage area to be reclaimed of the first logic unit according to the reclaim granularity comprises:
    按照第一顺序和所述回收粒度将所述第一逻辑单元划分得到多个中间存储块,所述第一顺序为向所述第一逻辑单元中写数据的过程中,由先到后占用存储块的顺序;Divide the first logical unit into a plurality of intermediate storage blocks according to the first order and the recycling granularity, and the first order is that during the process of writing data to the first logical unit, the first-come-last-occupied storage block the order of blocks;
    获取所述多个中间存储块各自对应的存储位移;Acquiring storage offsets corresponding to each of the plurality of intermediate storage blocks;
    在所述存储位移指示所述多个中间存储块存在位于所述待回收存储区域内的中间存储块时,确定位于所述待回收存储区域内的中间存储块为处于可回收状态的第一存储块。When the storage displacement indicates that there is an intermediate storage block located in the storage area to be reclaimed among the plurality of intermediate storage blocks, determine that the intermediate storage block located in the storage area to be reclaimed is the first storage in a recoverable state piece.
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 6, further comprising:
    在所述第一逻辑单元无法执行追加写入时,通过第一接口向所述第一存储块中写入数据,所述第一接口支持中间写入。When the first logic unit cannot perform additional writing, data is written into the first storage block through a first interface, and the first interface supports intermediate writing.
  8. 一种存储块回收装置,其特征在于,包括:A storage block recycling device is characterized in that it comprises:
    获取模块,用于获取所述存储引擎创建的第一逻辑单元的回收粒度;An acquisition module, configured to acquire the recovery granularity of the first logical unit created by the storage engine;
    确定模块,用于根据所述回收粒度确定所述第一逻辑单元的待回收存储区域内的处于可回收状态的第一存储块;A determination module, configured to determine, according to the recovery granularity, a first storage block in a recoverable state in the storage area to be recovered of the first logical unit;
    处理模块,用于将所述第一存储块标记为能够用于数据存储的空闲块,并将所述第一存储块在所述分布式存储设备中对应的第二存储块标记为能够用于数据存储的空闲块。A processing module, configured to mark the first storage block as a free block that can be used for data storage, and mark a second storage block corresponding to the first storage block in the distributed storage device as a free block that can be used for Free blocks for data storage.
  9. 根据权利要求8所述的存储块回收装置,其特征在于,所述获取模块,具体用于获取所述第一逻辑单元的冗余存储模式;根据所述冗余存储模块确定所述第一逻辑单元的回收粒度。The storage block recovery device according to claim 8, wherein the acquiring module is specifically configured to acquire the redundant storage mode of the first logic unit; determine the first logical unit according to the redundant storage module The recycling granularity of the unit.
  10. 根据权利要求9所述的存储块回收装置,其特征在于,所述冗余存储模式为副本冗余,所述获取模块,具体用于确定所述副本冗余在所述分布式存储设备中的最小写入粒度为所述第一逻辑单元的回收粒度。The storage block reclamation device according to claim 9, wherein the redundant storage mode is copy redundancy, and the obtaining module is specifically configured to determine the copy redundancy in the distributed storage device The minimum write granularity is the recovery granularity of the first logical unit.
  11. 根据权利要求9所述的存储块回收装置,其特征在于,所述冗余存储模式为纠错码EC算法冗余,所述获取模块,具体用于根据所述EC算法冗余的条带深度和所述分布式存储设备的最小写入粒度确定所述第一逻辑单元的回收粒度。The device for reclaiming storage blocks according to claim 9, wherein the redundant storage mode is error correction code (EC) algorithm redundancy, and the acquisition module is specifically used for stripe depth according to the EC algorithm redundancy and the minimum write granularity of the distributed storage device to determine the recovery granularity of the first logical unit.
  12. 根据权利要求11所述的存储块回收装置,其特征在于,所述获取模块,具体用于在所述条带深度大于所述最小写入粒度时,确定所述条带深度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度;The device for recovering storage blocks according to claim 11, wherein the acquisition module is specifically configured to determine the relationship between the stripe depth and the EC algorithm when the stripe depth is greater than the minimum write granularity. The multiplication value of the number of redundant data fragments is the recycling granularity of the first logical unit;
    在所述最小写入粒度大于所述条带深度时,确定所述最小写入粒度与所述EC算法冗余的数据分片数量的乘值为所述第一逻辑单元的回收粒度。When the minimum write granularity is greater than the stripe depth, it is determined that the product of the minimum write granularity and the number of data fragments redundant by the EC algorithm is the recycling granularity of the first logical unit.
  13. 根据权利要求8至12中任一项所述的存储块回收装置,其特征在于,所述确定模块,具体用于按照第一顺序和所述回收粒度将所述第一逻辑单元划分得到多个中间存储块,所述第一顺序为向所述第一逻辑单元中写数据的过程中,由先到后占用存储块的顺序;获取所述多个中间存储块各自对应的存储位移;在所述存储位移指示所述多个中间存储块存在位于所述待回收存储区域内的中间存储块时,确定位于所述待回收存储区域内的中间存储块为处于可回收状态的第一存储块。The device for reclaiming storage blocks according to any one of claims 8 to 12, wherein the determining module is specifically configured to divide the first logical unit into a plurality of For the intermediate storage blocks, the first order is the order in which the storage blocks are occupied first-come-first-served during the process of writing data to the first logical unit; obtaining the storage displacements corresponding to each of the plurality of intermediate storage blocks; When the storage displacement indicates that there is an intermediate storage block in the storage area to be reclaimed among the plurality of intermediate storage blocks, it is determined that the intermediate storage block in the storage area to be reclaimed is the first storage block in a recoverable state.
  14. 根据权利要求8至13中任一项所述的存储块回收装置,其特征在于,所述存储块回收装置还包括写入模块,在所述第一逻辑单元无法执行追加写入时,所述写入模块,用于通过第一接口向所述第一存储块中写入数据,所述第一接口支持中间写入。The device for reclaiming storage blocks according to any one of claims 8 to 13, wherein the device for reclaiming storage blocks further includes a writing module, when the first logic unit cannot perform additional writing, the A writing module, configured to write data into the first storage block through a first interface, and the first interface supports intermediate writing.
  15. 一种存储块回收装置,其特征在于,包括处理器和存储器,所述处理器与所述存储器耦合,A storage block reclamation device, characterized in that it includes a processor and a memory, the processor is coupled to the memory,
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,用于执行所述存储器中的程序,使得所述存储块回收装置执行如权利要求1至7中任一项所述的方法。The processor is configured to execute the program in the memory, so that the storage block reclamation device executes the method according to any one of claims 1-7.
  16. 一种计算机程序,其特征在于,当所述计算机程序在计算机上运行时,使得计算机执行如权利要求1至7中任意一项所述的方法。A computer program, characterized in that, when the computer program is run on a computer, it causes the computer to execute the method according to any one of claims 1-7.
  17. 一种计算机可读存储介质,其特征在于,包括程序,当所述程序在计算机上运行时,使得计算机执行如权利要求1至7中任一项所述的方法。A computer-readable storage medium, characterized by comprising a program, which causes the computer to execute the method according to any one of claims 1 to 7 when the program is run on the computer.
  18. 一种芯片***,其特征在于,所述芯片***包括一个或多个处理器和存储器,所述存储器中存储有程序指令,当所述程序指令在所述一个或多个处理器中执行时,使得如权利要求1至7中任一项所述的方法被执行。A chip system, characterized in that the chip system includes one or more processors and memory, and program instructions are stored in the memory, and when the program instructions are executed in the one or more processors, causing the method of any one of claims 1 to 7 to be performed.
PCT/CN2022/096184 2021-09-03 2022-05-31 Storage block collection method and related apparatus WO2023029624A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111033548.2A CN115757192A (en) 2021-09-03 2021-09-03 Recovery method of storage block and related device
CN202111033548.2 2021-09-03

Publications (1)

Publication Number Publication Date
WO2023029624A1 true WO2023029624A1 (en) 2023-03-09

Family

ID=85332599

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096184 WO2023029624A1 (en) 2021-09-03 2022-05-31 Storage block collection method and related apparatus

Country Status (2)

Country Link
CN (1) CN115757192A (en)
WO (1) WO2023029624A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170091107A1 (en) * 2015-09-25 2017-03-30 Dell Products, L.P. Cache load balancing by reclaimable block migration
CN108776614A (en) * 2018-05-03 2018-11-09 华为技术有限公司 The recovery method and device of memory block
CN110399310A (en) * 2018-04-18 2019-11-01 杭州宏杉科技股份有限公司 A kind of recovery method and device of memory space
CN111007985A (en) * 2019-10-31 2020-04-14 苏州浪潮智能科技有限公司 Compatible processing method, system and equipment for space recovery of storage system
CN113176858A (en) * 2021-05-07 2021-07-27 锐捷网络股份有限公司 Data processing method, storage system and storage device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170091107A1 (en) * 2015-09-25 2017-03-30 Dell Products, L.P. Cache load balancing by reclaimable block migration
CN110399310A (en) * 2018-04-18 2019-11-01 杭州宏杉科技股份有限公司 A kind of recovery method and device of memory space
CN108776614A (en) * 2018-05-03 2018-11-09 华为技术有限公司 The recovery method and device of memory block
CN111007985A (en) * 2019-10-31 2020-04-14 苏州浪潮智能科技有限公司 Compatible processing method, system and equipment for space recovery of storage system
CN113176858A (en) * 2021-05-07 2021-07-27 锐捷网络股份有限公司 Data processing method, storage system and storage device

Also Published As

Publication number Publication date
CN115757192A (en) 2023-03-07

Similar Documents

Publication Publication Date Title
US11294825B2 (en) Memory system for utilizing a memory included in an external device
TWI647566B (en) Data storage device and data processing method
CN107786638B (en) Data processing method, device and system
US11354250B2 (en) Apparatus for transmitting map information in memory system
WO2013095381A1 (en) Method and system for data de-duplication
CN109522154B (en) Data recovery method and related equipment and system
WO2017020576A1 (en) Method and apparatus for file compaction in key-value storage system
US11029867B2 (en) Apparatus and method for transmitting map information and read count in memory system
US11126562B2 (en) Method and apparatus for managing map data in a memory system
CN113961140B (en) Data processing method and corresponding data storage device
CN113900582A (en) Data processing method and corresponding data storage device
CN107798063B (en) Snapshot processing method and snapshot processing device
WO2023029624A1 (en) Storage block collection method and related apparatus
US11307991B2 (en) Apparatus and method and computer program product for generating a storage mapping table
US20210026778A1 (en) Method and apparatus for performing access operation in memory system
US20200250104A1 (en) Apparatus and method for transmitting map information in a memory system
CN111796969A (en) Data difference compression detection method, computer equipment and storage medium
CN108121504B (en) Data deleting method and device
TWI758944B (en) Data processing method and the associated data storage device
CN110825310A (en) Memory management method and memory controller
CN113835617A (en) Data processing method and corresponding data storage device
US20200349068A1 (en) Memory system and method for performing command operation by memory system
US10452308B2 (en) Encoding tags for metadata entries in a storage system
CN113316770B (en) Data restoration method and device
US11960411B2 (en) Apparatus for transmitting map information in memory system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862766

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE