CN111813347B - Garbage recycling space management method and device and computer readable storage medium - Google Patents

Garbage recycling space management method and device and computer readable storage medium Download PDF

Info

Publication number
CN111813347B
CN111813347B CN202010724347.6A CN202010724347A CN111813347B CN 111813347 B CN111813347 B CN 111813347B CN 202010724347 A CN202010724347 A CN 202010724347A CN 111813347 B CN111813347 B CN 111813347B
Authority
CN
China
Prior art keywords
block
selection module
recovery
data
inefficiency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010724347.6A
Other languages
Chinese (zh)
Other versions
CN111813347A (en
Inventor
刘晓瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Jinan data Technology Co ltd
Original Assignee
Inspur Jinan data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Jinan data Technology Co ltd filed Critical Inspur Jinan data Technology Co ltd
Priority to CN202010724347.6A priority Critical patent/CN111813347B/en
Publication of CN111813347A publication Critical patent/CN111813347A/en
Application granted granted Critical
Publication of CN111813347B publication Critical patent/CN111813347B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Memory System (AREA)

Abstract

The application discloses a garbage recycling space management method and device and a computer readable storage medium. The method comprises the steps of constructing a plurality of block recovery selection modules in advance, wherein each block recovery selection module is divided into a plurality of inefficiency intervals with different inefficiency ranges. When a target data block with fully written data is detected to exist, the target data block is transferred to a corresponding inefficiency interval in a target block recovery selection module according to the inefficiency of the target data block, and the target data block is adjusted to the corresponding inefficiency interval at intervals of a first preset time based on the current inefficiency; when the block recovery selection module with the highest inefficient interval meeting the garbage recovery condition exists, the data blocks are sequentially selected from the block recovery selection modules from old to new according to the using time sequence of the data blocks to perform garbage recovery, so that the data blocks with more invalid data can be selected to reduce the amount of migration data in the process of performing garbage recovery on the data blocks, and meanwhile, the invalid migration can be reduced to reduce the system overhead.

Description

Garbage recycling space management method and device and computer readable storage medium
Technical Field
The present application relates to the field of full flash memory array technologies, and in particular, to a garbage collection space management method and apparatus, and a computer-readable storage medium.
Background
In a full flash memory array, in order to reduce write amplification of a Solid-state disk (SSD) in the array and prolong the service life of the SSD, storage array software usually processes a host write service in a log-structured redirect write mode, so as to convert random small IO generated by the host write service into continuous large IO for writing into the SSD. Thus, after a host overwrite, garbage data is generated, and garbage reclamation is required to timely reclaim the storage pool space containing invalid data. The area of space occupied by the garbage data cannot be reused to receive host data until it is reclaimed by garbage collection.
However, the granularity of storage pool space management, i.e., a block of data, is much larger than the smallest unit sector of host data, there are situations where a block contains partially invalid data, and recovering such a block requires moving the contained valid data to another location before recovering the block. Therefore, it is preferable to select and collect blocks containing a large amount of invalid data to reduce the amount of data to be migrated, but it is not easy to find blocks containing a large amount of invalid data because the storage pool capacity is large and the number of blocks is large. In addition, if the migrated data is data which is just written by the host, the data is likely to become garbage data soon, so that invalid relocation is generated, useless work is done, and storage array resources are wasted.
In view of this, how to select a data block with more invalid data in the garbage collection process to reduce the amount of migration data and reduce invalid migration to reduce system overhead is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The application provides a garbage collection space management method and device and a computer readable storage medium, which can select data blocks with more invalid data to reduce the amount of migration data and can reduce invalid migration to reduce system overhead in the process of garbage collection of data blocks.
In order to solve the above technical problems, embodiments of the present invention provide the following technical solutions:
an embodiment of the present invention provides a method for managing a garbage collection space, including:
pre-constructing a block recovery packet, wherein each block recovery selection module in the block recovery packet is divided into a plurality of inefficiency intervals with different inefficiency ranges;
when a target data block with fully written data is detected to exist, the target data block is transferred to a corresponding inefficiency interval in a target block recovery selection module according to the inefficiency of the target data block, and the target data block is adjusted to the corresponding inefficiency interval at intervals of a first preset time based on the current inefficiency;
when a block recovery selection module with the highest inefficient interval meeting garbage recovery conditions exists, sequentially selecting data blocks from the block recovery selection modules from old to new according to the using time sequence of the data blocks to perform garbage recovery;
each block recovery selection module is provided with a longest life cycle which represents that data exist till the data are emptied, and the data block of the block recovery selection module reaching the longest life cycle is moved into a corresponding inefficacy interval of the next adjacent block recovery selection module; the inefficiency represents a fractional amount of invalid data; the target block recycling selection module is a block recycling selection module which selects the block recycling packet which is used most recently according to the using time sequence of the data blocks.
Optionally, when there is a block recycling selection module in which the highest inefficiency interval satisfies the garbage recycling condition, sequentially selecting data blocks from the block recycling selection modules according to the used time sequence of the data blocks from old to new to perform garbage recycling includes:
if only one block recovery selection module with the highest non-efficiency interval meeting the garbage recovery conditions is available, selecting a data block from the block recovery selection module with the highest non-efficiency interval meeting the garbage recovery conditions for garbage recovery;
and if a plurality of block recovery selection modules with the highest inefficient intervals meeting the garbage recovery conditions exist, selecting the data blocks from the block recovery selection modules in sequence from old to new according to the using time sequence of the data blocks to perform garbage recovery.
Optionally, when it is detected that there is a target data block with fully written data, migrating the target data block to a corresponding inefficiency interval in a target block recycling selection module according to the inefficiency of the target data block includes:
when detecting that a target data block with fully written data exists, calculating the inefficiency of the target data block;
selecting a block recycling selection module which is used most recently from the block recycling packets according to the using time sequence of the data blocks as a target block recycling selection module;
judging whether the target block recovery selection module already contains a data block;
and if the target block recovery selection module does not contain the data block, transferring the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time for the target block recovery selection module to be emptied.
Optionally, at least one block reclamation selection module with the longest life cycle being infinite exists in the block reclamation packet, and the block reclamation selection module with the longest life cycle being infinite is the block reclamation selection module used by the data block latest.
Optionally, the moving the data block in the block recycling selection module reaching the longest life cycle into the corresponding inefficiency interval of the next adjacent block recycling selection module includes:
judging whether the corresponding block recovery selection module reaches the clearing time or not in turn from the time second far to the time nearest according to the use time sequence of the data blocks at intervals of second preset time;
if so, moving the data block in each inefficiency interval of the first block recovery selection module reaching the clearing time into the corresponding inefficiency interval of the next block recovery selection module; the time of the first block recovery selection module used by the data block is a first time; and the next block recovery selection module is a block recovery selection module of which the use time of the data block in the block recovery packet is adjacent to the first time and is later than the first time.
Another aspect of the embodiments of the present invention provides a garbage collection space management apparatus, including:
the device comprises a preprocessing module, a selection module and a selection module, wherein the preprocessing module is used for constructing a block recovery packet in advance, and each block recovery selection module in the block recovery packet is divided into a plurality of inefficiency intervals with different inefficiency ranges; each block recovery selection module is provided with a longest life cycle which represents that data exist till the data are emptied, and the data block of the block recovery selection module reaching the longest life cycle is moved into a corresponding ineffective interval of the next adjacent block recovery selection module;
the data block migration module is used for migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency of the target data block when the target data block with fully written data is detected to exist, and adjusting the target data block to the corresponding inefficiency interval based on the current inefficiency at intervals of a first preset time; the inefficiency represents a fractional amount of invalid data; the target block recovery selection module is a block recovery selection module which selects the most recently used block from the block recovery packet according to the use time sequence of the data blocks;
and the garbage collection module is used for selecting the data blocks from the block collection selection modules in sequence from old to new according to the use time sequence of the data blocks to be collected for garbage collection when the block collection selection module with the highest inefficient interval meeting the garbage collection condition exists.
Optionally, the garbage recycling module includes:
the first recovery submodule is used for selecting a data block from the block recovery selection module which satisfies the garbage recovery condition in the highest inefficiency interval to carry out garbage recovery if only one block recovery selection module which satisfies the garbage recovery condition in the highest inefficiency interval is provided;
and the second recovery submodule is used for selecting the data blocks from the recovery selection modules in sequence from old to new according to the use time sequence of the data blocks to be recovered for garbage recovery if a plurality of block recovery selection modules of which the highest inefficient intervals meet garbage recovery conditions exist.
Optionally, the data block migration module includes:
the inefficiency calculation sub-module is used for calculating the inefficiency of the target data block when the target data block with fully written data is detected;
the set selection submodule is used for selecting a block recovery selection module which is used most recently from the block recovery packet according to the use time sequence of the data blocks to be used as a target block recovery selection module;
the judgment submodule is used for judging whether the target block recovery selection module already contains a data block;
and the set data processing sub-module is used for migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency if the target block recovery selection module does not contain the data block, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time when the target block recovery selection module is emptied.
An embodiment of the present invention further provides a garbage collection space management apparatus, including a processor, where the processor is configured to implement the steps of the garbage collection space management method according to any one of the preceding items when executing a computer program stored in a memory.
Finally, an embodiment of the present invention provides a computer-readable storage medium, where a garbage collection space management program is stored on the computer-readable storage medium, and when executed by a processor, the garbage collection space management program implements the steps of the garbage collection space management method according to any of the foregoing embodiments.
The technical scheme provided by the application has the advantages that the data blocks are divided into the corresponding block recovery selection modules according to the used time sequence, in each block recovery selection module, the data blocks are divided into different inefficiency intervals according to the current inefficiency, the inefficiency intervals of the data blocks in the sets are adjusted at regular time, and the data blocks in the non-empty highest inefficiency intervals and the earliest time sets in each set are preferentially selected when the data blocks are selected for recovery, so that the data blocks with more invalid data can be selected for garbage recovery, the migration data amount is effectively reduced, meanwhile, the invalid migration can be reduced, the system overhead is effectively reduced, and the whole performance and the service quality of the full flash memory array are improved.
In addition, the embodiment of the invention also provides a corresponding implementation device and a computer readable storage medium for the garbage collection space management method, so that the method has higher practicability, and the device and the computer readable storage medium have corresponding advantages.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the related art, the drawings required to be used in the description of the embodiments or the related art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a garbage collection space management method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of another garbage collection space management method according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of another garbage collection space management method according to an embodiment of the present invention;
fig. 4 is a structural diagram of a garbage collection space management apparatus according to an embodiment of the present invention;
fig. 5 is a structural diagram of another specific embodiment of the garbage collection space management device according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the disclosure, reference will now be made in detail to the embodiments of the disclosure as illustrated in the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may include other steps or elements not expressly listed.
Having described the technical solutions of the embodiments of the present invention, various non-limiting embodiments of the present application are described in detail below.
Referring to fig. 1, fig. 1 is a schematic flow chart of a garbage collection space management method according to an embodiment of the present invention, where the embodiment of the present invention includes the following:
s101: a chunk recycle package including a plurality of chunk recycle selection modules is pre-constructed.
In this step, a plurality of block reclamation selection modules are constructed in the full flash storage array, and the block reclamation selection modules are used for storing data blocks which are fully written with data. Each block recycling selection module is divided into a plurality of inefficiency intervals with different inefficiency ranges, that is, each block recycling selection module comprises a plurality of inefficiency ranges, each inefficiency range has one inefficiency range, the inefficiency ranges of the inefficiency ranges are not overlapped, the number of the inefficiency ranges and the corresponding inefficiency ranges can be selected based on actual conditions, and the method is not limited in this respect. For example, if the inefficiency is a percentage of the total data blocks occupied by the invalid data, each block recycling selection module may set 10 inefficiency intervals, and the inefficiency ranges of each inefficiency interval are 0-10%, 11% -20%, … …, 81% -90%, and 91% -100%. Each inefficiency interval in the block recycling selection module stores a data block with inefficiency in the inefficiency range, the inefficiency of the data block refers to the proportion of invalid data, and the data block can be obtained through calculation of all data of the invalid data/data block and can also be obtained through calculation of the invalid data/valid data, which does not affect the implementation of the application, that is, the inefficiency of the data block needs to be matched with the located inefficiency interval, and the inefficiency value needs to be in the inefficiency range of the located inefficiency interval.
Each block recovery selection module is provided with a longest life cycle which indicates that data is emptied, and technicians in the field can determine the longest life cycle of each block recovery selection module according to actual application scenarios, namely, the longest life cycle which indicates the longest time that a data block can be stored in the block recovery selection module. Whether each block recovery selection module reaches the longest life cycle can be judged regularly, for example, every second preset time, for example, 1min, if the block recovery selection module reaching the longest life cycle exists, all data blocks in the block recovery selection module are moved into the corresponding ineffective interval of the next adjacent block recovery selection module; and for a certain block recovery selection module B, the use time of the block recovery module A sequenced in front of B by the data block is earlier than that of B by the data block, the use time of the block recovery module C sequenced in back of B by the data block is later than that of B by the data block, and when B reaches the clearing time, all the data blocks in the block recovery module are moved into the corresponding inefficient interval in C.
S102: when the target data block with the written data is detected to exist, the target data block is migrated to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency of the target data block, and the target data block is adjusted to the corresponding inefficiency interval at intervals of a first preset time based on the current inefficiency.
In this step, the target block recycling selection module selects the block recycling selection module which is used most recently from the block recycling packets according to the use time sequence of the data blocks. For example, the block recycling packet includes A, B, C block recycling selection modules sequentially created according to a time sequence, when none of the block recycling selection modules is used by a data block, when there is a data block full of data, the data block is first placed in the block recycling selection module a, if the block recycling selection module a reaches the longest life cycle, all the data blocks in the block recycling selection module a are moved into the block recycling selection module B, if the block recycling selection module B reaches the longest life cycle, all the data blocks in the block recycling selection module B are moved into the block recycling selection module C, and the longest life cycle of the block recycling selection module C can be set to be infinite. For the block recycle selection module used by the data block in the block recycle packet, detecting that the target data block with fully written data exists,put the target data block into the block recycle module that has been used recently, e.g., block recycle selection module A at t1(e.g. 17: 14: 12) time is used by data block, and block recovery selection module B selects block recovery at t2(e.g., 17: 14: 13) time has been used by the data block, and block reclaim select module C is at t3(e.g., 17: 14: 14) time has been used by the data block, and t1<t2<t3Then the target data block is stored in block reclamation selection module C.
It can be understood that, as the user continuously writes data, invalid data in the data blocks will be more and more, so whether the inefficiency value of each data block in each block recovery selection module matches the currently located inefficiency interval can be regularly scanned, the first preset time can be determined according to the actual requirement, and this is not limited in any way in the present application, that is, the inefficiency interval where each data block in the block recovery selection data module is located is adjusted. For example, the target block recycling selection module includes 3 inefficiency intervals, where the inefficiency range of the first inefficiency interval is 0-35%, the inefficiency range of the second inefficiency interval is 36-70%, and the inefficiency range of the third inefficiency interval is 71-100%. Target data block at t1The target block is shifted into a target block recovery selection module, the inefficiency is 31%, the first inefficiency interval of the target block recovery selection module is recalculated every delta T, for example, 1min, and the target data block is reached after 2 delta T2At time t, the target data block2Is 80%, the target data block may be adjusted from the first inefficiency interval to the third inefficiency interval.
S103: and when the block recycling selection module with the highest inefficient interval meeting the garbage recycling condition exists, sequentially selecting the data blocks from the old block recycling selection module to the new block recycling selection module according to the using time sequence of the data blocks to recycle the garbage.
The garbage collection method and the garbage collection system can preset garbage collection conditions, when the storage array detects that the block collection selection module meeting the garbage collection conditions exists in the garbage collection execution process, garbage collection operation is executed, and if the block collection selection module meeting the garbage collection conditions does not exist, monitoring is continued. The garbage collection condition in this step can be determined based on the inefficiency of the data block, and the module is divided into a plurality of inefficiency intervals, so that whether the garbage collection condition is satisfied can be judged by taking the inefficiency interval as a unit, for example, the garbage collection is performed on the data block with the inefficiency exceeding 80%, and then the garbage collection condition is judged on the inefficiency interval with the inefficiency range of 80%. Of course, the garbage collection condition may be determined based on the invalid interval, for example, the garbage collection condition is to perform garbage collection only on the data blocks in the highest invalid interval of each block collection selection module. The skilled person can determine the garbage recycling condition according to the actual requirement, and the application is not limited to this. When garbage collection is carried out, the garbage collection is preferentially carried out on the relatively earliest data block, so that invalid migration can be effectively reduced, the garbage collection efficiency is improved, and the overhead of a system can be reduced.
In the technical scheme provided by the embodiment of the invention, the data blocks are divided into the corresponding block recovery selection modules according to the used time sequence, the data blocks are divided into different inefficiency intervals according to the current inefficiency in each block recovery selection module, the inefficiency intervals of the data blocks in the sets are adjusted at regular time, and the data blocks in the non-empty highest inefficiency interval and the earliest time set in each set are selected preferentially when the data blocks are selected for recovery, so that the data blocks with more invalid data can be selected for garbage recovery, the migration data amount is effectively reduced, the invalid migration is reduced, the system overhead is effectively reduced, and the overall performance and the service quality of the full flash memory array are improved.
In the foregoing embodiment, how to execute step S102 is not limited, and in this embodiment, an implementation manner of target data block migration is provided, as shown in fig. 2, S102 includes the following steps:
s1021: and when the target data block with the fully written data is detected to exist, calculating the inefficiency of the target data block.
S1022: and selecting the block recycling selection module which is used most recently from the block recycling packets according to the using time sequence of the data blocks as a target block recycling selection module.
S1023: and judging whether the target block recovery selection module contains the data block or not, and if not, executing S1024.
S1024: and migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time of the target block recovery selection module being emptied.
In the step, if the target data block is the first data block in the target block recovery selection module, the longest life cycle of the target block recovery selection module may be obtained first, and then the time for which the target block recovery selection module is emptied is calculated with the start time when the target data block is moved into the target block recovery selection module.
According to the embodiment of the invention, the data blocks are preferentially stored to the block recycling selection module which is used recently, and the data blocks are put in the same set as much as possible, so that the emptying operation time is reduced, and the working efficiency is improved.
It is understood that, during garbage collection, there may be one block collection selection module satisfying garbage collection conditions, there may also be a plurality of block collection selection modules satisfying garbage collection conditions, and there may also be no block collection selection module satisfying garbage collection conditions. Based on this, as an alternative embodiment, as shown in fig. 3, the present application also provides an embodiment of garbage recycling, and S103 may include the following steps:
s1031: and if only one block recovery selection module meeting the garbage recovery condition in the highest inefficiency interval is available, selecting the data block from the block recovery selection module meeting the garbage recovery condition in the highest inefficiency interval for garbage recovery.
S1032: and if a plurality of block recovery selection modules with the highest inefficient intervals meeting the garbage recovery conditions exist, selecting the data blocks from the block recovery selection modules in sequence from old to new according to the using time sequence of the data blocks to perform garbage recovery.
Of course, the system may continue to monitor for situations where the garbage collection condition is not met. When the recovery data blocks are selected, the data blocks in the most inefficient intervals are preferentially selected in the recovery selection modules, and when the inefficacy of the most inefficient intervals containing the data blocks in the recovery selection modules is the same, the data blocks in the earliest time set are preferentially selected, so that the migration data amount is favorably reduced, and meanwhile, the invalid migration can be reduced.
As another optional implementation manner, in order to achieve that a data block with more invalid data can be selected to reduce the amount of migration data and also reduce invalid migration to reduce system overhead in the process of garbage collection of a data block, at least one block collection selection module with an infinite lifetime exists in a block collection packet, and the block collection selection module with the infinite lifetime is the block collection selection module used by the data block at the latest. And one way to process the block recycling selection module that reaches the longest life cycle may be:
judging whether the corresponding block recovery selection module reaches the clearing time or not in turn from the time second far to the time nearest according to the use time sequence of the data blocks at intervals of second preset time;
if so, moving the data block in each inefficiency interval of the first block recovery selection module reaching the clearing time into the corresponding inefficiency interval of the next block recovery selection module; the using time of the first block recovery selection module by the data block is a first time; the next block recycling selection module is a block recycling selection module of which the use time of the data block in the block recycling packet is adjacent to the first time and later than the first time.
It should be noted that, in the present application, there is no strict sequential execution order among the steps, and as long as a logical order is met, the steps may be executed simultaneously or according to a certain preset order, and fig. 1 to fig. 3 are only schematic manners, and do not represent only such an execution order.
The embodiment of the invention also provides a corresponding device for the garbage recycling space management method, so that the method has higher practicability. Wherein the means can be described separately from the functional module point of view and the hardware point of view. In the following, the garbage collection space management apparatus according to the embodiment of the present invention is introduced, and the garbage collection space management apparatus described below and the garbage collection space management method described above may be referred to in correspondence with each other.
Based on the angle of the functional module, referring to fig. 4, fig. 4 is a structural diagram of a garbage collection space management device according to an embodiment of the present invention, in a specific implementation, the device may include:
a preprocessing module 401, configured to pre-construct a block recovery packet, where each block recovery selection module in the block recovery packet is divided into multiple inefficiency intervals with different inefficiency ranges; each block recycling selection module is provided with a longest life cycle which represents the period from data existence to data emptying, and the data block of the block recycling selection module reaching the longest life cycle is moved into a corresponding inefficacy interval of the next adjacent block recycling selection module.
A data block migration module 402, configured to, when a target data block with fully written data is detected to exist, migrate the target data block to a corresponding inefficiency interval in the target block recovery selection module according to inefficiency of the target data block, and adjust the target data block to the corresponding inefficiency interval at intervals of a first preset time based on the current inefficiency; the inefficiency represents the occupation of invalid data in the current data block; the target block recycling selection module is a block recycling selection module which selects the most recently used blocks from the block recycling packets according to the using time sequence of the data blocks.
And a garbage collection module 403, configured to, when there is a block collection selection module in which the highest inefficiency interval meets the garbage collection condition, sequentially select data blocks from the block collection selection modules according to the used time sequence of the data blocks from old to new to perform garbage collection.
Optionally, in some embodiments of this embodiment, the garbage collection module 403 may further include:
the first recovery submodule is used for selecting a data block from the block recovery selection module which satisfies the garbage recovery condition in the highest inefficiency interval to carry out garbage recovery if only one block recovery selection module which satisfies the garbage recovery condition in the highest inefficiency interval is provided;
and the second recovery submodule is used for selecting the data blocks from the recovery selection modules in sequence from old to new according to the use time sequence of the data blocks to be recovered for garbage recovery if a plurality of block recovery selection modules of which the highest inefficient intervals meet garbage recovery conditions exist.
Optionally, in other embodiments of this embodiment, the data block migration module 402 may further include:
the inefficiency calculation sub-module is used for calculating the inefficiency of the target data block when the target data block with the fully written data is detected;
the set selection submodule is used for selecting a block recovery selection module which is used most recently from the block recovery packet according to the use time sequence of the data blocks to be selected as a target block recovery selection module;
the judgment sub-module is used for judging whether the target block recovery selection module already contains the data block;
and the set data processing submodule is used for migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to inefficiency if the target block recovery selection module does not contain the data block, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time for emptying the target block recovery selection module.
As some other optional embodiments, the preprocessing module 401 may further be configured to:
whether the corresponding block recovery selection module reaches the clearing time or not is sequentially judged from time to time according to the using time sequence of the data blocks;
if so, moving the data block in each inefficiency interval of the first block recovery selection module reaching the clearing time into the corresponding inefficiency interval of the next block recovery selection module; the using time of the first block recovery selection module by the data block is a first time; the next block recycling selection module is a block recycling selection module of which the use time of the data block in the block recycling packet is adjacent to the first time and later than the first time.
The functions of the functional modules of the garbage collection space management device according to the embodiment of the present invention may be specifically implemented according to the method in the above method embodiment, and the specific implementation process may refer to the related description of the above method embodiment, which is not described herein again.
Therefore, in the process of garbage collection of the data blocks, the embodiment of the invention can select the data blocks with more invalid data to reduce the amount of the migration data, and can reduce the invalid migration to reduce the system overhead.
The garbage collection space management device mentioned above is described from the perspective of a functional module, and further, the present application also provides a garbage collection space management device described from the perspective of hardware. Fig. 5 is a structural diagram of another garbage collection space management device according to an embodiment of the present application. As shown in fig. 5, the apparatus comprises a memory 50 for storing a computer program;
the processor 51, when executing the computer program, is configured to implement the steps of the garbage collection space management method according to any of the above embodiments.
The processor 51 may include one or more processing cores, such as a 5-core processor, an 8-core processor, and the like. The processor 51 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 51 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 51 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 51 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 50 may include one or more computer-readable storage media, which may be non-transitory. Memory 50 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 50 is at least used for storing a computer program 501, wherein after being loaded and executed by the processor 51, the computer program can implement the relevant steps of the garbage collection space management method disclosed in any one of the foregoing embodiments. In addition, the resources stored in the memory 50 may also include an operating system 502, data 503, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 502 may include Windows, Unix, Linux, etc. Data 503 may include, but is not limited to, data corresponding to test results, and the like.
In some embodiments, the garbage collection space management device may further include a display 52, an input/output interface 53, a communication interface 54, a power supply 55, and a communication bus 56.
It will be appreciated by those skilled in the art that the configuration shown in figure 5 does not constitute a limitation of the waste recovery space management device and may include more or fewer components than those shown, such as sensors 57.
The functions of the functional modules of the garbage collection space management device according to the embodiment of the present invention may be specifically implemented according to the method in the above method embodiment, and the specific implementation process may refer to the related description of the above method embodiment, which is not described herein again.
Therefore, in the process of garbage collection of the data blocks, the embodiment of the invention can select the data blocks with more invalid data to reduce the amount of the migration data, and can reduce the invalid migration to reduce the system overhead.
It is to be understood that, if the garbage collection space management method in the above embodiments is implemented in the form of a software functional unit and sold or used as a stand-alone product, it may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application may be substantially or partially implemented in the form of a software product, which is stored in a storage medium and executes all or part of the steps of the methods of the embodiments of the present application, or all or part of the technical solutions. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), an electrically erasable programmable ROM, a register, a hard disk, a removable magnetic disk, a CD-ROM, a magnetic or optical disk, and other various media capable of storing program codes.
Based on this, an embodiment of the present invention further provides a computer-readable storage medium, in which a garbage collection space management program is stored, and the garbage collection space management program is executed by a processor, where the steps of the garbage collection space management method according to any one of the above embodiments are provided.
The functions of the functional modules of the computer-readable storage medium according to the embodiment of the present invention may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the related description of the foregoing method embodiment, which is not described herein again.
Therefore, in the process of garbage collection of the data blocks, the embodiment of the invention can select the data blocks with more invalid data to reduce the amount of the migration data, and can reduce the invalid migration to reduce the system overhead.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The garbage collection space management method, the garbage collection space management device and the computer-readable storage medium provided by the present application are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present application.

Claims (9)

1. A garbage collection space management method is characterized by comprising the following steps:
pre-constructing a block recovery packet, wherein each block recovery selection module in the block recovery packet is divided into a plurality of inefficiency intervals with different inefficiency ranges; the block recovery packet at least comprises a block recovery selection module with the longest life cycle being infinite length, and the block recovery selection module with the longest life cycle being infinite length is a block recovery selection module used by the data block at the latest;
when a target data block with fully written data is detected to exist, the target data block is transferred to a corresponding inefficiency interval in a target block recovery selection module according to the inefficiency of the target data block, and the target data block is adjusted to the corresponding inefficiency interval at intervals of a first preset time based on the current inefficiency;
when a block recovery selection module with the highest inefficient interval meeting garbage recovery conditions exists, sequentially selecting data blocks from the block recovery selection modules from old to new according to the using time sequence of the data blocks to perform garbage recovery;
each block recovery selection module is provided with a longest life cycle which represents that data exist till the data are emptied, and the data block of the block recovery selection module reaching the longest life cycle is moved into a corresponding inefficacy interval of the next adjacent block recovery selection module; the inefficiency represents a fractional amount of invalid data; the target block recycling selection module is a block recycling selection module which selects the block recycling packet which is used most recently according to the using time sequence of the data blocks.
2. The garbage collection space management method according to claim 1, wherein the selecting the data blocks from the block collection selection modules in order from old to new according to the time sequence of the use of the data blocks to perform garbage collection when the block collection selection module having the highest inefficient section satisfying the garbage collection condition exists comprises:
if only one block recovery selection module with the highest non-efficiency interval meeting the garbage recovery conditions is available, selecting a data block from the block recovery selection module with the highest non-efficiency interval meeting the garbage recovery conditions for garbage recovery;
and if a plurality of block recovery selection modules with the highest inefficient intervals meeting the garbage recovery conditions exist, selecting the data blocks from the block recovery selection modules in sequence from old to new according to the using time sequence of the data blocks to perform garbage recovery.
3. The garbage collection space management method according to claim 2, wherein when it is detected that there is a target data block with fully written data, migrating the target data block to a corresponding inefficiency interval in a target block collection selection module according to the inefficiency of the target data block comprises:
when detecting that a target data block with fully written data exists, calculating the inefficiency of the target data block;
selecting a block recycling selection module which is used most recently from the block recycling packets according to the using time sequence of the data blocks to be used as a target block recycling selection module;
judging whether the target block recovery selection module already contains a data block;
and if the target block recovery selection module does not contain the data block, transferring the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time for the target block recovery selection module to be emptied.
4. The garbage collection space management method of claim 1, wherein the moving the data block in the block collection selection module reaching the longest life cycle into the corresponding inefficiency interval of the next adjacent block collection selection module comprises:
judging whether the corresponding block recovery selection module reaches the clearing time or not in turn from the time next to the time nearest according to the use time sequence of the data blocks at intervals of second preset time;
if so, moving the data block in each inefficiency interval of the first block recovery selection module reaching the clearing time into the corresponding inefficiency interval of the next block recovery selection module; the time of the first block recovery selection module used by the data block is a first time; and the next block recovery selection module is a block recovery selection module of which the use time of the data block in the block recovery packet is adjacent to the first time and is later than the first time.
5. A space management device is retrieved to rubbish, its characterized in that includes:
the device comprises a preprocessing module, a selection module and a selection module, wherein the preprocessing module is used for constructing a block recovery packet in advance, and each block recovery selection module in the block recovery packet is divided into a plurality of inefficiency intervals with different inefficiency ranges; each block recovery selection module is provided with a longest life cycle which represents that data exist till the data are emptied, and the data block of the block recovery selection module reaching the longest life cycle is moved into a corresponding ineffective interval of the next adjacent block recovery selection module; at least one block recovery selection module with the longest life cycle being infinite in the block recovery packet exists, and the block recovery selection module with the longest life cycle being infinite is the block recovery selection module used by the data block at the latest;
the data block migration module is used for migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency of the target data block when the target data block with fully written data is detected to exist, and adjusting the target data block to the corresponding inefficiency interval based on the current inefficiency at intervals of a first preset time; the inefficiency represents a fractional amount of invalid data; the target block recovery selection module is a block recovery selection module which selects the most recently used block from the block recovery packet according to the use time sequence of the data blocks;
and the garbage collection module is used for selecting the data blocks from the block collection selection modules in sequence from old to new according to the use time sequence of the data blocks to be collected for garbage collection when the block collection selection module with the highest inefficient interval meeting the garbage collection condition exists.
6. The garbage collection space management device according to claim 5, wherein the garbage collection module comprises:
the first recovery sub-module is used for selecting a data block from the block recovery selection module with the highest inefficiency interval meeting the garbage recovery condition for garbage recovery if the highest inefficiency interval meets the garbage recovery condition and only one block recovery selection module is provided;
and the second recovery submodule is used for selecting the data blocks from the recovery selection modules in sequence from old to new according to the use time sequence of the data blocks to be recovered for garbage recovery if a plurality of block recovery selection modules of which the highest inefficient intervals meet garbage recovery conditions exist.
7. The garbage collection space management apparatus of claim 6, wherein the data block migration module comprises:
the inefficiency calculation sub-module is used for calculating the inefficiency of the target data block when the target data block with fully written data is detected;
the set selection submodule is used for selecting a block recovery selection module which is used most recently from the block recovery packet according to the use time sequence of the data blocks to be used as a target block recovery selection module;
the judgment submodule is used for judging whether the target block recovery selection module already contains a data block;
and the set data processing sub-module is used for migrating the target data block to a corresponding inefficiency interval in the target block recovery selection module according to the inefficiency if the target block recovery selection module does not contain the data block, taking the current time as the non-empty starting time of the target block recovery selection module, and calculating to obtain the time for the target block recovery selection module to be emptied.
8. A garbage collection space management apparatus comprising a processor for implementing the steps of the garbage collection space management method according to any one of claims 1 to 4 when executing a computer program stored in a memory.
9. A computer-readable storage medium, having a garbage collection space management program stored thereon, which when executed by a processor, performs the steps of the garbage collection space management method according to any one of claims 1 to 4.
CN202010724347.6A 2020-07-24 2020-07-24 Garbage recycling space management method and device and computer readable storage medium Active CN111813347B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010724347.6A CN111813347B (en) 2020-07-24 2020-07-24 Garbage recycling space management method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010724347.6A CN111813347B (en) 2020-07-24 2020-07-24 Garbage recycling space management method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111813347A CN111813347A (en) 2020-10-23
CN111813347B true CN111813347B (en) 2022-06-07

Family

ID=72862592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010724347.6A Active CN111813347B (en) 2020-07-24 2020-07-24 Garbage recycling space management method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111813347B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113687774A (en) * 2021-07-19 2021-11-23 锐捷网络股份有限公司 Space recovery method, device and equipment
CN113608702A (en) * 2021-08-18 2021-11-05 合肥大唐存储科技有限公司 Method and device for realizing data processing, computer storage medium and terminal
CN113849422A (en) * 2021-09-26 2021-12-28 苏州浪潮智能科技有限公司 Method, device and equipment for selecting garbage collection target block and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577339A (en) * 2012-07-27 2014-02-12 深圳市腾讯计算机***有限公司 Method and system for storing data
WO2015085732A1 (en) * 2013-12-10 2015-06-18 中兴通讯股份有限公司 Terminal memory processing method and apparatus, and terminal
CN106648837A (en) * 2016-12-30 2017-05-10 携程旅游网络技术(上海)有限公司 Virtual machine life cycle management system and virtual machine life cycle management method
CN109710541A (en) * 2018-12-06 2019-05-03 天津津航计算技术研究所 For the optimization method of NAND Flash main control chip Greedy garbage reclamation
CN110018794A (en) * 2019-04-11 2019-07-16 苏州浪潮智能科技有限公司 A kind of rubbish recovering method, device, storage system and readable storage medium storing program for executing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10366004B2 (en) * 2016-07-26 2019-07-30 Pure Storage, Inc. Storage system with elective garbage collection to reduce flash contention

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577339A (en) * 2012-07-27 2014-02-12 深圳市腾讯计算机***有限公司 Method and system for storing data
WO2015085732A1 (en) * 2013-12-10 2015-06-18 中兴通讯股份有限公司 Terminal memory processing method and apparatus, and terminal
CN106648837A (en) * 2016-12-30 2017-05-10 携程旅游网络技术(上海)有限公司 Virtual machine life cycle management system and virtual machine life cycle management method
CN109710541A (en) * 2018-12-06 2019-05-03 天津津航计算技术研究所 For the optimization method of NAND Flash main control chip Greedy garbage reclamation
CN110018794A (en) * 2019-04-11 2019-07-16 苏州浪潮智能科技有限公司 A kind of rubbish recovering method, device, storage system and readable storage medium storing program for executing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
P. Pufek;H. GrgićB. Mihaljević."Analysis of Garbage Collection Algorithms and Memory Management in Java".《IEEE》.2019, *
分布式数据处理***内存对象管理问题分析;张雄等;《中兴通讯技术》;20160430(第02期);全文 *
基于嵌入式Java虚拟机的垃圾回收算法;谌宁等;《计算机应用》;20050128(第01期);全文 *

Also Published As

Publication number Publication date
CN111813347A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
CN111813347B (en) Garbage recycling space management method and device and computer readable storage medium
CN110597616B (en) Memory allocation method and device for neural network
CN111090398B (en) Garbage recycling method, device and equipment for solid state disk and readable storage medium
CN104166606B (en) File backup method and main storage device
CN103336744B (en) A kind of rubbish recovering method of solid storage device and system thereof
CN111124305A (en) Solid state disk wear leveling method and device and computer readable storage medium
CN109284233B (en) Garbage recovery method of storage system and related device
CN105511806B (en) The method and mobile terminal of processing write requests
CN103473343A (en) File management method, device and terminal
CN111324303A (en) SSD garbage recycling method and device, computer equipment and storage medium
CN110673789A (en) Metadata storage management method, device, equipment and storage medium of solid state disk
CN106293497A (en) The recovery method of junk data and device in watt record filesystem-aware
CN103927263A (en) Garbage recycling method and garbage recycling device
CN108153594A (en) The resource fragmentation method for sorting and electronic equipment of a kind of artificial intelligence cloud platform
CN112306408A (en) Storage block processing method, device, equipment and storage medium
CN113535483B (en) File backup method and device and computing equipment
CN109189739B (en) Cache space recovery method and device
CN107704341A (en) File access pattern method, apparatus and electronic equipment
CN113190503B (en) File system capacity expansion method and device, electronic equipment and storage medium
CN102169464B (en) Caching method and device used for non-volatile memory, and intelligent card
CN110018793B (en) Host IO processing control method and device, terminal and readable storage medium
CN114995770B (en) Data processing method, device, equipment, system and readable storage medium
CN109032762A (en) Virtual machine retrogressive method and relevant device
CN110704241B (en) Method, device, equipment and medium for recovering file metadata
CN102981964A (en) Method and system of data storage space management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant