WO2016188098A1 - 瓦记录感知文件***中垃圾数据的回收方法和装置 - Google Patents

瓦记录感知文件***中垃圾数据的回收方法和装置 Download PDF

Info

Publication number
WO2016188098A1
WO2016188098A1 PCT/CN2015/097908 CN2015097908W WO2016188098A1 WO 2016188098 A1 WO2016188098 A1 WO 2016188098A1 CN 2015097908 W CN2015097908 W CN 2015097908W WO 2016188098 A1 WO2016188098 A1 WO 2016188098A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage tape
storage
recovered
band
determining
Prior art date
Application number
PCT/CN2015/097908
Other languages
English (en)
French (fr)
Inventor
曾令仿
张泽浩
李俊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2016188098A1 publication Critical patent/WO2016188098A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Definitions

  • the present invention relates to the field of data storage, and in particular, to a method and apparatus for recovering garbage data in a tile-aware file system.
  • Tile recording is a technique for increasing the density of disk storage.
  • the principle is that adjacent tracks on the disk partially overlap, and data is recorded like overlapping roof tiles, so it is called tile recording.
  • the watt record disk is generally divided into multiple storage bands (Band). Each Band only supports additional writes and does not support local update. Therefore, watt record disks require garbage collection to improve resource utilization.
  • SAFS Shingle-Aware File System
  • DSAFS Device-Side SAFS
  • HAFS Host-side SAFS
  • CSAFS Cooperative SAFS
  • DSAFS is similar to the Flash Translation Layer (FTL) in Solid State Drives (SSD), providing a block device to the host and a Shingled Write Disk.
  • FTL Flash Translation Layer
  • SSD Solid State Drives
  • SWD Shingled Write Disk
  • the invention provides a garbage data recycling method, a host end and a device end, which can improve garbage data recovery efficiency.
  • a method for recovering garbage data in a watt record-aware file system includes a host end and a device end, and the method includes: the host end determining storage to be reclaimed in a plurality of storage areas The host determines the type of the storage tape to be recycled, and the type of the storage tape to be recycled includes a hot storage tape and a cold storage tape; the host determines a garbage data recovery policy according to the type of the storage tape to be recycled; The device sends the garbage data recovery policy to the device, so that the device performs garbage data recovery processing on the to-be-recovered storage tape according to the garbage data recovery policy.
  • determining the to-be-recovered storage tape in the multiple storage strips includes: determining an invalid data block utilization rate of each of the plurality of storage storage spaces; The storage area with the largest utilization of the invalid data block is determined as the storage area to be recycled.
  • the determining a garbage data recovery policy according to the type of the storage tape to be recycled includes: when the type of the storage tape to be recycled is hot Determining a first target storage tape in the hot storage tape of the plurality of storage tapes; determining the garbage data recovery policy, the garbage data recovery policy is used to indicate that the device end is effective in the storage tape to be recycled The data block is stored in the free data block of the first target storage tape; when the type of the storage tape to be recycled is a cold storage tape, the second target storage tape is determined in the cold storage tape in the plurality of storage spaces; The garbage data recovery policy is used to instruct the device side to store the valid data block in the to-be-recovered storage area into the idle data block of the second target storage area.
  • determining the first in the hot storage tape in the plurality of storage tapes a target storage tape comprising: determining a valid data block utilization rate of the to-be-recovered storage tape when the type of the storage tape to be recycled is a hot storage tape; determining use of the free data block of the hot storage tape in the plurality of storage tapes Rate; when the free data block utilization rate of the at least one hot storage tape in the hot storage tape other than the to-be-recovered storage tape in the plurality of storage tapes is greater than or equal to the effective data block utilization ratio of the to-be-recovered storage tape; a hot storage tape with a minimum utilization of free data blocks in the at least one hot storage tape is determined as a first target storage tape; when there is no free data block utilization in the hot storage tape other than the to-be-recovered storage tape among the plurality of storage tapes
  • determining the first in the hot storage tape in the plurality of storage tapes a target storage tape comprising: determining a valid data block utilization rate of the to-be-recovered
  • a second target storage tape comprising: determining a valid data block utilization rate of the to-be-recovered storage tape when the type of the storage tape to be recycled is a cold storage tape; determining idle data of the cold storage tape in the plurality of storage tapes Block utilization; when the free data block utilization rate of at least one cold storage tape in the cold storage tape other than the to-be-recovered storage tape in the plurality of storage tapes is greater than or equal to the effective data block utilization ratio of the to-be-reclaimed storage tape Determining, as the second target storage tape, a cold storage tape that minimizes the utilization of the free data block in the at least one cold storage tape; when there is no idle data in the cold storage tape other than the to-be-recovered storage tape among the plurality of storage tapes When the block utilization rate is greater than
  • determining the type of the to-be-recovered storage tape includes: determining, according to the number of read/write times, each data block in the to-be-recovered storage tape The degree of coldness of each of the data blocks in the to-be-recovered storage zone is determined as the degree of thermal heat of the to-be-recovered storage zone; when the thermal heat of the to-be-recovered storage zone is greater than or equal to the first threshold And determining that the type of the storage tape to be recycled is a hot storage tape; or determining that the type of the storage tape to be recycled is a cold storage tape when the thermal heat of the to-be-recovered storage tape is less than the first threshold.
  • the method further includes: determining a degree of cooling of each of the plurality of storage strips; A storage strip having a thermal degree of heat greater than or equal to a second threshold is determined as a thermal storage strip; and a storage strip having a coldness of less than the second threshold in the plurality of storage strips is determined as a cold storage strip.
  • the host sends the garbage data recovery policy to the device end, so that the device end is configured according to the garbage data recovery policy. Recycling the storage tape for the garbage data recovery process, comprising: sending, by the host end, the garbage data recovery policy to the device end, where the garbage data recovery policy includes an indicator of the to-be-recycled storage tape and the target storage tape, so as to facilitate the device end According to the garbage data recovery strategy, the valid data in the to-be-recovered storage tape is stored in the free data block in the target storage tape, and the invalid data block in the to-be-recovered storage tape is recycled, and the target storage tape is processed.
  • the first target storage tape or the second target storage tape is included.
  • a host in a second aspect, includes a network interface, a memory, and a processor, wherein the memory stores a set of programs, and the processor is configured to invoke a program stored in the memory, so that the host performs A possible implementation of any of the above possibilities, on the one hand or the first aspect.
  • a host in a watt record-aware file system comprising: a first determining module, configured to determine a storage strip to be recovered in the plurality of storage strips; a second determining module, configured to determine a type of the storage strip to be recycled, the type of the storage strip to be recycled includes a hot storage strip and a cold storage strip a third determining module, configured to determine a garbage data recovery policy according to the type of the to-be-recovered storage tape determined by the second determining module, and a sending module, configured to send the garbage data determined by the third determining module to the device end Recycling strategy, so that the device side performs garbage data recovery processing on the to-be-recovered storage tape according to the garbage data recycling policy.
  • the first determining module is specifically configured to: determine an invalid data block utilization rate of each of the plurality of storage spaces; and use the invalid data block utilization rate The largest storage tape is determined as the storage tape to be recycled.
  • the third determining module is specifically configured to: when the type of the to-be-recovered storage tape is a hot storage tape, in the multiple storage Determining a first target storage zone in the hot storage zone in the tape; determining the garbage data recovery policy, the garbage data recovery policy is used to instruct the device end to store the valid data block in the to-be-recovered storage zone to the first target storage In the idle data block of the tape; when the type of the to-be-recovered storage tape is a cold storage tape, determining a second target storage tape in the cold storage tape in the plurality of storage tapes; determining the garbage data recovery policy, the garbage data The recycling policy is used to instruct the device end to store the valid data block in the to-be-recovered storage tape into the free data block of the second target storage tape.
  • the third determining module is specifically configured to: when the type of the to-be-recovered storage tape is a hot storage tape, determine the to-be-reclaimed storage Effective block utilization of the strip; determining free block utilization of the hot storage strip in the plurality of storage strips; at least one thermal storage in the hot storage strip of the plurality of storage strips other than the to-be-recovered storage strip
  • the idle data block utilization rate of the band is greater than or equal to the effective data block utilization of the to-be-reclaimed storage tape
  • the hot storage tape with the least utilization of the idle data block in the at least one hot storage tape is determined as the first target storage tape
  • the hot storage tape other than the to-be-recovered storage tape in the plurality of storage strips does not have a hot storage tape whose idle data block utilization is greater than or equal to the effective data block utilization ratio of the to-be-recovered storage strip
  • the band is determined to be the first target storage tape.
  • the third determining module is specifically configured to: when the type of the to-be-recovered storage tape is a cold storage tape, determine the to-be-reclaimed storage Effective block utilization of the strip; determining free block utilization of the cold storage strip in the plurality of storage strips; at least one cold storage in the cold storage strip of the plurality of storage strips other than the to-be-recovered storage strip
  • the free block utilization of the band is greater than or equal to the valid number of the to-be-recycled storage tape Determining, according to the block utilization rate, a cold storage tape having a minimum utilization of free data blocks in the at least one cold storage tape as a second target storage tape; and a cold storage tape in the plurality of storage tapes except the storage tape to be recycled
  • the storage tape to be recycled is determined as the second target storage tape.
  • the second determining module is specifically configured to: determine, according to the number of reading and writing times, the hot and cold of each data block in the to-be-recovered storage tape Determining the sum of the heat and coldness of each data block in the to-be-recovered storage strip as the thermal heat of the to-be-recovered storage strip; when the cold heat of the to-be-recovered storage strip is greater than or equal to the first threshold, determining The type of the storage tape to be recycled is a hot storage tape; or when the thermal heat of the to-be-recovered storage tape is less than the first threshold, it is determined that the type of the storage tape to be recycled is a cold storage tape.
  • the second determining module is further configured to: determine a hot and cold degree of each of the plurality of storage strips; The storage strip in which the cooling heat is greater than or equal to the second threshold is determined as a hot storage strip; the storage strip in which the cold heat is less than the second threshold in the plurality of storage strips is determined as a cold storage strip.
  • a watt record-aware file system comprising a device end and a host end in any one of the possible implementation manners of the third aspect or the third aspect of the third aspect,
  • the device is configured to: receive a garbage data recovery policy sent by the host, where the garbage data recovery policy includes an indicator of the storage tape to be recycled and the target storage tape; and according to the garbage data recovery policy received by the receiving module, The to-be-recovered storage belt is recycled.
  • the device is further configured to: determine, according to the garbage data recovery policy, a storage tape to be recycled and a target storage tape; and valid data in the storage tape to be recycled Stored in the free data block in the target storage area; the invalid data block in the to-be-recovered storage area is recycled.
  • the host side determines the storage belt to be recycled, and determines a garbage data recovery policy according to the type of the storage belt to be recycled, and the host side collects the garbage data recovery strategy.
  • the device side recycles the garbage data in the recycled storage tape according to the garbage data recovery policy, thereby avoiding the insufficiency of I/O access and semantic information of the upper layer application in the garbage collection in the DSAFS, and avoiding the HSAFS.
  • the data copy of the garbage collection on the host CPU can improve the garbage collection efficiency.
  • FIG. 1 is a schematic flow chart of a method for recycling garbage data in a tile recording-aware file system according to an embodiment of the present invention.
  • FIG. 2 is another schematic flowchart of a method for recovering garbage data in a tile recording-aware file system according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a garbage collection method to be recovered/recovered according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a self-recycling garbage collection method in accordance with an embodiment of the present invention.
  • FIG. 5 is still another schematic flowchart of a method for recovering garbage data in a tile recording-aware file system according to an embodiment of the present invention.
  • FIG. 6 is a schematic block diagram of a host side in a tile record aware file system in accordance with an embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of a device end in a tile recording aware file system in accordance with an embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of a tile recording aware file system in accordance with an embodiment of the present invention.
  • FIG. 9 is another schematic block diagram of a host side in a tile recording aware file system in accordance with an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method 100 for recovering garbage data in a tile-aware file system according to an embodiment of the present invention.
  • the tile-aware file system includes a host end and a device end, and the method 100 can be performed by the host side. .
  • the method 100 includes:
  • S120 Determine a type of the storage tape to be recycled, and the type of the storage tape to be recycled includes hot storage.
  • S140 Send the garbage data recovery policy to the device, so that the device performs garbage data recovery processing on the to-be-recovered storage tape according to the garbage data recovery policy.
  • the host determines the to-be-recovered Band in a plurality of storage zones, and determines a garbage data recovery policy according to the type of the to-be-recovered Band, where the garbage data recovery policy may include a bundle to be recycled and used for The target data of the valid data block in the to-be-recovered Band is stored, and the host sends the garbage data recovery policy to the device, so that the device collects the garbage data in the garbage according to the garbage data recovery policy.
  • the host side determines the storage tape to be recycled, and determines a garbage data recovery policy according to the type of the storage tape to be recycled, and the host side recovers the garbage data.
  • the policy is sent to the device side, and the device side recycles the garbage data in the recycled storage tape according to the garbage data recovery policy, thereby avoiding the insufficiency of I/O access and semantic information of the upper layer application in the garbage collection in the DSAFS, and avoiding the shortage.
  • the data copy of the garbage collection in HSAFS can occupy the host CPU and improve the garbage collection efficiency.
  • the host determines the to-be-recovered Band in a plurality of storage bands. Specifically, the host side may determine the invalid data block utilization of each of the multiple bands, and determine the Band with the largest invalid data block utilization as the to-be-recovered Band.
  • the invalid data block utilization of the Band is equal to the ratio of the number of invalid data blocks (Invalid blocks) in the Band to the total number of data blocks, wherein all the data blocks include valid blocks in the Band (Valid blocks). , invalid blocks (Invalid blocks) and free blocks (Free blocks).
  • the host determines the type of the to-be-recovered Band, and the types of the to-be-recovered Band include a hot band (Hot Band) and a cold storage band (Cold Band).
  • the host side can determine the hot and cold degree of each data block in the Band to be recovered according to the number of reading and writing times, and determine the sum of the hot and cold degrees of all the data blocks in the Band to be recovered as the hot and cold degree of the to-be-recovered Band.
  • the type of the to-be-recovered storage belt is determined to be a hot band; and when the cooling degree of the to-be-recovered Band is less than the first threshold, the to-be-recovered The type of Band is determined to be a cold band.
  • the first threshold may be determined based on empirical values, and the invention is not limited thereto.
  • the thermal brightness of each data block may be determined according to the read/write operation by the multiple data filtering algorithm Multiple Bloom Filter algorithm, but the present invention is not limited thereto.
  • the host side may also determine the type of each Band.
  • the host side can determine the hot and coldness of the data blocks included in each Band.
  • the sum of the heat and cold of all the data blocks included in a Band is the heat of the Band, and the hot and cold of each Band is determined.
  • the heat of the Band is greater than or equal to the second threshold, determining that the type of the Band is determined to be a hot Band; and when the thermal degree of the Band is less than the second threshold, determining that the type of the Band is a cold Band.
  • the second threshold may be determined according to an empirical value; optionally, the hot and cold degrees of the plurality of Bands may be sorted according to the size, and the types of the first 10% of the largest hot and cold bands are determined to be hot. Band, the type of the remaining 90% of the Band is determined as a cold band, and the second threshold is the thermal degree of the least-cooled Band in the 10% of the Band, but the present invention is not limited thereto.
  • the type of the to-be-recovered Band can also be determined by ranking. For example, when the cooling heat of the to-be-recovered Band is in the first 10% of all the Bands, it is determined that the to-be-recovered Band is a hot Band, otherwise it is a cold Band, but the present invention Not limited to this.
  • the first threshold and the second threshold may or may not be equal, and the present invention is not limited thereto.
  • the host determines a garbage data recovery policy according to the type of the to-be-recovered Band. Specifically, after the host determines the type of the to-be-recovered Band, for example, the to-be-recovered Band is a hot band, all the hot bands except the Band to be recovered are determined in all the Bands, and each of the hot bands is determined separately. Free block utilization of hot blocks.
  • the free block utilization rate of the at least one hot band is greater than or equal to the effective block utilization of the to-be-recovered Band, determining a band with the smallest free block utilization rate in the at least one hot band, and utilizing the free block utilization rate The smallest Band is determined as the first target Band; when there is no free data block utilization in the hot Band that is greater than or equal to the effective data block utilization of the to-be-recovered Band, the to-reclaimed Band is determined as the first target Band.
  • the host side determines all the cold bands (Cold Bands) other than the to-be-recovered Band in the plurality of bands, and in these Determining at least one cold band in the cold band, such that the idle block utilization of the at least one cold band is greater than or equal to the effective block utilization of the to-be-recovered Band, and the corresponding Band is used when the idle block utilization in at least one cold band is minimized. It is determined as the second target Band; when there is no such at least one cold band in the cold band, the ball to be recovered itself is determined as the second target band.
  • the target Band is used to store a valid data block in the to-be-recovered Band, and the target Band includes a first target Band and a second target Band.
  • the garbage data recycling processing method may include To Clean/Clean To and self-recycling. (Self-GC).
  • To Clean/Clean To When the determined target Band is another Band other than the Band to be recycled, and the recycling method to be recovered (To Clean/Clean To) is adopted, the to-be-recycled storage belt is To Clean Band, and the target storage belt is Clean To Band;
  • the self-recovery (Self-GC) recovery method is adopted.
  • the free block utilization ratio of the Band is equal to the ratio of the number of free blocks in the Band to the number of all data blocks in the Band, wherein all data blocks include valid data blocks in the Band (Valid) Blocks), invalid blocks (Invalid blocks) and free blocks (Free blocks).
  • the host determines a garbage data recovery policy according to the determined recovery mode. Specifically, when the target target determined by the host side is not the Band itself to be recovered, wherein the target Band includes the first target Band and the second target Band, the recovery to be recovered/recycled (To Cle an/Clean To) is adopted.
  • the garbage data recovery policy can be determined as ⁇ To_Clean_Band, To_Clean_Valid_Bitmap, To_Clean_Valid_Size, lean_To_Band>, wherein To_Clean_Band is an identifier for indicating the Band to be recycled, such as a Band ID, a name, etc., and the Band indicated by the identifier will be Garbage cleanup; To_Clean_Valid_Bitmap represents a bitmap of the valid data block in the to-be-recovered Band to be cleaned, which is used to indicate which blocks on the band are valid data blocks and which are invalid data blocks, so that the device side can have valid data.
  • To_Clean_Valid_Size represents the sum of all valid data sizes in the Band to be reclaimed
  • Clean_To_Band is an identifier for indicating the target Band, and the Band indicated by the identifier is the cleaned Band to be recycled.
  • the target band that is, the valid data block of the to-be-recovered Band to be cleaned is written to the target band. .
  • the host may send a determined garbage data recovery policy to the device, so that the device performs garbage data recovery processing on the to-be-recovered Band according to the garbage data recovery policy, and valid data in the to-be-recovered Band.
  • the block is reserved, and the invalid data block is deleted, thereby releasing the space of the Band.
  • the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be The implementation process of the embodiments of the present invention constitutes any limitation.
  • the host determines the to-be-recovered Band, and determines a garbage data recovery policy according to the type of the to-be-recovered Band, and may use a hot storage band (Hot Band).
  • the data blocks in the retention are kept in the hot storage area, and the data blocks in the Cold Band are still retained in the cold storage area.
  • the Hot Band is a hot block, the update is more frequent, there are fewer valid data blocks in the process of garbage data recovery, and the Cold Band contains cold data blocks, the update is infrequent, and basically no garbage collection is required, thereby accelerating garbage collection.
  • a copy operation of a valid data block is a hot storage band.
  • the host sends the garbage data collection policy to the device side, and the device side recycles the garbage data in the garbage collection according to the garbage data recovery policy, thereby preventing the garbage collection in the DSAFS from obtaining the I/O access of the upper application and
  • the lack of semantic information also avoids the use of garbage collection in HSAFS for host CPU usage, which can improve garbage collection efficiency.
  • FIG. 1 a method for recovering garbage data in a tile recording-aware file system according to an embodiment of the present invention is described in detail from the perspective of a host side.
  • a tile recording according to an embodiment of the present invention will be described from the perspective of a device side in conjunction with FIG. A method for detecting the garbage data in a file system.
  • the method 200 includes:
  • S220 Recover the to-be-recovered storage tape according to the garbage data recovery policy.
  • the host side determines the garbage data recovery policy, and the host sends the garbage data recovery policy to the device end, and the device end treats the garbage data recovery policy according to the garbage data recovery policy. Recycling the garbage data in the Band for recycling, which can avoid the lack of I/O access and semantic information of the upper-layer application in the garbage collection in the DSAFS, and avoid the data copy of the garbage collection in the HSAFS to occupy the host CPU, and can improve the garbage. Recovery efficiency.
  • the device receives the garbage data recovery policy sent by the host, and the garbage data recovery policy instructs the device to perform garbage data recovery processing on the storage tape (Band) that needs to be recycled.
  • the device side collects the garbage data according to the garbage data recovery policy. Specifically, the device side may first determine the to-be-recovered Band and the target Band according to the garbage collection policy.
  • the recovery method is to be recycled/reclaimed (To Cle an/Clean To).
  • the garbage collection policy can be ⁇ To_Clean_Band, To_Clean_Valid_Bitmap, To_Clean_Valid_Size , lean_To_Band>, according to the policy, the device determines the to-be-recovered Band (To_Clean_Band) and the target Band (Clean_To_Band), merges the valid data blocks in the to-be-recovered Band into the target Band, and deletes the invalid data blocks in the Band to be recycled. Thereby freeing up space.
  • the solid line frame in the to-be-recovered Band indicates that the data block is a valid data block, and the dotted line frame represents an invalid data block, and the recycled recycling band is self-recycling.
  • the way is to delete the invalid data block and retain the valid data block.
  • the valid data block (such as A1, A2, A3, and A7 in Figure 3) is copied to the random access memory (Random-Access Memory, RAM for short) in the device to be recycled, and then the Band to be recycled is emptied. Then, the valid data block in the original RAM to be recycled in the RAM is re-copied to the empty to be recovered Band, that is, copied to the target Band.
  • Band Write Pointer Band Write Pointer+1 after writing. Therefore, when the to-be-recovered Band is emptied and recycled, the Band Write Pointer is reset to 0. After the valid data in the RAM is written, the Band's Band Write Pointer indicates the next data block of the data block A7.
  • the solid line box in the to-be-recovered Band indicates that the data block is a valid data block, and the dotted line box represents invalid data.
  • the solid line box of the target Band has a label indicating a valid data block, and the solid line box has no label indicating a free data block, and the device side will reclaim the valid data in the Band.
  • the block is added to the location of the free data block in the target Band, and the whole of the recovered Band is recycled. For example, the data can be deleted and the space is released.
  • the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be taken to the embodiments of the present invention.
  • the implementation process constitutes any limitation.
  • the host side determines the garbage data recovery policy, and the host sends the garbage data recovery policy to the device end, and the device end treats the garbage data recovery policy according to the garbage data recovery policy. Recycling the garbage data in the Band for recycling, which can avoid the lack of I/O access and semantic information of the upper-layer application in the garbage collection in the DSAFS, and avoid the data copy of the garbage collection in the HSAFS to occupy the host CPU, and can improve the garbage. Recovery efficiency.
  • FIG. 5 is still another schematic flowchart of a method for recovering garbage data in a tile recording-aware file system according to an embodiment of the present invention, as shown in FIG. 5:
  • the host determines the to-be-recovered Band in a plurality of storage bands.
  • the host side can separately calculate the invalid data block utilization of each Band, and the invalid data block utilization of the Band is equal to the ratio of the number of invalid data blocks (Invalid blocks) in the Band to the total number of data blocks, wherein all the data
  • the block includes Valid blocks, Invalid blocks, and Free blocks in the Band.
  • the Band with the largest utilization of invalid data blocks in multiple Bands is determined as the Band to be reclaimed.
  • the host determines the type of the to-be-recovered Band.
  • the process goes to S303; when it is determined that the type of the to-be-recovered Band is a cold storage band (Cold Band) When, go to S304.
  • the host determines all the hot bands except the Band to be reclaimed in the plurality of Bands, and determines the free block utilization of each of the hot bands, and the free block utilization of the Band is equal to the idle data in the Band.
  • the host determines all the cold bands except the Band to be reclaimed in the plurality of Bands, and determines the free block utilization of each of the cold bands, the calculation of the idle block utilization and the idle of the hot band.
  • the data block utilization calculation process is consistent.
  • a cold band in which the idle block utilization is greater than or equal to the invalid block utilization of the to-be-recovered Band is determined in the cold bands, and if there is at least one such cold band, the at least one cold band is determined as a candidate cold band, and
  • S306 is executed, if there is no such hot band, S307 is executed.
  • the host side determines, in the candidate hot band, the candidate hot band that minimizes the idle block utilization as the target band.
  • the host side determines, in the candidate cold Band, the candidate cold Band that minimizes the idle data block utilization as the target Band.
  • the host determines the to-be-recovered Band as the target Band.
  • the host determines a garbage data recovery policy and sends the policy to the device.
  • the garbage data recovery policy includes instructing the device side to adopt the method of To Clean/Clean To or Self-GC to perform garbage data recovery.
  • the garbage data recovery policy is ⁇ To_Clean_Band, To_Clean_Valid_Bitmap, To_Clean_Valid_Size, Clean_To_Band>, where To_Clean_Band is It is used to indicate the identifier of the Band to be reclaimed, such as the Band ID, the name, etc., and the Band indicated by the identifier will be cleaned up by garbage;
  • To_Clean_Valid_Bitmap represents the bitmap of the valid data block in the Band to be reclaimed to be cleaned.
  • To_Clean_Valid_Size indicates the sum of all valid data sizes in the Band to be recycled.
  • Clean_To_Band is an identifier used to indicate the target Band, that is, the valid data block of the to-be-recovered Band to be cleaned is written in the target Band.
  • the device uses a self-recovery (Self-GC) method to process the to-be-recovered Band. Delete the wireless data block and free up space.
  • Self-GC self-recovery
  • the device side processes the recovered Band in a manner of To Clean/Clean To, and releases the space of the Band to be reclaimed.
  • the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be taken to the embodiments of the present invention.
  • the implementation process constitutes any limitation.
  • the host side determines the storage tape to be recycled, and determines a garbage data recovery policy according to the type of the storage tape to be recycled, and the host side recovers the garbage data.
  • the policy is sent to the device side, and the device side recycles the garbage data in the recycled storage tape according to the garbage data recovery policy, thereby avoiding the insufficiency of I/O access and semantic information of the upper layer application in the garbage collection in the DSAFS, and avoiding the shortage.
  • the data copy of the garbage collection in HSAFS can occupy the host CPU and improve the garbage collection efficiency.
  • a method for recovering garbage data in a tile recording-aware file system according to an embodiment of the present invention is described in detail above with reference to FIG. 1 to FIG. 5, and a tile recording-aware file according to an embodiment of the present invention will be described below with reference to FIGS. 6 to 7. Recycling device for garbage data in the system.
  • the apparatus for recovering garbage data in the tile recording-aware file system of the embodiment of the present invention includes a host end and a device end.
  • the host side 400 in the tile record sensing file system according to the embodiment of the present invention includes:
  • a first determining module 410 configured to determine, in a plurality of storage bands, a storage tape to be recycled
  • a second determining module 420 configured to determine a type of the storage tape to be recycled, the type of the storage tape to be recycled includes a hot storage tape and a cold storage tape;
  • the third determining module 430 is configured to determine a garbage data recovery policy according to the type of the to-be-recovered storage tape determined by the second determining module 420;
  • the sending module 440 is configured to send the garbage data recovery policy determined by the third determining module 430 to the device end, so that the device end performs garbage data recycling processing on the to-be-recovered storage tape according to the garbage data recycling policy.
  • the host determines the to-be-recovered Band in the plurality of storage zones (Band) through the first determining module 410, and determines the type of the to-be-recovered Band by the second determining module 420, and the third determining module 430 is configured according to the
  • the type of the to-be-recovered Band determines a garbage data recovery policy, and the garbage data recovery policy may include a target to be recovered, and a sending module 440 of the host side sends a third determination to the device side.
  • the garbage data recovery policy determined by the block 430 is configured to facilitate the recycling of the garbage data in the garbage to be recovered by the device according to the garbage data recovery policy.
  • the first determining module determines the storage tape to be recovered, and determines a garbage data recovery policy according to the type of the storage tape to be recycled, and the host end sends the garbage data through the sending module.
  • the garbage data collection policy is sent to the device side, so that the device side can recycle the garbage data in the garbage storage area according to the garbage data recovery policy, thereby avoiding the I/O access and semantic information of the upper layer application that cannot be obtained by the garbage collection in the DSAFS. Insufficient, it also avoids the data copy of the garbage collection in HSAFS to the host CPU, which can improve the garbage collection efficiency.
  • the first determining module 410 of the host side determines the to-be-recovered Band in a plurality of storage bands. Specifically, the host side may determine the invalid data block utilization of each of the multiple bands, and determine the Band with the largest invalid data block utilization as the to-be-recovered Band.
  • the invalid data block utilization of the Band is equal to the ratio of the number of invalid data blocks (Invalid blocks) in the Band to the total number of data blocks, wherein all the data blocks include valid blocks in the Band (Valid blocks). , invalid blocks (Invalid blocks) and free blocks (Free blocks).
  • the second determining module 420 of the host determines the type of the Band to be recovered determined by the first determining module 410, and the type of the to-be-recovered Band includes a hot band (Hot Band) and a cold storage band (Cold Band). ).
  • the host side can determine the hot and cold degree of each data block in the Band to be recovered according to the number of reading and writing times, and determine the sum of the hot and cold degrees of all the data blocks in the Band to be recovered as the hot and cold degree of the to-be-recovered Band.
  • the type of the to-be-recovered storage belt is determined to be a hot band; and when the cooling degree of the to-be-recovered Band is less than the first threshold, the to-be-recovered The type of Band is determined to be a cold band.
  • the first threshold may be determined based on empirical values, and the invention is not limited thereto.
  • the thermal brightness of each data block may be determined according to the read/write operation by the multiple data filtering algorithm Multiple Bloom Filter algorithm, but the present invention is not limited thereto.
  • the host side may also determine the type of each Band. Specifically, the host side can determine the hot and cold degree of the data block included in each Band, and the sum of the hot and cold degrees of all the data blocks included in one Band is the hot and cold degree of the Band, and the hot and cold of each Band is determined. And, when the heat of the Band is greater than or equal to the second threshold, determining that the type of the Band is determined to be a hot Band; and when the thermal degree of the Band is less than the second threshold, determining that the type of the Band is a cold Band.
  • the second threshold may be determined according to an empirical value; optionally, the hot and cold degrees of the plurality of Bands may be sorted according to the size, and the types of the first 10% of the largest hot and cold bands are determined to be hot. Band, the type of the remaining 90% of the Band is determined as a cold band, and the second threshold is the thermal degree of the least-cooled Band in the 10% of the Band, but the present invention is not limited thereto.
  • the type of the to-be-recovered Band can also be determined by ranking. For example, when the cooling heat of the to-be-recovered Band is in the first 10% of all the Bands, it is determined that the to-be-recovered Band is a hot Band, otherwise it is a cold Band, but the present invention Not limited to this.
  • the first threshold and the second threshold may or may not be equal, and the present invention is not limited thereto.
  • the third determining module 430 of the host determines the garbage data recovery policy according to the type of the to-be-recovered Band. Specifically, after the host determines the type of the to-be-recovered Band, for example, the to-be-recovered Band is a hot band, all the hot bands except the Band to be recovered are determined in all the Bands, and each of the hot bands is determined separately. Free block utilization of hot blocks.
  • the free block utilization rate of the at least one hot band is greater than or equal to the effective block utilization of the to-be-recovered Band, determining a band with the smallest free block utilization rate in the at least one hot band, and utilizing the free block utilization rate The smallest Band is determined as the first target Band; when there is no free data block utilization in the hot Band that is greater than or equal to the effective data block utilization of the to-be-recovered Band, the to-reclaimed Band is determined as the first target Band.
  • the host side determines all the cold bands (Cold Bands) other than the to-be-recovered Band in the plurality of bands, and in these Determining at least one cold band in the cold band, such that the idle block utilization of the at least one cold band is greater than or equal to the effective block utilization of the to-be-recovered Band, and the corresponding Band is used when the idle block utilization in at least one cold band is minimized. It is determined as the second target Band; when there is no such at least one cold band in the cold band, the ball to be recovered itself is determined as the second target band.
  • the target Band is used to store a valid data block in the to-be-recovered Band, and the target Band includes a first target Band and a second target Band.
  • the garbage data recycling processing method may include To Clean/Clean To and Self-GC.
  • the to-be-recycled storage belt is To Clean Band, and the target storage belt is Clean To Band;
  • the self-recovery (Self-GC) recovery method is adopted.
  • the free block utilization ratio of the Band is equal to the ratio of the number of free blocks in the Band to the number of all data blocks in the Band, wherein all data blocks include valid data blocks in the Band (Valid) Blocks), invalid blocks (Invalid blocks) and free blocks (Free blocks).
  • the host determines a garbage data recovery policy according to the determined recovery mode. Specifically, when the target target determined by the host side is not the Band itself to be recovered, wherein the target Band includes the first target Band and the second target Band, the recovery to be recovered/recycled (To Cle an/Clean To) is adopted.
  • the garbage data recovery policy can be determined as ⁇ To_Clean_Band, To_Clean_Valid_Bitmap, To_Clean_Valid_Size, lean_To_Band>, wherein To_Clean_Band is an identifier for indicating the Band to be recycled, such as a Band ID, a name, etc., and the Band indicated by the identifier will be Garbage cleanup; To_Clean_Valid_Bitmap represents a bitmap of the valid data block in the to-be-recovered Band to be cleaned, which is used to indicate which blocks on the band are valid data blocks and which are invalid data blocks, so that the device side can have valid data.
  • To_Clean_Valid_Size represents the sum of all valid data sizes in the Band to be reclaimed
  • Clean_To_Band is an identifier for indicating the target Band, and the Band indicated by the identifier is the cleaned Band to be recycled.
  • the target band that is, the valid data block of the to-be-recovered Band to be cleaned is written to the target band. .
  • the sending module of the host may send the garbage data recovery policy determined by the third determining module 430 to the device, so that the device performs garbage data recovery processing on the to-be-recovered Band according to the garbage data recovery policy.
  • the valid data block in the to-be-recovered Band is reserved, and the invalid data block is deleted, thereby releasing the space of the Band.
  • the host side 400 in the tile recording aware file system may correspond to 100 performing the method in the embodiment of the present invention, and the tile records the above-described respective modules in the host terminal 400 in the file system. And other operations and/or functions, respectively, in order to implement the corresponding processes of the method in FIG. 1, for brevity, no further details are provided herein.
  • the host in the watt record-aware file system of the embodiment of the present invention determines the to-be-recovered Band, and determines a garbage data recovery policy according to the type of the to-be-recovered Band, and can store the hot storage.
  • the data blocks in the Hot Band remain in the hot storage area, leaving the data blocks in the Cold Band still in the cold storage area. Because the Hot Band is a hot block, the update is more frequent, there are fewer valid data blocks in the process of garbage data recovery, and the Cold Band contains cold data blocks, the update is infrequent, and basically no garbage collection is required, thereby accelerating garbage collection. A copy operation of a valid data block.
  • the host sends the garbage data recovery policy to the device, so that the device can recycle the garbage data in the garbage according to the garbage data recovery policy, thereby preventing the garbage collection in the DSAFS from obtaining the upper application I/O.
  • the lack of access and semantic information also avoids the garbage copy data copy of HSAFS to occupy the host CPU, which can improve garbage collection efficiency.
  • FIG. 7 shows a schematic block diagram of a device end in a tile recording aware file system in accordance with another embodiment of the present invention.
  • the device end 500 in the tile recording-aware file system according to the embodiment of the present invention includes:
  • the receiving module 510 is configured to receive a garbage data recovery policy sent by the host, where the garbage data recovery policy includes an indicator of a storage tape to be recycled and a target storage tape;
  • the processing module 520 is configured to recover the to-be-recovered storage tape according to the garbage data recovery policy received by the receiving module 510.
  • the host side sends the determined garbage data recovery policy to the device end, and the receiving module of the device end receives the garbage data recovery measurement, and treats according to the garbage data recovery policy. Recycling the garbage data in the storage tape for recycling, so as to avoid the lack of I/O access and semantic information of the upper-layer application in the garbage collection in the DSAFS, and avoid the occupation of the host CPU by the data copy of the garbage collection in the HSAFS, which can improve Garbage collection efficiency.
  • the receiving module 510 of the device receives the garbage data recovery policy sent by the host, and the garbage data recovery policy instructs the device to perform garbage data recovery processing on the storage tape (Band) that needs to be recycled.
  • the garbage collection policy can be ⁇ To_Clean_Band, To_Clean_Valid_Bitmap, To_Clean_Valid_Size , lean_To_Band>, according to the policy, the device determines the to-be-recovered Band (To_Clean_Band) and the target Band (Clean_To_Band), merges the valid data blocks in the to-be-recovered Band into the target Band, and deletes the invalid data blocks in the Band to be recycled. Thereby freeing up space.
  • the solid line frame in the to-be-recovered Band indicates that the data block is a valid data block, and the dotted line frame represents an invalid data block, and the recycled recycling band is self-recycling.
  • the way is to delete the invalid data block and retain the valid data block.
  • the valid data block (such as A1, A2, A3, and A7 in Figure 3) is copied to the random access memory (Random-Access Memory, RAM for short) in the device to be recycled, and then the Band to be recycled is emptied. Then, the valid data block in the original RAM to be recycled in the RAM is re-copied to the empty to be recovered Band, that is, copied to the target Band.
  • Band Write Pointer Band Write Pointer+1 after writing. Therefore, when the to-be-recovered Band is emptied and recycled, the Band Write Pointer is reset to 0. After the valid data in the RAM is written, the Band's Band Write Pointer indicates the next data block of the data block A7.
  • the solid line box in the to-be-recovered Band indicates that the data block is a valid data block, and the dotted line box represents invalid data.
  • the solid line box of the target Band has a label indicating a valid data block, and the solid line box has no label indicating a free data block, and the device side adds the valid data block in the Band to be recycled to the target Band.
  • the location of the free data block, and then the recycling of the entire Band is recycled, for example, data can be deleted and space is released.
  • the device end 500 in the tile record sensing file system may be The method 200 of the embodiments of the present invention should be performed, and the above-described and other operations and/or functions of the respective modules in the device terminal 500 in the watt record sensing file system are respectively implemented in order to implement the respective methods in FIGS. 2 to 4. The process, for the sake of brevity, will not be described here.
  • the host side sends the determined garbage data recovery policy to the device end, and the receiving module of the device end receives the garbage data recovery measurement, and treats according to the garbage data recovery policy. Recycling the garbage data in the storage tape for recycling, so as to avoid the lack of I/O access and semantic information of the upper-layer application in the garbage collection in the DSAFS, and avoid the occupation of the host CPU by the data copy of the garbage collection in the HSAFS, which can improve Garbage collection efficiency.
  • an embodiment of the present invention further provides a tile record sensing file system 600, including a host end 610 and a device end 620.
  • the host side is configured to determine a storage belt to be recycled, determine a garbage data recovery policy according to the type of the storage belt to be recycled, and send the garbage collection policy to the device end; the device end is configured to receive the garbage data sent by the host end.
  • a recycling policy the garbage data recovery policy includes an indicator of the storage tape to be recycled and the target storage tape; and the storage tape to be recycled is recycled according to the garbage data recovery policy received by the receiving module.
  • the host side 610 may be the host side 400 as shown in FIG. 6, and the device side 620 may be the device side 500 as shown in FIG.
  • the host 610 may further include a first determining module 410, a second determining module 420, a third determining module 430, and a sending module 440 as shown in FIG. 6.
  • the device end 620 may further include a receiving module as shown in FIG. 510. Processing module 520.
  • the watt record-aware file system of the embodiment of the present invention includes a host end and a device end, and the host end sends the determined garbage data recovery policy to the device end, and the receiving module of the device end receives the garbage data recovery measurement, and according to the The garbage data recovery strategy recycles the garbage data in the recycled storage tape, thereby avoiding the insufficiency of I/O access and semantic information of the upper layer application in the garbage collection in the DSAFS, and avoiding the data copying of the garbage collection in the HSAFS to the host CPU. Occupation, can improve the efficiency of garbage collection.
  • an embodiment of the present invention further provides a host end 700, which includes a processor 710, a memory 720, a bus system 730, and a transceiver 740.
  • the processor 710, the memory 720, and the transceiver 740 are coupled by a bus system 730 for storing instructions for executing instructions stored by the memory 720 to control the transceiver 740 to receive signals.
  • the processor 710 is configured to determine a storage tape to be recovered in a plurality of storage spaces; and determine the storage tape to be recycled.
  • the type of the storage tape to be recycled includes a hot storage tape and a cold storage tape; determining a garbage data recovery policy according to the type of the storage tape to be recycled; the transceiver 740 is configured to send the garbage data to the device end Recycling the policy, so that the device end performs garbage data recovery processing on the to-be-recovered storage tape according to the garbage data recovery policy.
  • the host in the watt record sensing file system of the embodiment of the present invention determines the to-be-recovered Band, and determines a garbage data recovery policy according to the type of the to-be-recovered Band, and can retain the data block in the hot storage band (Hot Band).
  • the hot Band In the thermal storage tape, the data blocks in the Cold Band are still retained in the cold storage tape.
  • the Hot Band is a hot block, the update is more frequent, there are fewer valid data blocks in the process of garbage data recovery, and the Cold Band contains cold data blocks, the update is infrequent, and basically no garbage collection is required, thereby accelerating garbage collection. A copy operation of a valid data block.
  • the processor 710 may be a central processing unit (“CPU"), and the processor 710 may also be other general-purpose processors, digital signal processors (DSPs). , an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, and the like.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 720 can include read only memory and random access memory and provides instructions and data to the processor 710. A portion of the memory 720 can also include a non-volatile random access memory. For example, the memory 720 can also store information of the device type.
  • the bus system 730 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 730 in the figure.
  • each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 710 or an instruction in a form of software.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware processor, or may be performed by a combination of hardware and software modules in the processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in memory 720, and processor 710 reads the information in memory 720 and, in conjunction with its hardware, performs the steps of the above method. To avoid repetition, it will not be described in detail here.
  • the processor 710 may call the program code stored in the memory 720 to perform an operation of: determining an invalid data block utilization rate of each of the plurality of storage bands; The storage belt with the largest utilization of the invalid data block is determined as the storage belt to be recycled.
  • the processor 710 may call the program code stored in the memory 720 to perform the following operations: when the type of the to-be-recovered storage tape is a hot storage tape, the hot storage tape in the multiple storage tapes Determining a first target storage zone; determining the garbage data recovery policy, the garbage data recovery policy is used to instruct the device end to store the valid data block in the to-be-recovered storage zone into the idle data block of the first target storage zone Determining a second target storage zone in the cold storage zone of the plurality of storage tapes when the type of the storage tape to be recycled is a cold storage tape; determining the garbage data recovery policy, the garbage data recovery policy is used to indicate the The device side stores the valid data block in the to-be-recovered storage area into the free data block of the second target storage area.
  • the processor 710 may invoke the program code stored in the memory 720 to perform the following operations: when the type of the to-be-recovered storage tape is a hot storage tape, determine valid data block utilization of the to-be-recovered storage tape.
  • Rate determining an idle block utilization rate of the hot storage tape in the plurality of storage tapes; utilizing at least one hot storage block in the hot storage tape other than the to-be-recovered storage tape in the plurality of storage tapes When the rate is greater than or equal to the effective data block utilization of the to-be-reclaimed storage tape; the hot storage tape with the lowest idle data block utilization in the at least one hot storage tape is determined as the first target storage tape; when the plurality of storage tapes are The hot storage tape other than the to-be-recovered storage tape does not have a hot storage tape whose idle data block utilization is greater than or equal to the effective data block utilization of the to-be-recovered storage tape; determining the to-be-recovered storage tape as the first target Storage belt.
  • the processor 710 may invoke the program code stored in the memory 720 to perform the following operations: when the type of the to-be-recovered storage tape is a cold storage tape, determine the valid data block utilization of the to-be-recovered storage tape.
  • the rate is greater than or equal to the effective data block utilization of the to-be-reclaimed storage tape; the cold storage tape that minimizes the utilization of the idle data block in the at least one cold storage tape is determined as the second target storage tape; when the plurality of storage tapes are The cold storage tape other than the to-be-recovered storage tape does not have a cold storage tape whose idle data block utilization is greater than or equal to the effective data block utilization of the to-be-recovered storage tape; determining the to-be-recovered storage tape as the second target Storage belt.
  • the processor 710 may call the program code stored in the memory 720 to perform the following operations: determining, according to the number of reading and writing times, the hotness of each data block in the to-be-recovered storage area; The sum of the heat and coldness of each data block in the storage zone is determined as the degree of heat and coldness of the to-be-recovered storage tape; when the thermal heat of the to-be-recovered storage tape is greater than or equal to the first threshold, determining the waiting The type of the recovery storage tape is a thermal storage tape; or when the thermal heat of the to-be-recovered storage tape is less than the first threshold, it is determined that the type of the storage tape to be recycled is a cold storage tape.
  • the processor 710 may invoke the program code stored in the memory 720 to perform the following operations: determining the hot and cold degree of each of the plurality of storage strips; and heating the plurality of storage strips.
  • the storage tape having a degree greater than or equal to the second threshold is determined as a hot storage tape; the storage tape having a coldness of less than the second threshold among the plurality of storage tapes is determined as a cold storage tape.
  • the transceiver 740 is configured to send the garbage data recovery policy to the device end, where the garbage data recovery policy includes an indicator of the to-be-recovered storage tape and the target storage tape, to facilitate the device.
  • the garbage data recovery policy the valid data in the to-be-recovered storage tape is stored in the free data block in the target storage tape, and the invalid data block in the to-be-recovered storage tape is recycled, and the target storage is performed.
  • the belt includes the first target storage tape or the second target storage tape.
  • the host in the watt record sensing file system of the embodiment of the present invention determines the to-be-recovered Band, and determines a garbage data recovery policy according to the type of the to-be-recovered Band, and can retain the data block in the hot storage band (Hot Band).
  • the hot Band In the thermal storage tape, the data blocks in the Cold Band are still retained in the cold storage tape.
  • the Hot Band is a hot block, the update is more frequent, there are fewer valid data blocks in the process of garbage data recovery, and the Cold Band contains cold data blocks, the update is infrequent, and basically no garbage collection is required, thereby accelerating garbage collection. A copy operation of a valid data block.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another The system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

一种瓦记录感知文件***中垃圾数据的回收方法和装置,该瓦记录感知文件***包括主机端和设备端,该方法包括:该主机端在多个存储带中确定待回收存储带(S110);该主机端确定该待回收存储带的类型,该待回收存储带的类型包括热存储带和冷存储带(S120);该主机端根据该待回收存储带的类型,确定垃圾数据回收策略(S130);该主机端向该设备端发送该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理(S140)。该垃圾数据的回收方法和装置,能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。

Description

瓦记录感知文件***中垃圾数据的回收方法和装置 技术领域
本发明涉及数据存储领域,尤其涉及瓦记录感知文件***中垃圾数据的回收方法和装置。
背景技术
瓦记录是提高磁盘存储密度的一种技术,其原理是磁盘上的相邻磁道部分重叠,像重叠屋瓦一样记录数据,因此被称作瓦记录。瓦记录磁盘一般都是分为多个存储带(Band),每个Band只支持追加写,不支持就地更新,因此,瓦记录磁盘需要垃圾回收来提高资源利用率。
我们将瓦记录感知的文件***(Shingle-Aware File System,简称“SAFS”)归纳为三类:设备端瓦记录感知文件***(Device-side SAFS,简称“DSAFS”),主机端瓦记录感知文件***(Host-side SAFS,简称“HSAFS”)和协作式瓦记录感知文件***(Cooperative SAFS,简称“CSAFS”)。其中,DSAFS类似于固态硬盘(Solid State Drives,简称“SSD”)中的闪存转换层(Flash Translation Layer,简称“FTL”),向主机端提供一个块设备,而瓦记录磁盘(Shingled Write Disk,简称“SWD”)设备端做垃圾回收,地址映射等工作。HSAFS类似于专门的闪存(Flash)文件***,主机端负责数据布局和地址映射等工作,相对于DSAFS能针对应用做更多的优化。
在瓦记录感知文件***SAFS中,垃圾回收对***的整体性能影响很大。但是现有技术中DSAFS缺少上层应用的I/O访问及语义信息,而HSAFS和CSAFS中由主机端负责数据布局和地址映射等工作,涉及到的大量数据拷贝会对主机端运行造成影响,所以,现有的SAFS中垃圾回收效果不佳。
发明内容
本发明提供了一种垃圾数据的回收方法、主机端和设备端,能够提高垃圾数据回收效率。
第一方面,提供了一种瓦记录感知文件***中垃圾数据的回收方法,该瓦记录感知文件***包括主机端和设备端,该方法包括:该主机端在多个存储带中确定待回收存储带;该主机端确定该待回收存储带的类型,该待回收存储带的类型包括热存储带和冷存储带;该主机端根据该待回收存储带的类型,确定垃圾数据回收策略;该主机端向该设备端发送该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理。
结合第一方面,在第一方面的一种实现方式中,该在多个存储带中确定待回收存储带,包括:确定该多个存储带中每个存储带的无效数据块利用率;将该无效数据块利用率最大的存储带确定为该待回收存储带。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该根据该待回收存储带的类型,确定垃圾数据回收策略,包括:当该待回收存储带的类型为热存储带时,在该多个存储带中的热存储带中确定第一目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第一目标存储带的空闲数据块中;当该待回收存储带的类型为冷存储带时,在该多个存储带中的冷存储带中确定第二目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第二目标存储带的空闲数据块中。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该当该待回收存储带的类型为热存储带时,在该多个存储带中的热存储带中确定第一目标存储带,包括:当该待回收存储带的类型为热存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的热存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的热存储带中存在至少一个热存储带的空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率时;将该至少一个热存储带中空闲数据块利用率最小的热存储带确定为第一目标存储带;当该多个存储带中除了该待回收存储带以外的热存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的热存储带时;将该待回收存储带确定为第一目标存储带。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该当该待回收存储带的类型为冷存储带时,在该多个存储带中的冷存储带中确 定第二目标存储带,包括:当该待回收存储带的类型为冷存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的冷存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的冷存储带中存在至少一个冷存储带的空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率时;将该至少一个冷存储带中空闲数据块利用率最小的冷存储带确定为第二目标存储带;当该多个存储带中除了该待回收存储带以外的冷存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的冷存储带时;将该待回收存储带确定为第二目标存储带。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该确定该待回收存储带的类型,包括:根据读写次数,确定该待回收存储带中每个数据块的冷热度;将该待回收存储带中每个数据块的冷热度之和确定为该待回收存储带的冷热度;当该待回收存储带的冷热度大于或等于第一阈值时,确定该待回收存储带的类型为热存储带;或当该待回收存储带的冷热度小于该第一阈值时,确定该待回收存储带的类型为冷存储带。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该方法还包括:确定该多个存储带中每个存储带的冷热度;将该多个存储带中冷热度大于或等于第二阈值的存储带确定为热存储带;将该多个存储带中冷热度小于该第二阈值的存储带确定为冷存储带。
结合第一方面及其上述实现方式,在第一方面的另一种实现方式中,该主机端向该设备端发送该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理,包括:该主机端向该设备端发送该垃圾数据回收策略,该垃圾数据回收策略包括该待回收存储带和该目标存储带的指示符,以便于该设备端根据该垃圾数据回收策略,将该待回收存储带中的有效数据存储到该目标存储带中的空闲数据块中,并对该待回收存储带中的无效数据块进行回收处理,该目标存储带包括该第一目标存储带或该第二目标存储带。
第二方面,提供了一种主机,该主机包括网络接口、存储器以及处理器,其中,该存储器中存储一组程序,且该处理器用于调用该存储器中存储的程序,使得该主机执行如第一方面或第一方面的上述几种可能中任一种可能的实现方式。
第三方面,提供了一种瓦记录感知文件***中的主机端,该主机端包括: 第一确定模块,用于在多个存储带中确定待回收存储带;第二确定模块,用于确定该待回收存储带的类型,该待回收存储带的类型包括热存储带和冷存储带;第三确定模块,用于根据该第二确定模块确定的该待回收存储带的类型,确定垃圾数据回收策略;发送模块,用于向该设备端发送该第三确定模块确定的该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理。
结合第三方面,在第三方面的一种实现方式中,该第一确定模块具体用于:确定该多个存储带中每个存储带的无效数据块利用率;将该无效数据块利用率最大的存储带确定为该待回收存储带。
结合第三方面及其上述实现方式,在第三方面的另一种实现方式中,该第三确定模块具体用于:当该待回收存储带的类型为热存储带时,在该多个存储带中的热存储带中确定第一目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第一目标存储带的空闲数据块中;当该待回收存储带的类型为冷存储带时,在该多个存储带中的冷存储带中确定第二目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第二目标存储带的空闲数据块中。
结合第三方面及其上述实现方式,在第三方面的另一种实现方式中,该第三确定模块具体用于:当该待回收存储带的类型为热存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的热存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的热存储带中存在至少一个热存储带的空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率时;将该至少一个热存储带中空闲数据块利用率最小的热存储带确定为第一目标存储带;当该多个存储带中除了该待回收存储带以外的热存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的热存储带时;将该待回收存储带确定为第一目标存储带。
结合第三方面及其上述实现方式,在第三方面的另一种实现方式中,该第三确定模块具体用于:当该待回收存储带的类型为冷存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的冷存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的冷存储带中存在至少一个冷存储带的空闲数据块利用率大于或等于该待回收存储带的有效数 据块利用率时;将该至少一个冷存储带中空闲数据块利用率最小的冷存储带确定为第二目标存储带;当该多个存储带中除了该待回收存储带以外的冷存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的冷存储带时;将该待回收存储带确定为第二目标存储带。
结合第三方面及其上述实现方式,在第三方面的另一种实现方式中,该第二确定模块具体用于:根据读写次数,确定该待回收存储带中每个数据块的冷热度;将该待回收存储带中每个数据块的冷热度之和确定为该待回收存储带的冷热度;当该待回收存储带的冷热度大于或等于第一阈值时,确定该待回收存储带的类型为热存储带;或当该待回收存储带的冷热度小于该第一阈值时,确定该待回收存储带的类型为冷存储带。
结合第三方面及其上述实现方式,在第三方面的另一种实现方式中,该第二确定模块还用于:确定该多个存储带中每个存储带的冷热度;将该多个存储带中冷热度大于或等于第二阈值的存储带确定为热存储带;将该多个存储带中冷热度小于该第二阈值的存储带确定为冷存储带。
第四方面,提供了一种瓦记录感知文件***,该***包括设备端和如第三方面或第三方面的上述几种可能的实现方式中的任一种可能的实现方式中的主机端,其中,该设备端用于:接收该主机端发送的垃圾数据回收策略,该垃圾数据回收策略包括待回收存储带和目标存储带的指示符;根据该接收模块接收的该垃圾数据回收策略,对该待回收存储带进行回收。
结合第四方面,在第四方面的一种实现方式中,该设备端还用于:根据该垃圾数据回收策略,确定待回收存储带和目标存储带;将该待回收存储带中的有效数据存储到该目标存储带中的空闲数据块中;对该待回收存储带中的无效数据块进行回收处理。
基于上述技术方案,本发明实施例的垃圾数据的回收方法和装置,由主机端确定待回收存储带,并根据该待回收存储带的类型确定垃圾数据回收策略,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法的示意性流程图。
图2是根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法的另一示意性流程图。
图3是根据本发明实施例的待回收/回收到的垃圾回收方式的示意图。
图4是根据本发明实施例的自我回收的垃圾回收方式的示意图。
图5是根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法的再一示意性流程图。
图6是根据本发明实施例的瓦记录感知文件***中的主机端的示意性框图。
图7是根据本发明实施例的瓦记录感知文件***中的设备端的示意性框图。
图8是根据本发明实施例的瓦记录感知文件***的示意性框图。
图9是根据本发明实施例的瓦记录感知文件***中的主机端的另一示意性框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都应属于本发明保护的范围。
图1示出了根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法100的示意性流程图,瓦记录感知文件***中包括主机端和设备端,该方法100可以由主机端执行。如图1所示,该方法100包括:
S110,在多个存储带中确定待回收存储带;
S120,确定该待回收存储带的类型,该待回收存储带的类型包括热存储 带和冷存储带;
S130,根据该待回收存储带的类型,确定垃圾数据回收策略;
S140,向设备端发送该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理。
具体地,在CSAFS中,主机端在多个存储带(Band)中确定待回收Band,根据该待回收Band的类型,确定垃圾数据回收策略,该垃圾数据回收策略可以包括待回收Band以及用于存储该待回收Band中有效数据块的目标Band,主机端向设备端发送该垃圾数据回收策略,以便于设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收处理。
因此,本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,由主机端确定待回收存储带,并根据该待回收存储带的类型确定垃圾数据回收策略,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
在S110中,主机端在多个存储带(Band)中确定待回收Band。具体地,主机端可以分别确定多个Band中每个Band的无效数据块利用率,将其中无效数据块利用率最大的Band确定为待回收Band。可选地,Band的无效数据块利用率等于该Band中无效数据块(Invalid blocks)的个数与全部数据块个数的比值,其中,全部数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。
在S120中,主机确定该待回收Band的类型,该待回收Band的类型包括热存储带(Hot Band)和冷存储带(Cold Band)。具体地,主机端可以根据读写次数,确定待回收Band中每个数据块的冷热度,将待回收Band中所有数据块的冷热度之和确定为该待回收Band的冷热度。当该待回收Band的冷热度大于或等于第一阈值时,将该待回收存储带的类型确定为热Band;当该待回收Band的冷热度小于该第一阈值时,将该待回收Band的类型确定为冷Band。可选地,该第一阈值可以根据经验值确定,本发明并不限于此。可选地,可以通过热数据识别算法Multiple Bloom Filter算法,根据读写操作来确定每个数据块的冷热度,但本发明并不限于此。
可选地,作为一个实施例,该主机端还可以确定每个Band的类型。具 体地,主机端可以确定每个Band中包括的数据块的冷热度,一个Band中包括的所有数据块的冷热度之和即为该Band的冷热度,确定每个Band的冷热度,当Band的冷热度大于或等于第二阈值时,确定该Band的类型确定为热Band;当该Band的冷热度小于该第二阈值时,确定该Band的类型为冷Band。可选地,该第二阈值可以根据经验值确定;可选地,也可以将多个Band的冷热度按照大小进行排序,将冷热度最大的前10%的Band的类型均确定为热Band,将其余90%的Band的类型确定为冷Band,该第二阈值即为排名10%的Band中冷热度最小的Band的冷热度,但本发明并不限于。可选地,待回收Band的类型也可以通过排名确定,例如当待回收Band的冷热度在所有Band的前10%,则确定该待回收Band为热Band,否则为冷Band,但本发明并不限于此。可选地,第一阈值与第二阈值可以相等,也可以不相等,本发明并不限于此。
在S130中,主机端根据该待回收Band的类型,确定垃圾数据回收策略。具体地,当主机端确定了待回收Band的类型后,例如该待回收Band为热Band,则在所有Band中确定除了待回收Band之外的所有热Band,并分别确定这些热Band中每个热Band的空闲数据块利用率。当存在至少一个热Band的空闲数据块利用率大于或等于待回收Band的有效数据块利用率时,在至少一个热Band中确定空闲数据块利用率最小的一个Band,将该空闲数据块利用率最小的Band确定为第一目标Band;当热Band中不存在空闲数据块利用率大于或等于待回收Band的有效数据块利用率时,则将该待回收Band确定为第一目标Band。
可选地,作为一个实施例,当待回收Band的类型被确定为冷Band时,同样地,主机端在多个Band中确定除了待回收Band以外的所有冷Band(Cold Band),并在这些冷Band中确定至少一个冷Band,使得该至少一个冷Band的空闲数据块利用率大于或等于待回收Band的有效数据块利用率,将至少一个冷Band中空闲数据块利用率最小时对应的Band确定为第二目标Band;当冷Band中不存在这样的至少一个冷Band时,则将待回收Band本身确定为第二目标Band。
在本发明实施例中,目标Band用于存储待回收Band中的有效数据块,该目标Band包括第一目标Band和第二目标Band。具体地,垃圾数据回收处理方式可以包括待回收/回收到(To Clean/Clean To)和自我回收 (Self-GC)。当确定的目标Band为待回收Band以外的另一个Band时,采用待回收/回收到(To Clean/Clean To)的回收方式,则该待回收存储带为To Clean Band,目标存储带为Clean To Band;当确定的目标Band为待回收Band本身时,则采用自我回收(Self-GC)的回收方式。
在本发明实施例中,Band的空闲块利用率等于该Band中空闲数据块(Free blocks)个数与Band所有数据块个数的比值,其中,所有数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。
在本发明实施例中,主机端根据确定的回收方式,确定出垃圾数据回收策略。具体的,当主机端确定的目标Band不是待回收Band本身时,其中,该目标Band包括第一目标Band和第二目标Band,则采用待回收/回收到(To Cle an/Clean To)的回收方式,则垃圾数据回收策略可以确定为<To_Clean_Band,To_Clean_Valid_Bitmap,To_Clean_Valid_Size,lean_To_Band>,其中,To_Clean_Band是用于指示待回收Band的标示符,如Band ID、名字等信息,该标示符指示的Band将被垃圾清理;To_Clean_Valid_Bitmap表示被清理的待回收Band中的有效数据块的位图(bitmap),用来指示该band上哪些块是有效数据块,哪些是无效数据块,以便于设备端可以对有效数据块保留,对无效数据块做垃圾数据处理;To_Clean_Valid_Size表示待回收Band中所有有效数据大小的总和;Clean_To_Band是用于指示目标Band的标示符,该标示符指示的Band是被清理的待回收Band的目标Band,即被清理的待回收Band的有效数据块会写入该目标Band中。可选地,当确定的目标Band是待回收Band本身时,即采用的是自我回收(Self-GC)的回收方式,则主机端确定的垃圾数据回收策略可以为Clean_To_Band=-1,其中,Clean_To_Band表示目标Band,即待回收Band,指示设备端将该Band中的有效数据块保留,删除无效数据块。
在S140中,主机端可以向设备端发送确定的垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略,对该待回收Band进行垃圾数据回收处理,对该待回收Band中的有效数据块保留,无效数据块删除,从而释放Band的空间。
应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应 对本发明实施例的实施过程构成任何限定。
因此,本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,由主机端确定待回收Band,并根据该待回收Band的类型确定垃圾数据回收策略,可以将热存储带(Hot Band)中的数据块保留在热存储带中,将冷存储带(Cold Band)中的数据块仍然保留在冷存储带中。因为Hot Band中为热数据块,更新更频繁,在垃圾数据回收的过程中有效数据块更少,而Cold Band中包含冷数据块,更新不频繁,基本不用进行垃圾回收,从而可以加速垃圾回收中有效数据块的拷贝操作。
另外,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
上文中结合图1,从主机端的角度详细描述了根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,下面将结合图2,从设备端的角度描述根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法。
图2示出了根据本发明另一实施例的瓦记录感知文件***中垃圾数据的回收方法的示意性流程图。该方法主要由设备端执行。如图2所示,该方法200包括:
S210,接收主机端发送的垃圾数据回收策略,该垃圾数据回收策略包括待回收存储带和目标存储带的指示符;
S220,根据该垃圾数据回收策略,对该待回收存储带进行回收。
因此,本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,由主机端确定垃圾数据回收策略,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
在S210中,设备端接收主机端发送的垃圾数据回收策略,该垃圾数据回收策略指示设备端对需要回收的存储带(Band)进行垃圾数据回收处理。
在S220中,设备端根据垃圾数据回收策略,对垃圾数据进行回收。具体地,设备端可以先根据垃圾回收策略确定出待回收Band和目标Band,进 而对垃圾回收策略进行判断,判断采用哪一种回收方式。例如,当接收到的垃圾数据回收策略为Clean_To_Band=-1,即Clean_To_Band指示的目标Band等于-1时,表示的是待回收Band和目标Band为同一个Band,则采用的是自我回收的方式。当接收到的垃圾数据回收策略不是上述自我回收策略时,即采用的是待回收/回收到(To Cle an/Clean To)的回收方式,例如,垃圾数据回收策略可以为<To_Clean_Band,To_Clean_Valid_Bitmap,To_Clean_Valid_Size,lean_To_Band>,则根据该策略,设备端确定待回收Band(To_Clean_Band)和目标Band(Clean_To_Band),将待回收Band中有效数据块合并到目标Band中,并删除待回收Band中的无效数据块,从而释放空间。
具体地,对于自我回收(Self-GC)的方式,如图3所示,待回收Band中实线框表示该数据块为有效数据块,虚线框代表无效数据块,对待回收Band采用自我回收的方式就是删除无效数据块,保留有效数据块。待回收Band将包括的有效数据块(如图3中的A1、A2、A3和A7)拷贝到设备端的随机存取存储器(Random-Access Memory,简称“RAM”)中,然后将待回收Band清空,之后将RAM中的原待回收Band中的有效数据块,重新拷到清空之后的待回收Band,即拷到目标Band之中。由于瓦记录的相邻磁道是部分重叠的,也就是说写入任意一个磁道i的时候,与磁道i相邻的几个磁道(比如磁道i+1)中的数据就被破坏了。所以每个Band只能追加写(数据只能顺序写入Band)。如果不遵循追加写的规则,比如在Band中顺序写入A1 A2 A3 A4后,然后再在A2所在地方写入数据,则A3和A4中的数据会被破坏。因此,在每个Band有一个相应的变量叫做Band的写入指示点(Band Write Pointer),可以对应图3中的实心黑色箭头,用于指示接下来数据如果要写入Band,只能写入Band Write Pointer所指示的地方,写完后Band Write Pointer+1。因此,当待回收Band被清空回收后,Band Write Pointer被重置为0。将RAM中有效数据写入后,Band的Band Write Pointer指示数据块A7的下一个数据块。
具体地,对于待回收/回收到(To Clean/Clean To)的方式,如图4所示,待回收Band(To Clean)中实线框表示该数据块为有效数据块,虚线框代表无效数据块,目标Band(Clean To)的实线框中具有标号的表示有效数据块,实线框中没有标号的表示空闲数据块,则设备端将待回收Band中有效数据 块添加至目标Band中的空闲数据块的位置,再对待回收Band整体进行回收处理,例如可以删除数据,释放空间。
应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
因此,本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,由主机端确定垃圾数据回收策略,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
下面以一个具体实施例为例,对本发明实施例的瓦记录感知文件***中垃圾数据的回收方法进行说明。图5示出了根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法的再一示意性流程图,如图5所示:
在S301中,主机端在多个存储带(Band)中确定待回收Band。该主机端可分别计算每个Band的无效数据块利用率,Band的无效数据块利用率等于该Band中无效数据块(Invalid blocks)的个数与全部数据块个数的比值,其中,全部数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。将多个Band中无效数据块利用率最大的Band确定为待回收Band。
在S302中,主机端确定待回收Band的类型,当确定该待回收Band的类型为热存储带(Hot Band)时,到S303;当确定该待回收Band的类型为冷存储带(Cold Band)时,到S304。
在S303中,主机端在多个Band中确定除了待回收Band以外的所有热Band,并确定这些热Band中每个Band的空闲数据块利用率,Band的空闲块利用率等于该Band中空闲数据块(Free blocks)个数与Band所有数据块个数的比值,其中,所有数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。在这些热Band中确定空闲数据块利用率大于或等于待回收Band的无效数据块利用率的热Band,若存在至少一个这样的热Band,则将该至少一个热Band确定为候选热Band,并执行S305,若不存在这样的热Band,则执行S307。
在S304中,主机端在多个Band中确定除了待回收Band以外的所有冷Band,并确定这些冷Band中每个Band的空闲数据块利用率,空闲数据块利用率的计算与热Band的空闲数据块利用率计算过程一致。在这些冷Band中确定空闲数据块利用率大于或等于待回收Band的无效数据块利用率的冷Band,若存在至少一个这样的冷Band,则将该至少一个冷Band确定为候选冷Band,并执行S306,若不存在这样的热Band,则执行S307。
在S305中,主机端在候选热Band中,将空闲数据块利用率最小的候选热Band确定为目标Band。
在S306中,主机端在候选冷Band中,将将空闲数据块利用率最小的候选冷Band确定为目标Band。
在S307中,主机端将待回收Band确定为目标Band。
在S308中,主机端确定垃圾数据回收策略,并将该策略发送至设备端。该垃圾数据回收策略包括指示设备端采用待回收/回收到(To Clean/Clean To)或自我回收(Self-GC)的方式进行垃圾数据回收。当主机端通过S305或S306确定的目标Band时,采用待回收/回收到(To Clean/Clean To)的回收方式,则垃圾数据回收策略为<To_Clean_Band,To_Clean_Valid_Bitmap,To_Clean_Valid_Size,Clean_To_Band>,其中,To_Clean_Band是用于指示待回收Band的标示符,如Band ID、名字等信息,该标示符指示的Band将被垃圾清理;To_Clean_Valid_Bitmap表示被清理的待回收Band中的有效数据块的位图(bitmap),用来指示该band上哪些块是有效数据块,哪些是无效数据块,以便于设备端可以对有效数据块保留,对无效数据块做垃圾数据处理;To_Clean_Valid_Size表示待回收Band中所有有效数据大小的总和;Clean_To_Band是用于指示目标Band的标示符,即被清理的待回收Band的有效数据块会写入该目标Band中。当主机端通过S307确定的目标Band时,采用自我回收(Self-GC)的方式进行垃圾数据回收,则垃圾数据回收策略为Clean_To_Band=-1,其中,Clean_To_Band表示目标Band,即待回收Band标识符,指示设备端将该Band中的有效数据块保留,删除无效数据块。
在S309中,设备端接收主机端发送的垃圾数据回收策略,判断该垃圾数据回收策略中,Clean_To_Band是否为-1,若垃圾回收策略为Clean_To_Band=-1,则执行S310,否则执行S311。
在S310中,设备端采用自我回收(Self-GC)的方式处理待回收Band, 将无线数据块删除,释放空间。
在S311中,设备端采用待回收/回收到(To Clean/Clean To)的方式对待回收Band进行处理,释放待回收Band的空间。
应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
因此,本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,由主机端确定待回收存储带,并根据该待回收存储带的类型确定垃圾数据回收策略,主机端将该垃圾数据回收策略发送至设备端,设备端根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
上文中结合图1至图5,详细描述了根据本发明实施例的瓦记录感知文件***中垃圾数据的回收方法,下面将结合图6至图7,描述根据本发明实施例的瓦记录感知文件***中垃圾数据的回收装置。
本发明实施例的瓦记录感知文件***中垃圾数据的回收装置包括主机端和设备端。如图6所示,根据本发明实施例的瓦记录感知文件***中主机端400包括:
第一确定模块410,用于在多个存储带中确定待回收存储带;
第二确定模块420,用于确定该待回收存储带的类型,该待回收存储带的类型包括热存储带和冷存储带;
第三确定模块430,用于根据该第二确定模块420确定的该待回收存储带的类型,确定垃圾数据回收策略;
发送模块440,用于向该设备端发送该第三确定模块430确定的该垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略对该待回收存储带进行垃圾数据回收处理。
具体地,在CSAFS中,主机端通过第一确定模块410在多个存储带(Band)中确定待回收Band,并通过第二确定模块420确定待回收Band的类型,第三确定模块430根据该待回收Band的类型,确定垃圾数据回收策略,该垃圾数据回收策略可以包括待回收Band以及用于存储该待回收Band中有效数据块的目标Band,主机端的发送模块440向设备端发送第三确定模 块430确定的该垃圾数据回收策略,以便于设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收处理。
因此,本发明实施例的瓦记录感知文件***中的主机端,由第一确定模块确定待回收存储带,并根据该待回收存储带的类型确定垃圾数据回收策略,主机端通过发送模块将该垃圾数据回收策略发送至设备端,以便于设备端根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
在本发明实施例中,主机端的第一确定模块410在多个存储带(Band)中确定待回收Band。具体地,主机端可以分别确定多个Band中每个Band的无效数据块利用率,将其中无效数据块利用率最大的Band确定为待回收Band。可选地,Band的无效数据块利用率等于该Band中无效数据块(Invalid blocks)的个数与全部数据块个数的比值,其中,全部数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。
在本发明实施例中,主机端的第二确定模块420确定该第一确定模块410确定的待回收Band的类型,该待回收Band的类型包括热存储带(Hot Band)和冷存储带(Cold Band)。具体地,主机端可以根据读写次数,确定待回收Band中每个数据块的冷热度,将待回收Band中所有数据块的冷热度之和确定为该待回收Band的冷热度。当该待回收Band的冷热度大于或等于第一阈值时,将该待回收存储带的类型确定为热Band;当该待回收Band的冷热度小于该第一阈值时,将该待回收Band的类型确定为冷Band。可选地,该第一阈值可以根据经验值确定,本发明并不限于此。可选地,可以通过热数据识别算法Multiple Bloom Filter算法,根据读写操作来确定每个数据块的冷热度,但本发明并不限于此。
可选地,作为一个实施例,该主机端还可以确定每个Band的类型。具体地,主机端可以确定每个Band中包括的数据块的冷热度,一个Band中包括的所有数据块的冷热度之和即为该Band的冷热度,确定每个Band的冷热度,当Band的冷热度大于或等于第二阈值时,确定该Band的类型确定为热Band;当该Band的冷热度小于该第二阈值时,确定该Band的类型为冷Band。 可选地,该第二阈值可以根据经验值确定;可选地,也可以将多个Band的冷热度按照大小进行排序,将冷热度最大的前10%的Band的类型均确定为热Band,将其余90%的Band的类型确定为冷Band,该第二阈值即为排名10%的Band中冷热度最小的Band的冷热度,但本发明并不限于。可选地,待回收Band的类型也可以通过排名确定,例如当待回收Band的冷热度在所有Band的前10%,则确定该待回收Band为热Band,否则为冷Band,但本发明并不限于此。可选地,第一阈值与第二阈值可以相等,也可以不相等,本发明并不限于此。
在本发明实施例中,主机端的第三确定模块430根据该待回收Band的类型,确定垃圾数据回收策略。具体地,当主机端确定了待回收Band的类型后,例如该待回收Band为热Band,则在所有Band中确定除了待回收Band之外的所有热Band,并分别确定这些热Band中每个热Band的空闲数据块利用率。当存在至少一个热Band的空闲数据块利用率大于或等于待回收Band的有效数据块利用率时,在至少一个热Band中确定空闲数据块利用率最小的一个Band,将该空闲数据块利用率最小的Band确定为第一目标Band;当热Band中不存在空闲数据块利用率大于或等于待回收Band的有效数据块利用率时,则将该待回收Band确定为第一目标Band。
可选地,作为一个实施例,当待回收Band的类型被确定为冷Band时,同样地,主机端在多个Band中确定除了待回收Band以外的所有冷Band(Cold Band),并在这些冷Band中确定至少一个冷Band,使得该至少一个冷Band的空闲数据块利用率大于或等于待回收Band的有效数据块利用率,将至少一个冷Band中空闲数据块利用率最小时对应的Band确定为第二目标Band;当冷Band中不存在这样的至少一个冷Band时,则将待回收Band本身确定为第二目标Band。
在本发明实施例中,目标Band用于存储待回收Band中的有效数据块,该目标Band包括第一目标Band和第二目标Band。具体地,垃圾数据回收处理方式可以包括待回收/回收到(To Clean/Clean To)和自我回收(Self-GC)。当确定的目标Band为待回收Band以外的另一个Band时,采用待回收/回收到(To Clean/Clean To)的回收方式,则该待回收存储带为To Clean Band,目标存储带为Clean To Band;当确定的目标Band为待回收Band本身时,则采用自我回收(Self-GC)的回收方式。
在本发明实施例中,Band的空闲块利用率等于该Band中空闲数据块(Free blocks)个数与Band所有数据块个数的比值,其中,所有数据块包括该Band中有效数据块(Valid blocks)、无效数据块(Invalid blocks)和空闲数据块(Free blocks)。
在本发明实施例中,主机端根据确定的回收方式,确定出垃圾数据回收策略。具体的,当主机端确定的目标Band不是待回收Band本身时,其中,该目标Band包括第一目标Band和第二目标Band,则采用待回收/回收到(To Cle an/Clean To)的回收方式,则垃圾数据回收策略可以确定为<To_Clean_Band,To_Clean_Valid_Bitmap,To_Clean_Valid_Size,lean_To_Band>,其中,To_Clean_Band是用于指示待回收Band的标示符,如Band ID、名字等信息,该标示符指示的Band将被垃圾清理;To_Clean_Valid_Bitmap表示被清理的待回收Band中的有效数据块的位图(bitmap),用来指示该band上哪些块是有效数据块,哪些是无效数据块,以便于设备端可以对有效数据块保留,对无效数据块做垃圾数据处理;To_Clean_Valid_Size表示待回收Band中所有有效数据大小的总和;Clean_To_Band是用于指示目标Band的标示符,该标示符指示的Band是被清理的待回收Band的目标Band,即被清理的待回收Band的有效数据块会写入该目标Band中。可选地,当确定的目标Band是待回收Band本身时,即采用的是自我回收(Self-GC)的回收方式,则主机端确定的垃圾数据回收策略可以为Clean_To_Band=-1,其中,Clean_To_Band表示目标Band,即待回收Band,指示设备端将该Band中的有效数据块保留,删除无效数据块。
在本发明实施例中,主机端的发送模块可以向设备端发送第三确定模块430确定的垃圾数据回收策略,以便于该设备端根据该垃圾数据回收策略,对该待回收Band进行垃圾数据回收处理,对该待回收Band中的有效数据块保留,无效数据块删除,从而释放Band的空间。
应理解,根据本发明实施例的瓦记录感知文件***中的主机端400可对应于执行本发明实施例中的方法的100,并且瓦记录感知文件***中的主机端400中的各个模块的上述和其它操作和/或功能分别为了实现图1中的方法的相应流程,为了简洁,在此不再赘述。
因此,本发明实施例的瓦记录感知文件***中的主机端,确定待回收Band,并根据该待回收Band的类型确定垃圾数据回收策略,可以将热存储 带(Hot Band)中的数据块保留在热存储带中,将冷存储带(Cold Band)中的数据块仍然保留在冷存储带中。因为Hot Band中为热数据块,更新更频繁,在垃圾数据回收的过程中有效数据块更少,而Cold Band中包含冷数据块,更新不频繁,基本不用进行垃圾回收,从而可以加速垃圾回收中有效数据块的拷贝操作。
另外,主机端将该垃圾数据回收策略发送至设备端,以便于设备端根据该垃圾数据回收策略对待回收Band中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
图7示出了根据本发明另一实施例的瓦记录感知文件***中的设备端的示意性框图。如图7所示,根据本发明实施例的瓦记录感知文件***中设备端500包括::
接收模块510,用于接收该主机端发送的垃圾数据回收策略,该垃圾数据回收策略包括待回收存储带和目标存储带的指示符;
处理模块520,用于根据该接收模块510接收的该垃圾数据回收策略,对该待回收存储带进行回收。
因此,本发明实施例的瓦记录感知文件***中的设备端,主机端将确定的垃圾数据回收策略发送至设备端,设备端的接收模块接收该垃圾数据回收测量,并根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
在本发明实施例中,设备端的接收模块510接收主机端发送的垃圾数据回收策略,该垃圾数据回收策略指示设备端对需要回收的存储带(Band)进行垃圾数据回收处理。
在本发明实施例中,设备端的处理模块520根据接收模块510接收的垃圾数据回收策略,对垃圾数据进行回收。具体地,设备端可以先根据垃圾回收策略确定出待回收Band和目标Band,进而对垃圾回收策略进行判断,判断采用哪一种回收方式。例如,当接收到的垃圾数据回收策略为Clean_To_Band=-1,即Clean_To_Band指示的目标Band等于-1时,表示的 是待回收Band和目标Band为同一个Band,则采用的是自我回收的方式。当接收到的垃圾数据回收策略不是上述自我回收策略时,即采用的是待回收/回收到(To Cle an/Clean To)的回收方式,例如,垃圾数据回收策略可以为<To_Clean_Band,To_Clean_Valid_Bitmap,To_Clean_Valid_Size,lean_To_Band>,则根据该策略,设备端确定待回收Band(To_Clean_Band)和目标Band(Clean_To_Band),将待回收Band中有效数据块合并到目标Band中,并删除待回收Band中的无效数据块,从而释放空间。
具体地,对于自我回收(Self-GC)的方式,如图3所示,待回收Band中实线框表示该数据块为有效数据块,虚线框代表无效数据块,对待回收Band采用自我回收的方式就是删除无效数据块,保留有效数据块。待回收Band将包括的有效数据块(如图3中的A1、A2、A3和A7)拷贝到设备端的随机存取存储器(Random-Access Memory,简称“RAM”)中,然后将待回收Band清空,之后将RAM中的原待回收Band中的有效数据块,重新拷到清空之后的待回收Band,即拷到目标Band之中。由于瓦记录的相邻磁道是部分重叠的,也就是说写入任意一个磁道i的时候,与磁道i相邻的几个磁道(比如磁道i+1)中的数据就被破坏了。所以每个Band只能追加写(数据只能顺序写入Band)。如果不遵循追加写的规则,比如在Band中顺序写入A1 A2 A3 A4后,然后再在A2所在地方写入数据,则A3和A4中的数据会被破坏。因此,在每个Band有一个相应的变量叫做Band的写入指示点(Band Write Pointer),可以对应图3中的实心黑色箭头,用于指示接下来数据如果要写入Band,只能写入Band Write Pointer所指示的地方,写完后Band Write Pointer+1。因此,当待回收Band被清空回收后,Band Write Pointer被重置为0。将RAM中有效数据写入后,Band的Band Write Pointer指示数据块A7的下一个数据块。
具体地,对于待回收/回收到(To Clean/Clean To)的方式,如图4所示,待回收Band(To Clean)中实线框表示该数据块为有效数据块,虚线框代表无效数据块,目标Band(Clean To)的实线框中具有标号的表示有效数据块,实线框中没有标号的表示空闲数据块,则设备端将待回收Band中有效数据块添加至目标Band中的空闲数据块的位置,再对待回收Band整体进行回收处理,例如可以删除数据,释放空间。
应理解,根据本发明实施例的瓦记录感知文件***中的设备端500可对 应于执行本发明实施例中的方法200,并且瓦记录感知文件***中的设备端500中的各个模块的上述和其它操作和/或功能分别为了实现图2至图4中的各个方法的相应流程,为了简洁,在此不再赘述。
因此,本发明实施例的瓦记录感知文件***中的设备端,主机端将确定的垃圾数据回收策略发送至设备端,设备端的接收模块接收该垃圾数据回收测量,并根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
如图8所示,本发明实施例还提供了一种瓦记录感知文件***600,包括主机端610和设备端620。该主机端用于确定待回收存储带,根据该待回收存储带的类型,确定垃圾数据回收策略,并将该垃圾回收策略发送至设备端;该设备端用于接收该主机端发送的垃圾数据回收策略,该垃圾数据回收策略包括待回收存储带和目标存储带的指示符;根据该接收模块接收的该垃圾数据回收策略,对该待回收存储带进行回收。
主机端610可以是如图6所示的所示的主机端400,设备端620可以是如图7所示的设备端500。
该主机端610还可以包括如图6所示的第一确定模块410,第二确定模块420,第三确定模块430,发送模块440;该设备端620还可以包括如图7所示的接收模块510,处理模块520。
因此,本发明实施例的瓦记录感知文件***,该***包括主机端和设备端,主机端将确定的垃圾数据回收策略发送至设备端,设备端的接收模块接收该垃圾数据回收测量,并根据该垃圾数据回收策略对待回收存储带中的垃圾数据进行回收,从而能够避免DSAFS中垃圾回收无法获得上层应用的I/O访问及语义信息的不足,也避免了HSAFS中垃圾回收的数据拷贝对主机CPU的占用,能够提高垃圾回收效率。
如图9所示,本发明实施例还提供了一种主机端700,包括处理器710、存储器720、总线***730和收发器740。处理器710、存储器720和收发器740通过总线***730相连,该存储器720用于存储指令,该处理器710用于执行该存储器720存储的指令,以控制收发器740接收信号。其中,该处理器710用于在多个存储带中确定待回收存储带;确定所述待回收存储带的 类型,所述待回收存储带的类型包括热存储带和冷存储带;根据所述待回收存储带的类型,确定垃圾数据回收策略;收发器740用于向所述设备端发送所述垃圾数据回收策略,以便于所述设备端根据所述垃圾数据回收策略对所述待回收存储带进行垃圾数据回收处理。
因此,本发明实施例的瓦记录感知文件***中的主机端,确定待回收Band,并根据该待回收Band的类型确定垃圾数据回收策略,可以将热存储带(Hot Band)中的数据块保留在热存储带中,将冷存储带(Cold Band)中的数据块仍然保留在冷存储带中。因为Hot Band中为热数据块,更新更频繁,在垃圾数据回收的过程中有效数据块更少,而Cold Band中包含冷数据块,更新不频繁,基本不用进行垃圾回收,从而可以加速垃圾回收中有效数据块的拷贝操作。
应理解,在本发明实施例中,该处理器710可以是中央处理单元(Central Processing Unit,简称为“CPU”),该处理器710还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器720可以包括只读存储器和随机存取存储器,并向处理器710提供指令和数据。存储器720的一部分还可以包括非易失性随机存取存储器。例如,存储器720还可以存储设备类型的信息。
该总线***730除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线***730。
在实现过程中,上述方法的各步骤可以通过处理器710中的硬件的集成逻辑电路或者软件形式的指令完成。结合本发明实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器720,处理器710读取存储器720中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:确定该多个存储带中每个存储带的无效数据块利用率; 将该无效数据块利用率最大的存储带确定为该待回收存储带。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:当该待回收存储带的类型为热存储带时,在该多个存储带中的热存储带中确定第一目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第一目标存储带的空闲数据块中;当该待回收存储带的类型为冷存储带时,在该多个存储带中的冷存储带中确定第二目标存储带;确定该垃圾数据回收策略,该垃圾数据回收策略用于指示该设备端将该待回收存储带中的有效数据块存储到该第二目标存储带的空闲数据块中。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:当该待回收存储带的类型为热存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的热存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的热存储带中存在至少一个热存储带的空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率时;将该至少一个热存储带中空闲数据块利用率最小的热存储带确定为第一目标存储带;当该多个存储带中除了该待回收存储带以外的热存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的热存储带时;将该待回收存储带确定为第一目标存储带。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:当该待回收存储带的类型为冷存储带时,确定该待回收存储带的有效数据块利用率;确定该多个存储带中的冷存储带的空闲数据块利用率;当该多个存储带中除了该待回收存储带以外的冷存储带中存在至少一个冷存储带的空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率时;将该至少一个冷存储带中空闲数据块利用率最小的冷存储带确定为第二目标存储带;当该多个存储带中除了该待回收存储带以外的冷存储带中不存在空闲数据块利用率大于或等于该待回收存储带的有效数据块利用率的冷存储带时;将该待回收存储带确定为第二目标存储带。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:根据读写次数,确定该待回收存储带中每个数据块的冷热度;将该待回收存储带中每个数据块的冷热度之和确定为该待回收存储带的冷热度;当该待回收存储带的冷热度大于或等于第一阈值时,确定该待 回收存储带的类型为热存储带;或当该待回收存储带的冷热度小于该第一阈值时,确定该待回收存储带的类型为冷存储带。
可选地,作为一个实施例,处理器710可以调用存储器720中存储的程序代码执行以下操作:确定该多个存储带中每个存储带的冷热度;将该多个存储带中冷热度大于或等于第二阈值的存储带确定为热存储带;将该多个存储带中冷热度小于该第二阈值的存储带确定为冷存储带。
可选地,作为一个实施例,收发器740用于:向该设备端发送该垃圾数据回收策略,该垃圾数据回收策略包括该待回收存储带和该目标存储带的指示符,以便于该设备端根据该垃圾数据回收策略,将该待回收存储带中的有效数据存储到该目标存储带中的空闲数据块中,并对该待回收存储带中的无效数据块进行回收处理,该目标存储带包括该第一目标存储带或该第二目标存储带。
因此,本发明实施例的瓦记录感知文件***中的主机端,确定待回收Band,并根据该待回收Band的类型确定垃圾数据回收策略,可以将热存储带(Hot Band)中的数据块保留在热存储带中,将冷存储带(Cold Band)中的数据块仍然保留在冷存储带中。因为Hot Band中为热数据块,更新更频繁,在垃圾数据回收的过程中有效数据块更少,而Cold Band中包含冷数据块,更新不频繁,基本不用进行垃圾回收,从而可以加速垃圾回收中有效数据块的拷贝操作。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个 ***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。

Claims (18)

  1. 一种瓦记录感知文件***中垃圾数据的回收方法,其特征在于,所述瓦记录感知文件***包括主机端和设备端,所述方法包括:
    所述主机端在多个存储带中确定待回收存储带;
    所述主机端确定所述待回收存储带的类型,所述待回收存储带的类型包括热存储带和冷存储带;
    所述主机端根据所述待回收存储带的类型,确定垃圾数据回收策略;
    所述主机端向所述设备端发送所述垃圾数据回收策略,以便于所述设备端根据所述垃圾数据回收策略对所述待回收存储带进行垃圾数据回收处理。
  2. 根据权利要求1所述的方法,其特征在于,所述在多个存储带中确定待回收存储带,包括:
    确定所述多个存储带中每个存储带的无效数据块利用率;
    将所述无效数据块利用率最大的存储带确定为所述待回收存储带。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述待回收存储带的类型,确定垃圾数据回收策略,包括:
    当所述待回收存储带的类型为热存储带时,在所述多个存储带中的热存储带中确定第一目标存储带;
    确定所述垃圾数据回收策略,所述垃圾数据回收策略用于指示所述设备端将所述待回收存储带中的有效数据块存储到所述第一目标存储带的空闲数据块中;
    当所述待回收存储带的类型为冷存储带时,在所述多个存储带中的冷存储带中确定第二目标存储带;
    确定所述垃圾数据回收策略,所述垃圾数据回收策略用于指示所述设备端将所述待回收存储带中的有效数据块存储到所述第二目标存储带的空闲数据块中。
  4. 根据权利要求3所述的方法,其特征在于,所述当所述待回收存储带的类型为热存储带时,在所述多个存储带中的热存储带中确定第一目标存储带,包括:
    当所述待回收存储带的类型为热存储带时,确定所述待回收存储带的有 效数据块利用率;
    确定所述多个存储带中的热存储带的空闲数据块利用率;
    当所述多个存储带中除了所述待回收存储带以外的热存储带中存在至少一个热存储带的空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率时;
    将所述至少一个热存储带中空闲数据块利用率最小的热存储带确定为第一目标存储带;
    当所述多个存储带中除了所述待回收存储带以外的热存储带中不存在空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率的热存储带时;
    将所述待回收存储带确定为第一目标存储带。
  5. 根据权利要求3所述的方法,其特征在于,所述当所述待回收存储带的类型为冷存储带时,在所述多个存储带中的冷存储带中确定第二目标存储带,包括:
    当所述待回收存储带的类型为冷存储带时,确定所述待回收存储带的有效数据块利用率;
    确定所述多个存储带中的冷存储带的空闲数据块利用率;
    当所述多个存储带中除了所述待回收存储带以外的冷存储带中存在至少一个冷存储带的空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率时;
    将所述至少一个冷存储带中空闲数据块利用率最小的冷存储带确定为第二目标存储带;
    当所述多个存储带中除了所述待回收存储带以外的冷存储带中不存在空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率的冷存储带时;
    将所述待回收存储带确定为第二目标存储带。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述确定所述待回收存储带的类型,包括:
    根据读写次数,确定所述待回收存储带中每个数据块的冷热度;
    将所述待回收存储带中每个数据块的冷热度之和确定为所述待回收存储带的冷热度;
    当所述待回收存储带的冷热度大于或等于第一阈值时,确定所述待回收存储带的类型为热存储带;或
    当所述待回收存储带的冷热度小于所述第一阈值时,确定所述待回收存储带的类型为冷存储带。
  7. 根据权利要求3至6中任一项所述的方法,其特征在于,所述方法还包括:
    确定所述多个存储带中每个存储带的冷热度;
    将所述多个存储带中冷热度大于或等于第二阈值的存储带确定为热存储带;
    将所述多个存储带中冷热度小于所述第二阈值的存储带确定为冷存储带。
  8. 根据权利要求3至7中任一项所述的方法,其特征在于,所述主机端向所述设备端发送所述垃圾数据回收策略,以便于所述设备端根据所述垃圾数据回收策略对所述待回收存储带进行垃圾数据回收处理,包括:
    所述主机端向所述设备端发送所述垃圾数据回收策略,所述垃圾数据回收策略包括所述待回收存储带和所述目标存储带的指示符,以便于所述设备端根据所述垃圾数据回收策略,将所述待回收存储带中的有效数据存储到所述目标存储带中的空闲数据块中,并对所述待回收存储带中的无效数据块进行回收处理,所述目标存储带包括所述第一目标存储带或所述第二目标存储带。
  9. 一种主机,其特征在于,包括网络接口、存储器以及处理器,其中,所述存储器中存储一组程序,且所述处理器用于调用所述存储器中存储的程序,使得所述主机执行如权利要求1至8任一项所述的方法。
  10. 一种瓦记录感知文件***中的主机端,其特征在于,所述主机端包括:
    第一确定模块,用于在多个存储带中确定待回收存储带;
    第二确定模块,用于确定所述待回收存储带的类型,所述待回收存储带的类型包括热存储带和冷存储带;
    第三确定模块,用于根据所述第二确定模块确定的所述待回收存储带的类型,确定垃圾数据回收策略;
    发送模块,用于向所述设备端发送所述第三确定模块确定的所述垃圾数 据回收策略,以便于所述设备端根据所述垃圾数据回收策略对所述待回收存储带进行垃圾数据回收处理。
  11. 根据权利要求10所述的主机端,其特征在于,所述第一确定模块具体用于:
    确定所述多个存储带中每个存储带的无效数据块利用率;
    将所述无效数据块利用率最大的存储带确定为所述待回收存储带。
  12. 根据权利要求11所述的主机端,其特征在于,所述第三确定模块具体用于:
    当所述待回收存储带的类型为热存储带时,在所述多个存储带中的热存储带中确定第一目标存储带;
    确定所述垃圾数据回收策略,所述垃圾数据回收策略用于指示所述设备端将所述待回收存储带中的有效数据块存储到所述第一目标存储带的空闲数据块中;
    当所述待回收存储带的类型为冷存储带时,在所述多个存储带中的冷存储带中确定第二目标存储带;
    确定所述垃圾数据回收策略,所述垃圾数据回收策略用于指示所述设备端将所述待回收存储带中的有效数据块存储到所述第二目标存储带的空闲数据块中。
  13. 根据权利要求12所述的主机端,其特征在于,所述第三确定模块具体用于:
    当所述待回收存储带的类型为热存储带时,确定所述待回收存储带的有效数据块利用率;
    确定所述多个存储带中的热存储带的空闲数据块利用率;
    当所述多个存储带中除了所述待回收存储带以外的热存储带中存在至少一个热存储带的空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率时;
    将所述至少一个热存储带中空闲数据块利用率最小的热存储带确定为第一目标存储带;
    当所述多个存储带中除了所述待回收存储带以外的热存储带中不存在空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率的热存储带时;
    将所述待回收存储带确定为第一目标存储带。
  14. 根据权利要求12所述的主机端,其特征在于,所述第三确定模块具体用于:
    当所述待回收存储带的类型为冷存储带时,确定所述待回收存储带的有效数据块利用率;
    确定所述多个存储带中的冷存储带的空闲数据块利用率;
    当所述多个存储带中除了所述待回收存储带以外的冷存储带中存在至少一个冷存储带的空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率时;
    将所述至少一个冷存储带中空闲数据块利用率最小的冷存储带确定为第二目标存储带;
    当所述多个存储带中除了所述待回收存储带以外的冷存储带中不存在空闲数据块利用率大于或等于所述待回收存储带的有效数据块利用率的冷存储带时;
    将所述待回收存储带确定为第二目标存储带。
  15. 根据权利要求10至14中任一项所述的主机端,其特征在于,所述第二确定模块具体用于:
    根据读写次数,确定所述待回收存储带中每个数据块的冷热度;
    将所述待回收存储带中每个数据块的冷热度之和确定为所述待回收存储带的冷热度;
    当所述待回收存储带的冷热度大于或等于第一阈值时,确定所述待回收存储带的类型为热存储带;或
    当所述待回收存储带的冷热度小于所述第一阈值时,确定所述待回收存储带的类型为冷存储带。
  16. 根据权利要求12至15中任一项所述的主机端,其特征在于,所述第二确定模块还用于:
    确定所述多个存储带中每个存储带的冷热度;
    将所述多个存储带中冷热度大于或等于第二阈值的存储带确定为热存储带;
    将所述多个存储带中冷热度小于所述第二阈值的存储带确定为冷存储带。
  17. 一种瓦记录感知文件***,其特征在于,所述***包括设备端和如权利要求10至16所述的主机端,所述设备端用于:
    接收所述主机端发送的垃圾数据回收策略,所述垃圾数据回收策略包括待回收存储带和目标存储带的指示符;
    根据所述接收模块接收的所述垃圾数据回收策略,对所述待回收存储带进行回收。
  18. 根据权利要求17所述的***,其特征在于,所述设备端还用于:
    根据所述垃圾数据回收策略,确定待回收存储带和目标存储带;
    将所述待回收存储带中的有效数据存储到所述目标存储带中的空闲数据块中;
    对所述待回收存储带中的无效数据块进行回收处理。
PCT/CN2015/097908 2015-05-27 2015-12-18 瓦记录感知文件***中垃圾数据的回收方法和装置 WO2016188098A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510279079.0A CN106293497B (zh) 2015-05-27 2015-05-27 瓦记录感知文件***中垃圾数据的回收方法和装置
CN201510279079.0 2015-05-27

Publications (1)

Publication Number Publication Date
WO2016188098A1 true WO2016188098A1 (zh) 2016-12-01

Family

ID=57393537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097908 WO2016188098A1 (zh) 2015-05-27 2015-12-18 瓦记录感知文件***中垃圾数据的回收方法和装置

Country Status (2)

Country Link
CN (1) CN106293497B (zh)
WO (1) WO2016188098A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110554970A (zh) * 2018-05-31 2019-12-10 北京忆恒创源科技有限公司 显著降低写放大的垃圾回收方法及存储设备
CN110945486B (zh) 2018-06-30 2022-06-10 华为技术有限公司 一种存储碎片管理方法及终端
CN109189345B (zh) * 2018-09-18 2022-03-04 郑州云海信息技术有限公司 一种在线数据整理方法、装置、设备及存储介质
CN109783020B (zh) * 2018-12-28 2020-05-22 西安交通大学 一种基于ssd-smr混合键值存储***的垃圾回收方法
CN116166570A (zh) * 2019-07-31 2023-05-26 华为技术有限公司 一种垃圾回收方法及装置
CN110515550B (zh) * 2019-08-21 2022-03-29 深圳忆联信息***有限公司 一种sata固态硬盘冷热数据分离的方法及其装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101526923A (zh) * 2009-04-02 2009-09-09 成都市华为赛门铁克科技有限公司 一种数据处理方法、装置和闪存存储***
US8521972B1 (en) * 2010-06-30 2013-08-27 Western Digital Technologies, Inc. System and method for optimizing garbage collection in data storage
US20140115239A1 (en) * 2012-10-22 2014-04-24 Samsung Electronics Co., Ltd. Method of managing data in nonvolatile memory device
CN103955433A (zh) * 2014-05-09 2014-07-30 华为技术有限公司 盖瓦磁记录硬盘、盖瓦磁记录硬盘写数据的方法及装置
US20140281129A1 (en) * 2013-03-15 2014-09-18 Tal Heller Data tag sharing from host to storage systems
CN104156317A (zh) * 2014-08-08 2014-11-19 浪潮(北京)电子信息产业有限公司 一种非易失性闪存的擦写管理方法及***
CN104461390A (zh) * 2014-12-05 2015-03-25 华为技术有限公司 将数据写入叠瓦状磁记录smr硬盘的方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101526923A (zh) * 2009-04-02 2009-09-09 成都市华为赛门铁克科技有限公司 一种数据处理方法、装置和闪存存储***
US8521972B1 (en) * 2010-06-30 2013-08-27 Western Digital Technologies, Inc. System and method for optimizing garbage collection in data storage
US20140115239A1 (en) * 2012-10-22 2014-04-24 Samsung Electronics Co., Ltd. Method of managing data in nonvolatile memory device
US20140281129A1 (en) * 2013-03-15 2014-09-18 Tal Heller Data tag sharing from host to storage systems
CN103955433A (zh) * 2014-05-09 2014-07-30 华为技术有限公司 盖瓦磁记录硬盘、盖瓦磁记录硬盘写数据的方法及装置
CN104156317A (zh) * 2014-08-08 2014-11-19 浪潮(北京)电子信息产业有限公司 一种非易失性闪存的擦写管理方法及***
CN104461390A (zh) * 2014-12-05 2015-03-25 华为技术有限公司 将数据写入叠瓦状磁记录smr硬盘的方法及装置

Also Published As

Publication number Publication date
CN106293497A (zh) 2017-01-04
CN106293497B (zh) 2019-03-19

Similar Documents

Publication Publication Date Title
WO2016188098A1 (zh) 瓦记录感知文件***中垃圾数据的回收方法和装置
CN106598878B (zh) 一种固态硬盘冷热数据分离方法
US10467044B2 (en) Transaction processing method and apparatus, and computer system
TWI417727B (zh) 記憶體儲存裝置、其記憶體控制器與回應主機指令的方法
WO2014139184A1 (zh) 用于闪存存储器的数据擦除方法及装置
WO2016041401A1 (zh) 向缓存中写入数据的方法及装置
JP2003177947A5 (zh)
TWI607448B (zh) 資料寫入方法、記憶體控制電路單元與記憶體儲存裝置
CN103034592B (zh) 数据处理方法和装置
CN110362499B (zh) 电子机器及其控制方法、计算机***及其控制方法以及主机的控制方法
JP6161721B2 (ja) ホストからストレージ装置への削除されたデータの示唆
US20140359198A1 (en) Notification of storage device performance to host
US20100191901A1 (en) Non-volatile storage device, host device, non-volatile storage system, data recording method, and program
WO2021139166A1 (zh) 基于三维闪存存储结构的错误页识别方法
CN109918318A (zh) Ssd元数据管理方法、装置、设备及可读存储介质
WO2017112318A1 (en) Techniques to achieve ordering among storage device transactions
CN114968839A (zh) 硬盘垃圾回收方法、装置、设备及计算机可读存储介质
WO2014075586A1 (zh) 一种jbod阵列自动恢复存储的方法和装置
CN115687174A (zh) 一种固态硬盘动态垃圾回收的方法及固态硬盘
CN104360953B (zh) 数据拷贝方法及装置
JP4997858B2 (ja) データ記録装置およびデータ記録プログラム
CN111221468A (zh) 存储块数据删除方法、装置、电子设备及云存储***
TWI285378B (en) System and method for controlling flash memory
CN114237489B (zh) 将逻辑资源写入smr盘的方法、装置、电子设备及存储介质
CN103678478A (zh) 信息处理装置、信息处理方法和程序

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15893166

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15893166

Country of ref document: EP

Kind code of ref document: A1