WO2017143972A1 - Data processing method and apparatus - Google Patents

Data processing method and apparatus Download PDF

Info

Publication number
WO2017143972A1
WO2017143972A1 PCT/CN2017/074290 CN2017074290W WO2017143972A1 WO 2017143972 A1 WO2017143972 A1 WO 2017143972A1 CN 2017074290 W CN2017074290 W CN 2017074290W WO 2017143972 A1 WO2017143972 A1 WO 2017143972A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
page data
page
type
block
Prior art date
Application number
PCT/CN2017/074290
Other languages
French (fr)
Chinese (zh)
Inventor
杨洪章
罗圣美
王志坤
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017143972A1 publication Critical patent/WO2017143972A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device

Definitions

  • the embodiments of the present invention relate to the field of communications, and in particular, to a data processing method and apparatus.
  • SSDs Solid State Drives
  • SSDs are a new generation of hard drives that combine advanced semiconductor technology into high-capacity mobile storage. Since there is no mechanical structure like a magnetic head inside, there is no need to move the head positioning data, so the solid state hard disk starts up faster, and since there is no seek time, the storage and reading and writing speed of the solid state hard disk is also superior to that of the mechanical hard disk.
  • SSDs have advantages over traditional disks: high reliability, high shock resistance, low power consumption, low noise, and more. For this reason, SSDs are beginning to gain popularity in both personal and enterprise applications.
  • SSDs also have some shortcomings, such as pre-write erase, limited erase times, and garbage collection.
  • erasing before writing means that the solid state hard disk has three operations of reading, writing and erasing, and must be erased before the writing operation, that is, the overwriting operation cannot be directly performed. For example, when you need to modify the written data, you need to invalidate the old data mark and then write the new data to the free space.
  • the feature of erasing before writing greatly reduces the write performance of the solid state drive.
  • the number of limited erasures means that the number of erasures of the solid state hard disk is generally 100,000 times to one million times.
  • GC Garbage Collection
  • the commonly used method in the prior art is to adopt the classic Greedy algorithm, that is, select the data blocks containing the most failed pages for garbage collection, and all the pages in the data block will be invalidated.
  • Priority recycling In other words, in solid When the free space in the hard disk is insufficient, the valid page in the solid state hard disk data recovery block is moved, and the invalid page in the data recovery block is erased to implement garbage collection of the solid state hard disk.
  • the SSD does not subdivide the effective page data page, that is, after the relocation, the cold and dirty page data in the valid page data will be replaced from the cache, and further the dirty data needs to be replaced.
  • the page data is relocated, and the data that has just been moved to the new location is marked as invalid, and the new data in the cache is written to the location of the SSD update.
  • a large number of unnecessary secondary relocations of the effective page during the use of the solid state hard disk will greatly increase the overhead of the solid state hard disk, thereby affecting the processing efficiency of the data in the solid state hard disk.
  • the embodiment of the invention provides a data processing method and device, so as to at least solve the problem that the data processing efficiency is low due to secondary data relocation in the related art.
  • a data processing method includes: acquiring a reclaim request, wherein the reclaiming request is used to request data recovery of page data in a solid state hard disk; and validating the cache request in response to the reclaiming request Obtaining the first type of page data in the page data, wherein the first type of page data is used to indicate that the page data to be replaced by the cache to the solid state hard disk is to be read; and the first type of page data is relocated from the cache And a predetermined relocation position in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after the data recovery is performed.
  • the obtaining the first type of page data from the cached valid page data in response to the foregoing reclaiming request includes: obtaining an access frequency and a modified identifier of the valid page data in the cache; and obtaining, according to the access frequency and the modified identifier
  • the page type of the valid page data wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache.
  • the second type of page data includes first page data and second page data
  • the page type for obtaining the valid page data according to the access frequency and the modified identifier includes: identifying the modification as unmodified page data as the first page data, and identifying the modification as being modified and the access frequency.
  • the page data that is greater than or equal to the first predetermined threshold is used as the second page data, and the modification is identified as page data that has been modified and the access frequency is less than the first predetermined threshold as the first type of page data.
  • the method before the relocating the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, the method further includes: determining, according to at least the first type of page data, a data recovery block of the solid state hard disk, where The page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, and the valid page data, wherein each of the data blocks includes the data recovery block; and the data is collected by using the data recovery block. Recycling.
  • determining, according to the foregoing first type of page data, the data recovery block of the solid state hard disk comprising: acquiring the foregoing data according to the first type of page data in the cache and the page data in each data block in the solid state hard disk.
  • the data recovery rate of the block; the above data recovery block is determined according to the above data recovery rate.
  • obtaining, according to the first type of page data in the cache and the page data in each data block in the solid state hard disk, the data recovery rate of each of the data blocks includes: repeating the following steps until the traversing of the solid state hard disk is completed. And the foregoing data block: obtaining the block identifier of the current data block; acquiring the first type of page data and the invalid page data identified by the block identifier; and acquiring the data recovery rate of the current data block by: Wherein, the r represents the data recovery rate of the current data block, the a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and the b represents the page data of the first type in the cache. The number of pages, the above P represents the page size, and the above B represents the block size.
  • performing the foregoing data recovery on the data recovery block includes: relocating the valid page data in the data recovery block to the predetermined relocation location, and marking the valid page data as the invalid page data; The above failed page data in the recycle block is erased.
  • the method further includes: according to the other data block except the data recovery block in the solid state hard disk The size of the write page data determines the predetermined relocation location described above.
  • a data processing apparatus including: a first obtaining unit, configured to acquire a recycling request, wherein the recycling request is used to request data recovery of page data in a solid state hard disk;
  • the second obtaining unit is configured to obtain the first type of page data from the cached valid page data in response to the recycling request, wherein the first type of page data is used to indicate that the storage is to be replaced from the cache to the solid state hard disk.
  • a page data a relocation unit configured to relocate the first type of page data from the cache to a predetermined relocation location in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after performing the data recovery .
  • the second obtaining unit includes: a first acquiring module, configured to acquire an access frequency and a modified identifier of the valid page data in the cache, and a second acquiring module, configured to obtain according to the access frequency and the modified identifier
  • the page type of the valid page data wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache.
  • the separating module is configured to separate the valid page data according to the page type of the valid page data, to obtain the first type of page data.
  • the page data of the second type includes the first page data and the second page data
  • the second obtaining module obtains the page type of the valid page data by: identifying the modification as being unmodified
  • the page data is used as the first page data
  • the modification is identified as page data that has been modified and the access frequency is greater than or equal to a first predetermined threshold as the second page data
  • the modification is identified as having been modified and the access frequency is
  • the page data smaller than the first predetermined threshold is used as the first type of page data.
  • the method further includes: a first determining unit, configured to: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, at least according to the foregoing
  • the first type of page data determines a data recovery block of the solid state hard disk, wherein the page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, the valid page data, and the foregoing data.
  • the block includes the above data recovery block; and the recovery unit is configured to perform the above data recovery on the data recovery block.
  • the first determining unit includes: a third acquiring module, configured to obtain, according to the first type of page data in the cache and the page data in each data block in the solid state hard disk, a data recovery rate of each of the data blocks;
  • the determining module is configured to determine the above data recovery block according to the above data recovery rate.
  • the foregoing third obtaining module includes: a processing submodule, configured to repeatedly perform the following steps until the foregoing each data block in the solid state hard disk is traversed: acquiring a block identifier of a current data block; acquiring the identifier identified by the block identifier The first type of page data and the invalid page data; the foregoing data recovery rate of the current data block is obtained by: Wherein, the r represents the data recovery rate of the current data block, the a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and the b represents the page data of the first type in the cache. The number of pages, the above P represents the page size, and the above B represents the block size.
  • the recycling unit includes: a relocation module configured to relocate the valid page data in the data recovery block to the predetermined relocation location, and mark the valid page data as the invalid page data; and an erasing module, It is set to erase the above-mentioned invalid page data in the above data recovery block.
  • the method further includes: a second determining unit, configured to: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, according to the solid state hard disk except the data recovery block The size of the above-mentioned unwritten page data in the other data blocks determines the predetermined relocation position.
  • a second determining unit configured to: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, according to the solid state hard disk except the data recovery block The size of the above-mentioned unwritten page data in the other data blocks determines the predetermined relocation position.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the data processing method in the foregoing embodiment.
  • data recovery is performed on page data in the solid state hard disk.
  • the first type of page data that is to be replaced from the cache to the SSD is directly relocated to the predetermined relocation position in the SSD without first replacing the first type of page data.
  • another relocation is carried out, thereby overcoming the problem of low data processing efficiency caused by the secondary relocation of data in the related art, thereby improving the efficiency of data processing, and also reducing the SSD in The number of data relocations and the extra overhead caused by data reclamation and cache replacement improves the performance of SSDs.
  • FIG. 1 is a flow chart of an alternative data processing method in accordance with an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of an optional data block according to an embodiment of the present invention.
  • FIG. 3 is a flow chart of another alternative data processing method in accordance with an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of an alternative data processing apparatus in accordance with an embodiment of the present invention.
  • FIG. 1 is a diagram of an embodiment of the present invention.
  • a flow chart of an optional data processing method is provided, as shown in FIG. 1, the process includes the following steps:
  • Step S102 acquiring a recycling request, where the recycling request is used to request data recovery of page data in the solid state hard disk;
  • step S104 the first type of page data is obtained from the cached valid page data in response to the reclaiming request, wherein the first type of page data is used to indicate that the page data to be replaced from the cache to the solid state hard disk is to be replaced;
  • Step S106 Relocating the first type of page data from the cache to a predetermined relocation location in the SSD, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
  • the foregoing data processing method may be, but is not limited to, being applied to a garbage data recovery process of a solid state hard disk. That is to say, in the embodiment, when the data of the page data as the garbage in the solid state hard disk is recovered, the valid page data can be obtained when the data recovery request for the page data in the solid state hard disk is obtained.
  • the first type of page data stored in the cache to be replaced by the cache to the SSD is directly moved to the predetermined relocation position in the SSD without first replacing the first type of page data with the SSD, and then relocating again. Therefore, the problem of low data processing efficiency caused by the secondary relocation of data in the related art is overcome, thereby improving the data processing efficiency and greatly reducing the overhead caused by the data relocation of the solid state hard disk.
  • the SSD includes: a Flash Translation Layer (FTL), where the flash translation layer is used to map a logical address to a physical address through a mapping table; Page type; detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
  • FTL Flash Translation Layer
  • Page type detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
  • the page type of the valid page data includes the first type of page data and the second type of page data, wherein the first type of page data is to be The cache replaces the page data stored to the SSD, and the second type of page data is page data that is not replaced by the cache to the SSD.
  • the obtaining, by the response to the reclaiming request, the first type of page data from the cached valid page data comprises: obtaining an access frequency of the valid page data in the cache and modifying the identifier; and according to the access frequency and the modification identifier Obtaining a page type of valid page data, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache to the solid state hard disk.
  • Page data separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
  • a Cache Layer for temporarily caching valid page data will queue the pages in the cache according to the Least Recently Used algorithm (the most recently unused queue, the LRU queue).
  • the LRU queue may be, but is not limited to, divided into hot (HOT) pages and cold (COOL) pages according to a predetermined threshold. For example, if the predetermined threshold is 10, 10% of the tail of the LRU queue is marked as a cold (COOL) page, before 90% of pages are marked as hot (HOT) pages.
  • the LRU queues may be, but are not limited to, ordered according to the access frequency.
  • the page data modified in the cache layer is marked as a dirty (DIRTY) page
  • the unmodified page data is marked as a clean (CLEAN) page.
  • a dirty page in a cold page is called a COOL DIRTY page (identified by a CD)
  • a dirty page in a hot page is called a hot dirty (HOT DIRTY) page (identified by HD).
  • the page data in the cache there are two copies of the page data in the cache, one is a copy in the SSD, and the other is a copy in the cache. If it is a clean page, the two copies are identical; if it is a dirty page, the copy in the cache is the latest page data, and the copy in the SSD is the old page data. That is to say, the cold dirty page in the solid state hard disk and the cold dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state hard disk stores the old data of the cold dirty page.
  • the cold dirty page storing the old data in the solid state hard disk may be marked as invalid page data, and Will replace the new dirty page in the cache
  • the data is directly written to the updated location in the SSD (such as the scheduled relocation location after data recovery). Therefore, the cold dirty page is first moved to the solid state hard disk through the cache replacement, and a secondary relocation step is performed in the process of data recovery, thereby overcoming the data processing caused by the secondary relocation of data in the related art.
  • the problem of low efficiency in addition to improving the efficiency of data processing, also greatly reduces the overhead caused by the relocation of solid state drives.
  • each data block in the solid state hard disk may include, but is not limited to, the following five types of page data: unwritten page data (can be represented by an unwritten page), invalid page data (available invalid page) Indicates), valid page data (can be represented by a valid page), wherein the valid page data includes: clean page data (which can be represented by a clean page), hot dirty page data (which can be represented by a hot dirty page), and cold dirty page data ( Can be represented by a cold dirty page).
  • the unwritten page data is a free space in the data block, has been erased or never allocated, and can directly write data.
  • Clean page data means that the page has been written to the data, and the page data is not modified in the cache; hot dirty page data means that the page data has been modified in the cache, but has not been cached due to frequent access.
  • the cache is used to store valid page data, which may include, but is not limited to, the following three types of page data: clean page data, hot dirty page data, and cold dirty page data.
  • the page data of the second type may include, but is not limited to, clean page data and hot dirty page data
  • the first type of page data may include, but is not limited to, cold dirty page data.
  • the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data can be, but is not limited to, the hot dirty page data.
  • the second type of page data stored to the SSD is not replaced from the cache.
  • the method before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes:
  • performing data recovery on the data recovery block may include: relocating valid page data in the data recovery block to a predetermined relocation location, and marking the valid page data as invalid page data; The failed page data in the recycle block is erased.
  • determining, according to at least the first type of page data, the data recovery block of the solid state hard disk may be, but is not limited to, acquiring data recovery rate of each data block in the solid state hard disk according to at least the first type of page data.
  • the data recovery block is determined by comparing the obtained data recovery rate (such as determining the block identifier of the data recovery block).
  • the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
  • the data recovery rate of each data block in the solid state hard disk is obtained according to at least the first type of page data, which may include, but is not limited to, according to the first type of page data and each data block in the solid state hard disk.
  • the invalidation page data to determine the data recovery rate.
  • the method before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes: according to other data blocks in the SSD except the data recovery block.
  • the size of the write page data determines the predetermined relocation location.
  • the valid page data in the page can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
  • the method includes:
  • the system triggers a recycling request.
  • the first type of page data stored in the effective page data to be replaced from the cache to the solid state hard disk is directly used. Relocating to a predetermined relocation location in the SSD without first replacing the first type of page data with the SSD, and then re-removing, thereby overcoming the data processing efficiency caused by the secondary relocation of data in the related art.
  • the low problem achieves the effect of improving data processing efficiency.
  • it also reduces the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement, and improves the performance of the SSD.
  • the first type of page data is obtained from the cached valid page data in response to the reclaim request:
  • the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the page data stored to the solid state hard disk is not replaced from the cache;
  • S3 Separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
  • the page data of the second type includes the first page data and the second page data
  • the page type for obtaining the valid page data according to the access frequency and the modification identifier includes: identifying the modification as not being
  • the modified page data is used as the first page data
  • the modification identifies the page data that has been modified and the access frequency is greater than or equal to the first predetermined threshold as the second page data, and identifies the modification as being modified and the access frequency is less than the first predetermined threshold.
  • the page data is used as the first type of page data.
  • the FTL in the SSD will detect the free space of the SSD in real time. After detecting that the free space is insufficient and triggering the reclaim request, the system will start to traverse the LRU queue in the cache and queue the LRU. The page data in the tag page type.
  • the cache layer marks the 10% page of the tail of the LRU queue as a cold page and the remaining pages as a hot page. Further, the cold page and the hot page are respectively traversed, the dirty page in the cold page is marked as a cold dirty page (CD), the dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified. FTL in SSD.
  • the page data of the first type ie, the cold dirty page
  • the page data of the first type can be obtained by separating, so that the first type of page data can be easily relocated. Relocate to the storage location of the valid page data after data recovery, and avoid the two relocations in the cache replacement and data recovery process to reduce the overhead.
  • the method before moving the first type of page data from the cache to the predetermined relocation location in the SSD, the method further includes:
  • S1 determining, according to at least the first type of page data, a data recovery block of the solid state hard disk, where
  • the page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block;
  • the determining, by the foregoing at least the first type of page data, the data recovery block of the solid state hard disk includes:
  • the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
  • the mapping table of the cache layer as the cache not only stores the page type of the page data mark, but also stores the preset location information after the page data is replaced by the SSD. Such as the block identifier of the data block).
  • the FTL may count the number of cold dirty pages marked by the cache as the first type of page data, and the number of invalid page data in each data block of the solid state hard disk. Obtain the number of failed pages and the number of cold and dirty pages in each data block in units of data blocks, and calculate the number of invalid pages and the number of cold and dirty pages. The data recovery rate of each data block.
  • the data recovery rate is used to accurately locate the data recovery block for data recovery in the solid state hard disk, thereby realizing accurate and efficient data recovery of the solid state hard disk, thereby ensuring data processing efficiency in the solid state hard disk.
  • obtaining data recovery rates of each data block according to the first type of page data in the cache and the page data in each data block in the solid state hard disk includes:
  • r represents the data recovery rate of the current data block
  • a represents the number of pages of the invalid page data of the current data block in the solid state hard disk
  • b represents the number of pages of the first type of page data in the cache
  • P represents the page size
  • B represents Block size.
  • the unit of the read/write operation in the solid state hard disk is a page, wherein the page size is usually 2 KB, and the access delay is generally 15 us to 200 us.
  • the unit of the erase operation is a block, where the block size is usually 128 KB, and erasing a block requires an overhead of about 2 ms.
  • the data recovery rate of each data block in the solid state hard disk is sequentially calculated in the above manner, thereby ensuring the accuracy of the determined data recovery block, thereby realizing accurate and efficient data recovery of the solid state hard disk.
  • data recovery for data recovery blocks includes:
  • the clean page and the hot dirty page in the data recovery block can be directly copied to the predetermined relocation position, and the corresponding page data corresponding to the FTL is modified. location information.
  • the corresponding valid page data in the data recovery block is marked as invalid page data (which can be represented by a stale page).
  • the cold dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold dirty page in the cache is copied to a predetermined relocation location. Then, the latest data of the cold dirty page in the cache is deleted, and the location information corresponding to the page data in the FTL is modified.
  • the page data in the data recovery block is erased, and the data recovery block is marked as "erased” to realize data recovery of the solid state hard disk and release the free space.
  • the data recovery of the solid state hard disk is ensured by the foregoing manner, and the first type of page data and the second type of page data in the effective page data can be relocated to the predetermined relocation position at one time, thereby avoiding The secondary relocation of the first type of page data achieves the effect of reducing the overhead of the solid state drive.
  • the method before moving the first type of page data from the cache to the predetermined relocation location in the SSD, the method further includes:
  • the size of the valid page data in the statistics recovery block For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block.
  • the predetermined relocation location of the valid page data For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block.
  • the predetermined relocation location is determined according to the size of the unwritten page data in the data block other than the data recovery block in the solid state hard disk, so as to ensure that the effective page data in the data recovery block can be completely relocated.
  • a data processing device is also provided, which is used to implement the above-mentioned embodiments and preferred embodiments, and will not be described again.
  • the term “module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 4 is a schematic diagram of an optional data processing apparatus according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:
  • the first obtaining unit 402 is configured to acquire a recycling request, where the recycling request is used to request data recovery of page data in the solid state hard disk;
  • the second obtaining unit 404 is configured to obtain, according to the recycling request, the first type of page data from the cached valid page data, wherein the first type of page data is used to indicate that the page to be replaced from the cache to the solid state hard disk is to be replaced. data;
  • the relocation unit 406 is configured to relocate the first type of page data from the cache to a predetermined relocation location in the SSD, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
  • the foregoing data processing apparatus may be, but is not limited to, being applied to a garbage data recovery process of a solid state hard disk. That is to say, in the embodiment, when the data of the page data as the garbage in the solid state hard disk is recovered, the valid page data can be obtained when the data recovery request for the page data in the solid state hard disk is obtained.
  • the first type of page data stored in the cache to be replaced by the cache to the SSD is directly moved to the predetermined relocation position in the SSD without first replacing the first type of page data with the SSD. Performing a relocation, thereby overcoming the problem of low data processing efficiency caused by the secondary relocation of data in the related technology, thereby improving the efficiency of data processing and greatly reducing the overhead caused by data relocation of the solid state drive. .
  • the SSD includes: a Flash Translation Layer (FTL), where the flash translation layer is used to map a logical address to a physical address through a mapping table; Page type; detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
  • FTL Flash Translation Layer
  • Page type detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
  • the page type of the valid page data includes the first type of page data and the second type of page data, wherein the first type of page data is to be replaced from the cache to the solid state hard disk.
  • Page data, the second type of page data is page data that is not replaced by the cache to the SSD.
  • the second obtaining unit 404 includes: (1) a first obtaining module, configured to acquire an access frequency and a modified identifier of valid page data in the cache; and (2) a second acquiring module,
  • the page type is set to obtain valid page data according to the access frequency and the modification identifier, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the page type is not
  • the cache replaces the page data stored in the solid state hard disk; (3) the separation module is configured to separate the valid page data according to the page type of the valid page data to obtain the first type of page data.
  • a Cache Layer for temporarily caching valid page data will queue the pages in the cache according to the Least Recently Used algorithm (the most recently unused queue, the LRU queue).
  • the LRU queue may be, but is not limited to, divided into hot (HOT) pages and cold (COOL) pages according to a predetermined threshold. For example, if the predetermined threshold is 10, 10% of the tail of the LRU queue is marked as a cold (COOL) page, before 90% of pages are marked as hot (HOT) pages.
  • the LRU queues may be, but are not limited to, ordered according to the access frequency.
  • the page data modified in the cache layer is marked as a dirty (DIRTY) page.
  • Unmodified page data is marked as a CLEAN page.
  • a dirty page in a cold page is called a COOL DIRTY page (identified by a CD), and a dirty page in a hot page is called a hot dirty (HOT DIRTY) page (identified by HD).
  • the page data in the cache there are two copies of the page data in the cache, one is a copy in the SSD, and the other is a copy in the cache. If it is a clean page, the two copies are identical; if it is a dirty page, the copy in the cache is the latest page data, and the copy in the SSD is the old page data. That is to say, the cold dirty page in the solid state hard disk and the cold dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state hard disk stores the old data of the cold dirty page.
  • the cold dirty page storing the old data in the solid state hard disk may be marked as invalid page data, and
  • the new data corresponding to the cold and dirty pages in the cache is directly written into the updated position of the SSD (for example, the predetermined relocation position after performing data recovery). Therefore, the cold dirty page is first moved to the solid state hard disk through the cache replacement, and a secondary relocation step is performed in the process of data recovery, thereby overcoming the data processing caused by the secondary relocation of data in the related art.
  • the problem of low efficiency in addition to improving the efficiency of data processing, also greatly reduces the overhead caused by the relocation of solid state drives.
  • each data block in the solid state hard disk may include, but is not limited to, the following five types of page data: unwritten page data (can be represented by an unwritten page), invalid page data (available invalid page) Indicates), valid page data (can be represented by a valid page), wherein the valid page data includes: clean page data (which can be represented by a clean page), hot dirty page data (which can be represented by a hot dirty page), and cold dirty page data ( Can be represented by a cold dirty page).
  • the unwritten page data is a free space in the data block, has been erased or never allocated, and can directly write data.
  • Clean page data means that the page has been written to the data, and the page data is not modified in the cache; hot dirty page data means that the page data has been modified in the cache, but has not been cached due to frequent access.
  • the above cache It is used to store valid page data, which can include, but is not limited to, the following three types of page data: clean page data, hot dirty page data, and cold dirty page data.
  • the page data of the second type may include, but is not limited to, clean page data and hot dirty page data
  • the first type of page data may include, but is not limited to, cold dirty page data.
  • the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data can be, but is not limited to, the hot dirty page data.
  • the second type of page data stored to the SSD is not replaced from the cache.
  • the foregoing apparatus further includes: (1) a first determining unit, configured to: at least according to the first, before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk
  • the type of page data determines a data recovery block of the solid state hard disk, wherein the page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block.
  • Recycling unit set to recover data from the data recovery block.
  • the foregoing recycling unit performs data recovery on the data recovery block by: relocating valid page data in the data recovery block to a predetermined relocation location, and marking the valid page data as a invalidation page. Data; erase the invalid page data in the data recovery block.
  • the first determining unit is configured to determine, according to at least the first type of page data, a data recovery block of the solid state hard disk by acquiring at least each data block in the solid state hard disk according to the first type of page data.
  • Data recovery rate by comparing the data recovery rate to determine the data recovery block (such as determining the block identifier of the data recovery block).
  • the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
  • the data recovery rate of each data block in the solid state hard disk is obtained according to at least the first type of page data, which may include, but is not limited to, according to the first type of page data and each data block in the solid state hard disk.
  • the invalidation page data to determine the data recovery rate.
  • the method before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes: according to other data blocks in the SSD except the data recovery block.
  • the size of the write page data determines the predetermined relocation location. So that the effective page data in the SSD can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
  • the data processing apparatus can implement data recovery of the solid state hard disk by the following steps:
  • the system triggers a recycling request.
  • the first type of page data stored in the effective page data to be replaced from the cache to the solid state hard disk is directly used. Relocating to a predetermined relocation location in the SSD without first replacing the first type of page data with the SSD, and then re-removing, thereby overcoming the data processing efficiency caused by the secondary relocation of data in the related art.
  • the low problem achieves the effect of improving data processing efficiency.
  • it also reduces the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement, and improves the performance of the SSD.
  • the second obtaining unit includes:
  • the first obtaining module is configured to obtain an access frequency and a modified identifier of valid page data in the cache
  • the second obtaining module is configured to obtain a page type of valid page data according to the access frequency and the modification identifier, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type
  • the page data is used to indicate that the page data stored to the solid state hard disk is not replaced from the cache;
  • the separation module is configured to separate the valid page data according to the page type of the valid page data to obtain the first type of page data.
  • the page data of the second type may include: first page data and second page data, where the second obtaining module obtains the page type of the valid page data by: modifying The page data that is identified as being unmodified is used as the first page data, and the modification is identified as the page data that has been modified and the access frequency is greater than or equal to the first predetermined threshold as the second page data, and the modification is identified as having been modified and the access frequency is less than The page data of the first predetermined threshold is used as the first type of page data.
  • the FTL in the SSD will detect the free space of the SSD in real time. After detecting that the free space is insufficient and triggering the reclaim request, the system will start to traverse the LRU queue in the cache and queue the LRU. The page data in the tag page type.
  • the cache layer marks the 10% page of the tail of the LRU queue as a cold page and the remaining pages as a hot page. Further, the cold page and the hot page are respectively traversed, the dirty page in the cold page is marked as a cold dirty page (CD), the dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified. FTL in SSD.
  • the page data of the first type ie, the cold dirty page
  • the page data of the first type can be obtained by separating, so that the first type of page data can be easily relocated. Relocate to the storage location of the valid page data after data recovery, and avoid the two relocations in the cache replacement and data recovery process to reduce the overhead.
  • the first determining unit is configured to determine, according to the first type of page data, a data recovery block of the solid state hard disk, at least before the relocation of the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, wherein the solid state hard disk
  • the page type of the page data in each data block includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block;
  • Recycling unit set to recover data from the data recovery block.
  • the first determining unit includes:
  • the third obtaining module is configured to obtain a data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state hard disk;
  • a determination module that is set to determine a data recovery block based on the data recovery rate.
  • the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
  • the mapping table of the cache layer as the cache not only stores the page type of the page data mark, but also stores the preset location information after the page data is replaced by the SSD. Such as the block identifier of the data block).
  • the FTL may count the number of cold dirty pages marked by the cache as the first type of page data, and the number of invalid page data in each data block of the solid state hard disk.
  • the data block unit the number of failed pages and the number of cold and dirty pages in each data block are respectively obtained, and the data recovery rate of each data block is calculated by using the number of failed pages and the number of cold dirty pages.
  • the data recovery rate is used to accurately locate the data recovery block for data recovery in the solid state hard disk, thereby realizing accurate and efficient data recovery of the solid state hard disk, thereby ensuring data processing efficiency in the solid state hard disk.
  • the third obtaining module includes:
  • r represents the data recovery rate of the current data block
  • a represents the number of pages of the invalid page data of the current data block in the solid state hard disk
  • b represents the page of the first type of page data in the cache.
  • the number, P represents the page size
  • B represents the block size.
  • the unit of the read/write operation in the solid state hard disk is a page, wherein the page size is usually 2 KB, and the access delay is generally 15 us to 200 us.
  • the unit of the erase operation is a block, where the block size is usually 128 KB, and erasing a block requires an overhead of about 2 ms.
  • the data recovery rate of each data block in the solid state hard disk is sequentially calculated in the above manner, thereby ensuring the accuracy of the determined data recovery block, thereby realizing accurate and efficient data recovery of the solid state hard disk.
  • the recycling unit includes:
  • the relocation module is configured to relocate the valid page data in the data recovery block to a predetermined relocation location, and mark the valid page data as invalid page data;
  • the erase module is set to erase the invalid page data in the data recovery block.
  • the first type of page data is a cold dirty page
  • the second type of page data is a clean page and a hot dirty page.
  • a clean page in the data recovery block can be recycled.
  • the hot dirty page is directly copied to the predetermined relocation location, and the location information corresponding to the above page data in the FTL is modified.
  • the corresponding valid page data in the data recovery block is marked as invalid page data (which can be represented by a stale page).
  • the cold dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold dirty page in the cache is copied to a predetermined relocation location. Then, the latest data of the cold dirty page in the cache is deleted, and the location information corresponding to the page data in the FTL is modified.
  • the page data in the data recovery block is erased, and the data recovery block is marked as "erased” to realize data recovery of the solid state hard disk and release the free space.
  • the data recovery of the solid state hard disk is ensured by the foregoing manner, and the first type of page data and the second type of page data in the effective page data can be relocated to the predetermined relocation position at one time, thereby avoiding The secondary relocation of the first type of page data achieves the effect of reducing the overhead of the solid state drive.
  • the second determining unit is configured to: before the relocation of the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, the page data is not written according to the data block other than the data recovery block in the solid state hard disk.
  • the size determines the predetermined relocation location.
  • the size of the valid page data in the statistics recovery block For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block.
  • the predetermined relocation location of the valid page data For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block.
  • the predetermined relocation location is determined according to the size of the unwritten page data in the data block other than the data recovery block in the solid state hard disk, so as to ensure that the effective page data in the data recovery block can be completely relocated.
  • each of the above modules may be implemented by software or hardware.
  • the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the first type of page data is obtained from the cached valid page data in response to the reclaiming request, wherein the first type of page data is used to indicate that the page data to be replaced from the cache to the solid state hard disk is to be replaced;
  • the first type of page data is relocated from the cache to a predetermined relocation location in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the first type of page data that is to be replaced from the cache to the solid state hard disk is directly relocated by the valid page data.
  • To the predetermined relocation position in the SSD without first replacing the first type of page data with the SSD, and then performing a relocation, thereby overcoming the low efficiency of data processing caused by the secondary relocation of data in the related art.
  • the problem is to improve the efficiency of data processing.
  • the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement are reduced, and the performance of the SSD is improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

Embodiments of the present invention provide a data processing method and apparatus. The method comprises: acquiring a reclamation request, wherein the reclamation request is used for requesting data reclamation for page data in a solid state drive (SSD); in response to the reclamation request, acquiring a first type of page data from cached valid page data, wherein the first type of page data is used to indicate the page data to be permuted and stored from a cache to the SSD; migrating the first type of page data from the cache to a predetermined migration location in the SSD, wherein the predetermined migration location is a storage location for the valid page data after the execution of data reclamation. The present invention solves the problem in the related art wherein data processing is inefficient due to secondary migration of data, thereby improving data processing efficiency.

Description

数据处理方法及装置Data processing method and device 技术领域Technical field
本发明实施例涉及通信领域,具体而言,涉及一种数据处理方法及装置。The embodiments of the present invention relate to the field of communications, and in particular, to a data processing method and apparatus.
背景技术Background technique
固态硬盘(Solid State Drives,简称SSD)是将先进的半导体技术融入大容量移动存储而产生的新一代硬盘。由于内部没有类似磁头这样的机械结构,不需要移动磁头定位数据,所以固态硬盘启动更快,并且由于没有寻道时间,固态硬盘的存储和读写速度也优于机械硬盘。除高性能之外,固态硬盘相对于传统磁盘的优势还包括:可靠性高、抗震性强、低功耗、低噪音等等。正因如此,固态硬盘开始在个人应用领域以及企业级应用中逐渐普及。Solid State Drives (SSDs) are a new generation of hard drives that combine advanced semiconductor technology into high-capacity mobile storage. Since there is no mechanical structure like a magnetic head inside, there is no need to move the head positioning data, so the solid state hard disk starts up faster, and since there is no seek time, the storage and reading and writing speed of the solid state hard disk is also superior to that of the mechanical hard disk. In addition to high performance, SSDs have advantages over traditional disks: high reliability, high shock resistance, low power consumption, low noise, and more. For this reason, SSDs are beginning to gain popularity in both personal and enterprise applications.
然而,固态硬盘还存在一些不足,比如写前擦除、有限擦除次数、垃圾回收。其中,1)写前擦除是指:固态硬盘有读、写和擦除三种操作,写操作前必须先进行擦除,即无法直接进行覆盖写操作。例如,当需要修改已写入的数据时,需要先将旧数据标记失效,再将新数据写入空闲空间。写前擦除的特性极大的降低了固态硬盘的写性能。2)有限擦除次数是指:固态硬盘的擦除次数一般是十万次到一百万次,一旦擦穿则将无法继续使用,需要将损坏单元所存储的数据迁移到别的单元,且当损坏单元超过一定数量,整块固态硬盘将无法使用。3)垃圾回收(Garbage Collection,简称GC)是指:当没有空闲空间进行写操作时,需要释放一部分空闲空间,也就是说,把所有有效页面重新搬迁到一个或者几个数据块中,把不包含有效数据页面的块进行擦除来释放自由空间。However, SSDs also have some shortcomings, such as pre-write erase, limited erase times, and garbage collection. Among them, 1) erasing before writing means that the solid state hard disk has three operations of reading, writing and erasing, and must be erased before the writing operation, that is, the overwriting operation cannot be directly performed. For example, when you need to modify the written data, you need to invalidate the old data mark and then write the new data to the free space. The feature of erasing before writing greatly reduces the write performance of the solid state drive. 2) The number of limited erasures means that the number of erasures of the solid state hard disk is generally 100,000 times to one million times. Once it is wiped through, it will not be able to continue to be used, and the data stored in the damaged unit needs to be migrated to another unit, and When the damaged unit exceeds a certain amount, the entire solid state drive will not be available. 3) Garbage Collection (GC) means: when there is no free space for writing, you need to free some free space, that is, relocate all valid pages to one or several data blocks. A block containing a valid data page is erased to free up free space.
目前,针对固态硬盘的垃圾回收,现有技术中常用的方法是采用经典贪婪算法(Greedy Algorithm),即选择那些包含最多失效页面的数据块进行垃圾回收,数据块中的页面全部失效的将被优先回收。也就是说,在固 态硬盘中的空闲空间不足时,会将固态硬盘数据回收块中的有效页面进行搬迁,并对数据回收块中的失效页面进行擦除,以实现对固态硬盘的垃圾回收。然而,在现有的垃圾回收过程中,固态硬盘没有对有效页面数据页面进行细分,即在搬迁过后,有效页面数据中冷脏页面数据会从缓存中置换出去,进一步还需对该冷脏页面数据进行重新搬迁,即将刚刚搬迁到新位置的数据标记为失效,将缓存中的新数据写入固态硬盘更新的位置。这样在固态硬盘的使用过程中对有效页面进行大量的、无谓的二次搬迁,将大大增加固态硬盘的开销,从而影响固态硬盘中数据的处理效率。At present, for the garbage collection of solid-state hard disks, the commonly used method in the prior art is to adopt the classic Greedy algorithm, that is, select the data blocks containing the most failed pages for garbage collection, and all the pages in the data block will be invalidated. Priority recycling. In other words, in solid When the free space in the hard disk is insufficient, the valid page in the solid state hard disk data recovery block is moved, and the invalid page in the data recovery block is erased to implement garbage collection of the solid state hard disk. However, in the existing garbage collection process, the SSD does not subdivide the effective page data page, that is, after the relocation, the cold and dirty page data in the valid page data will be replaced from the cache, and further the dirty data needs to be replaced. The page data is relocated, and the data that has just been moved to the new location is marked as invalid, and the new data in the cache is written to the location of the SSD update. In this way, a large number of unnecessary secondary relocations of the effective page during the use of the solid state hard disk will greatly increase the overhead of the solid state hard disk, thereby affecting the processing efficiency of the data in the solid state hard disk.
针对上述提出的问题,目前尚未提出有效的解决方案。In response to the above-mentioned questions, no effective solution has been proposed yet.
发明内容Summary of the invention
本发明实施例提供了一种数据处理方法及装置,以至少解决相关技术中由于数据二次搬迁导致的数据处理效率较低的问题。The embodiment of the invention provides a data processing method and device, so as to at least solve the problem that the data processing efficiency is low due to secondary data relocation in the related art.
根据本发明实施例的一个方面,提供了一种数据处理方法,包括:获取回收请求,其中,上述回收请求用于请求对固态硬盘中的页面数据进行数据回收;响应上述回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,上述第一类型的页面数据用于指示将要从上述缓存中置换存储到上述固态硬盘的页面数据;将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置,其中,上述预定的搬迁位置为执行上述数据回收后上述有效页面数据的存储位置。According to an aspect of the embodiments of the present invention, a data processing method includes: acquiring a reclaim request, wherein the reclaiming request is used to request data recovery of page data in a solid state hard disk; and validating the cache request in response to the reclaiming request Obtaining the first type of page data in the page data, wherein the first type of page data is used to indicate that the page data to be replaced by the cache to the solid state hard disk is to be read; and the first type of page data is relocated from the cache And a predetermined relocation position in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after the data recovery is performed.
可选地,响应上述回收请求从缓存的有效页面数据中获取第一类型的页面数据包括:获取对上述缓存中的上述有效页面数据的访问频率及修改标识;根据上述访问频率及上述修改标识获取上述有效页面数据的页面类型,其中,上述有效页面数据的页面类型包括上述第一类型的页面数据及第二类型的页面数据,上述第二类型的页面数据用于指示未从上述缓存中置换存储到上述固态硬盘的页面数据;根据上述有效页面数据的页面类型对上述有效页面数据进行分离,得到上述第一类型的页面数据。Optionally, the obtaining the first type of page data from the cached valid page data in response to the foregoing reclaiming request includes: obtaining an access frequency and a modified identifier of the valid page data in the cache; and obtaining, according to the access frequency and the modified identifier The page type of the valid page data, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache. To the page data of the solid state hard disk; separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
可选地,上述第二类型的页面数据包括第一页面数据及第二页面数据, 其中,根据上述访问频率及上述修改标识获取上述有效页面数据的页面类型包括:将上述修改标识为未被修改的页面数据作为上述第一页面数据,将上述修改标识为已被修改且上述访问频率大于等于第一预定阈值的页面数据作为上述第二页面数据,将上述修改标识为已被修改且上述访问频率小于上述第一预定阈值的页面数据作为上述第一类型的页面数据。Optionally, the second type of page data includes first page data and second page data, The page type for obtaining the valid page data according to the access frequency and the modified identifier includes: identifying the modification as unmodified page data as the first page data, and identifying the modification as being modified and the access frequency. The page data that is greater than or equal to the first predetermined threshold is used as the second page data, and the modification is identified as page data that has been modified and the access frequency is less than the first predetermined threshold as the first type of page data.
可选地,在将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置之前,还包括:至少根据上述第一类型的页面数据确定上述固态硬盘的数据回收块,其中,上述固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、上述有效页面数据,上述各个数据块中包括上述数据回收块;对上述数据回收块进行上述数据回收。Optionally, before the relocating the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, the method further includes: determining, according to at least the first type of page data, a data recovery block of the solid state hard disk, where The page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, and the valid page data, wherein each of the data blocks includes the data recovery block; and the data is collected by using the data recovery block. Recycling.
可选地,至少根据上述第一类型的页面数据确定上述固态硬盘的数据回收块包括:根据上述缓存中上述第一类型的页面数据及上述固态硬盘中各个数据块中的页面数据获取上述各个数据块的数据回收率;根据上述数据回收率确定上述数据回收块。Optionally, determining, according to the foregoing first type of page data, the data recovery block of the solid state hard disk, comprising: acquiring the foregoing data according to the first type of page data in the cache and the page data in each data block in the solid state hard disk. The data recovery rate of the block; the above data recovery block is determined according to the above data recovery rate.
可选地,根据上述缓存中上述第一类型的页面数据及上述固态硬盘中各个数据块中的页面数据获取上述各个数据块的数据回收率包括:重复执行以下步骤,直至遍历完上述固态硬盘中的上述各个数据块:获取当前数据块的块标识;获取上述块标识所标识的上述第一类型的页面数据及上述失效页面数据;通过以下方式获取上述当前数据块的上述数据回收率:
Figure PCTCN2017074290-appb-000001
其中,上述r表示上述当前数据块的上述数据回收率,上述a表示上述固态硬盘中上述当前数据块的上述失效页面数据的页面数量,上述b表示上述缓存中的上述第一类型的页面数据的页面数量,上述P表示页面大小,上述B表示块大小。
Optionally, obtaining, according to the first type of page data in the cache and the page data in each data block in the solid state hard disk, the data recovery rate of each of the data blocks includes: repeating the following steps until the traversing of the solid state hard disk is completed. And the foregoing data block: obtaining the block identifier of the current data block; acquiring the first type of page data and the invalid page data identified by the block identifier; and acquiring the data recovery rate of the current data block by:
Figure PCTCN2017074290-appb-000001
Wherein, the r represents the data recovery rate of the current data block, the a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and the b represents the page data of the first type in the cache. The number of pages, the above P represents the page size, and the above B represents the block size.
可选地,对上述数据回收块进行上述数据回收包括:将上述数据回收块中的上述有效页面数据搬迁至上述预定的搬迁位置,并将上述有效页面数据标记为上述失效页面数据;对上述数据回收块中的上述失效页面数据进行擦除。 Optionally, performing the foregoing data recovery on the data recovery block includes: relocating the valid page data in the data recovery block to the predetermined relocation location, and marking the valid page data as the invalid page data; The above failed page data in the recycle block is erased.
可选地,在将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置之前,还包括:根据上述固态硬盘中除上述数据回收块之外的其他数据块中上述未写页面数据的大小确定上述预定的搬迁位置。Optionally, before the relocating the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, the method further includes: according to the other data block except the data recovery block in the solid state hard disk The size of the write page data determines the predetermined relocation location described above.
根据本发明实施例的另一方面,提供了一种数据处理装置,包括:第一获取单元,设置为获取回收请求,其中,上述回收请求用于请求对固态硬盘中的页面数据进行数据回收;第二获取单元,设置为响应上述回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,上述第一类型的页面数据用于指示将要从上述缓存中置换存储到上述固态硬盘的页面数据;搬迁单元,设置为将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置,其中,上述预定的搬迁位置为执行上述数据回收后上述有效页面数据的存储位置。According to another aspect of the present invention, a data processing apparatus is provided, including: a first obtaining unit, configured to acquire a recycling request, wherein the recycling request is used to request data recovery of page data in a solid state hard disk; The second obtaining unit is configured to obtain the first type of page data from the cached valid page data in response to the recycling request, wherein the first type of page data is used to indicate that the storage is to be replaced from the cache to the solid state hard disk. a page data; a relocation unit configured to relocate the first type of page data from the cache to a predetermined relocation location in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after performing the data recovery .
可选地,第二获取单元包括:第一获取模块,设置为获取对上述缓存中的上述有效页面数据的访问频率及修改标识;第二获取模块,设置为根据上述访问频率及上述修改标识获取上述有效页面数据的页面类型,其中,上述有效页面数据的页面类型包括上述第一类型的页面数据及第二类型的页面数据,上述第二类型的页面数据用于指示未从上述缓存中置换存储到上述固态硬盘的页面数据;分离模块,设置为根据上述有效页面数据的页面类型对上述有效页面数据进行分离,得到上述第一类型的页面数据。Optionally, the second obtaining unit includes: a first acquiring module, configured to acquire an access frequency and a modified identifier of the valid page data in the cache, and a second acquiring module, configured to obtain according to the access frequency and the modified identifier The page type of the valid page data, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache. To the page data of the solid state hard disk; the separating module is configured to separate the valid page data according to the page type of the valid page data, to obtain the first type of page data.
可选地,上述第二类型的页面数据包括第一页面数据及第二页面数据,其中,上述第二获取模块通过以下方式获取上述有效页面数据的页面类型包括:将上述修改标识为未被修改的页面数据作为上述第一页面数据,将上述修改标识为已被修改且上述访问频率大于等于第一预定阈值的页面数据作为上述第二页面数据,将上述修改标识为已被修改且上述访问频率小于上述第一预定阈值的页面数据作为上述第一类型的页面数据。Optionally, the page data of the second type includes the first page data and the second page data, where the second obtaining module obtains the page type of the valid page data by: identifying the modification as being unmodified The page data is used as the first page data, and the modification is identified as page data that has been modified and the access frequency is greater than or equal to a first predetermined threshold as the second page data, and the modification is identified as having been modified and the access frequency is The page data smaller than the first predetermined threshold is used as the first type of page data.
可选地,还包括:第一确定单元,设置为在将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置之前,至少根据上述 第一类型的页面数据确定上述固态硬盘的数据回收块,其中,上述固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、上述有效页面数据,上述各个数据块中包括上述数据回收块;回收单元,设置为对上述数据回收块进行上述数据回收。Optionally, the method further includes: a first determining unit, configured to: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, at least according to the foregoing The first type of page data determines a data recovery block of the solid state hard disk, wherein the page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, the valid page data, and the foregoing data. The block includes the above data recovery block; and the recovery unit is configured to perform the above data recovery on the data recovery block.
可选地,第一确定单元包括:第三获取模块,设置为根据上述缓存中上述第一类型的页面数据及上述固态硬盘中各个数据块中的页面数据获取上述各个数据块的数据回收率;确定模块,设置为根据上述数据回收率确定上述数据回收块。Optionally, the first determining unit includes: a third acquiring module, configured to obtain, according to the first type of page data in the cache and the page data in each data block in the solid state hard disk, a data recovery rate of each of the data blocks; The determining module is configured to determine the above data recovery block according to the above data recovery rate.
可选地,上述第三获取模块包括:处理子模块,设置为重复执行以下步骤,直至遍历完上述固态硬盘中的上述各个数据块:获取当前数据块的块标识;获取上述块标识所标识的上述第一类型的页面数据及上述失效页面数据;通过以下方式获取上述当前数据块的上述数据回收率:
Figure PCTCN2017074290-appb-000002
其中,上述r表示上述当前数据块的上述数据回收率,上述a表示上述固态硬盘中上述当前数据块的上述失效页面数据的页面数量,上述b表示上述缓存中的上述第一类型的页面数据的页面数量,上述P表示页面大小,上述B表示块大小。
Optionally, the foregoing third obtaining module includes: a processing submodule, configured to repeatedly perform the following steps until the foregoing each data block in the solid state hard disk is traversed: acquiring a block identifier of a current data block; acquiring the identifier identified by the block identifier The first type of page data and the invalid page data; the foregoing data recovery rate of the current data block is obtained by:
Figure PCTCN2017074290-appb-000002
Wherein, the r represents the data recovery rate of the current data block, the a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and the b represents the page data of the first type in the cache. The number of pages, the above P represents the page size, and the above B represents the block size.
可选地,上述回收单元包括:搬迁模块,设置为将上述数据回收块中的上述有效页面数据搬迁至上述预定的搬迁位置,并将上述有效页面数据标记为上述失效页面数据;擦除模块,设置为对上述数据回收块中的上述失效页面数据进行擦除。Optionally, the recycling unit includes: a relocation module configured to relocate the valid page data in the data recovery block to the predetermined relocation location, and mark the valid page data as the invalid page data; and an erasing module, It is set to erase the above-mentioned invalid page data in the above data recovery block.
可选地,还包括:第二确定单元,设置为在将上述第一类型的页面数据从上述缓存搬迁至上述固态硬盘中预定的搬迁位置之前,根据上述固态硬盘中除上述数据回收块之外的其他数据块中上述未写页面数据的大小确定上述预定的搬迁位置。Optionally, the method further includes: a second determining unit, configured to: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, according to the solid state hard disk except the data recovery block The size of the above-mentioned unwritten page data in the other data blocks determines the predetermined relocation position.
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的数据处理方法。In the embodiment of the present invention, a computer storage medium is further provided, and the computer storage medium may store an execution instruction for executing the data processing method in the foregoing embodiment.
通过本发明实施例,在获取到对固态硬盘中的页面数据进行数据回收 的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现提高数据处理效率的效果,此外,还降低了固态硬盘在数据回收和缓存置换过程中的数据搬迁次数及其造成的额外开销,提高了固态硬盘的性能。Through the embodiment of the present invention, data recovery is performed on page data in the solid state hard disk. When the request is to be reclaimed, the first type of page data that is to be replaced from the cache to the SSD is directly relocated to the predetermined relocation position in the SSD without first replacing the first type of page data. In the SSD, another relocation is carried out, thereby overcoming the problem of low data processing efficiency caused by the secondary relocation of data in the related art, thereby improving the efficiency of data processing, and also reducing the SSD in The number of data relocations and the extra overhead caused by data reclamation and cache replacement improves the performance of SSDs.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:
图1是根据本发明实施例的一种可选的数据处理方法的流程图;1 is a flow chart of an alternative data processing method in accordance with an embodiment of the present invention;
图2是根据本发明实施例的一种可选的数据块的结构示意图;2 is a schematic structural diagram of an optional data block according to an embodiment of the present invention;
图3是根据本发明实施例的另一种可选的数据处理方法的流程图;以及3 is a flow chart of another alternative data processing method in accordance with an embodiment of the present invention;
图4是根据本发明实施例的一种可选的数据处理装置的示意图。4 is a schematic diagram of an alternative data processing apparatus in accordance with an embodiment of the present invention.
具体实施方式detailed description
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.
实施例1Example 1
在本实施例中提供了一种数据处理方法,图1是根据本发明实施例提 供的一种可选的数据处理方法的流程图,如图1所示,该流程包括如下步骤:A data processing method is provided in this embodiment, and FIG. 1 is a diagram of an embodiment of the present invention. A flow chart of an optional data processing method is provided, as shown in FIG. 1, the process includes the following steps:
步骤S102,获取回收请求,其中,回收请求用于请求对固态硬盘中的页面数据进行数据回收;Step S102, acquiring a recycling request, where the recycling request is used to request data recovery of page data in the solid state hard disk;
步骤S104,响应回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,第一类型的页面数据用于指示将要从缓存中置换存储到固态硬盘的页面数据;In step S104, the first type of page data is obtained from the cached valid page data in response to the reclaiming request, wherein the first type of page data is used to indicate that the page data to be replaced from the cache to the solid state hard disk is to be replaced;
步骤S106,将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置,其中,预定的搬迁位置为执行数据回收后有效页面数据的存储位置。Step S106: Relocating the first type of page data from the cache to a predetermined relocation location in the SSD, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
可选地,在本实施例中,上述数据处理方法可以但不限于应用于固态硬盘的垃圾数据回收过程中。也就是说,在本实施例中,在对作为固态硬盘中的垃圾的页面数据进行数据回收时,可以在获取到对固态硬盘中的页面数据进行数据回收的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现在提高数据处理效率的同时,还大大降低了固态硬盘由于数据搬迁所造成的开销。Optionally, in this embodiment, the foregoing data processing method may be, but is not limited to, being applied to a garbage data recovery process of a solid state hard disk. That is to say, in the embodiment, when the data of the page data as the garbage in the solid state hard disk is recovered, the valid page data can be obtained when the data recovery request for the page data in the solid state hard disk is obtained. The first type of page data stored in the cache to be replaced by the cache to the SSD is directly moved to the predetermined relocation position in the SSD without first replacing the first type of page data with the SSD, and then relocating again. Therefore, the problem of low data processing efficiency caused by the secondary relocation of data in the related art is overcome, thereby improving the data processing efficiency and greatly reducing the overhead caused by the data relocation of the solid state hard disk.
可选地,在本实施例中,上述固态硬盘中包括:闪存转换层(Flash Translation Layer,简称FTL),闪存转换层用于通过映射表将逻辑地址映射为物理地址;负责标记固态硬盘中的页面类型;检测空闲空间,当出现不足时触发数据回收,例如,当固态硬盘内的已擦除块的数量占总数据块数的20%以下时,将触发用于请求对固态硬盘中的页面数据进行数据回收的回收请求。Optionally, in this embodiment, the SSD includes: a Flash Translation Layer (FTL), where the flash translation layer is used to map a logical address to a physical address through a mapping table; Page type; detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
可选地,在本实施例中,上述有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,其中,第一类型的页面数据为将要从 缓存中置换存储到固态硬盘的页面数据,第二类型的页面数据为未从缓存中置换存储到固态硬盘的页面数据。Optionally, in this embodiment, the page type of the valid page data includes the first type of page data and the second type of page data, wherein the first type of page data is to be The cache replaces the page data stored to the SSD, and the second type of page data is page data that is not replaced by the cache to the SSD.
可选地,在本实施例中,响应回收请求从缓存的有效页面数据中获取第一类型的页面数据包括:获取对缓存中的有效页面数据的访问频率及修改标识;根据访问频率及修改标识获取有效页面数据的页面类型,其中,有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,第二类型的页面数据用于指示未从缓存中置换存储到固态硬盘的页面数据;根据有效页面数据的页面类型对有效页面数据进行分离,得到第一类型的页面数据。Optionally, in the embodiment, the obtaining, by the response to the reclaiming request, the first type of page data from the cached valid page data comprises: obtaining an access frequency of the valid page data in the cache and modifying the identifier; and according to the access frequency and the modification identifier Obtaining a page type of valid page data, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the storage is not replaced from the cache to the solid state hard disk. Page data; separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
例如,用于临时缓存有效页面数据的缓存层(Cache Layer),将根据最近最久未使用(Least Recently Used)算法将缓存中的页面排成队列(最近最久未使用队列,即LRU队列)。其中,LRU队列可以但不限于根据预定阈值分为热(HOT)页面和冷(COOL)页面,例如,预定阈值为10,则LRU队列的队尾10%被标记为冷(COOL)页面,前90%的页面被标记为热(HOT)页面。其中,LRU队列可以但不限于根据访问频率排序。For example, a Cache Layer for temporarily caching valid page data will queue the pages in the cache according to the Least Recently Used algorithm (the most recently unused queue, the LRU queue). The LRU queue may be, but is not limited to, divided into hot (HOT) pages and cold (COOL) pages according to a predetermined threshold. For example, if the predetermined threshold is 10, 10% of the tail of the LRU queue is marked as a cold (COOL) page, before 90% of pages are marked as hot (HOT) pages. The LRU queues may be, but are not limited to, ordered according to the access frequency.
进一步,将在缓存层中被修改的页面数据标记为脏(DIRTY)页面,未被修改的页面数据标记为干净(CLEAN)页面。冷页面中的脏页面被称为冷脏(COOL DIRTY)页面(用CD标识)、热页面中的脏页面被称为热脏(HOT DIRTY)页面(用HD标识)。Further, the page data modified in the cache layer is marked as a dirty (DIRTY) page, and the unmodified page data is marked as a clean (CLEAN) page. A dirty page in a cold page is called a COOL DIRTY page (identified by a CD), and a dirty page in a hot page is called a hot dirty (HOT DIRTY) page (identified by HD).
需要说明的是,在缓存中的页面数据存在两个副本,一个是在固态硬盘中的副本,一个是在缓存中的副本。如果是干净页面,则两个副本内容完全相同;如果是脏页面,则缓存中的副本为最新页面数据,固态硬盘中的副本为旧页面数据。也就是说,固态硬盘中冷脏页面与缓存中冷脏页面是同一页面的不同副本,缓存中的冷脏页面存储着新数据,固态硬盘中存储着冷脏页面的旧数据。在本实施例中,通过将缓存置换与数据回收进行灵活结合,以使缓存中的冷脏页面被置换出去时,可以将固态硬盘中存储着旧数据的冷脏页面标记为失效页面数据,并将缓存中冷脏页面对应的新 数据直接写入固态硬盘中更新后的位置(如执行数据回收后预定的搬迁位置)。从而避免先将冷脏页面通过缓存置换搬迁到固态硬盘中,在进行数据回收的过程中,再执行一次二次搬迁的步骤,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现在提高数据处理效率的同时,还大大降低了固态硬盘由于数据搬迁所造成的开销。It should be noted that there are two copies of the page data in the cache, one is a copy in the SSD, and the other is a copy in the cache. If it is a clean page, the two copies are identical; if it is a dirty page, the copy in the cache is the latest page data, and the copy in the SSD is the old page data. That is to say, the cold dirty page in the solid state hard disk and the cold dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state hard disk stores the old data of the cold dirty page. In this embodiment, by combining the cache replacement and the data recovery flexibly, so that the cold and dirty pages in the cache are replaced, the cold dirty page storing the old data in the solid state hard disk may be marked as invalid page data, and Will replace the new dirty page in the cache The data is directly written to the updated location in the SSD (such as the scheduled relocation location after data recovery). Therefore, the cold dirty page is first moved to the solid state hard disk through the cache replacement, and a secondary relocation step is performed in the process of data recovery, thereby overcoming the data processing caused by the secondary relocation of data in the related art. The problem of low efficiency, in addition to improving the efficiency of data processing, also greatly reduces the overhead caused by the relocation of solid state drives.
例如,如图2所示,固态硬盘中每一个数据块中可以但不限于包括以下5种类型的页面数据:未写页面数据(可以用未写页面表示)、失效页面数据(可以用失效页面表示)、有效页面数据(可以用有效页面表示),其中,有效页面数据包括:干净页面数据(可以用干净页面表示)、热脏页面数据(可以用热脏页面表示)及冷脏页面数据(可以用冷脏页面表示)。其中,未写页面数据是该数据块中的空闲空间,已经被擦除过或从未被分配,可以直接将数据写入。干净页面数据是指该页面已被写入数据,同时该页面数据未在缓存中被修改;热脏页面数据是指该页面数据已在缓存中被修改,但由于被频繁访问而暂未从缓存中置换出的页面数据;冷脏页面数据是指该页面数据已在缓存中被修改,且未被频繁访问,即将被缓存置换算法置换出去;失效页面数据是指该页面数据被修改过,新数据已经写到其他位置,那么旧数据即为失效数据。此外,在本实施例中,上述缓存中用于存储有效页面数据,其中可以但不限于包括以下3种类型的页面数据:干净页面数据、热脏页面数据及冷脏页面数据。For example, as shown in FIG. 2, each data block in the solid state hard disk may include, but is not limited to, the following five types of page data: unwritten page data (can be represented by an unwritten page), invalid page data (available invalid page) Indicates), valid page data (can be represented by a valid page), wherein the valid page data includes: clean page data (which can be represented by a clean page), hot dirty page data (which can be represented by a hot dirty page), and cold dirty page data ( Can be represented by a cold dirty page). The unwritten page data is a free space in the data block, has been erased or never allocated, and can directly write data. Clean page data means that the page has been written to the data, and the page data is not modified in the cache; hot dirty page data means that the page data has been modified in the cache, but has not been cached due to frequent access. The page data replaced in the cold; the dirty page data means that the page data has been modified in the cache, and is not frequently accessed, and will be replaced by the cache replacement algorithm; the invalid page data means that the page data has been modified, new The data has been written to other locations, and the old data is the invalid data. In addition, in the embodiment, the cache is used to store valid page data, which may include, but is not limited to, the following three types of page data: clean page data, hot dirty page data, and cold dirty page data.
也就是说,在本实施例中,上述第二类型的页面数据可以包括但不限于:干净页面数据、热脏页面数据,上述第一类型的页面数据可以包括但不限于冷脏页面数据。其中,需要说明的是由于有效页面数据中的干净页面数据在缓存中与固态硬盘中存储的内容一致,因而,在本实施例中,干净页面数据可以但不限于与热脏页面数据均被视为未从缓存中置换存储到固态硬盘的第二类型的页面数据。That is, in the embodiment, the page data of the second type may include, but is not limited to, clean page data and hot dirty page data, and the first type of page data may include, but is not limited to, cold dirty page data. It should be noted that, because the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data can be, but is not limited to, the hot dirty page data. The second type of page data stored to the SSD is not replaced from the cache.
可选地,在本实施例中,在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,还包括: Optionally, in this embodiment, before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes:
S1,至少根据第一类型的页面数据确定固态硬盘的数据回收块;S1, determining, according to at least the first type of page data, a data recovery block of the solid state hard disk;
S2,对数据回收块进行数据回收。S2, data recovery is performed on the data recovery block.
可选地,在本实施例中,上述对数据回收块进行数据回收可以包括:将数据回收块中的有效页面数据搬迁至预定的搬迁位置,并将有效页面数据标记为失效页面数据;对数据回收块中的失效页面数据进行擦除。Optionally, in this embodiment, performing data recovery on the data recovery block may include: relocating valid page data in the data recovery block to a predetermined relocation location, and marking the valid page data as invalid page data; The failed page data in the recycle block is erased.
可选地,在本实施例中,上述至少根据第一类型的页面数据确定固态硬盘的数据回收块可以但不限于至少根据第一类型的页面数据获取固态硬盘中各个数据块的数据回收率,通过比较得到的数据回收率来确定数据回收块(如确定数据回收块的块标识)。Optionally, in this embodiment, determining, according to at least the first type of page data, the data recovery block of the solid state hard disk may be, but is not limited to, acquiring data recovery rate of each data block in the solid state hard disk according to at least the first type of page data. The data recovery block is determined by comparing the obtained data recovery rate (such as determining the block identifier of the data recovery block).
可选地,在本实施例中,通过比较得到的数据回收率确定数据回收块的方式包括以下至少之一:Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
1)将数据回收率最高的数据块作为数据回收块;1) The data block with the highest data recovery rate is used as the data recovery block;
2)如果有若干个数据块的数据回收率相同且为最高值,则比较这些数据块中的第一类型的页面数据的数量,数值大者作为数据回收块;2) If there are several data blocks whose data recovery rates are the same and the highest value, compare the number of page data of the first type in the data blocks, and the larger value is used as the data recovery block;
3)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,则比较这些数据块的第二类型的页面数据的数量,数值小者作为数据回收块;3) If there are several data blocks whose data recovery rates are the same and the highest value, and the number of page data of the first type in the data blocks is the same, the number of page data of the second type of the data blocks is compared, the value The smaller is used as a data recovery block;
4)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,第二类型的页面数据的数量也相同,则将块标识最大的数据块作为数据回收块。4) If there are several data blocks with the same data recovery rate and the highest value, and the number of page data of the first type in the data blocks is the same, and the number of page data of the second type is also the same, the block identification is the largest The data block acts as a data recovery block.
可选地,在本实施例中,至少根据第一类型的页面数据获取固态硬盘中各个数据块的数据回收率可以包括但不限于:根据第一类型的页面数据及固态硬盘中各个数据块中的失效页面数据来确定数据回收率。Optionally, in this embodiment, the data recovery rate of each data block in the solid state hard disk is obtained according to at least the first type of page data, which may include, but is not limited to, according to the first type of page data and each data block in the solid state hard disk. The invalidation page data to determine the data recovery rate.
可选地,在本实施例中,在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,还包括:根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置。以使固态硬盘 中的有效页面数据可以全部重新搬迁至其他数据块中未写页面数据对应的区域。Optionally, in this embodiment, before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes: according to other data blocks in the SSD except the data recovery block. The size of the write page data determines the predetermined relocation location. To make a solid state drive The valid page data in the page can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
具体结合以下示例进行说明,如图3所示,该方法包括:Specifically, the following example is used to illustrate, as shown in FIG. 3, the method includes:
S302,***触发回收请求;S302. The system triggers a recycling request.
S304,获取缓存中所有页面数据的页面类型,从中获取第一类型的页面数据;S304. Obtain a page type of all page data in the cache, and obtain a page data of the first type.
S306,计算固态硬盘中各个数据块的数据回收率;S306. Calculate a data recovery rate of each data block in the solid state hard disk.
S308,确定数据回收块;S308, determining a data recovery block;
S310,选取预定的搬迁位置;S310, selecting a predetermined relocation location;
S312,将数据回收块中第二类型的页面数据直接拷贝至搬迁位置预定的搬迁位置;S312, directly copying the second type of page data in the data recovery block to a predetermined relocation position of the relocation location;
S314,将数据回收块中与第一类型的页面数据对应的页面数据标记为失效页面,将缓存中最新的第一类型的页面数据拷贝至预定的搬迁位置;S314. Mark the page data corresponding to the first type of page data in the data recovery block as a invalid page, and copy the latest first type of page data in the cache to a predetermined relocation position;
S316,将该数据回收块擦除。S316, erasing the data recovery block.
通过本申请提供的实施例,在获取到对固态硬盘中的页面数据进行数据回收的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现提高数据处理效率的效果,此外,还降低了固态硬盘在数据回收和缓存置换过程中的数据搬迁次数及其造成的额外开销,提高了固态硬盘的性能。Through the embodiment provided by the present application, when the recovery request for data recovery of the page data in the solid state hard disk is obtained, the first type of page data stored in the effective page data to be replaced from the cache to the solid state hard disk is directly used. Relocating to a predetermined relocation location in the SSD without first replacing the first type of page data with the SSD, and then re-removing, thereby overcoming the data processing efficiency caused by the secondary relocation of data in the related art. The low problem, in turn, achieves the effect of improving data processing efficiency. In addition, it also reduces the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement, and improves the performance of the SSD.
作为一种可选的方案,响应回收请求从缓存的有效页面数据中获取第一类型的页面数据包括:As an optional solution, the first type of page data is obtained from the cached valid page data in response to the reclaim request:
S1,获取对缓存中的有效页面数据的访问频率及修改标识;S1, obtaining an access frequency and a modification identifier of valid page data in the cache;
S2,根据访问频率及修改标识获取有效页面数据的页面类型,其中, 有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,第二类型的页面数据用于指示未从缓存中置换存储到固态硬盘的页面数据;S2, the page type for obtaining valid page data according to the access frequency and the modification identifier, where The page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the page data stored to the solid state hard disk is not replaced from the cache;
S3,根据有效页面数据的页面类型对有效页面数据进行分离,得到第一类型的页面数据。S3: Separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
可选地,在本实施例中,第二类型的页面数据包括第一页面数据及第二页面数据,其中,根据访问频率及修改标识获取有效页面数据的页面类型包括:将修改标识为未被修改的页面数据作为第一页面数据,将修改标识为已被修改且访问频率大于等于第一预定阈值的页面数据作为第二页面数据,将修改标识为已被修改且访问频率小于第一预定阈值的页面数据作为第一类型的页面数据。Optionally, in this embodiment, the page data of the second type includes the first page data and the second page data, where the page type for obtaining the valid page data according to the access frequency and the modification identifier includes: identifying the modification as not being The modified page data is used as the first page data, and the modification identifies the page data that has been modified and the access frequency is greater than or equal to the first predetermined threshold as the second page data, and identifies the modification as being modified and the access frequency is less than the first predetermined threshold. The page data is used as the first type of page data.
也就是说,在本实施例中,固态硬盘中的FTL将实时检测固态硬盘的空闲空间,在检测出空闲空间不足并触发回收请求后,***将开始遍历缓存中的LRU队列,并对LRU队列中的页面数据标记页面类型。That is to say, in this embodiment, the FTL in the SSD will detect the free space of the SSD in real time. After detecting that the free space is insufficient and triggering the reclaim request, the system will start to traverse the LRU queue in the cache and queue the LRU. The page data in the tag page type.
例如,假设用于分界的预定阈值为10%,则缓存层将LRU队列的队尾的10%页面标记为冷页面,将其余页面标记为热页面。进一步,分别遍历冷页面及热页面,将冷页面中的脏页面标记为冷脏页面(CD),将热页面中的脏页面标记为热脏页面(HD),并将标记出的页面类型通知固态硬盘中的FTL。For example, assuming the predetermined threshold for demarcation is 10%, the cache layer marks the 10% page of the tail of the LRU queue as a cold page and the remaining pages as a hot page. Further, the cold page and the hot page are respectively traversed, the dirty page in the cold page is marked as a cold dirty page (CD), the dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified. FTL in SSD.
也就是说,在获取到缓存中标记的页面数据的页面类型后,则可以通过分离获取第一类型的页面数据(即冷脏页面),从而便于对该第一类型的页面数据进行一次搬迁,搬迁至执行数据回收后有效页面数据的存储位置,而避免执行缓存置换及数据回收过程中的两次搬迁,达到减小开销的效果。That is to say, after the page type of the page data marked in the cache is obtained, the page data of the first type (ie, the cold dirty page) can be obtained by separating, so that the first type of page data can be easily relocated. Relocate to the storage location of the valid page data after data recovery, and avoid the two relocations in the cache replacement and data recovery process to reduce the overhead.
作为一种可选的方案,在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,还包括:As an alternative, before moving the first type of page data from the cache to the predetermined relocation location in the SSD, the method further includes:
S1,至少根据第一类型的页面数据确定固态硬盘的数据回收块,其中, 固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、有效页面数据,各个数据块中包括数据回收块;S1, determining, according to at least the first type of page data, a data recovery block of the solid state hard disk, where The page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block;
S2,对数据回收块进行数据回收。S2, data recovery is performed on the data recovery block.
可选地,在本实施例中,上述至少根据第一类型的页面数据确定固态硬盘的数据回收块包括:Optionally, in this embodiment, the determining, by the foregoing at least the first type of page data, the data recovery block of the solid state hard disk includes:
S12,根据缓存中第一类型的页面数据及固态硬盘中各个数据块中的页面数据获取各个数据块的数据回收率;S12: Obtain a data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state hard disk;
S14,根据数据回收率确定数据回收块。S14, determining a data recovery block according to the data recovery rate.
可选地,在本实施例中,通过比较得到的数据回收率确定数据回收块的方式包括以下至少之一:Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
1)将数据回收率最高的数据块作为数据回收块;1) The data block with the highest data recovery rate is used as the data recovery block;
2)如果有若干个数据块的数据回收率相同且为最高值,则比较这些数据块中的第一类型的页面数据的数量,数值大者作为数据回收块;2) If there are several data blocks whose data recovery rates are the same and the highest value, compare the number of page data of the first type in the data blocks, and the larger value is used as the data recovery block;
3)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,则比较这些数据块的第二类型的页面数据的数量,数值小者作为数据回收块;3) If there are several data blocks whose data recovery rates are the same and the highest value, and the number of page data of the first type in the data blocks is the same, the number of page data of the second type of the data blocks is compared, the value The smaller is used as a data recovery block;
4)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,第二类型的页面数据的数量也相同,则将块标识最大的数据块作为数据回收块。4) If there are several data blocks with the same data recovery rate and the highest value, and the number of page data of the first type in the data blocks is the same, and the number of page data of the second type is also the same, the block identification is the largest The data block acts as a data recovery block.
需要说明的是,在本实施例中,上述作为缓存的缓存层的映射表中不仅存储了页面数据标记的页面类型,还对应存储了页面数据在置换存储到固态硬盘后预设的位置信息(如数据块的块标识)。It should be noted that, in this embodiment, the mapping table of the cache layer as the cache not only stores the page type of the page data mark, but also stores the preset location information after the page data is replaced by the SSD. Such as the block identifier of the data block).
例如,假设第一类型的页面数据为冷脏页面,则FTL可以统计由缓存获取的标记为第一类型的页面数据的冷脏页面的数量,及固态硬盘各个数据块中失效页面数据的数量,以数据块为单位,分别获取各个数据块中失效页面数量和冷脏页面数量,利用上述失效页面数量和冷脏页面数量计算 各个数据块的数据回收率。For example, if the first type of page data is a cold dirty page, the FTL may count the number of cold dirty pages marked by the cache as the first type of page data, and the number of invalid page data in each data block of the solid state hard disk. Obtain the number of failed pages and the number of cold and dirty pages in each data block in units of data blocks, and calculate the number of invalid pages and the number of cold and dirty pages. The data recovery rate of each data block.
通过本申请提供的实施例,通过数据回收率来准确定位固态硬盘中用于进行数据回收的数据回收块,从而实现对固态硬盘准确高效的数据回收,进而保证固态硬盘中数据的处理效率。Through the embodiment provided by the present application, the data recovery rate is used to accurately locate the data recovery block for data recovery in the solid state hard disk, thereby realizing accurate and efficient data recovery of the solid state hard disk, thereby ensuring data processing efficiency in the solid state hard disk.
作为一种可选的方案,根据缓存中第一类型的页面数据及固态硬盘中各个数据块中的页面数据获取各个数据块的数据回收率包括:As an optional solution, obtaining data recovery rates of each data block according to the first type of page data in the cache and the page data in each data block in the solid state hard disk includes:
S1,重复执行以下步骤,直至遍历完固态硬盘中的各个数据块:S1, repeat the following steps until the individual data blocks in the SSD are traversed:
S12,获取当前数据块的块标识;S12. Obtain a block identifier of a current data block.
S14,获取块标识所标识的第一类型的页面数据及失效页面数据;S14: Obtain the first type of page data and the invalidated page data identified by the block identifier.
S16,通过以下方式获取当前数据块的数据回收率:S16: Obtain the data recovery rate of the current data block by:
Figure PCTCN2017074290-appb-000003
Figure PCTCN2017074290-appb-000003
其中,r表示当前数据块的数据回收率,a表示固态硬盘中当前数据块的失效页面数据的页面数量,b表示缓存中的第一类型的页面数据的页面数量,P表示页面大小,B表示块大小。Where r represents the data recovery rate of the current data block, a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, b represents the number of pages of the first type of page data in the cache, P represents the page size, and B represents Block size.
需要说明的是,在本实施例中,固态硬盘中读写操作的单位是页面,其中,页面大小通常为2KB,访问延迟一般为15us到200us。擦除操作的单位是块,其中,块大小通常为128KB,擦除一块需要2ms左右的开销。It should be noted that, in this embodiment, the unit of the read/write operation in the solid state hard disk is a page, wherein the page size is usually 2 KB, and the access delay is generally 15 us to 200 us. The unit of the erase operation is a block, where the block size is usually 128 KB, and erasing a block requires an overhead of about 2 ms.
通过本申请提供的实施例,通过上述方式依次计算固态硬盘中各个数据块的数据回收率,从而保证所确定的数据回收块的准确性,进而实现对固态硬盘准确高效的数据回收。Through the embodiments provided by the present application, the data recovery rate of each data block in the solid state hard disk is sequentially calculated in the above manner, thereby ensuring the accuracy of the determined data recovery block, thereby realizing accurate and efficient data recovery of the solid state hard disk.
作为一种可选的方案,对数据回收块进行数据回收包括:As an alternative, data recovery for data recovery blocks includes:
S1,将数据回收块中的有效页面数据搬迁至预定的搬迁位置,并将有效页面数据标记为失效页面数据;S1, relocating valid page data in the data recovery block to a predetermined relocation location, and marking the valid page data as invalid page data;
S2,对数据回收块中的失效页面数据进行擦除。S2, erasing the invalid page data in the data recovery block.
例如,假设第一类型的页面数据以冷脏页面为例,第二类型的页面数 据以干净页面和热脏页面为例,则在执行数据回收的过程中,可以将数据回收块中的干净页面和热脏页面直接拷贝至预定的搬迁位置,并修改FTL中上述页面数据对应的位置信息。并将数据回收块中的相应有效页面数据标记为失效页面数据(可以用失效页面表示)。For example, suppose the first type of page data is taken as a cold dirty page, and the second type of page number is According to the example of a clean page and a hot dirty page, in the process of performing data recovery, the clean page and the hot dirty page in the data recovery block can be directly copied to the predetermined relocation position, and the corresponding page data corresponding to the FTL is modified. location information. The corresponding valid page data in the data recovery block is marked as invalid page data (which can be represented by a stale page).
进一步,将数据回收块中的冷脏页面也标记为失效页面数据,并将在缓存中该冷脏页面的最新数据拷贝至预定的搬迁位置。然后,将该冷脏页面在缓存中的最新数据删除,同时修改FTL中上述页面数据对应的位置信息。Further, the cold dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold dirty page in the cache is copied to a predetermined relocation location. Then, the latest data of the cold dirty page in the cache is deleted, and the location information corresponding to the page data in the FTL is modified.
然后,擦除数据回收块中的页面数据,将该数据回收块标记为“已擦除”,以实现对固态硬盘的数据回收,释放空闲空间的目的。Then, the page data in the data recovery block is erased, and the data recovery block is marked as "erased" to realize data recovery of the solid state hard disk and release the free space.
通过本申请提供的实施例,通过上述方式实现对固态硬盘的数据回收保证了有效页面数据中的第一类型的页面数据和第二类型的页面数据均可一次搬迁到预定的搬迁位置,避免了对第一类型的页面数据的二次搬迁,实现了降低固态硬盘的开销的效果。Through the embodiment provided by the present application, the data recovery of the solid state hard disk is ensured by the foregoing manner, and the first type of page data and the second type of page data in the effective page data can be relocated to the predetermined relocation position at one time, thereby avoiding The secondary relocation of the first type of page data achieves the effect of reducing the overhead of the solid state drive.
作为一种可选的方案,在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,还包括:As an alternative, before moving the first type of page data from the cache to the predetermined relocation location in the SSD, the method further includes:
S1,根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置。S1. Determine a predetermined relocation location according to the size of the unwritten page data in the data block other than the data recovery block in the solid state hard disk.
例如,统计数据回收块中有效页面数据的大小,查看实时更新的FTL,查询固态硬盘其他数据块中满足该有效页面数据的大小的未写数据页面,将查找到的数据块作为该数据回收块的有效页面数据的预定的搬迁位置。For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block. The predetermined relocation location of the valid page data.
通过本申请提供的实施例,通过根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置,以保证数据回收块中的有效页面数据可以全部搬迁。Through the embodiment provided by the present application, the predetermined relocation location is determined according to the size of the unwritten page data in the data block other than the data recovery block in the solid state hard disk, so as to ensure that the effective page data in the data recovery block can be completely relocated.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理 解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on this rationale The solution of the technical solution of the present invention in essence or contribution to the prior art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD). A number of instructions are included to cause a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.
实施例2Example 2
在本实施例中还提供了一种数据处理装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。In the embodiment, a data processing device is also provided, which is used to implement the above-mentioned embodiments and preferred embodiments, and will not be described again. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
图4是根据本发明实施例提供的一种可选的数据处理装置的示意图,如图4所示,该装置包括:FIG. 4 is a schematic diagram of an optional data processing apparatus according to an embodiment of the present invention. As shown in FIG. 4, the apparatus includes:
1)第一获取单元402,设置为获取回收请求,其中,回收请求用于请求对固态硬盘中的页面数据进行数据回收;1) The first obtaining unit 402 is configured to acquire a recycling request, where the recycling request is used to request data recovery of page data in the solid state hard disk;
2)第二获取单元404,设置为响应回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,第一类型的页面数据用于指示将要从缓存中置换存储到固态硬盘的页面数据;2) The second obtaining unit 404 is configured to obtain, according to the recycling request, the first type of page data from the cached valid page data, wherein the first type of page data is used to indicate that the page to be replaced from the cache to the solid state hard disk is to be replaced. data;
3)搬迁单元406,设置为将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置,其中,预定的搬迁位置为执行数据回收后有效页面数据的存储位置。3) The relocation unit 406 is configured to relocate the first type of page data from the cache to a predetermined relocation location in the SSD, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
可选地,在本实施例中,上述数据处理装置可以但不限于应用于固态硬盘的垃圾数据回收过程中。也就是说,在本实施例中,在对作为固态硬盘中的垃圾的页面数据进行数据回收时,可以在获取到对固态硬盘中的页面数据进行数据回收的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再 进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现在提高数据处理效率的同时,还大大降低了固态硬盘由于数据搬迁所造成的开销。Optionally, in this embodiment, the foregoing data processing apparatus may be, but is not limited to, being applied to a garbage data recovery process of a solid state hard disk. That is to say, in the embodiment, when the data of the page data as the garbage in the solid state hard disk is recovered, the valid page data can be obtained when the data recovery request for the page data in the solid state hard disk is obtained. The first type of page data stored in the cache to be replaced by the cache to the SSD is directly moved to the predetermined relocation position in the SSD without first replacing the first type of page data with the SSD. Performing a relocation, thereby overcoming the problem of low data processing efficiency caused by the secondary relocation of data in the related technology, thereby improving the efficiency of data processing and greatly reducing the overhead caused by data relocation of the solid state drive. .
可选地,在本实施例中,上述固态硬盘中包括:闪存转换层(Flash Translation Layer,简称FTL),闪存转换层用于通过映射表将逻辑地址映射为物理地址;负责标记固态硬盘中的页面类型;检测空闲空间,当出现不足时触发数据回收,例如,当固态硬盘内的已擦除块的数量占总数据块数的20%以下时,将触发用于请求对固态硬盘中的页面数据进行数据回收的回收请求。Optionally, in this embodiment, the SSD includes: a Flash Translation Layer (FTL), where the flash translation layer is used to map a logical address to a physical address through a mapping table; Page type; detects free space, triggers data recovery when there is insufficient, for example, when the number of erased blocks in the SSD is less than 20% of the total number of data blocks, it will trigger the request for the page in the SSD Data recovery request for data recovery.
可选地,在本实施例中,上述有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,其中,第一类型的页面数据为将要从缓存中置换存储到固态硬盘的页面数据,第二类型的页面数据为未从缓存中置换存储到固态硬盘的页面数据。Optionally, in this embodiment, the page type of the valid page data includes the first type of page data and the second type of page data, wherein the first type of page data is to be replaced from the cache to the solid state hard disk. Page data, the second type of page data is page data that is not replaced by the cache to the SSD.
可选地,在本实施例中,第二获取单元404包括:(1)第一获取模块,设置为获取对缓存中的有效页面数据的访问频率及修改标识;(2)第二获取模块,设置为根据访问频率及修改标识获取有效页面数据的页面类型,其中,有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,第二类型的页面数据用于指示未从缓存中置换存储到固态硬盘的页面数据;(3)分离模块,设置为根据有效页面数据的页面类型对有效页面数据进行分离,得到第一类型的页面数据。Optionally, in this embodiment, the second obtaining unit 404 includes: (1) a first obtaining module, configured to acquire an access frequency and a modified identifier of valid page data in the cache; and (2) a second acquiring module, The page type is set to obtain valid page data according to the access frequency and the modification identifier, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type of page data is used to indicate that the page type is not The cache replaces the page data stored in the solid state hard disk; (3) the separation module is configured to separate the valid page data according to the page type of the valid page data to obtain the first type of page data.
例如,用于临时缓存有效页面数据的缓存层(Cache Layer),将根据最近最久未使用(Least Recently Used)算法将缓存中的页面排成队列(最近最久未使用队列,即LRU队列)。其中,LRU队列可以但不限于根据预定阈值分为热(HOT)页面和冷(COOL)页面,例如,预定阈值为10,则LRU队列的队尾10%被标记为冷(COOL)页面,前90%的页面被标记为热(HOT)页面。其中,LRU队列可以但不限于根据访问频率排序。For example, a Cache Layer for temporarily caching valid page data will queue the pages in the cache according to the Least Recently Used algorithm (the most recently unused queue, the LRU queue). The LRU queue may be, but is not limited to, divided into hot (HOT) pages and cold (COOL) pages according to a predetermined threshold. For example, if the predetermined threshold is 10, 10% of the tail of the LRU queue is marked as a cold (COOL) page, before 90% of pages are marked as hot (HOT) pages. The LRU queues may be, but are not limited to, ordered according to the access frequency.
进一步,将在缓存层中被修改的页面数据标记为脏(DIRTY)页面, 未被修改的页面数据标记为干净(CLEAN)页面。冷页面中的脏页面被称为冷脏(COOL DIRTY)页面(用CD标识)、热页面中的脏页面被称为热脏(HOT DIRTY)页面(用HD标识)。Further, the page data modified in the cache layer is marked as a dirty (DIRTY) page. Unmodified page data is marked as a CLEAN page. A dirty page in a cold page is called a COOL DIRTY page (identified by a CD), and a dirty page in a hot page is called a hot dirty (HOT DIRTY) page (identified by HD).
需要说明的是,在缓存中的页面数据存在两个副本,一个是在固态硬盘中的副本,一个是在缓存中的副本。如果是干净页面,则两个副本内容完全相同;如果是脏页面,则缓存中的副本为最新页面数据,固态硬盘中的副本为旧页面数据。也就是说,固态硬盘中冷脏页面与缓存中冷脏页面是同一页面的不同副本,缓存中的冷脏页面存储着新数据,固态硬盘中存储着冷脏页面的旧数据。在本实施例中,通过将缓存置换与数据回收进行灵活结合,以使缓存中的冷脏页面被置换出去时,可以将固态硬盘中存储着旧数据的冷脏页面标记为失效页面数据,并将缓存中冷脏页面对应的新数据直接写入固态硬盘中更新后的位置(如执行数据回收后预定的搬迁位置)。从而避免先将冷脏页面通过缓存置换搬迁到固态硬盘中,在进行数据回收的过程中,再执行一次二次搬迁的步骤,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现在提高数据处理效率的同时,还大大降低了固态硬盘由于数据搬迁所造成的开销。It should be noted that there are two copies of the page data in the cache, one is a copy in the SSD, and the other is a copy in the cache. If it is a clean page, the two copies are identical; if it is a dirty page, the copy in the cache is the latest page data, and the copy in the SSD is the old page data. That is to say, the cold dirty page in the solid state hard disk and the cold dirty page in the cache are different copies of the same page, the cold dirty page in the cache stores new data, and the solid state hard disk stores the old data of the cold dirty page. In this embodiment, by combining the cache replacement and the data recovery flexibly, so that the cold and dirty pages in the cache are replaced, the cold dirty page storing the old data in the solid state hard disk may be marked as invalid page data, and The new data corresponding to the cold and dirty pages in the cache is directly written into the updated position of the SSD (for example, the predetermined relocation position after performing data recovery). Therefore, the cold dirty page is first moved to the solid state hard disk through the cache replacement, and a secondary relocation step is performed in the process of data recovery, thereby overcoming the data processing caused by the secondary relocation of data in the related art. The problem of low efficiency, in addition to improving the efficiency of data processing, also greatly reduces the overhead caused by the relocation of solid state drives.
例如,如图2所示,固态硬盘中每一个数据块中可以但不限于包括以下5种类型的页面数据:未写页面数据(可以用未写页面表示)、失效页面数据(可以用失效页面表示)、有效页面数据(可以用有效页面表示),其中,有效页面数据包括:干净页面数据(可以用干净页面表示)、热脏页面数据(可以用热脏页面表示)及冷脏页面数据(可以用冷脏页面表示)。其中,未写页面数据是该数据块中的空闲空间,已经被擦除过或从未被分配,可以直接将数据写入。干净页面数据是指该页面已被写入数据,同时该页面数据未在缓存中被修改;热脏页面数据是指该页面数据已在缓存中被修改,但由于被频繁访问而暂未从缓存中置换出的页面数据;冷脏页面数据是指该页面数据已在缓存中被修改,且未被频繁访问,即将被缓存置换算法置换出去;失效页面数据是指该页面数据被修改过,新数据已经写到其他位置,那么旧数据即为失效数据。此外,在本实施例中,上述缓存 中用于存储有效页面数据,其中可以但不限于包括以下3种类型的页面数据:干净页面数据、热脏页面数据及冷脏页面数据。For example, as shown in FIG. 2, each data block in the solid state hard disk may include, but is not limited to, the following five types of page data: unwritten page data (can be represented by an unwritten page), invalid page data (available invalid page) Indicates), valid page data (can be represented by a valid page), wherein the valid page data includes: clean page data (which can be represented by a clean page), hot dirty page data (which can be represented by a hot dirty page), and cold dirty page data ( Can be represented by a cold dirty page). The unwritten page data is a free space in the data block, has been erased or never allocated, and can directly write data. Clean page data means that the page has been written to the data, and the page data is not modified in the cache; hot dirty page data means that the page data has been modified in the cache, but has not been cached due to frequent access. The page data replaced in the cold; the dirty page data means that the page data has been modified in the cache, and is not frequently accessed, and will be replaced by the cache replacement algorithm; the invalid page data means that the page data has been modified, new The data has been written to other locations, and the old data is the invalid data. In addition, in the embodiment, the above cache It is used to store valid page data, which can include, but is not limited to, the following three types of page data: clean page data, hot dirty page data, and cold dirty page data.
也就是说,在本实施例中,上述第二类型的页面数据可以包括但不限于:干净页面数据、热脏页面数据,上述第一类型的页面数据可以包括但不限于冷脏页面数据。其中,需要说明的是由于有效页面数据中的干净页面数据在缓存中与固态硬盘中存储的内容一致,因而,在本实施例中,干净页面数据可以但不限于与热脏页面数据均被视为未从缓存中置换存储到固态硬盘的第二类型的页面数据。That is, in the embodiment, the page data of the second type may include, but is not limited to, clean page data and hot dirty page data, and the first type of page data may include, but is not limited to, cold dirty page data. It should be noted that, because the clean page data in the valid page data is consistent with the content stored in the solid state disk in the cache, in this embodiment, the clean page data can be, but is not limited to, the hot dirty page data. The second type of page data stored to the SSD is not replaced from the cache.
可选地,在本实施例中,上述装置还包括:(1)第一确定单元,设置为在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,至少根据第一类型的页面数据确定固态硬盘的数据回收块,其中,固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、有效页面数据,各个数据块中包括数据回收块;(2)回收单元,设置为对数据回收块进行数据回收。Optionally, in this embodiment, the foregoing apparatus further includes: (1) a first determining unit, configured to: at least according to the first, before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk The type of page data determines a data recovery block of the solid state hard disk, wherein the page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block. (2) Recycling unit, set to recover data from the data recovery block.
可选地,在本实施例中,上述回收单元通过以下步骤实现对数据回收块进行数据回收:将数据回收块中的有效页面数据搬迁至预定的搬迁位置,并将有效页面数据标记为失效页面数据;对数据回收块中的失效页面数据进行擦除。Optionally, in this embodiment, the foregoing recycling unit performs data recovery on the data recovery block by: relocating valid page data in the data recovery block to a predetermined relocation location, and marking the valid page data as a invalidation page. Data; erase the invalid page data in the data recovery block.
可选地,在本实施例中,上述第一确定单元通过以下步骤实现至少根据第一类型的页面数据确定固态硬盘的数据回收块:至少根据第一类型的页面数据获取固态硬盘中各个数据块的数据回收率,通过比较得到的数据回收率来确定数据回收块(如确定数据回收块的块标识)。Optionally, in the embodiment, the first determining unit is configured to determine, according to at least the first type of page data, a data recovery block of the solid state hard disk by acquiring at least each data block in the solid state hard disk according to the first type of page data. Data recovery rate, by comparing the data recovery rate to determine the data recovery block (such as determining the block identifier of the data recovery block).
可选地,在本实施例中,通过比较得到的数据回收率确定数据回收块的方式包括以下至少之一:Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
1)将数据回收率最高的数据块作为数据回收块;1) The data block with the highest data recovery rate is used as the data recovery block;
2)如果有若干个数据块的数据回收率相同且为最高值,则比较这些数据块中的第一类型的页面数据的数量,数值大者作为数据回收块; 2) If there are several data blocks whose data recovery rates are the same and the highest value, compare the number of page data of the first type in the data blocks, and the larger value is used as the data recovery block;
3)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,则比较这些数据块的第二类型的页面数据的数量,数值小者作为数据回收块;3) If there are several data blocks whose data recovery rates are the same and the highest value, and the number of page data of the first type in the data blocks is the same, the number of page data of the second type of the data blocks is compared, the value The smaller is used as a data recovery block;
4)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,第二类型的页面数据的数量也相同,则将块标识最大的数据块作为数据回收块。4) If there are several data blocks with the same data recovery rate and the highest value, and the number of page data of the first type in the data blocks is the same, and the number of page data of the second type is also the same, the block identification is the largest The data block acts as a data recovery block.
可选地,在本实施例中,至少根据第一类型的页面数据获取固态硬盘中各个数据块的数据回收率可以包括但不限于:根据第一类型的页面数据及固态硬盘中各个数据块中的失效页面数据来确定数据回收率。Optionally, in this embodiment, the data recovery rate of each data block in the solid state hard disk is obtained according to at least the first type of page data, which may include, but is not limited to, according to the first type of page data and each data block in the solid state hard disk. The invalidation page data to determine the data recovery rate.
可选地,在本实施例中,在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,还包括:根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置。以使固态硬盘中的有效页面数据可以全部重新搬迁至其他数据块中未写页面数据对应的区域。Optionally, in this embodiment, before the relocation of the first type of page data from the cache to the predetermined relocation position in the SSD, the method further includes: according to other data blocks in the SSD except the data recovery block. The size of the write page data determines the predetermined relocation location. So that the effective page data in the SSD can be completely relocated to the area corresponding to the unwritten page data in other data blocks.
具体结合以下示例进行说明,如图3所示,上述数据处理装置可以通过以下步骤实现对固态硬盘的数据回收:Specifically, the following example is used to illustrate, as shown in FIG. 3, the data processing apparatus can implement data recovery of the solid state hard disk by the following steps:
S302,***触发回收请求;S302. The system triggers a recycling request.
S304,获取缓存中所有页面数据的页面类型,从中获取第一类型的页面数据;S304. Obtain a page type of all page data in the cache, and obtain a page data of the first type.
S306,计算固态硬盘中各个数据块的数据回收率;S306. Calculate a data recovery rate of each data block in the solid state hard disk.
S308,确定数据回收块;S308, determining a data recovery block;
S310,选取预定的搬迁位置;S310, selecting a predetermined relocation location;
S312,将数据回收块中第二类型的页面数据直接拷贝至搬迁位置预定的搬迁位置;S312, directly copying the second type of page data in the data recovery block to a predetermined relocation position of the relocation location;
S314,将数据回收块中与第一类型的页面数据对应的页面数据标记为失效页面,将缓存中最新的第一类型的页面数据拷贝至预定的搬迁位置; S314. Mark the page data corresponding to the first type of page data in the data recovery block as a invalid page, and copy the latest first type of page data in the cache to a predetermined relocation position;
S316,将该数据回收块擦除。S316, erasing the data recovery block.
通过本申请提供的实施例,在获取到对固态硬盘中的页面数据进行数据回收的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现提高数据处理效率的效果,此外,还降低了固态硬盘在数据回收和缓存置换过程中的数据搬迁次数及其造成的额外开销,提高了固态硬盘的性能。Through the embodiment provided by the present application, when the recovery request for data recovery of the page data in the solid state hard disk is obtained, the first type of page data stored in the effective page data to be replaced from the cache to the solid state hard disk is directly used. Relocating to a predetermined relocation location in the SSD without first replacing the first type of page data with the SSD, and then re-removing, thereby overcoming the data processing efficiency caused by the secondary relocation of data in the related art. The low problem, in turn, achieves the effect of improving data processing efficiency. In addition, it also reduces the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement, and improves the performance of the SSD.
作为一种可选的方案,第二获取单元包括:As an optional solution, the second obtaining unit includes:
1)第一获取模块,设置为获取对缓存中的有效页面数据的访问频率及修改标识;1) The first obtaining module is configured to obtain an access frequency and a modified identifier of valid page data in the cache;
2)第二获取模块,设置为根据访问频率及修改标识获取有效页面数据的页面类型,其中,有效页面数据的页面类型包括第一类型的页面数据及第二类型的页面数据,第二类型的页面数据用于指示未从缓存中置换存储到固态硬盘的页面数据;2) The second obtaining module is configured to obtain a page type of valid page data according to the access frequency and the modification identifier, wherein the page type of the valid page data includes the first type of page data and the second type of page data, and the second type The page data is used to indicate that the page data stored to the solid state hard disk is not replaced from the cache;
3)分离模块,设置为根据有效页面数据的页面类型对有效页面数据进行分离,得到第一类型的页面数据。3) The separation module is configured to separate the valid page data according to the page type of the valid page data to obtain the first type of page data.
可选地,在本实施例中,上述第二类型的页面数据可以包括:第一页面数据及第二页面数据,其中,第二获取模块通过以下方式获取有效页面数据的页面类型包括:将修改标识为未被修改的页面数据作为第一页面数据,将修改标识为已被修改且访问频率大于等于第一预定阈值的页面数据作为第二页面数据,将修改标识为已被修改且访问频率小于第一预定阈值的页面数据作为第一类型的页面数据。Optionally, in this embodiment, the page data of the second type may include: first page data and second page data, where the second obtaining module obtains the page type of the valid page data by: modifying The page data that is identified as being unmodified is used as the first page data, and the modification is identified as the page data that has been modified and the access frequency is greater than or equal to the first predetermined threshold as the second page data, and the modification is identified as having been modified and the access frequency is less than The page data of the first predetermined threshold is used as the first type of page data.
也就是说,在本实施例中,固态硬盘中的FTL将实时检测固态硬盘的空闲空间,在检测出空闲空间不足并触发回收请求后,***将开始遍历缓存中的LRU队列,并对LRU队列中的页面数据标记页面类型。 That is to say, in this embodiment, the FTL in the SSD will detect the free space of the SSD in real time. After detecting that the free space is insufficient and triggering the reclaim request, the system will start to traverse the LRU queue in the cache and queue the LRU. The page data in the tag page type.
例如,假设用于分界的预定阈值为10%,则缓存层将LRU队列的队尾的10%页面标记为冷页面,将其余页面标记为热页面。进一步,分别遍历冷页面及热页面,将冷页面中的脏页面标记为冷脏页面(CD),将热页面中的脏页面标记为热脏页面(HD),并将标记出的页面类型通知固态硬盘中的FTL。For example, assuming the predetermined threshold for demarcation is 10%, the cache layer marks the 10% page of the tail of the LRU queue as a cold page and the remaining pages as a hot page. Further, the cold page and the hot page are respectively traversed, the dirty page in the cold page is marked as a cold dirty page (CD), the dirty page in the hot page is marked as a hot dirty page (HD), and the marked page type is notified. FTL in SSD.
也就是说,在获取到缓存中标记的页面数据的页面类型后,则可以通过分离获取第一类型的页面数据(即冷脏页面),从而便于对该第一类型的页面数据进行一次搬迁,搬迁至执行数据回收后有效页面数据的存储位置,而避免执行缓存置换及数据回收过程中的两次搬迁,达到减小开销的效果。That is to say, after the page type of the page data marked in the cache is obtained, the page data of the first type (ie, the cold dirty page) can be obtained by separating, so that the first type of page data can be easily relocated. Relocate to the storage location of the valid page data after data recovery, and avoid the two relocations in the cache replacement and data recovery process to reduce the overhead.
作为一种可选的方案,还包括:As an alternative, it also includes:
1)第一确定单元,设置为在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,至少根据第一类型的页面数据确定固态硬盘的数据回收块,其中,固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、有效页面数据,各个数据块中包括数据回收块;1) The first determining unit is configured to determine, according to the first type of page data, a data recovery block of the solid state hard disk, at least before the relocation of the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, wherein the solid state hard disk The page type of the page data in each data block includes: unwritten page data, invalid page data, valid page data, and each data block includes a data recovery block;
2)回收单元,设置为对数据回收块进行数据回收。2) Recycling unit, set to recover data from the data recovery block.
可选地,在本实施例中,第一确定单元包括:Optionally, in this embodiment, the first determining unit includes:
(1)第三获取模块,设置为根据缓存中第一类型的页面数据及固态硬盘中各个数据块中的页面数据获取各个数据块的数据回收率;(1) The third obtaining module is configured to obtain a data recovery rate of each data block according to the first type of page data in the cache and the page data in each data block in the solid state hard disk;
(2)确定模块,设置为根据数据回收率确定数据回收块。(2) A determination module that is set to determine a data recovery block based on the data recovery rate.
可选地,在本实施例中,通过比较得到的数据回收率确定数据回收块的方式包括以下至少之一:Optionally, in this embodiment, the manner of determining the data recovery block by comparing the obtained data recovery rates includes at least one of the following:
1)将数据回收率最高的数据块作为数据回收块;1) The data block with the highest data recovery rate is used as the data recovery block;
2)如果有若干个数据块的数据回收率相同且为最高值,则比较这些数据块中的第一类型的页面数据的数量,数值大者作为数据回收块; 2) If there are several data blocks whose data recovery rates are the same and the highest value, compare the number of page data of the first type in the data blocks, and the larger value is used as the data recovery block;
3)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,则比较这些数据块的第二类型的页面数据的数量,数值小者作为数据回收块;3) If there are several data blocks whose data recovery rates are the same and the highest value, and the number of page data of the first type in the data blocks is the same, the number of page data of the second type of the data blocks is compared, the value The smaller is used as a data recovery block;
4)如果有若干个数据块的数据回收率相同且为最高值,且这些数据块中的第一类型的页面数据的数量相同,第二类型的页面数据的数量也相同,则将块标识最大的数据块作为数据回收块。4) If there are several data blocks with the same data recovery rate and the highest value, and the number of page data of the first type in the data blocks is the same, and the number of page data of the second type is also the same, the block identification is the largest The data block acts as a data recovery block.
需要说明的是,在本实施例中,上述作为缓存的缓存层的映射表中不仅存储了页面数据标记的页面类型,还对应存储了页面数据在置换存储到固态硬盘后预设的位置信息(如数据块的块标识)。It should be noted that, in this embodiment, the mapping table of the cache layer as the cache not only stores the page type of the page data mark, but also stores the preset location information after the page data is replaced by the SSD. Such as the block identifier of the data block).
例如,假设第一类型的页面数据为冷脏页面,则FTL可以统计由缓存获取的标记为第一类型的页面数据的冷脏页面的数量,及固态硬盘各个数据块中失效页面数据的数量,以数据块为单位,分别获取各个数据块中失效页面数量和冷脏页面数量,利用上述失效页面数量和冷脏页面数量计算各个数据块的数据回收率。For example, if the first type of page data is a cold dirty page, the FTL may count the number of cold dirty pages marked by the cache as the first type of page data, and the number of invalid page data in each data block of the solid state hard disk. In the data block unit, the number of failed pages and the number of cold and dirty pages in each data block are respectively obtained, and the data recovery rate of each data block is calculated by using the number of failed pages and the number of cold dirty pages.
通过本申请提供的实施例,通过数据回收率来准确定位固态硬盘中用于进行数据回收的数据回收块,从而实现对固态硬盘准确高效的数据回收,进而保证固态硬盘中数据的处理效率。Through the embodiment provided by the present application, the data recovery rate is used to accurately locate the data recovery block for data recovery in the solid state hard disk, thereby realizing accurate and efficient data recovery of the solid state hard disk, thereby ensuring data processing efficiency in the solid state hard disk.
作为一种可选的方案,第三获取模块包括:As an optional solution, the third obtaining module includes:
1)处理子模块,设置为重复执行以下步骤,直至遍历完固态硬盘中的各个数据块:1) Process the submodule, set to repeat the following steps until the individual data blocks in the SSD are traversed:
S1,获取当前数据块的块标识;S1, acquiring a block identifier of a current data block;
S2,获取块标识所标识的第一类型的页面数据及失效页面数据;S2. Obtain the first type of page data and the invalidated page data identified by the block identifier.
S3,通过以下方式获取当前数据块的数据回收率:S3, the data recovery rate of the current data block is obtained by:
Figure PCTCN2017074290-appb-000004
Figure PCTCN2017074290-appb-000004
其中,r表示当前数据块的数据回收率,a表示固态硬盘中当前数据块的失效页面数据的页面数量,b表示缓存中的第一类型的页面数据的页面 数量,P表示页面大小,B表示块大小。Where r represents the data recovery rate of the current data block, a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and b represents the page of the first type of page data in the cache. The number, P represents the page size, and B represents the block size.
需要说明的是,在本实施例中,固态硬盘中读写操作的单位是页面,其中,页面大小通常为2KB,访问延迟一般为15us到200us。擦除操作的单位是块,其中,块大小通常为128KB,擦除一块需要2ms左右的开销。It should be noted that, in this embodiment, the unit of the read/write operation in the solid state hard disk is a page, wherein the page size is usually 2 KB, and the access delay is generally 15 us to 200 us. The unit of the erase operation is a block, where the block size is usually 128 KB, and erasing a block requires an overhead of about 2 ms.
通过本申请提供的实施例,通过上述方式依次计算固态硬盘中各个数据块的数据回收率,从而保证所确定的数据回收块的准确性,进而实现对固态硬盘准确高效的数据回收。Through the embodiments provided by the present application, the data recovery rate of each data block in the solid state hard disk is sequentially calculated in the above manner, thereby ensuring the accuracy of the determined data recovery block, thereby realizing accurate and efficient data recovery of the solid state hard disk.
作为一种可选的方案,回收单元包括:As an alternative, the recycling unit includes:
1)搬迁模块,设置为将数据回收块中的有效页面数据搬迁至预定的搬迁位置,并将有效页面数据标记为失效页面数据;1) The relocation module is configured to relocate the valid page data in the data recovery block to a predetermined relocation location, and mark the valid page data as invalid page data;
2)擦除模块,设置为对数据回收块中的失效页面数据进行擦除。2) The erase module is set to erase the invalid page data in the data recovery block.
例如,假设第一类型的页面数据以冷脏页面为例,第二类型的页面数据以干净页面和热脏页面为例,则在执行数据回收的过程中,可以将数据回收块中的干净页面和热脏页面直接拷贝至预定的搬迁位置,并修改FTL中上述页面数据对应的位置信息。并将数据回收块中的相应有效页面数据标记为失效页面数据(可以用失效页面表示)。For example, suppose the first type of page data is a cold dirty page, and the second type of page data is a clean page and a hot dirty page. In the process of performing data recovery, a clean page in the data recovery block can be recycled. And the hot dirty page is directly copied to the predetermined relocation location, and the location information corresponding to the above page data in the FTL is modified. The corresponding valid page data in the data recovery block is marked as invalid page data (which can be represented by a stale page).
进一步,将数据回收块中的冷脏页面也标记为失效页面数据,并将在缓存中该冷脏页面的最新数据拷贝至预定的搬迁位置。然后,将该冷脏页面在缓存中的最新数据删除,同时修改FTL中上述页面数据对应的位置信息。Further, the cold dirty page in the data recovery block is also marked as invalid page data, and the latest data of the cold dirty page in the cache is copied to a predetermined relocation location. Then, the latest data of the cold dirty page in the cache is deleted, and the location information corresponding to the page data in the FTL is modified.
然后,擦除数据回收块中的页面数据,将该数据回收块标记为“已擦除”,以实现对固态硬盘的数据回收,释放空闲空间的目的。Then, the page data in the data recovery block is erased, and the data recovery block is marked as "erased" to realize data recovery of the solid state hard disk and release the free space.
通过本申请提供的实施例,通过上述方式实现对固态硬盘的数据回收保证了有效页面数据中的第一类型的页面数据和第二类型的页面数据均可一次搬迁到预定的搬迁位置,避免了对第一类型的页面数据的二次搬迁,实现了降低固态硬盘的开销的效果。 Through the embodiment provided by the present application, the data recovery of the solid state hard disk is ensured by the foregoing manner, and the first type of page data and the second type of page data in the effective page data can be relocated to the predetermined relocation position at one time, thereby avoiding The secondary relocation of the first type of page data achieves the effect of reducing the overhead of the solid state drive.
作为一种可选的方案,还包括:As an alternative, it also includes:
1)第二确定单元,设置为在将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置之前,根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置。1) The second determining unit is configured to: before the relocation of the first type of page data from the cache to the predetermined relocation position in the solid state hard disk, the page data is not written according to the data block other than the data recovery block in the solid state hard disk. The size determines the predetermined relocation location.
例如,统计数据回收块中有效页面数据的大小,查看实时更新的FTL,查询固态硬盘其他数据块中满足该有效页面数据的大小的未写数据页面,将查找到的数据块作为该数据回收块的有效页面数据的预定的搬迁位置。For example, the size of the valid page data in the statistics recovery block, the real-time updated FTL, the unwritten data page of the other data blocks of the SSD that meet the size of the valid page data, and the found data block as the data recovery block. The predetermined relocation location of the valid page data.
通过本申请提供的实施例,通过根据固态硬盘中除数据回收块之外的其他数据块中未写页面数据的大小确定预定的搬迁位置,以保证数据回收块中的有效页面数据可以全部搬迁。Through the embodiment provided by the present application, the predetermined relocation location is determined according to the size of the unwritten page data in the data block other than the data recovery block in the solid state hard disk, so as to ensure that the effective page data in the data recovery block can be completely relocated.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述模块分别位于多个处理器中。It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the modules are located in multiple In the processor.
实施例3Example 3
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the foregoing storage medium may be configured to store program code for performing the following steps:
S1,获取回收请求,其中,回收请求用于请求对固态硬盘中的页面数据进行数据回收;S1, obtaining a recycling request, wherein the recycling request is used to request data recovery of page data in the solid state hard disk;
S2,响应回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,第一类型的页面数据用于指示将要从缓存中置换存储到固态硬盘的页面数据;S2, the first type of page data is obtained from the cached valid page data in response to the reclaiming request, wherein the first type of page data is used to indicate that the page data to be replaced from the cache to the solid state hard disk is to be replaced;
S3,将第一类型的页面数据从缓存搬迁至固态硬盘中预定的搬迁位置,其中,预定的搬迁位置为执行数据回收后有效页面数据的存储位置。S3, the first type of page data is relocated from the cache to a predetermined relocation location in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after performing data recovery.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介 质。Optionally, in this embodiment, the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory. Various kinds of programs that can store program code, such as a disc or a disc. quality.
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.
工业实用性Industrial applicability
在本发明实施例中,在获取到对固态硬盘中的页面数据进行数据回收的回收请求时,通过将有效页面数据中将要从缓存中置换存储到固态硬盘的第一类型的页面数据直接一次搬迁到固态硬盘中预定的搬迁位置,而无需先将第一类型的页面数据置换到固态硬盘中,再进行一次重新搬迁,从而克服了相关技术中由于数据的二次搬迁所导致的数据处理效率低的问题,进而实现提高数据处理效率的效果,此外,还降低了固态硬盘在数据回收和缓存置换过程中的数据搬迁次数及其造成的额外开销,提高了固态硬盘的性能。 In the embodiment of the present invention, when the recovery request for data recovery of the page data in the solid state hard disk is obtained, the first type of page data that is to be replaced from the cache to the solid state hard disk is directly relocated by the valid page data. To the predetermined relocation position in the SSD, without first replacing the first type of page data with the SSD, and then performing a relocation, thereby overcoming the low efficiency of data processing caused by the secondary relocation of data in the related art. The problem is to improve the efficiency of data processing. In addition, the number of data relocations and the additional overhead caused by the SSD during data recovery and cache replacement are reduced, and the performance of the SSD is improved.

Claims (16)

  1. 一种数据处理方法,包括:A data processing method comprising:
    获取回收请求,其中,所述回收请求用于请求对固态硬盘中的页面数据进行数据回收;Obtaining a recycling request, wherein the recycling request is used to request data recovery of page data in the solid state hard disk;
    响应所述回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,所述第一类型的页面数据用于指示将要从所述缓存中置换存储到所述固态硬盘的页面数据;Retrieving the first type of page data from the cached valid page data in response to the reclaiming request, wherein the first type of page data is used to indicate page data to be replaced from the cache to the SSD;
    将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置,其中,所述预定的搬迁位置为执行所述数据回收后所述有效页面数据的存储位置。Removing the first type of page data from the cache to a predetermined relocation location in the solid state hard disk, wherein the predetermined relocation location is a storage location of the valid page data after performing the data recovery.
  2. 根据权利要求1所述的方法,其中,响应所述回收请求从缓存的有效页面数据中获取第一类型的页面数据包括:The method of claim 1, wherein the fetching the first type of page data from the cached valid page data in response to the reclaiming request comprises:
    获取对所述缓存中的所述有效页面数据的访问频率及修改标识;Obtaining an access frequency and a modification identifier of the valid page data in the cache;
    根据所述访问频率及所述修改标识获取所述有效页面数据的页面类型,其中,所述有效页面数据的页面类型包括所述第一类型的页面数据及第二类型的页面数据,所述第二类型的页面数据用于指示未从所述缓存中置换存储到所述固态硬盘的页面数据;Obtaining a page type of the valid page data according to the access frequency and the modification identifier, where the page type of the valid page data includes the first type of page data and the second type of page data, where the Two types of page data are used to indicate that page data stored to the solid state hard disk is not replaced from the cache;
    根据所述有效页面数据的页面类型对所述有效页面数据进行分离,得到所述第一类型的页面数据。Separating the valid page data according to the page type of the valid page data to obtain the first type of page data.
  3. 根据权利要求2所述的方法,其中,所述第二类型的页面数据包括第一页面数据及第二页面数据,其中,根据所述访问频率及所述修改标识获取所述有效页面数据的页面类型包括:The method of claim 2, wherein the second type of page data comprises first page data and second page data, wherein the page of the valid page data is obtained according to the access frequency and the modified identifier Types include:
    将所述修改标识为未被修改的页面数据作为所述第一页面数据,将所述修改标识为已被修改且所述访问频率大于等于第一预定阈值 的页面数据作为所述第二页面数据,将所述修改标识为已被修改且所述访问频率小于所述第一预定阈值的页面数据作为所述第一类型的页面数据。Identifying the modification as unmodified page data as the first page data, identifying the modification as having been modified, and the access frequency is greater than or equal to a first predetermined threshold The page data is used as the second page data, and the modification is identified as page data that has been modified and the access frequency is less than the first predetermined threshold as the first type of page data.
  4. 根据权利要求2所述的方法,其中,在将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置之前,还包括:The method according to claim 2, further comprising: before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, further comprising:
    至少根据所述第一类型的页面数据确定所述固态硬盘的数据回收块,其中,所述固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、所述有效页面数据,所述各个数据块中包括所述数据回收块;Determining, according to the first type of page data, a data recovery block of the solid state hard disk, wherein a page type of the page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, the Valid page data, wherein the data block includes the data recovery block;
    对所述数据回收块进行所述数据回收。The data recovery is performed on the data recovery block.
  5. 根据权利要求4所述的方法,其中,至少根据所述第一类型的页面数据确定所述固态硬盘的数据回收块包括:The method of claim 4, wherein determining the data recovery block of the solid state hard disk based on at least the first type of page data comprises:
    根据所述缓存中所述第一类型的页面数据及所述固态硬盘中各个数据块中的页面数据获取所述各个数据块的数据回收率;Obtaining a data recovery rate of each of the data blocks according to the first type of page data in the cache and page data in each data block in the solid state hard disk;
    根据所述数据回收率确定所述数据回收块。The data recovery block is determined based on the data recovery rate.
  6. 根据权利要求5所述的方法,其中,根据所述缓存中所述第一类型的页面数据及所述固态硬盘中各个数据块中的页面数据获取所述各个数据块的数据回收率包括:The method according to claim 5, wherein the obtaining data recovery rates of the respective data blocks according to the first type of page data in the cache and the page data in each data block in the solid state hard disk comprises:
    重复执行以下步骤,直至遍历完所述固态硬盘中的所述各个数据块:Repeat the following steps until the individual data blocks in the SSD are traversed:
    获取当前数据块的块标识;Obtain a block identifier of the current data block;
    获取所述块标识所标识的所述第一类型的页面数据及所述失效 页面数据;Obtaining the first type of page data and the invalidity identified by the block identifier Page data
    通过以下方式获取所述当前数据块的所述数据回收率:Obtaining the data recovery rate of the current data block by:
    Figure PCTCN2017074290-appb-100001
    Figure PCTCN2017074290-appb-100001
    其中,所述r表示所述当前数据块的所述数据回收率,所述a表示所述固态硬盘中所述当前数据块的所述失效页面数据的页面数量,所述b表示所述缓存中的所述第一类型的页面数据的页面数量,所述P表示页面大小,所述B表示块大小。The r represents the data recovery rate of the current data block, the a represents the number of pages of the invalid page data of the current data block in the solid state hard disk, and the b represents the cache. The number of pages of the first type of page data, the P represents a page size, and the B represents a block size.
  7. 根据权利要求4所述的方法,其中,对所述数据回收块进行所述数据回收包括:The method of claim 4 wherein said performing data recovery on said data recovery block comprises:
    将所述数据回收块中的所述有效页面数据搬迁至所述预定的搬迁位置,并将所述有效页面数据标记为所述失效页面数据;Removing the valid page data in the data recovery block to the predetermined relocation location, and marking the valid page data as the invalidation page data;
    对所述数据回收块中的所述失效页面数据进行擦除。The invalidated page data in the data recovery block is erased.
  8. 根据权利要求7所述的方法,其中,在将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置之前,还包括:The method of claim 7, wherein before the relocation of the first type of page data from the cache to a predetermined relocation location in the solid state drive, the method further comprises:
    根据所述固态硬盘中除所述数据回收块之外的其他数据块中所述未写页面数据的大小确定所述预定的搬迁位置。Determining the predetermined relocation location according to a size of the unwritten page data in the other data block except the data recovery block in the solid state hard disk.
  9. 一种数据处理装置,包括:A data processing device comprising:
    第一获取单元,设置为获取回收请求,其中,所述回收请求用于请求对固态硬盘中的页面数据进行数据回收;a first obtaining unit, configured to acquire a recycling request, where the recycling request is used to request data recovery of page data in the solid state hard disk;
    第二获取单元,设置为响应所述回收请求从缓存的有效页面数据中获取第一类型的页面数据,其中,所述第一类型的页面数据用于指 示将要从所述缓存中置换存储到所述固态硬盘的页面数据;a second obtaining unit, configured to obtain, according to the recycling request, the first type of page data from the cached valid page data, wherein the first type of page data is used to refer to Representing that page data stored to the solid state hard disk is to be replaced from the cache;
    搬迁单元,设置为将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置,其中,所述预定的搬迁位置为执行所述数据回收后所述有效页面数据的存储位置。a relocation unit, configured to relocate the first type of page data from the cache to a predetermined relocation location in the SSD, wherein the predetermined relocation location is the valid page data after performing the data recovery Storage location.
  10. 根据权利要求9所述的装置,其中,第二获取单元包括:The apparatus of claim 9, wherein the second obtaining unit comprises:
    第一获取模块,设置为获取对所述缓存中的所述有效页面数据的访问频率及修改标识;a first obtaining module, configured to acquire an access frequency and a modified identifier of the valid page data in the cache;
    第二获取模块,设置为根据所述访问频率及所述修改标识获取所述有效页面数据的页面类型,其中,所述有效页面数据的页面类型包括所述第一类型的页面数据及第二类型的页面数据,所述第二类型的页面数据用于指示未从所述缓存中置换存储到所述固态硬盘的页面数据;a second acquiring module, configured to acquire a page type of the valid page data according to the access frequency and the modified identifier, where the page type of the valid page data includes the first type of page data and the second type Page data, the second type of page data is used to indicate that page data stored to the solid state hard disk is not replaced from the cache;
    分离模块,设置为根据所述有效页面数据的页面类型对所述有效页面数据进行分离,得到所述第一类型的页面数据。And a separating module, configured to separate the valid page data according to the page type of the valid page data, to obtain the first type of page data.
  11. 根据权利要求10所述的装置,其中,所述第二类型的页面数据包括第一页面数据及第二页面数据,其中,所述第二获取模块通过以下方式获取所述有效页面数据的页面类型包括:The device according to claim 10, wherein the second type of page data comprises first page data and second page data, wherein the second obtaining module acquires a page type of the valid page data by: include:
    将所述修改标识为未被修改的页面数据作为所述第一页面数据,将所述修改标识为已被修改且所述访问频率大于等于第一预定阈值的页面数据作为所述第二页面数据,将所述修改标识为已被修改且所述访问频率小于所述第一预定阈值的页面数据作为所述第一类型的页面数据。Identifying the modification as unmodified page data as the first page data, and identifying the modification as page data that has been modified and the access frequency is greater than or equal to a first predetermined threshold as the second page data. And modifying the modification as page data that has been modified and the access frequency is less than the first predetermined threshold as the first type of page data.
  12. 根据权利要求10所述的装置,其中,还包括: The apparatus of claim 10, further comprising:
    第一确定单元,设置为在将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置之前,至少根据所述第一类型的页面数据确定所述固态硬盘的数据回收块,其中,所述固态硬盘中各个数据块中的页面数据的页面类型包括:未写页面数据、失效页面数据、所述有效页面数据,所述各个数据块中包括所述数据回收块;a first determining unit, configured to determine the solid state hard disk according to at least the first type of page data before relocating the first type of page data from the cache to a predetermined relocation position in the solid state hard disk a data recovery block, wherein a page type of page data in each data block in the solid state hard disk includes: unwritten page data, invalid page data, the valid page data, and the data recovery block is included in each data block ;
    回收单元,设置为对所述数据回收块进行所述数据回收。And a recycling unit configured to perform the data recovery on the data recovery block.
  13. 根据权利要求12所述的装置,其中,第一确定单元包括:The apparatus of claim 12, wherein the first determining unit comprises:
    第三获取模块,设置为根据所述缓存中所述第一类型的页面数据及所述固态硬盘中各个数据块中的页面数据获取所述各个数据块的数据回收率;a third acquiring module, configured to acquire, according to the first type of page data in the cache and the page data in each data block in the solid state hard disk, a data recovery rate of each of the data blocks;
    确定模块,设置为根据所述数据回收率确定所述数据回收块。A determination module is configured to determine the data recovery block based on the data recovery rate.
  14. 根据权利要求13所述的装置,其中,所述第三获取模块包括:The apparatus of claim 13, wherein the third acquisition module comprises:
    处理子模块,设置为重复执行以下步骤,直至遍历完所述固态硬盘中的所述各个数据块:Processing the sub-module, set to repeat the following steps until the individual data blocks in the SSD are traversed:
    获取当前数据块的块标识;Obtain a block identifier of the current data block;
    获取所述块标识所标识的所述第一类型的页面数据及所述失效页面数据;Obtaining the first type of page data and the invalidated page data identified by the block identifier;
    通过以下方式获取所述当前数据块的所述数据回收率:Obtaining the data recovery rate of the current data block by:
    Figure PCTCN2017074290-appb-100002
    Figure PCTCN2017074290-appb-100002
    其中,所述r表示所述当前数据块的所述数据回收率,所述a表示所述固态硬盘中所述当前数据块的所述失效页面数据的页面数量, 所述b表示所述缓存中的所述第一类型的页面数据的页面数量,所述P表示页面大小,所述B表示块大小。Wherein, the r represents the data recovery rate of the current data block, and the a represents a page number of the invalid page data of the current data block in the solid state hard disk, The b represents the number of pages of the first type of page data in the cache, the P represents a page size, and the B represents a block size.
  15. 根据权利要求12所述的装置,其中,所述回收单元包括:The apparatus according to claim 12, wherein said recycling unit comprises:
    搬迁模块,设置为将所述数据回收块中的所述有效页面数据搬迁至所述预定的搬迁位置,并将所述有效页面数据标记为所述失效页面数据;a relocation module, configured to relocate the valid page data in the data recovery block to the predetermined relocation location, and mark the valid page data as the invalidation page data;
    擦除模块,设置为对所述数据回收块中的所述失效页面数据进行擦除。An erase module configured to erase the failed page data in the data recovery block.
  16. 根据权利要求15所述的装置,其中,还包括:The device according to claim 15, further comprising:
    第二确定单元,设置为在将所述第一类型的页面数据从所述缓存搬迁至所述固态硬盘中预定的搬迁位置之前,根据所述固态硬盘中除所述数据回收块之外的其他数据块中所述未写页面数据的大小确定所述预定的搬迁位置。 a second determining unit, configured to set, according to the relocation of the first type of page data from the cache to a predetermined relocation position in the solid state hard disk, according to the solid state hard disk except the data recovery block The size of the unwritten page data in the data block determines the predetermined relocation location.
PCT/CN2017/074290 2016-02-25 2017-02-21 Data processing method and apparatus WO2017143972A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610103929.6A CN107122124B (en) 2016-02-25 2016-02-25 Data processing method and device
CN201610103929.6 2016-02-25

Publications (1)

Publication Number Publication Date
WO2017143972A1 true WO2017143972A1 (en) 2017-08-31

Family

ID=59684803

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/074290 WO2017143972A1 (en) 2016-02-25 2017-02-21 Data processing method and apparatus

Country Status (2)

Country Link
CN (1) CN107122124B (en)
WO (1) WO2017143972A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710541A (en) * 2018-12-06 2019-05-03 天津津航计算技术研究所 For the optimization method of NAND Flash main control chip Greedy garbage reclamation
CN109739776A (en) * 2018-12-06 2019-05-10 天津津航计算技术研究所 Greedy garbage retrieving system for NAND Flash main control chip

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905129B (en) * 2021-05-06 2021-08-13 蚂蚁金服(杭州)网络技术有限公司 Method and device for eliminating cache memory block and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279809A (en) * 2011-08-10 2011-12-14 郏惠忠 Method for redirecting write in and garbage recycling in solid hard disk
CN102508788A (en) * 2011-09-28 2012-06-20 成都市华为赛门铁克科技有限公司 SSD (solid state drive) and SSD garbage collection method and device
CN104424103A (en) * 2013-08-21 2015-03-18 光宝科技股份有限公司 Management method for cache in solid state storage device
US20160041903A1 (en) * 2009-12-11 2016-02-11 Nimble Storage, Inc. Garbage collection based on temperature

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8219776B2 (en) * 2009-09-23 2012-07-10 Lsi Corporation Logical-to-physical address translation for solid state disks
US20120159098A1 (en) * 2010-12-17 2012-06-21 Microsoft Corporation Garbage collection and hotspots relief for a data deduplication chunk store
CN102841850B (en) * 2012-06-19 2016-04-20 记忆科技(深圳)有限公司 Reduce the method and system that solid state disk write is amplified
CN103136121B (en) * 2013-03-25 2014-04-16 中国人民解放军国防科学技术大学 Cache management method for solid-state disc
CN103455435A (en) * 2013-08-29 2013-12-18 华为技术有限公司 Data writing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160041903A1 (en) * 2009-12-11 2016-02-11 Nimble Storage, Inc. Garbage collection based on temperature
CN102279809A (en) * 2011-08-10 2011-12-14 郏惠忠 Method for redirecting write in and garbage recycling in solid hard disk
CN102508788A (en) * 2011-09-28 2012-06-20 成都市华为赛门铁克科技有限公司 SSD (solid state drive) and SSD garbage collection method and device
CN104424103A (en) * 2013-08-21 2015-03-18 光宝科技股份有限公司 Management method for cache in solid state storage device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710541A (en) * 2018-12-06 2019-05-03 天津津航计算技术研究所 For the optimization method of NAND Flash main control chip Greedy garbage reclamation
CN109739776A (en) * 2018-12-06 2019-05-10 天津津航计算技术研究所 Greedy garbage retrieving system for NAND Flash main control chip
CN109710541B (en) * 2018-12-06 2023-06-09 天津津航计算技术研究所 Optimization method for Greedy garbage collection of NAND Flash main control chip
CN109739776B (en) * 2018-12-06 2023-06-30 天津津航计算技术研究所 Greedy garbage collection system for NAND Flash main control chip

Also Published As

Publication number Publication date
CN107122124B (en) 2021-06-15
CN107122124A (en) 2017-09-01

Similar Documents

Publication Publication Date Title
US10838859B2 (en) Recency based victim block selection for garbage collection in a solid state device (SSD)
KR100843543B1 (en) System comprising flash memory device and data recovery method thereof
US8838875B2 (en) Systems, methods and computer program products for operating a data processing system in which a file delete command is sent to an external storage device for invalidating data thereon
US10176190B2 (en) Data integrity and loss resistance in high performance and high capacity storage deduplication
US8745310B2 (en) Storage apparatus, computer system, and method for managing storage apparatus
US9690694B2 (en) Apparatus, system, and method for an address translation layer
US20170139825A1 (en) Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
US8719501B2 (en) Apparatus, system, and method for caching data on a solid-state storage device
US8417878B2 (en) Selection of units for garbage collection in flash memory
US9779027B2 (en) Apparatus, system and method for managing a level-two cache of a storage appliance
CN109656486B (en) Configuration method of solid state disk, data storage method, solid state disk and storage controller
US10877898B2 (en) Method and system for enhancing flash translation layer mapping flexibility for performance and lifespan improvements
US10025669B2 (en) Maintaining data-set coherency in non-volatile memory across power interruptions
CN107391774B (en) The rubbish recovering method of log file system based on data de-duplication
US20170060448A1 (en) Systems, solid-state mass storage devices, and methods for host-assisted garbage collection
US9122586B2 (en) Physical-to-logical address map to speed up a recycle operation in a solid state drive
US10114576B2 (en) Storage device metadata synchronization
CN111880723B (en) Data storage device and data processing method
CN110674056B (en) Garbage recovery method and device
CN112596667A (en) High throughput method and system for organizing NAND blocks and placing data for random writing in a solid state drive
US20180189144A1 (en) Apparatus and method for memory storage to protect data-loss after power loss
WO2017143972A1 (en) Data processing method and apparatus
CN115269451B (en) Flash memory garbage collection method, device and readable storage medium
US20140258591A1 (en) Data storage and retrieval in a hybrid drive
JP2007220107A (en) Apparatus and method for managing mapping information of nonvolatile memory

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17755801

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17755801

Country of ref document: EP

Kind code of ref document: A1