WO2015065312A1 - Method and apparatus of data de-duplication for solid state memory - Google Patents

Method and apparatus of data de-duplication for solid state memory Download PDF

Info

Publication number
WO2015065312A1
WO2015065312A1 PCT/US2013/067034 US2013067034W WO2015065312A1 WO 2015065312 A1 WO2015065312 A1 WO 2015065312A1 US 2013067034 W US2013067034 W US 2013067034W WO 2015065312 A1 WO2015065312 A1 WO 2015065312A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
hash value
written
hash
logical
Prior art date
Application number
PCT/US2013/067034
Other languages
French (fr)
Inventor
Akio Nakajima
Original Assignee
Hitachi, Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi, Ltd. filed Critical Hitachi, Ltd.
Priority to PCT/US2013/067034 priority Critical patent/WO2015065312A1/en
Publication of WO2015065312A1 publication Critical patent/WO2015065312A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7201Logical to physical mapping or translation of blocks or pages

Definitions

  • the present invention relates generally to storage systems and, more particularly, to life management for a solid state memory system which is applied to data de-duplication.
  • Exemplary embodiments of the invention provide a way to
  • a storage program calculates a fingerprint of data de-duplication such as a hash value, and then the storage program updates the hash value which is a reference logical block address and manages a reference count.
  • the storage program stores the un- referenced hash value reuse list and keeps un-referenced data for future host write data as data de-duplication. In this way, when the host writes the same data as the unreferenced data, the storage system avoids a re-write operation of the same data. As a result, the number of write times is reduced and the number of erase flash block times is reduced and the life time of the flash media are increased.
  • a SSD program when a solid storage disk system (SSD) performs a write operation, a SSD program calculates a fingerprint of data de- duplication such as a hash value, and then the SSD program updates the hash value which is a reference logical block address and manages a reference count.
  • the SSD program performs reclamation using an erased list and an un-referenced hash value list and move reference data to any erased physical area, the SSD program erases the block which stores data de-duplication un-referenced data, and then the SSD program deletes the unreferenced hash value from reuse list.
  • a storage computer comprises a storage media; and a controller being operable to manage a plurality of logical areas, each of the logical areas being an area to be read/written data from/to by an external computer.
  • the controller is operable to maintain (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the controller receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
  • the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information.
  • the controller upon receiving a command to write said another new data which is the same as the maintained existing data to a logical area of the plurality of logical areas by finding that said another new data has a calculated hash value which is same as the maintained hash value, decrements the reference count of the maintained hash value of the maintained existing data.
  • the controller is operable to unmap the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be overwritten, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written.
  • the controller updates the hash value from the maintained hash value to the calculated hash value of said another new data in the hash mapping information; and the controller sets an unmap flag to a value indicating that the updated hash value is not unmapped.
  • the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information.
  • the controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written.
  • PBAs physical block addresses
  • the controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list.
  • the controller is operable to check a used capacity of the storage media to determine whether the used capacity exceeds a preset threshold, and if the used capacity exceeds the preset threshold, then: refer to the PBAs in the location mapping information using the hash value entries starting from the head of the least recently used list so as to identify candidate PBAs; issue an unmap command to the candidate PBAs corresponding to the hash value entries starting from the head of the least recently used list so as to create free capacity of the storage media until the used capacity no longer exceeds the preset threshold; delete each of the hash value entries from the least recently used list for which the unmap command has been issued to the corresponding candidate PBAs; and indicate in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
  • the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information.
  • the controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written.
  • PBAs physical block addresses
  • the controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list.
  • the storage media stores data that is erased by overwriting the data.
  • the storage media each includes a storage media controller being operable to determine whether a used capacity of the storage media exceeds a preset threshold and, if yes, then: refer to a candidate erase physical address corresponding to the hash value entry starting from the head of the least recently used list; check candidate erase physical address boundary of the candidate erase physical address and determine whether there is any contiguous physical space which only includes erased physical segment or un-referenced physical segment or both erased physical segment and unreferenced physical segment; and if there is no contiguous physical space, move an actual data block which is referred to by one or more LBAs and for which the reference count is not zero from the candidate erase physical address to an erased block, change a PBA corresponding to the hash value of the moved data in the location mapping information to the PBA of the erased block, erase a block containing the candidate erase physical address and any un-referenced physical segment, delete the hash value entry of the candidate erase physical address from the least recently used list, and indicate in the location information that the hash value corresponding to the
  • Another aspect of the invention is directed to a method of managing a plurality of logical areas in a storage computer which includes a 3 067034 storage media, each of the logical areas being an area to be read/written data from/to by an external computer.
  • the method comprises maintaining (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
  • Another aspect of this invention is directed to a computer
  • the computer program comprises code for maintaining (i) a hash value of the existing data to be
  • controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
  • Another aspect of this invention is directed to a non-transitory computer-readable storage medium storing a plurality of instructions for
  • the T/US2013/067034 plurality of instructions comprise instructions that cause the data processor to maintain (i) a hash value of the existing data to be over-written and (ii) the
  • the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
  • FIG. 1 illustrates an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied.
  • FIG. 2 shows an example of the storage memory.
  • FIG. 3 shows an example of a hash table.
  • FIG. 4 shows an example of a location table.
  • FIG. 5 shows an example of the least recently used (LRU) list.
  • FIGS. 6a and 6b show examples of prior write operations.
  • FIGS. 7a and 7b show examples illustrating an overview of the write operation.
  • FIG. 8 shows an example of a flow diagram illustrating a host read miss operation.
  • FIG. 9 shows an example of a flow diagram illustrating a host write and data de-duplication operation.
  • FIG. 10 shows an example of a flow diagram illustrating a host unmap operation.
  • FIG. 11 shows an example of a flow diagram illustrating an internal unmap process.
  • FIGS. 12a and 12b illustrate an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the second embodiment of the invention.
  • FIG. 13 shows an example of the SSD memory.
  • FIGS. 14a and 14b show an example illustrating an overview of the reclamation process.
  • FIG. 5 shows an example of a flow diagram illustrating the reclamation process.
  • processing can include the actions and processes of a computer system or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.
  • the present invention also relates to an apparatus for performing the operations herein.
  • This apparatus may be specially
  • Exemplary embodiments of the invention provide apparatuses, methods and computer programs for life management of a solid state memory system which is applied to data de-duplication.
  • FIG. 1 illustrates an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the first embodiment of the invention.
  • the computer system includes a storage system 1 and a host computer 2.
  • the physical storage system 1 contains a host interface which connects to the host 2, a
  • FIG. 2 shows an example of the storage memory 20.
  • the storage memory 20 contains a storage program 21 , a data de-duplication program 22, a thin provisioning program 23, a hash table 30, a location table 40, and a least recently used (LRU) list 50.
  • LRU least recently used
  • the storage program 21 performs host read and write operation, management of LU, and management of cache memory operation, and so on.
  • the storage program 21 contains a data de- duplication program 22 and a thin provisioning program 23.
  • the data de- duplication program 22 performs management of mapping information between LBA (Logical Block Addressing) and hash value or between hash value and physical block address (PBA), and calculates hash value based on input data.
  • the thin provisioning program 23 performs allocation LBA when the storage system receives new write data from the host. Also, the thin provisioning program performs de-allocation LBA which is deleted by the host using an unmap/trim command or is written pattern data such as format data.
  • the hash table 30 contains mapping information between LBA and hash value to provision data segment and locate actual data segment.
  • the location table 40 contains mapping information between hash value and Physical Block Address (PBA) to manage location of stored actual data segment and free space to reuse actual data segment which is already un-referenced.
  • the LRU list 50 is a sorted list of un-referenced hash value from any LBA to manage candidates of erase block.
  • FIG. 3 shows an example of a hash table 30.
  • the hash table 30 contains a Logical Unit Number (LUN) field 31 , a Logical Block Address
  • the LBA field 31 identifies the internal Logical Unit 3 in the storage system 1.
  • the LBA field 32 67034 identifies the address of the LUN.
  • a data segment size is a fixed size defined by data de-duplication size; for example, the data segment size is 4kB.
  • the unmap flag 33 is an allocation or de-allocation physical block flag for thin
  • the unmap flag 33 is stored with the value 1, the hash value field 34 is invalidate and the LBA segment does not allocate actual data segment. If host 2 reads a LBA which has the unmap flag 33 set to the value 1 , then the host returns read data with specific format data such as all zero.
  • the hash value field 34 is a fingerprint of the actual data segment.
  • storage program calculates a hash value based on the actual data segment to identify the same data segment.
  • FIG. 4 shows an example of a location table 40.
  • the location table 40 contains a valid bit field 41 , a hash value field 42, a reference count field 43, and a Physical Block Address (PBA) field 44.
  • the valid bit field 41 is entry of a validation or invalidation flag. When the valid bit 41 is set to 1 , the hash value field 42 and reference count field 43 are valid. When the valid bit
  • the hash value field 42 has the same value as the hash value field 34 of the hash table 30.
  • the reference count field 43 is a counter of reference by the hash table 30.
  • the reference count 43 is set to 10.
  • the reference count 43 is set to 0.
  • the storage program changes the hash table and the reference count is incremented.
  • the PBA field 44 identifies the location of the SSD.
  • the PBA field 44 contains the disk identifier and LBA of the disk.
  • FIG. 5 shows an example of the least recently used (LRU) list 50.
  • the LRU list 50 contains a hash value list 51.
  • the hash value list 51 contains stored hash value 42 for which the reference counter 43 is set to 0 to reuse un-referenced data.
  • the storage program detects insufficient write space, then the storage program issues an unmap or trim command to the SSD, so that the storage program chooses the head of the hash value list 51 as a candidate physical segment for the unmap command.
  • the storage program changes the hash table and the reference count is incremented and the corresponding hash value is deleted from the LRU list 51.
  • any LBA segment does not refer to the hash value and the reference count 43 is set to 0
  • the storage program inserts the corresponding hash value for which the reference count 43 is set to 0 at the bottom of the LRU list 51.
  • FIGS. 6a and 6b show examples of prior write operations.
  • FIG. 6a shows three mapping layers of logical block address space 61 , hash value space 62, and physical block address space 63.
  • the logical block address space 61 is LU 3 address space.
  • the hash value space 62 is fingerprint key of actual physical data location for data de-duplication.
  • the logical block address space 61 is SSD physical location space.
  • the hash table 30 (FIG. 3) is used for mapping between the logical block address space 61 and hash value space 62.
  • the location table 40 (FIG. 4) is used for mapping between the hash value space 62 and physical block address space 63.
  • the LBA stores Data C 601 segment and the Data C 601 is pointed to hash value 602.
  • the actual data segment of hash value 602 is stored in PBA segment 603.
  • the PBA segment 604 is erased and the storage program may write to the PBA segment 604 for free space.
  • the storage program calculates a hash value 611 corresponding to Data D 610.
  • the storage program searches a free space using the valid bit 41 of the location table 40 and allocates the erased data segment 604 of FIG. 6a.
  • the storage program writes actual data segment of Data D 612 to the allocated segment 604 of FIG. 6a.
  • the Data C segment 603 is un-referenced hash value 602 of FIG. 6a, and then the storage program performs an unmap command and Data C is erased from segment 613.
  • the storage program needs to write physical address space 613, so that the number of write or erase times is increased.
  • FIGS. 7a and 7b show examples illustrating an overview of the write operation.
  • the host writes data D 610 to LBA 701
  • the existing Data C 610 (of FIG. 6a) is overwritten by Data D 610.
  • the storage program calculates the hash value 611 corresponding to Data D 610.
  • the storage program writes to actual data segment of Data D 612.
  • C segment 603 is un-referenced hash value 602, and the storage program decrements the reference count 43 of the location table 40 and the reference count is set to 0 for that un-referenced hash value 602. Then the storage program inserts the corresponding hash value 602 to the LRU list 50.
  • the thin provisioning program does not perform unmap or erase operation for Data C 603.
  • FIG. 7b when the host writes Data C 740 to LBA 741 , the existing Data B 742 (of FIG. 7a) is overwritten by Data C 740.
  • the storage program calculates the hash value corresponding to Data C 740.
  • the hash value of Data C 740 is existing hash value 602 of the location table 40.
  • the hash value 602 is un-referenced hash value since the reference count 43 is set to zero.
  • the storage program modifies LBA 741 entry in the hash table 30 to set the hash value 602 and the unmap flag 33 is set to 0.
  • the storage program increments the reference counter of hash value 602, so that the reference counter is set to 1 (702 in FIG. 7b).
  • the Data B segment 751 is un-referenced hash value 752, and the storage program decrements the reference count 43 of the location table 40 and the reference count is set to 0 for that un-referenced hash value 752. Then the storage program inserts the corresponding hash value 752 to the LRU list 50. Thus, the storage program does not perform an unmap or erase operation for Data B 751. When the host writes the same data as Data C 603, the storage program does not need to rewrite the physical address space 63, so that the number of write or erase times is reduced.
  • FIG. 8 shows an example of a flow diagram 800 illustrating a host read miss operation.
  • the flowchart is a read operation of cache miss.
  • the storage program searches the hash table 30 and location table 40 and reads data from the physical media.
  • the storage program receives a read command from the host 2.
  • the storage program allocates cache memory and divides the read command request size to de-duplication segment size.
  • the storage program checks the unmap flag 33 of a segment specified in the read command. If the unmap flag is set to 1
  • step S806 If the unmap flag is set to 0 (NO), then the next step is S803.
  • step S803 the storage program refers to the hash value
  • step S804 the storage program obtains the physical address 44 from the location table 40 based on the hash value 34.
  • step S805 the storage program reads the segment related to the physical address 44 and stores it in the cache memory (and proceeds to step S807).
  • step S806 the storage program returns the segment related to the specific pattern data and stores it in the cache memory (and proceeds to step S807).
  • step S807 the storage program processes the next segment(s) of the read command. If next segment exists (YES), the next step is S802. If all segments have been processed (NO), the next step is S808. In step S808, the storage program returns the read data and a response to the host.
  • FIG. 9 shows an example of a flow diagram 900 illustrating a host write and data de-duplication operation.
  • the storage program receives a write command from the host 2.
  • the storage program allocates cache memory and stores the received write data, and then returns a write response to the host.
  • the storage program divides the write command request size into de-duplication segment size. This example is a write back operation.
  • the flow diagram for a write through operation is similar to that shown in FIG. 9 except for the timing of returning a write response.
  • the storage program calculates a hash value and sets the hash value 34 in the hash table 30.
  • the storage program searches the calculated hash value at S902 in the hash value field 42 in the location table 40. If the calculated hash value at S902 exists in the location table 40 (YES), then the next step is S906. If the calculated hash value at S902 does not exist in the location table 40 (NO), then the next step is S904.
  • step S904 the storage program finds an entry of valid bit 41 set to 0 in the location table 40 to allocate free physical space. Then the storage program assigns physical space and sets the valid field 41 to 1 and stores the calculated hash value at S902 to the hash value field 42 of the allocated PBA 44 entry. In step S905, the storage program writes to the segment related to the PBA 44 from the cache memory and sets the unmap flag to 0 (and proceeds to step S910).
  • step S906 the storage program checks the unmap flag 33 of the target LBA for storing data of the write command. If the unmap flag is set to 1 (YES), then the next step is S909. If the unmap flag is set to 0 (NO), then the next step is S907. In step S907, the storage program decrements the reference count 43 of the existing hash value entry 42 in the location table 40.
  • the storage program inserts the unreferenced hash value to the LRU list 50.
  • the storage program changes the hash value from an existing hash value of existing data to the hash value which was calculated at S902.
  • the storage program updates the hash value 34 of the LBA entry 32 in the hash table 30 and sets the unmap flag to
  • step S910 the storage program processes the next segment(s) of the write command. If the next segment exists (YES), then the next step is S902.
  • FIG. 10 shows an example of a flow diagram 1000 illustrating a host unmap operation.
  • the storage program receives an
  • the storage program divides the unmap command request size into de-duplication segment size.
  • the storage program sets the unmap flag 33 with 1 for the segment of the unmap command in the hash table 30.
  • the storage program sets the unmap flag 33 with 1 for the segment of the unmap command in the hash table 30.
  • next segment processes the next segment(s) of the unmap command. If the next segment exists (YES), then the next step is S1002. If all segments have been
  • step S1004 the storage program returns an unmap response to the host.
  • FIG. 11 shows an example of a flow diagram 1100 illustrating an internal unmap process.
  • the storage program checks the used capacity of physical space. If the physical capacity exceeds a threshold
  • step S1102 the storage program refers to the multiple PBA entries 44 in the location table 40 using the head of the hash values 51 of the LRU list 50.
  • step S1103 the storage program issues an unmap or trim command to the physical media 4 to candidate PBA list at S1102.
  • step S1104 the storage program deletes the hash value
  • the storage system keeps un-referenced data for future reuse.
  • the storage system avoids the re-write operation of the same data. As a result, the number of write or erase operation times is reduced.
  • FIGS. 12a and 12b illustrate an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the second embodiment of the invention.
  • the computer system includes a host 2, a storage system 120, and SSD 121.
  • FIG. 12a is similar as FIG. 1 except that the data de-duplication engine and storage program 21 are stored in the memory 22 does not have data de-duplication program 22.
  • the data de-duplication capability SSD 121 contains a plurality of drive interfaces which connect to the storage disk interface, CPU, data de-duplication engine, SSD Memory 130, cache memory, flash interface, and flash memory media, which are connected to each other by a Bus interface such as PCI, DDR, SCSI, flash interface.
  • a logical unit (LU) 123 is logical data store which is created by multiple flash media.
  • FIG. 13 shows an example of the SSD memory 130.
  • the content of the SSD memory 130 of FIG. 13 is similar to that of the memory 20 of FIG. 2 except that the thin provisioning program 23 is not in FIG. 13 and a reclamation program 133 is provided in FIG. 13.
  • the SSD memory 130 contains an SSD program 131 , a data de-duplication program
  • the reclamation program 133 performs management of free physical space based on block address.
  • the block 2013/067034 address is a minimum unit of the erase operation for flash memory media.
  • the block address size is different from the LBA size, PBA size, or data de- duplication segment size.
  • the reclamation program gathers the existing
  • FIGS. 14a and 14b show an example illustrating an overview of the reclamation process.
  • the SSD program gathers erased physical segment (141 in FIG. 14a) and invalid/un-referenced physical segment for which the
  • SSD program performs the
  • the SSD program moves actual data block which is refereed by one or multiple LBAs for which the reference count is not zero
  • the SSD program changes the PBA field of corresponding hash value in the location table 40 (145 in FIG. 14b).
  • the SSD program erases the block (146 in FIG.
  • FIG. 15 shows an example of a flow diagram 1500 illustrating the reclamation process.
  • the SSD program checks the
  • the next step is S1502. If the used 13 067034 physical capacity does not exceed the threshold (NO), then the process ends.
  • step S1502 the SSD program refers to a candidate erase physical address from the top of the LRU list 50. Also, the SSD program gathers erased
  • step S1503 the SSD program checks the erase address boundary and any contiguous space which only includes erased physical segment and/or un-referenced
  • the SSD program moves the actual data block which is referred to by one or
  • step S1504 the SSD program erases the block since the block includes invalid physical segment or un-referenced physical segment.
  • step S1505 the SSD program deletes the hash value entries from the LRU list 50 and invalidates the valid bits 41 corresponding to the hash values in the location table.
  • I/O devices e.g., CD and DVD drives, floppy disk drives, hard drives, etc.
  • inventions can be stored on computer-readable media independently of one or more computer-readable media on which reside the programs used in the invention.
  • the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network.
  • communication networks include local area networks, wide area networks, e.g., the Internet, wireless networks, storage area networks, and the like.
  • the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways.
  • the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A storage computer comprises a storage media; and a controller being operable to manage a plurality of logical areas, each of the logical areas being an area to be read/written data from/to by an external computer. The controller is operable to maintain (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the controller receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.

Description

T U 2013/067034
DATA DE-DUPLICATION FOR SOLID STATE MEMORY
BACKGROUND OF THE INVENTION
[0001] The present invention relates generally to storage systems and, more particularly, to life management for a solid state memory system which is applied to data de-duplication.
[0002] Currently, write and erase operations of a solid state memory system such as a flash memory should be managed, because the solid state memory has a limitation of duration of life due to the erase page operation.
US2013/0151756 for data de-duplication and solid state memory device
discloses methods and apparatus for identifying placement and/or erasure
data for a flash memory based solid state device that supports de-duplication.
BRIEF SUMMARY OF THE INVENTION
[0003] Exemplary embodiments of the invention provide a way to
extend the life duration of a solid state memory system by managing the
system to reduce erase page operation. Current solid state memory systems do not implement a reduction of the number of times of erase or write
processes.
[0004] In one embodiment, when a storage system performs a write
operation, a storage program calculates a fingerprint of data de-duplication such as a hash value, and then the storage program updates the hash value which is a reference logical block address and manages a reference count.
When recent actual data does not refer to any logical block address because the reference count is set to zero, then the storage program stores the un- referenced hash value reuse list and keeps un-referenced data for future host write data as data de-duplication. In this way, when the host writes the same data as the unreferenced data, the storage system avoids a re-write operation of the same data. As a result, the number of write times is reduced and the number of erase flash block times is reduced and the life time of the flash media are increased.
[0005] In another embodiment, when a solid storage disk system (SSD) performs a write operation, a SSD program calculates a fingerprint of data de- duplication such as a hash value, and then the SSD program updates the hash value which is a reference logical block address and manages a reference count. When the SSD program performs reclamation using an erased list and an un-referenced hash value list and move reference data to any erased physical area, the SSD program erases the block which stores data de-duplication un-referenced data, and then the SSD program deletes the unreferenced hash value from reuse list.
[0006] In accordance with an aspect of the present invention, a storage computer comprises a storage media; and a controller being operable to manage a plurality of logical areas, each of the logical areas being an area to be read/written data from/to by an external computer. The controller is operable to maintain (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the controller receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
[0007] In some embodiments, the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information. The controller, upon receiving a command to write said another new data which is the same as the maintained existing data to a logical area of the plurality of logical areas by finding that said another new data has a calculated hash value which is same as the maintained hash value, decrements the reference count of the maintained hash value of the maintained existing data.
[0008] In specific embodiments, the controller is operable to unmap the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be overwritten, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written.
[0009] In some embodiments, the controller updates the hash value from the maintained hash value to the calculated hash value of said another new data in the hash mapping information; and the controller sets an unmap flag to a value indicating that the updated hash value is not unmapped.
[0010] In specific embodiments, the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information. The controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written. The controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list. The controller is operable to check a used capacity of the storage media to determine whether the used capacity exceeds a preset threshold, and if the used capacity exceeds the preset threshold, then: refer to the PBAs in the location mapping information using the hash value entries starting from the head of the least recently used list so as to identify candidate PBAs; issue an unmap command to the candidate PBAs corresponding to the hash value entries starting from the head of the least recently used list so as to create free capacity of the storage media until the used capacity no longer exceeds the preset threshold; delete each of the hash value entries from the least recently used list for which the unmap command has been issued to the corresponding candidate PBAs; and indicate in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
[0011] In some embodiments, the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information. The controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written. The controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list. The storage media stores data that is erased by overwriting the data. The storage media each includes a storage media controller being operable to determine whether a used capacity of the storage media exceeds a preset threshold and, if yes, then: refer to a candidate erase physical address corresponding to the hash value entry starting from the head of the least recently used list; check candidate erase physical address boundary of the candidate erase physical address and determine whether there is any contiguous physical space which only includes erased physical segment or un-referenced physical segment or both erased physical segment and unreferenced physical segment; and if there is no contiguous physical space, move an actual data block which is referred to by one or more LBAs and for which the reference count is not zero from the candidate erase physical address to an erased block, change a PBA corresponding to the hash value of the moved data in the location mapping information to the PBA of the erased block, erase a block containing the candidate erase physical address and any un-referenced physical segment, delete the hash value entry of the candidate erase physical address from the least recently used list, and indicate in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
[0012] Another aspect of the invention is directed to a method of managing a plurality of logical areas in a storage computer which includes a 3 067034 storage media, each of the logical areas being an area to be read/written data from/to by an external computer. The method comprises maintaining (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
[0013] Another aspect of this invention is directed to a computer
program for managing a plurality of logical areas in a storage computer which includes a storage media, each of the logical areas being an area to be
read/written data from/to by an external computer. The computer program comprises code for maintaining (i) a hash value of the existing data to be
over-written and (ii) the existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the
controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
[0014] Another aspect of this invention is directed to a non-transitory computer-readable storage medium storing a plurality of instructions for
controlling a data processor to manage managing a plurality of logical areas in a storage computer which includes a storage media, each of the logical areas being an area to be read/written data from/to by an external computer. The T/US2013/067034 plurality of instructions comprise instructions that cause the data processor to maintain (i) a hash value of the existing data to be over-written and (ii) the
existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area
storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
[0015] These and other features and advantages of the present
invention will become apparent to those of ordinary skill in the art in view of the following detailed description of the specific embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] FIG. 1 illustrates an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied.
[0017] FIG. 2 shows an example of the storage memory.
[0018] FIG. 3 shows an example of a hash table.
[0019] FIG. 4 shows an example of a location table.
[0020] FIG. 5 shows an example of the least recently used (LRU) list.
[0021] FIGS. 6a and 6b show examples of prior write operations.
[0022] FIGS. 7a and 7b show examples illustrating an overview of the write operation.
[0023] FIG. 8 shows an example of a flow diagram illustrating a host read miss operation. [0024] FIG. 9 shows an example of a flow diagram illustrating a host write and data de-duplication operation.
[0025] FIG. 10 shows an example of a flow diagram illustrating a host unmap operation.
[0026] FIG. 11 shows an example of a flow diagram illustrating an internal unmap process.
[0027] FIGS. 12a and 12b illustrate an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the second embodiment of the invention.
[0028] FIG. 13 shows an example of the SSD memory.
[0029] FIGS. 14a and 14b show an example illustrating an overview of the reclamation process.
[0030] FIG. 5 shows an example of a flow diagram illustrating the reclamation process.
DETAILED DESCRIPTION OF THE INVENTION
[0031] In the following detailed description of the invention, reference is made to the accompanying drawings which form a part of the disclosure, and in which are shown by way of illustration, and not of limitation, exemplary embodiments by which the invention may be practiced. In the drawings, like numerals describe substantially similar components throughout the several views. Further, it should be noted that while the detailed description provides various exemplary embodiments, as described below and as illustrated in the drawings, the present invention is not limited to the embodiments described and illustrated herein, but can extend to other embodiments, as would be known or as would become known to those skilled in the art. Reference in the specification to "one embodiment," "this embodiment," or "these
embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention, and the appearances of these phrases in various places in the specification are not necessarily all referring to the same embodiment. Additionally, in the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that these specific details may not all be needed to practice the present invention. In other circumstances, well-known structures, materials, circuits, processes and interfaces have not been described in detail, and/or may be illustrated in block diagram form, so as to not unnecessarily obscure the present invention.
[0032] Furthermore, some portions of the detailed description that follow are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to most effectively convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In the present invention, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result.
Usually, though not necessarily, these quantities take the form of electrical or magnetic signals or instructions capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, instructions, or the like. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise, as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as
"processing," "computing," "calculating," "determining," "displaying," or the like, can include the actions and processes of a computer system or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.
[0033] The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially
constructed for the required purposes, or it may include one or more general- purpose computers selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored in a computer- readable storage medium including non-transitory medium, such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of media suitable for storing electronic information. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs and modules in accordance with the teachings herein, or it may prove convenient to construct a more specialized apparatus to perform desired method steps. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein. The instructions of the programming language(s) may be executed by one or more processing devices, e.g., central processing units (CPUs), processors, or controllers.
[0034] Exemplary embodiments of the invention, as will be described in greater detail below, provide apparatuses, methods and computer programs for life management of a solid state memory system which is applied to data de-duplication.
[0035] Embodiment 1
[0036] FIG. 1 illustrates an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the first embodiment of the invention. The computer system includes a storage system 1 and a host computer 2. The physical storage system 1 contains a host interface which connects to the host 2, a
CPU, a data de-duplication engine, a storage memory 20, a cache memory, a disk interface, and disk drives 4, which are connected to each other by a bus interface such as PCI, DDR, SCSI, or the like. A logical unit (LU) 3 is a logical data store which is created by multiple disk drives 4. For example, the disk drive 4 is a solid state disk (SSD) that contains flash memory media. In alternative embodiments, the disk drive 4 may be a hard disk drive (HDD). [0037] FIG. 2 shows an example of the storage memory 20. The storage memory 20 contains a storage program 21 , a data de-duplication program 22, a thin provisioning program 23, a hash table 30, a location table 40, and a least recently used (LRU) list 50. The storage program 21 performs host read and write operation, management of LU, and management of cache memory operation, and so on. The storage program 21 contains a data de- duplication program 22 and a thin provisioning program 23. The data de- duplication program 22 performs management of mapping information between LBA (Logical Block Addressing) and hash value or between hash value and physical block address (PBA), and calculates hash value based on input data. The thin provisioning program 23 performs allocation LBA when the storage system receives new write data from the host. Also, the thin provisioning program performs de-allocation LBA which is deleted by the host using an unmap/trim command or is written pattern data such as format data. The hash table 30 contains mapping information between LBA and hash value to provision data segment and locate actual data segment. The location table 40 contains mapping information between hash value and Physical Block Address (PBA) to manage location of stored actual data segment and free space to reuse actual data segment which is already un-referenced. The LRU list 50 is a sorted list of un-referenced hash value from any LBA to manage candidates of erase block.
[0038] FIG. 3 shows an example of a hash table 30. The hash table 30 contains a Logical Unit Number (LUN) field 31 , a Logical Block Address
(LBA) field 31 , an unmap flag 33, and a hash value field 34. The LUN field 31 identifies the internal Logical Unit 3 in the storage system 1. The LBA field 32 67034 identifies the address of the LUN. A data segment size is a fixed size defined by data de-duplication size; for example, the data segment size is 4kB. The unmap flag 33 is an allocation or de-allocation physical block flag for thin
provisioning. When the unmap flag 33 is stored with the value 1, the hash value field 34 is invalidate and the LBA segment does not allocate actual data segment. If host 2 reads a LBA which has the unmap flag 33 set to the value 1 , then the host returns read data with specific format data such as all zero.
The hash value field 34 is a fingerprint of the actual data segment. The
storage program calculates a hash value based on the actual data segment to identify the same data segment.
[0039] FIG. 4 shows an example of a location table 40. The location table 40 contains a valid bit field 41 , a hash value field 42, a reference count field 43, and a Physical Block Address (PBA) field 44. The valid bit field 41 is entry of a validation or invalidation flag. When the valid bit 41 is set to 1 , the hash value field 42 and reference count field 43 are valid. When the valid bit
41 is set to 0, the hash value field 42 and reference count field 43 are invalid and the PBA segment is unmapped. When the storage program issues an unmap or trim command to the SSD, then the segment is unmapped and the valid bit 41 is set to 0. When the storage program allocates physical data
segment, then the segment is anchored and the valid bit 41 is set to 1. The hash value field 42 has the same value as the hash value field 34 of the hash table 30. The reference count field 43 is a counter of reference by the hash table 30. When 10 LBA segments refer to one hash value and 10 LBA
segments are same data duplication, the reference count 43 is set to 10.
When any LBA segment does not refer to hash value, then the reference count 43 is set to 0. When the host writes the same data as the PBA unreferenced data, the storage program changes the hash table and the reference count is incremented. The PBA field 44 identifies the location of the SSD. The PBA field 44 contains the disk identifier and LBA of the disk.
[0040] FIG. 5 shows an example of the least recently used (LRU) list 50. The LRU list 50 contains a hash value list 51. The hash value list 51 contains stored hash value 42 for which the reference counter 43 is set to 0 to reuse un-referenced data. When the storage program detects insufficient write space, then the storage program issues an unmap or trim command to the SSD, so that the storage program chooses the head of the hash value list 51 as a candidate physical segment for the unmap command. When the host writes the same data as the PBA un-referenced data, the storage program changes the hash table and the reference count is incremented and the corresponding hash value is deleted from the LRU list 51. When any LBA segment does not refer to the hash value and the reference count 43 is set to 0, then the storage program inserts the corresponding hash value for which the reference count 43 is set to 0 at the bottom of the LRU list 51.
[0041] FIGS. 6a and 6b show examples of prior write operations. FIG. 6a shows three mapping layers of logical block address space 61 , hash value space 62, and physical block address space 63. The logical block address space 61 is LU 3 address space. The hash value space 62 is fingerprint key of actual physical data location for data de-duplication. The logical block address space 61 is SSD physical location space. The hash table 30 (FIG. 3) is used for mapping between the logical block address space 61 and hash value space 62. The location table 40 (FIG. 4) is used for mapping between the hash value space 62 and physical block address space 63.
[0042] As seen in FIG. 6a, the LBA stores Data C 601 segment and the Data C 601 is pointed to hash value 602. The actual data segment of hash value 602 is stored in PBA segment 603. The PBA segment 604 is erased and the storage program may write to the PBA segment 604 for free space.
[0043] As seen in FIG. 6b, when the host writes data D 610 to LBA 601 , the existing Data C 601 is overwritten by Data D 610. The storage program calculates a hash value 611 corresponding to Data D 610. When the hash value 611 of Data D 610 is a new hash value, the storage program searches a free space using the valid bit 41 of the location table 40 and allocates the erased data segment 604 of FIG. 6a. The storage program writes actual data segment of Data D 612 to the allocated segment 604 of FIG. 6a. The Data C segment 603 is un-referenced hash value 602 of FIG. 6a, and then the storage program performs an unmap command and Data C is erased from segment 613. When the host writes the same data as Data C 603, the storage program needs to write physical address space 613, so that the number of write or erase times is increased.
[0044] FIGS. 7a and 7b show examples illustrating an overview of the write operation. In FIG. 7a, when the host writes data D 610 to LBA 701 , then the existing Data C 610 (of FIG. 6a) is overwritten by Data D 610. The storage program calculates the hash value 611 corresponding to Data D 610.
The storage program writes to actual data segment of Data D 612. The Data
C segment 603 is un-referenced hash value 602, and the storage program decrements the reference count 43 of the location table 40 and the reference count is set to 0 for that un-referenced hash value 602. Then the storage program inserts the corresponding hash value 602 to the LRU list 50.
Therefore, the thin provisioning program does not perform unmap or erase operation for Data C 603.
[0045] In FIG. 7b, when the host writes Data C 740 to LBA 741 , the existing Data B 742 (of FIG. 7a) is overwritten by Data C 740. The storage program calculates the hash value corresponding to Data C 740. The hash value of Data C 740 is existing hash value 602 of the location table 40. The hash value 602 is un-referenced hash value since the reference count 43 is set to zero. Then the storage program modifies LBA 741 entry in the hash table 30 to set the hash value 602 and the unmap flag 33 is set to 0. The storage program increments the reference counter of hash value 602, so that the reference counter is set to 1 (702 in FIG. 7b). The Data B segment 751 is un-referenced hash value 752, and the storage program decrements the reference count 43 of the location table 40 and the reference count is set to 0 for that un-referenced hash value 752. Then the storage program inserts the corresponding hash value 752 to the LRU list 50. Thus, the storage program does not perform an unmap or erase operation for Data B 751. When the host writes the same data as Data C 603, the storage program does not need to rewrite the physical address space 63, so that the number of write or erase times is reduced.
[0046] FIG. 8 shows an example of a flow diagram 800 illustrating a host read miss operation. The flowchart is a read operation of cache miss.
The storage program searches the hash table 30 and location table 40 and reads data from the physical media. In step S801 , the storage program receives a read command from the host 2. The storage program allocates cache memory and divides the read command request size to de-duplication segment size. In step S802, the storage program checks the unmap flag 33 of a segment specified in the read command. If the unmap flag is set to 1
(YES), then the next step is S806. If the unmap flag is set to 0 (NO), then the next step is S803. In step S803, the storage program refers to the hash value
34 from the hash table 30. In step S804, the storage program obtains the physical address 44 from the location table 40 based on the hash value 34. In step S805, the storage program reads the segment related to the physical address 44 and stores it in the cache memory (and proceeds to step S807).
In step S806, the storage program returns the segment related to the specific pattern data and stores it in the cache memory (and proceeds to step S807).
In step S807, the storage program processes the next segment(s) of the read command. If next segment exists (YES), the next step is S802. If all segments have been processed (NO), the next step is S808. In step S808, the storage program returns the read data and a response to the host.
[0047] FIG. 9 shows an example of a flow diagram 900 illustrating a host write and data de-duplication operation. In step S901 , the storage program receives a write command from the host 2. The storage program allocates cache memory and stores the received write data, and then returns a write response to the host. The storage program divides the write command request size into de-duplication segment size. This example is a write back operation. The flow diagram for a write through operation is similar to that shown in FIG. 9 except for the timing of returning a write response. In step
S902, the storage program calculates a hash value and sets the hash value 34 in the hash table 30. In step S903, the storage program searches the calculated hash value at S902 in the hash value field 42 in the location table 40. If the calculated hash value at S902 exists in the location table 40 (YES), then the next step is S906. If the calculated hash value at S902 does not exist in the location table 40 (NO), then the next step is S904.
[0048] In step S904, the storage program finds an entry of valid bit 41 set to 0 in the location table 40 to allocate free physical space. Then the storage program assigns physical space and sets the valid field 41 to 1 and stores the calculated hash value at S902 to the hash value field 42 of the allocated PBA 44 entry. In step S905, the storage program writes to the segment related to the PBA 44 from the cache memory and sets the unmap flag to 0 (and proceeds to step S910).
[0049] In step S906, the storage program checks the unmap flag 33 of the target LBA for storing data of the write command. If the unmap flag is set to 1 (YES), then the next step is S909. If the unmap flag is set to 0 (NO), then the next step is S907. In step S907, the storage program decrements the reference count 43 of the existing hash value entry 42 in the location table 40.
If the reference count is 0, the storage program inserts the unreferenced hash value to the LRU list 50. In step S908, the storage program changes the hash value from an existing hash value of existing data to the hash value which was calculated at S902. In step S909, the storage program updates the hash value 34 of the LBA entry 32 in the hash table 30 and sets the unmap flag to
0. In step S910, the storage program processes the next segment(s) of the write command. If the next segment exists (YES), then the next step is S902.
If all segments have been processed (NO), then the process ends. 2013/067034
[0050] FIG. 10 shows an example of a flow diagram 1000 illustrating a host unmap operation. In step S1001, the storage program receives an
unmap command from the host 2. The storage program divides the unmap command request size into de-duplication segment size. In step S1002, the storage program sets the unmap flag 33 with 1 for the segment of the unmap command in the hash table 30. In step S1003, the storage program
processes the next segment(s) of the unmap command. If the next segment exists (YES), then the next step is S1002. If all segments have been
processed (NO), the next step is S1004. In step S1004, the storage program returns an unmap response to the host.
[0051] FIG. 11 shows an example of a flow diagram 1100 illustrating an internal unmap process. In step S1 01 , the storage program checks the used capacity of physical space. If the physical capacity exceeds a threshold
(YES), the next step is S1102. If the physical capacity does not exceed the threshold (NO), then the process ends. In step S1102, the storage program refers to the multiple PBA entries 44 in the location table 40 using the head of the hash values 51 of the LRU list 50. In step S1103, the storage program issues an unmap or trim command to the physical media 4 to candidate PBA list at S1102. In step S1104, the storage program deletes the hash value
entries in the LRU list 50 and invalidates the valid bit 41 at the corresponding hash value entries in the location table 40.
[0052] According to specific embodiments of this invention, the storage system keeps un-referenced data for future reuse. When the host writes the same data as un-referenced data, then the storage system avoids the re-write operation of the same data. As a result, the number of write or erase operation times is reduced.
[0053] Embodiments 2
[0054] FIGS. 12a and 12b illustrate an example of a hardware configuration of a computer system in which the method and apparatus of the invention may be applied according to the second embodiment of the invention. The computer system includes a host 2, a storage system 120, and SSD 121.
[0055] FIG. 12a is similar as FIG. 1 except that the data de-duplication engine and storage program 21 are stored in the memory 22 does not have data de-duplication program 22. In FIG. 12b, the data de-duplication capability SSD 121 contains a plurality of drive interfaces which connect to the storage disk interface, CPU, data de-duplication engine, SSD Memory 130, cache memory, flash interface, and flash memory media, which are connected to each other by a Bus interface such as PCI, DDR, SCSI, flash interface. A logical unit (LU) 123 is logical data store which is created by multiple flash media.
[0056] FIG. 13 shows an example of the SSD memory 130. The content of the SSD memory 130 of FIG. 13 is similar to that of the memory 20 of FIG. 2 except that the thin provisioning program 23 is not in FIG. 13 and a reclamation program 133 is provided in FIG. 13. As seen in FIG. 13, the SSD memory 130 contains an SSD program 131 , a data de-duplication program
22, a reclamation program 133, a hash table 30, a location table 40, and a least recently used (LRU) list 50. The reclamation program 133 performs management of free physical space based on block address. The block 2013/067034 address is a minimum unit of the erase operation for flash memory media.
The block address size is different from the LBA size, PBA size, or data de- duplication segment size. The reclamation program gathers the existing
fragment valid data to an erase block.
[0057] FIGS. 14a and 14b show an example illustrating an overview of the reclamation process. In FIG. 14a, when the SSD program performs the reclamation process, the SSD program gathers erased physical segment (141 in FIG. 14a) and invalid/un-referenced physical segment for which the
reference count is zero (142 in FIG. 14a). The SSD program checks the
erase address boundary and any contiguous space which only includes
erased physical segment and/or un-referenced physical segment. If it does not find any contiguous physical space, then SSD program performs the
reclamation process. The SSD program moves actual data block which is refereed by one or multiple LBAs for which the reference count is not zero
(143 in FIG. 14a) to the erased block (141 in FIG. 14a). In FIG. 14b, the SSD program changes the PBA field of corresponding hash value in the location table 40 (145 in FIG. 14b). The SSD program erases the block (146 in FIG.
14b) since the block includes invalid physical segment (146 in FIG. 14b) or un-referenced physical segment (147 in FIG. 14b). The SSD program
invalidates the valid bit 41 corresponding to the hash value 142 (148 in FIG.
14b).
[0058] FIG. 15 shows an example of a flow diagram 1500 illustrating the reclamation process. In step S1501 , the SSD program checks the
capacity for any free physical space. If the used capacity of the physical
media exceeds a threshold (YES), the next step is S1502. If the used 13 067034 physical capacity does not exceed the threshold (NO), then the process ends.
In step S1502, the SSD program refers to a candidate erase physical address from the top of the LRU list 50. Also, the SSD program gathers erased
physical address(es) (valid bit = 0) from the location table 40. In step S1503, the SSD program checks the erase address boundary and any contiguous space which only includes erased physical segment and/or un-referenced
physical segment. If it does not find any contiguous physical space, then the SSD program moves the actual data block which is referred to by one or
multiple LBAs and for which the reference count is not zero to the erased
block. Then the SSD program changes the PBA field of the corresponding hash value in the location table 40. In step S1504, the SSD program erases the block since the block includes invalid physical segment or un-referenced physical segment. In step S1505, the SSD program deletes the hash value entries from the LRU list 50 and invalidates the valid bits 41 corresponding to the hash values in the location table.
[0059] Of course, the system configurations illustrated in FIGS. 1 and
12 are purely exemplary of information systems in which the present invention may be implemented, and the invention is not limited to a particular hardware configuration. The computers and storage systems implementing the
invention can also have known I/O devices (e.g., CD and DVD drives, floppy disk drives, hard drives, etc.) which can store and read the modules,
programs and data structures used to implement the above-described
invention. These modules, programs and data structures can be encoded on such computer-readable media. For example, the data structures of the
invention can be stored on computer-readable media independently of one or more computer-readable media on which reside the programs used in the invention. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include local area networks, wide area networks, e.g., the Internet, wireless networks, storage area networks, and the like.
[0060] In the description, numerous details are set forth for purposes of explanation in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that not all of these specific details are required in order to practice the present invention. It is also noted that the invention may be described as a process, which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged.
[0061] As is known in the art, the operations described above can be performed by hardware, software, or some combination of software and hardware. Various aspects of embodiments of the invention may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine-readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out embodiments of the invention.
Furthermore, some embodiments of the invention may be performed solely in hardware, whereas other embodiments may be performed solely in software.
Moreover, the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways. When performed by software, the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.
[0062] From the foregoing, it will be apparent that the invention provides methods, apparatuses and programs stored on computer readable media for life management of a solid state memory system which is applied to data de-duplication. Additionally, while specific embodiments have been illustrated and described in this specification, those of ordinary skill in the art appreciate that any arrangement that is calculated to achieve the same purpose may be substituted for the specific embodiments disclosed. This disclosure is intended to cover any and all adaptations or variations of the present invention, and it is to be understood that the terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with the established doctrines of claim interpretation, along with the full range of equivalents to which such claims are entitled.

Claims

WHAT IS CLAIMED IS:
1. A storage computer comprising:
a storage media; and
a controller being operable to manage a plurality of logical areas, each of the logical areas being an area to be read/written data from/to by an external computer,
wherein the controller is operable to maintain (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the controller receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
2. The storage computer according to claim 1 ,
wherein the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information; and
wherein the controller, upon receiving a command to write said another new data which is the same as the maintained existing data to a logical area of the plurality of logical areas by finding that said another new data has a calculated hash value which is same as the maintained hash value, decrements the reference count of the maintained hash value of the maintained existing data.
3. The storage computer according to claim 2,
wherein the controller is operable to unmap the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
wherein the controller updates the hash value from the maintained hash value to the calculated hash value of said another new data in the hash mapping information; and
wherein the controller sets an unmap flag to a value indicating that the updated hash value is not unmapped.
4. The storage computer according to claim 1 ,
wherein the controller is operable to unmap the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
wherein the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information;
wherein the controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
wherein the controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and
wherein the controller is operable to check a used capacity of the storage media to determine whether the used capacity exceeds a preset threshold, and if the used capacity exceeds the preset threshold, then:
refer to the PBAs in the location mapping information using the hash value entries starting from the head of the least recently used list so as to identify candidate PBAs;
issue an unmap command to the candidate PBAs corresponding to the hash value entries starting from the head of the least recently used list so as to create free capacity of the storage media until the used capacity no longer exceeds the preset threshold;
delete each of the hash value entries from the least recently used list for which the unmap command has been issued to the corresponding candidate PBAs; and
indicate in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
5. The storage computer according to claim 1 ,
wherein the controller is operable to unmap the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
wherein the controller manages hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written, and maintains a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information;
wherein the controller manages location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
wherein the controller manages a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and
wherein the storage media stores data that is erased by overwriting the data; and
wherein the storage media each includes a storage media controller being operable to determine whether a used capacity of the storage media exceeds a preset threshold and, if yes, then:
refer to a candidate erase physical address corresponding to the hash value entry starting from the head of the least recently used list; check candidate erase physical address boundary of the candidate erase physical address and determine whether there is any contiguous physical space which only includes erased physical segment or un-referenced physical segment or both erased physical segment and un-referenced physical segment; and
if there is no contiguous physical space, move an actual data block which is referred to by one or more LBAs and for which the reference count is not zero from the candidate erase physical address to an erased block, change a PBA corresponding to the hash value of the moved data in the location mapping information to the PBA of the erased block, erase a block containing the candidate erase physical address and any un-referenced physical segment, delete the hash value entry of the candidate erase physical address from the least recently used list, and indicate in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
6. A method of managing a plurality of logical areas in a storage computer which includes a storage media, each of the logical areas being an area to be read/written data from/to by an external computer, the method comprising: maintaining (i) a hash value of the existing data to be over-written and (ii) the existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
7. The method according to claim 6, further comprising:
managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written;
maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information; and
upon receiving a command to write said another new data which is the same as the maintained existing data to a logical area of the plurality of logical areas by finding that said another new data has a calculated hash value which is same as the maintained hash value, decrementing the reference count of the maintained hash value of the maintained existing data.
8. The method according to claim 7, further comprising:
unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
updating the hash value from the maintained hash value to the calculated hash value of said another new data in the hash mapping information; and setting an unmap flag to a value indicating that the updated hash value is not unmapped.
9. The method according to claim 6, further comprising:
unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written;
maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information;
managing location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
managing a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and
checking a used capacity of the storage media to determine whether the used capacity exceeds a preset threshold, and if the used capacity exceeds the preset threshold, then: referring to the PBAs in the location mapping information using the hash value entries starting from the head of the least recently used list so as to identify candidate PBAs;
issuing an unmap command to the candidate PBAs corresponding to the hash value entries starting from the head of the least recently used list so as to create free capacity of the storage media until the used capacity no longer exceeds the preset threshold;
deleting each of the hash value entries from the least recently used list for which the unmap command has been issued to the corresponding candidate PBAs; and
indicating in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
10. The method according to claim 6, wherein the storage media stores data that is erased by overwriting the data, the method further comprising: unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written;
managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written;
maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information; managing location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
managing a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and
determining whether a used capacity of the storage media exceeds a preset threshold and, if yes, then:
referring to a candidate erase physical address corresponding to the hash value entry starting from the head of the least recently used list;
checking candidate erase physical address boundary of the candidate erase physical address and determining whether there is any contiguous physical space which only includes erased physical segment or un-referenced physical segment or both erased physical segment and un-referenced physical segment; and
if there is no contiguous physical space, moving an actual data block which is referred to by one or more LBAs and for which the reference count is not zero from the candidate erase physical address to an erased block, changing a PBA corresponding to the hash value of the moved data in the location mapping information to the PBA of the erased block, erasing a block containing the candidate erase physical address and any un-referenced physical segment, deleting the hash value entry of the candidate erase physical address from the least recently used list, and indicating in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
11. A computer program for managing a plurality of logical areas in a storage computer which includes a storage media, each of the logical areas being an area to be read/written data from/to by an external computer, the computer program comprising:
code for maintaining (i) a hash value of the existing data to be overwritten and (ii) the existing data to be over-written even when the storage computer receives a request from the external computer for writing new data to a logical area storing the existing data to be over-written, so that the controller can use the maintained hash value and the maintained existing data when another new data which is same as the maintained existing data is to be written to a logical area of the plurality of logical areas.
12. The computer program according to claim 11 , further comprising: code for managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written;
code for maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information; and
code for, upon receiving a command to write said another new data which is the same as the maintained existing data to a logical area of the plurality of logical areas by finding that said another new data has a calculated hash value which is same as the maintained hash value, decrementing the reference count of the maintained hash value of the maintained existing data.
13. The computer program according to claim 12, further comprising: code for unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written; and
code for updating the hash value from the maintained hash value to the calculated hash value of said another new data in the hash mapping information; and
code for setting an unmap flag to a value indicating that the updated hash value is not unmapped.
14. The computer program according. to claim 11 , further comprising: code for unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written; and
code for managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written; code for maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information;
code for managing location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
code for managing a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and
code for checking a used capacity of the storage media to determine whether the used capacity exceeds a preset threshold, and if the used capacity exceeds the preset threshold, then:
code for referring to the PBAs in the location mapping information using the hash value entries starting from the head of the least recently used list so as to identify candidate PBAs;
code for issuing an unmap command to the candidate PBAs corresponding to the hash value entries starting from the head of the least recently used list so as to create free capacity of the storage media until the used capacity no longer exceeds the preset threshold;
code for deleting each of the hash value entries from the least recently used list for which the unmap command has been issued to the corresponding candidate PBAs; and code for indicating in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
15. The computer program according to claim 11 , wherein the storage media stores data that is erased by overwriting the data, the method further comprising:
code for unmapping the logical area, which is mapped to a storage area of the storage media storing existing data to be over-written, from the storage area storing data to be over-written, based on the request from the external computer for writing new data to the logical area storing the existing data to be over-written; and
code for managing hash mapping information between logical block addresses (LBAs) of the logical areas and the hash values of data to be written;
code for maintaining a reference count for each of the hash values representing a number of times the hash value is referenced in the hash mapping information;
code for managing location mapping information between hash values of data to be written and physical block addresses (PBAs) of the storage media storing the data to be written;
code for managing a least recently used list containing a list of hash value entries for which the reference counters are set to zero and which are arranged chronologically from a head to a bottom of the least recently used list; and code for determining whether a used capacity of the storage media exceeds a preset threshold and, if yes, then:
code for referring to a candidate erase physical address corresponding to the hash value entry starting from the head of the least recently used list; code for checking candidate erase physical address boundary of the candidate erase physical address and determining whether there is any contiguous physical space which only includes erased physical segment or un-referenced physical segment or both erased physical segment and unreferenced physical segment; and
code for, if there is no contiguous physical space, moving an actual data block which is referred to by one or more LBAs and for which the reference count is not zero from the candidate erase physical address to an erased block, changing a PBA corresponding to the hash value of the moved data in the location mapping information to the PBA of the erased block, erasing a block containing the candidate erase physical address and any unreferenced physical segment, deleting the hash value entry of the candidate erase physical address from the least recently used list, and indicating in the location information that the hash value corresponding to the deleted hash value entry from the least recently used list is invalid.
PCT/US2013/067034 2013-10-28 2013-10-28 Method and apparatus of data de-duplication for solid state memory WO2015065312A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2013/067034 WO2015065312A1 (en) 2013-10-28 2013-10-28 Method and apparatus of data de-duplication for solid state memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2013/067034 WO2015065312A1 (en) 2013-10-28 2013-10-28 Method and apparatus of data de-duplication for solid state memory

Publications (1)

Publication Number Publication Date
WO2015065312A1 true WO2015065312A1 (en) 2015-05-07

Family

ID=53004734

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/067034 WO2015065312A1 (en) 2013-10-28 2013-10-28 Method and apparatus of data de-duplication for solid state memory

Country Status (1)

Country Link
WO (1) WO2015065312A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111427855A (en) * 2016-09-28 2020-07-17 华为技术有限公司 Method for deleting repeated data in storage system, storage system and controller
CN111435289A (en) * 2019-01-15 2020-07-21 爱思开海力士有限公司 Memory controller with improved mapped data access performance and method of operating the same
CN112148220A (en) * 2020-09-17 2020-12-29 合肥大唐存储科技有限公司 Method and device for realizing data processing, computer storage medium and terminal
CN112527693A (en) * 2020-12-11 2021-03-19 苏州浪潮智能科技有限公司 Wear leveling method, system, equipment and medium for solid state disk
CN113243013A (en) * 2018-12-21 2021-08-10 美光科技公司 Data integrity protection for relocating data in a memory system
CN113806071A (en) * 2021-08-10 2021-12-17 中标慧安信息技术股份有限公司 Data synchronization method and system for edge computing application
US11455279B2 (en) 2018-11-05 2022-09-27 International Business Machines Corporation Distributed data deduplication reference counting

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100223495A1 (en) * 2009-02-27 2010-09-02 Leppard Andrew Minimize damage caused by corruption of de-duplicated data
US20110238635A1 (en) * 2010-03-25 2011-09-29 Quantum Corporation Combining Hash-Based Duplication with Sub-Block Differencing to Deduplicate Data
US20120166401A1 (en) * 2010-12-28 2012-06-28 Microsoft Corporation Using Index Partitioning and Reconciliation for Data Deduplication
US20120198139A1 (en) * 2007-09-28 2012-08-02 Hitachi, Ltd. Storage device and deduplication method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120198139A1 (en) * 2007-09-28 2012-08-02 Hitachi, Ltd. Storage device and deduplication method
US20100223495A1 (en) * 2009-02-27 2010-09-02 Leppard Andrew Minimize damage caused by corruption of de-duplicated data
US20110238635A1 (en) * 2010-03-25 2011-09-29 Quantum Corporation Combining Hash-Based Duplication with Sub-Block Differencing to Deduplicate Data
US20120166401A1 (en) * 2010-12-28 2012-06-28 Microsoft Corporation Using Index Partitioning and Reconciliation for Data Deduplication

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111427855A (en) * 2016-09-28 2020-07-17 华为技术有限公司 Method for deleting repeated data in storage system, storage system and controller
CN111427855B (en) * 2016-09-28 2024-04-12 华为技术有限公司 Method for deleting repeated data in storage system, storage system and controller
US11455279B2 (en) 2018-11-05 2022-09-27 International Business Machines Corporation Distributed data deduplication reference counting
CN113243013A (en) * 2018-12-21 2021-08-10 美光科技公司 Data integrity protection for relocating data in a memory system
CN111435289A (en) * 2019-01-15 2020-07-21 爱思开海力士有限公司 Memory controller with improved mapped data access performance and method of operating the same
CN111435289B (en) * 2019-01-15 2023-07-21 爱思开海力士有限公司 Memory controller with improved mapped data access performance and method of operating the same
CN112148220A (en) * 2020-09-17 2020-12-29 合肥大唐存储科技有限公司 Method and device for realizing data processing, computer storage medium and terminal
CN112148220B (en) * 2020-09-17 2024-03-19 合肥大唐存储科技有限公司 Method, device, computer storage medium and terminal for realizing data processing
CN112527693A (en) * 2020-12-11 2021-03-19 苏州浪潮智能科技有限公司 Wear leveling method, system, equipment and medium for solid state disk
CN112527693B (en) * 2020-12-11 2023-01-06 苏州浪潮智能科技有限公司 Wear leveling method, system, equipment and medium for solid state disk
CN113806071A (en) * 2021-08-10 2021-12-17 中标慧安信息技术股份有限公司 Data synchronization method and system for edge computing application
CN113806071B (en) * 2021-08-10 2022-08-19 中标慧安信息技术股份有限公司 Data synchronization method and system for edge computing application

Similar Documents

Publication Publication Date Title
US11216185B2 (en) Memory system and method of controlling memory system
CN110678836B (en) Persistent memory for key value storage
KR102421075B1 (en) Memory device comprising stream detector and operating method thereof
US9229876B2 (en) Method and system for dynamic compression of address tables in a memory
WO2015065312A1 (en) Method and apparatus of data de-duplication for solid state memory
JP5606938B2 (en) Data storage device, method for accessing recording medium, and recording medium therefor
KR101257691B1 (en) Memory controller and data management method
US20170139825A1 (en) Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
US20140082310A1 (en) Method and apparatus of storage tier and cache management
WO2017000658A1 (en) Storage system, storage management device, storage device, hybrid storage device, and storage management method
US9104327B2 (en) Fast translation indicator to reduce secondary address table checks in a memory device
CN108604165B (en) Storage device
US9489239B2 (en) Systems and methods to manage tiered cache data storage
US20150347310A1 (en) Storage Controller and Method for Managing Metadata in a Cache Store
US8635399B2 (en) Reducing a number of close operations on open blocks in a flash memory
US9910798B2 (en) Storage controller cache memory operations that forego region locking
KR20100021868A (en) Buffer cache management method for flash memory device
US20120297140A1 (en) Expandable data cache
KR101017067B1 (en) Locality-Aware Garbage Collection Technique for NAND Flash Memory-Based Storage Systems
CN106326132B (en) Storage system, storage management device, memory, hybrid storage device, and storage management method
KR101155542B1 (en) Method for managing mapping table of ssd device
KR20120039166A (en) Nand flash memory system and method for providing invalidation chance to data pages
KR101379161B1 (en) Using bloom-filter of reverse mapping method and system for enhancing performance of garbage collection in storage systems
WO2015156758A1 (en) Method and apparatus of cache promotion between server and storage system
KR20240111560A (en) Method and System for updating file in a Filesystem

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13896567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13896567

Country of ref document: EP

Kind code of ref document: A1