CN111290883B - Simplified replication method based on deduplication - Google Patents

Simplified replication method based on deduplication Download PDF

Info

Publication number
CN111290883B
CN111290883B CN202010094567.5A CN202010094567A CN111290883B CN 111290883 B CN111290883 B CN 111290883B CN 202010094567 A CN202010094567 A CN 202010094567A CN 111290883 B CN111290883 B CN 111290883B
Authority
CN
China
Prior art keywords
logical volume
data
hash
node
merkle tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010094567.5A
Other languages
Chinese (zh)
Other versions
CN111290883A (en
Inventor
周耀辉
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orca Data Technology Xian Co Ltd
Original Assignee
Orca Data Technology Xian Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Orca Data Technology Xian Co Ltd filed Critical Orca Data Technology Xian Co Ltd
Priority to CN202010094567.5A priority Critical patent/CN111290883B/en
Publication of CN111290883A publication Critical patent/CN111290883A/en
Application granted granted Critical
Publication of CN111290883B publication Critical patent/CN111290883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/137Hash-based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for simplifying and copying based on deduplication, which comprises the steps of writing operation of a logical volume and reading operation of the logical volume; the write operation of the logical volume comprises the following steps: s1: dividing the address of the logical volume by 4KB to calculate the VBN, namely the logical block number; s2: calculating a hash value according to the data content of 4KB through a hash function; s3: updating to a leaf node corresponding to VBN in the Merkle Tree; s4: calculating that the 4KB data should be sent to a target node in the distributed cluster through DHT (distributed hash table) by using the hash value of the 4KB data; s5: recording PBN (physical block number) of 4KB data falling on the node through Objectrecord, namely, flashing the data to a physical disk, wherein the Objectrecord can record information such as hash value, reference count, PBN and the like of the object; the invention realizes the deletion and management of repeated data of the logical volume and the efficient operation of the replication, snapshot and cloning of the logical volume in the distributed system, thereby improving the efficiency.

Description

Simplified replication method based on deduplication
Technical Field
The invention relates to the technical field of distributed storage systems, in particular to a method for simplifying and copying based on deduplication.
Background
With the development of cloud computing, conventional storage device products increasingly exhibit various limitations. The distributed storage system is applied, the problems of transverse expansion, performance bottleneck, single-point failure and the like of the storage system are solved, the reliability, the availability and the storage efficiency of the system are greatly improved, and the distributed storage system is applied to the technology of deleting repeated data by utilizing the storage space of the storage system more efficiently. De-duplication is defined as a special data compression technology capable of removing redundant data at coarse granularity in Wikipedia, and is generally file-level or block-level matching, and the aim of the de-duplication is to achieve balance between performance and de-duplication ratio;
at present, data deduplication in the industry is generally used in a backup system, and the main method is that a user uploads data from a source end to a target end through a network, a single function is used at the target end to generate a fingerprint of the data to be uploaded, whether similar data exists is judged by comparing the fingerprints, and whether the data is stored is determined. The method for deleting the repeated data has certain limitation and is only suitable for a backup system
Therefore, a method for simplifying and copying based on deduplication is provided.
Summary of the invention
1. Technical problem to be solved
Aiming at the problems in the prior art, the invention aims to provide a simplified replication method based on deduplication, which realizes deletion and management of repeated data of a logical volume and efficient operation of replication, snapshot and cloning of the logical volume in a distributed system.
2. Technical scheme
In order to solve the above problems, the present invention adopts the following technical solutions.
A method for simplifying copy based on deduplication comprises write operation of a logical volume and read operation of the logical volume;
the write operation of the logical volume comprises the following steps:
s1: dividing the address of the logical volume by 4KB to calculate the VBN, namely the logical block number;
s2: calculating a hash value according to the data content of 4KB through a hash function;
s3: updating to a leaf node corresponding to VBN in the Merkle Tree;
s4: calculating that the 4KB data should be sent to a target node in the distributed cluster through DHT (distributed hash table) by using the hash value of the 4KB data;
s5: recording PBN (physical block number) of 4KB data falling on the node through Objectrecord, namely, flashing the data to a physical disk, wherein the Objectrecord can record information such as hash value, reference count, PBN and the like of the object;
the read operation of the logical volume comprises the following steps:
s1: calculate the VBN (logical Block number) of the logical volume by dividing the address of the logical volume by 4 KB;
s2: finding the hash value of the leaf node of the Merkle Tree through the VBN;
s3: calculating the content of the data block corresponding to the hash on a target node in the cluster through DHT (distributed hash table);
s4: and then reads the data to the corresponding physical disk through the PBN of Objectrecord.
Further, in the distributed storage system, the logical volume is divided by 4KB units, each 4KB is an object, each object calculates a 20-bit hash value according to its content by a hash function, and if the hash values of some objects in the logical volume are the same, the objects have the same content, and it can be determined that the objects belong to the duplicated data.
Further, the logical volume is managed for its object in a Merkle Tree.
Further, the Merkle Tree is a binary Tree, and is composed of a group of leaf nodes, a group of intermediate nodes and a root node, each node is used for storing the hash values of its child nodes, and the size of each node is 4KB, so that a node can store 4096/20-204 hash values.
Furthermore, the leaf nodes are the stored hash values of the objects of the logical volume, the hash values calculated by the hash function of the hash values of 204 leaf nodes are stored in the parent nodes of the leaf nodes, the hash values calculated by the hash function of the hash values of 204 parent nodes are stored in the parent nodes of the upper layer of the leaf nodes, and the hash values of the root nodes are the hash values of the logical volume.
3. Advantageous effects
Compared with the prior art, the invention has the advantages that:
the invention aims to provide a method for simplifying and copying based on deduplication, in a logical volume, for repeated data, the hash values of objects of the repeated data are the same, the values calculated by DHT and object record are the same, and only one copy of the repeated data is stored to the bottom layer, so that the function of deleting the repeated data is achieved in a distributed system;
for the replication of the logical volume, the snapshot of the logical volume and the clone of the logical volume can construct a replicated Merkle Tree, a snapshot Merkle Tree and a clone Merkle Tree with the same size based on the Merkle Tree of the logical volume, and then synchronize the content of each node in the Merkel Tree of the logical volume to the same node of the target Merkel Tree, because the contents of the replicated, snapshot and cloned data and the logical volume are the same, the hash value of the corresponding object is the same, the data stored in the bottom layer are the same, and all the operations of the replication, snapshot and cloning of the logical volume can be rapidly performed through the migration of the Merkel Tree under the condition of not carrying the back-end data, so that the efficiency is improved.
Drawings
FIG. 1 is a flow chart illustrating a logical volume write operation of the present invention;
FIG. 2 is a flow chart illustrating a logical volume read operation according to the present invention;
FIG. 3 is a schematic flow diagram of the Merkel Tree of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention; it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by those skilled in the art without any inventive work are within the scope of the present invention.
Example 1:
referring to fig. 1, a deduplication based thin copy method includes a write operation of a logical volume and a read operation of the logical volume;
the write operation of the logical volume comprises the following steps:
s1: dividing the address of the logical volume by 4KB to calculate the VBN, namely the logical block number;
s2: calculating a hash value according to the data content of 4KB through a hash function;
s3: updating to a leaf node corresponding to VBN in the Merkle Tree;
s4: calculating that the 4KB data should be sent to a target node in the distributed cluster through DHT (distributed hash table) by using the hash value of the 4KB data;
s5: recording PBN (physical block number) of 4KB data falling on the node through Objectrecord, namely, flashing the data to a physical disk, wherein the Objectrecord can record information such as hash value, reference count, PBN and the like of the object;
the read operation of the logical volume comprises the following steps:
s1: calculate the VBN (logical Block number) of the logical volume by dividing the address of the logical volume by 4 KB;
s2: finding the hash value of the leaf node of the Merkle Tree through the VBN;
s3: calculating the content of the data block corresponding to the hash on a target node in the cluster through DHT (distributed hash table);
s4: and then reads the data to the corresponding physical disk through the PBN of Objectrecord.
In the distributed storage system, the logical volume is divided by 4KB as a unit, each 4KB is an object, each object calculates a 20-bit hash value according to the content of the object through a hash function, if the hash values of some objects in the logical volume are the same, the objects can be judged to belong to the repeated data, the logical volume is managed according to a Merkle Tree, the Merkle Tree is a binary Tree which is composed of a group of leaf nodes, a group of intermediate nodes and a root node, each node is used for storing the hash value of the child node, the size of the node is 4KB, so that the hash value which can be stored by one node is 4096/20, the leaf nodes are the hash values of the objects of the stored logical volume, the hash value content of 204 leaf nodes is stored on the parent node through the hash value calculated by the hash function, the hash value content of 204 parent nodes is stored on the parent node through the hash function, by this interpolation, the hash value of the root node, which is the hash value of the logical volume, can be calculated.
According to the technical scheme, in the logical volume, the hash values of the objects of the repeated data are the same, the values calculated by DHT and the object record are the same, and only one copy of the repeated data is stored to the bottom layer, so that the function of deleting the repeated data is realized in a distributed system;
for the replication of the logical volume, the snapshot of the logical volume and the clone of the logical volume can construct a replicated Merkle Tree, a snapshot Merkle Tree and a clone Merkle Tree with the same size based on the Merkle Tree of the logical volume, and then synchronize the content of each node in the Merkel Tree of the logical volume to the same node of the target Merkel Tree, because the contents of the replicated, snapshot and cloned data and the logical volume are the same, the hash value of the corresponding object is the same, the data stored in the bottom layer are the same, and all the operations of the replication, snapshot and cloning of the logical volume can be rapidly performed through the migration of the Merkel Tree under the condition of not carrying the back-end data, so that the efficiency is improved.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (1)

1. A method for simplifying copy based on deduplication is characterized by comprising write operation of a logical volume and read operation of the logical volume;
the write operation of the logical volume comprises the following steps:
s1: dividing the address of the logical volume by 4KB to calculate VBN, namely a logical block number;
s2: calculating a hash value according to the data content of 4KB through a hash function;
s3: updating to a leaf node corresponding to VBN in the Merkle Tree;
s4: then, the hash value of the 4KB data is used for calculating that the 4KB data is to be sent to a target node in the distributed cluster through DHT (distributed hash table);
s5: recording PBN (physical block number) of 4KB data falling on the node through Objectrecord, and flashing the data onto a physical disk, wherein the Objectrecord records the hash value, the reference count and the PBN of the object;
the read operation of the logical volume comprises the following steps:
s1: dividing the address of the logical volume by 4KB to calculate VBN, namely a logical block number;
s2: finding a hash value on a leaf node of the Merkle Tree through the VBN;
s3: then, calculating a target node of the data block content corresponding to the hash value in the cluster through DHT (distributed hash table);
s4: reading data from the corresponding physical disk through the PBN of Objectrecord;
in the distributed storage system, the logical volume is divided by taking 4KB as a unit, each 4KB is an object, each object calculates a 20-bit hash value according to the content of the object through a hash function, if the hash values of some objects in the logical volume are the same, the objects can be judged to belong to repeated data, and the values calculated through DHT and objectrecord are the same;
for the replication of the logical volume, constructing a replicated Merkle Tree, a snapshot Merkle Tree and a clone Merkle Tree with the same size on the basis of the Merkle Tree of the logical volume, synchronizing the content of each node in the Merkle Tree of the logical volume to the same node of a target Merkle Tree, wherein the hash values of corresponding objects are the same and the data stored at the bottom layer are the same;
the Merkle Tree is a binary Tree and comprises a group of leaf nodes, a group of intermediate nodes and a root node, wherein each node is used for storing the hash values of the child nodes, and the size of each node is 4KB, so that 4096/20=204 hash values can be stored by one node;
the leaf nodes are the hash values of the objects of the stored logical volume, the hash values calculated by the hash function of the hash values of 204 leaf nodes are stored in the parent nodes of the leaf nodes, the hash values calculated by the hash function of the hash values of 204 parent nodes are stored in the parent nodes of the upper layer of the leaf nodes, and the hash values of the root nodes are the hash values of the logical volume.
CN202010094567.5A 2020-02-16 2020-02-16 Simplified replication method based on deduplication Active CN111290883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010094567.5A CN111290883B (en) 2020-02-16 2020-02-16 Simplified replication method based on deduplication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010094567.5A CN111290883B (en) 2020-02-16 2020-02-16 Simplified replication method based on deduplication

Publications (2)

Publication Number Publication Date
CN111290883A CN111290883A (en) 2020-06-16
CN111290883B true CN111290883B (en) 2021-03-26

Family

ID=71022370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010094567.5A Active CN111290883B (en) 2020-02-16 2020-02-16 Simplified replication method based on deduplication

Country Status (1)

Country Link
CN (1) CN111290883B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664223A (en) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 A kind of distributed storage method, device, computer equipment and storage medium
CN109522283A (en) * 2018-10-30 2019-03-26 深圳先进技术研究院 A kind of data de-duplication method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8706701B1 (en) * 2010-11-18 2014-04-22 Emc Corporation Scalable cloud file system with efficient integrity checks
CN104077423B (en) * 2014-07-23 2017-05-03 山东大学(威海) Consistent hash based structural data storage, inquiry and migration method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664223A (en) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 A kind of distributed storage method, device, computer equipment and storage medium
CN109522283A (en) * 2018-10-30 2019-03-26 深圳先进技术研究院 A kind of data de-duplication method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于重复数据删除技术的数据容灾***的研究";廖海生;《中国优秀硕士学位论文全文数据库 信息科技辑》;20111215;第I138-153页 *

Also Published As

Publication number Publication date
CN111290883A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
US11797510B2 (en) Key-value store and file system integration
US10459632B1 (en) Method and system for automatic replication data verification and recovery
US9043540B2 (en) Systems and methods for tracking block ownership
US10810161B1 (en) System and method for determining physical storage space of a deduplicated storage system
CN106201771B (en) Data-storage system and data read-write method
CN109313538B (en) Inline deduplication
US11861169B2 (en) Layout format for compressed data
US11748208B2 (en) Persistent memory architecture
US11768807B2 (en) Destination namespace and file copying
US11994998B2 (en) Low-overhead atomic writes for persistent memory
CN113535670B (en) Virtual resource mirror image storage system and implementation method thereof
US11822520B2 (en) Freeing pages within persistent memory
US11960448B2 (en) Unified object format for retaining compression and performing additional compression for reduced storage consumption in an object store
US11544007B2 (en) Forwarding operations to bypass persistent memory
US20240128984A1 (en) Additional compression for existing compressed data
CN111290883B (en) Simplified replication method based on deduplication
US11599506B1 (en) Source namespace and file copying
US11593218B1 (en) Source file copying and error handling
US11977521B2 (en) Source file copying
US12001724B2 (en) Forwarding operations to bypass persistent memory
US20230105587A1 (en) Destination file copying
Phyu et al. Efficient data deduplication scheme for scale-out distributed storage
JP2022091062A (en) Information processing device, duplication removing method, and duplication removing program
Phyu et al. Using Efficient Deduplication Method in Large-scale Distributed Storage System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A thin replication method based on re deletion

Effective date of registration: 20210928

Granted publication date: 20210326

Pledgee: Xi'an investment and financing Company limited by guarantee

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2021980010139

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20221009

Granted publication date: 20210326

Pledgee: Xi'an investment and financing Company limited by guarantee

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2021980010139

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A compact replication method based on re deletion

Effective date of registration: 20221017

Granted publication date: 20210326

Pledgee: Xi'an investment and financing Company limited by guarantee

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2022610000660

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20231101

Granted publication date: 20210326

Pledgee: Xi'an investment and financing Company limited by guarantee

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2022610000660