CN102982180A - Method and device for storing data - Google Patents

Method and device for storing data Download PDF

Info

Publication number
CN102982180A
CN102982180A CN2012105520997A CN201210552099A CN102982180A CN 102982180 A CN102982180 A CN 102982180A CN 2012105520997 A CN2012105520997 A CN 2012105520997A CN 201210552099 A CN201210552099 A CN 201210552099A CN 102982180 A CN102982180 A CN 102982180A
Authority
CN
China
Prior art keywords
backup data
data piece
fingerprint
stored
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012105520997A
Other languages
Chinese (zh)
Other versions
CN102982180B (en
Inventor
付旭东
段雨梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan and Magnetic Technology Co., Ltd.
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201210552099.7A priority Critical patent/CN102982180B/en
Publication of CN102982180A publication Critical patent/CN102982180A/en
Application granted granted Critical
Publication of CN102982180B publication Critical patent/CN102982180B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for storing data. The method comprises the following steps: matching a fingerprint of each data block of a to-be-stored file with a fingerprint in a fingerprint database to obtain a corresponding backup data block; deleting repeated data of the to-be-stored file according to the backup data block, and performing status identification to the backup data block; and performing recovery process to the backup data block according to the status identification of the backup data block.

Description

Date storage method and equipment
Technical field
The embodiment of the invention relates to data processing technique, relates in particular to a kind of date storage method and equipment.
Background technology
Along with the data volume of enterprise constantly increases, a large amount of repeating datas are brought stern challenge to storage.And data de-duplication (Date de-duplication is called for short De-Dupe) reduces the important technology of data carrying cost as by effectively reducing data, more and more comes into one's own.
In the task of carrying out the data storage, usually file to be stored is divided into data block, but data de-duplication technology automatic search repeating data piece, identical block is only kept a unique copy, and use the pointer that points to unique copy to replace other duplicate copies, the simultaneously reference count of this copy increases by 1, the memory technology that eliminate redundant data to reach, reduces storage capacity requirement.When the unique copy data piece that keeps behind the data de-duplication is modified or deletes, to cause its reference count to change, when the reference count of this copy is kept to 0, this copy has just satisfied the condition of refuse collection, this copy is reclaimed as rubbish, thereby discharge more storage space.
Yet in the prior art, when data de-duplication when reclaiming concurrent execution, the data that the pointed that offers duplicate copies has just been reclaimed cause loss of data.
Summary of the invention
The embodiment of the invention provides a kind of data processing method and equipment, to optimize the concurrent execution flow process of data de-duplication and recovery.
First aspect, the embodiment of the invention provide a kind of date storage method, comprising:
The fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, to obtain corresponding Backup Data piece;
According to described Backup Data piece described file to be stored is carried out data de-duplication operations, and carry out status indicator for described Backup Data piece;
Status indicator according to described Backup Data piece recycles described Backup Data piece.
In the possible implementation of the first, according to first aspect, specific implementation is: the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, comprise to obtain corresponding Backup Data piece:
Described file to be stored is carried out piecemeal process, obtain each data block, and calculate the fingerprint of each data block;
Fingerprint to each described data block carries out sample process, and generates the fingerprint sampling table of described file to be stored according to the fingerprint that is drawn into;
According to described fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under described file to be stored is in described cluster sampling storehouse, the data block of having stored that described similar grouping is corresponding is as described Backup Data piece, described cluster sampling storehouse is carried out sample process by described fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the described similar fingerprint sampling table that is grouped in the described cluster sampling storehouse with described file to be stored is complementary.
In the possible implementation of the second, according to first aspect, specific implementation is: according to described Backup Data piece described file to be stored is carried out data de-duplication operations, and comprise for described Backup Data piece carries out status indicator:
Before according to described Backup Data piece described file to be stored being carried out data de-duplication operations, the classified counting of described Backup Data piece is added one;
Finish according to described Backup Data piece described file to be stored is carried out data de-duplication operations after, the classified counting of described Backup Data piece is subtracted one.
In the third possible implementation, the implementation possible according to first aspect the second, specific implementation is: according to the status indicator of described Backup Data piece described Backup Data piece is recycled and comprise:
When the classified counting in the status indicator that recognizes described Backup Data piece is non-vanishing, suspend the recycling to described Backup Data piece;
When the classified counting in the status indicator that recognizes described Backup Data piece is zero, trigger the recycling to described Backup Data piece.
In the 4th kind of possible implementation, according to first aspect or possible implementation or the possible implementation of first aspect the second of first aspect the first, specific implementation is: according to the status indicator of described Backup Data piece described Backup Data piece is recycled and comprise:
When the numerical value of the reference count that monitors described Backup Data piece changes, the status indicator of the Backup Data piece that identification is corresponding;
When the status indicator that recognizes corresponding Backup Data piece shows that described Backup Data piece does not use, then identify the numerical value of the reference count of described Backup Data piece;
When the numerical value of the reference count that recognizes described Backup Data piece is zero, trigger that described Backup Data piece is recycled.
Second aspect, the embodiment of the invention provide a kind of data storage device, comprising:
Backup Data piece acquisition module is used for the fingerprint of each data block of file to be stored and the fingerprint of fingerprint base are mated, to obtain corresponding Backup Data piece;
The data de-duplication module is used for according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and carries out status indicator for described Backup Data piece;
Recycling module is used for according to the status indicator of described Backup Data piece described Backup Data piece being recycled.
In the possible implementation of the first, according to second aspect, specific implementation is: described Backup Data piece acquisition module comprises:
The fingerprint computing unit is used for that described file to be stored is carried out piecemeal and processes, and obtains each data block, and calculates the fingerprint of each data block;
The fingerprint sampling unit is used for the fingerprint of each described data block is carried out sample process, and generates the fingerprint sampling table of described file to be stored according to the fingerprint that is drawn into;
The grouping determining unit, be used for according to described fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under described file to be stored is in described cluster sampling storehouse, the data block of having stored that described similar grouping is corresponding is as described Backup Data piece, described cluster sampling storehouse is carried out sample process by described fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the described similar fingerprint sampling table that is grouped in the described cluster sampling storehouse with described file to be stored is complementary.
In the possible implementation of the second, according to second aspect, specific implementation is: described data de-duplication module comprises:
The first counting unit was used for before according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and the classified counting of described Backup Data piece is added one;
The second counting unit, be used for finish according to described Backup Data piece described file to be stored is carried out data de-duplication operations after, the classified counting of described Backup Data piece is subtracted one.
In the third possible implementation, the implementation possible according to second aspect the second, specific implementation is: described recycling module comprises:
Reclaim and suspend the unit, be used for when the classified counting of the status indicator that recognizes described Backup Data piece is non-vanishing, suspending the recycling to described Backup Data piece;
First reclaims trigger element, is used for triggering the recycling to described Backup Data piece when the classified counting of the status indicator that recognizes described Backup Data piece is zero.
In the 4th kind of possible implementation, according to second aspect or possible implementation or the possible implementation of second aspect the second of second aspect the first, specific implementation is: described recycling module comprises:
The reference count monitoring means is used for when the numerical value of the reference count that monitors described Backup Data piece changes the status indicator of the Backup Data piece that identification is corresponding;
The reference count recognition unit is used for then identifying the numerical value of the reference count of described Backup Data piece when the status indicator that recognizes corresponding Backup Data piece shows that described Backup Data piece does not use;
Second reclaims trigger element, is used for triggering that described Backup Data piece is recycled when the numerical value of the reference count that recognizes described Backup Data piece is zero.
The embodiment of the invention provides a kind of date storage method and equipment, the method is by mating the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base, to obtain corresponding Backup Data piece, according to the Backup Data piece file to be stored is carried out data de-duplication operations, and for the Backup Data piece carries out status indicator, status indicator according to the Backup Data piece recycles the Backup Data piece, data de-duplication is processed preferentially to carry out, solve data de-duplication and processed the problem that causes loss of data with the concurrent execution of recycling, guaranteed to repeat to delete the security of carrying out and having stored data in order of processing and recycling.
Description of drawings
In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art, apparently, accompanying drawing in the following describes only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the process flow diagram of date storage method embodiment one of the present invention;
Fig. 2 is the process flow diagram of date storage method embodiment two of the present invention;
Fig. 3 is the process flow diagram of date storage method embodiment three of the present invention;
Fig. 4 is data storage logic framework embodiment one synoptic diagram of the present invention;
Fig. 5 is data store set gang fight structure embodiment one synoptic diagram of the present invention;
Fig. 6 is the structural drawing of data storage device embodiment one of the present invention;
Fig. 7 is the structural drawing of data storage device embodiment two of the present invention;
Fig. 8 is the structural drawing of data storage device embodiment three of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
Fig. 1 is the process flow diagram of date storage method embodiment one of the present invention, and as shown in Figure 1, present embodiment provides a kind of date storage method, and the method can be carried out by the equipment of any executing data storage operation, can specifically comprise the steps:
Step 101: the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, to obtain corresponding Backup Data piece.
Identical date storage method is all carried out in storage for each file in the present embodiment, and file is file to be stored before storage.Fingerprint in the fingerprint base is the fingerprint of each data block of the file stored.By in the fingerprint of each data block of file to be stored and the fingerprint base the fingerprint of each data block of storage file mate one by one, according to the fingerprint of each data block of file to be stored and the fingerprint similarity of each data block of storage file, determine the Backup Data piece of the correspondence under file to be stored is in fingerprint base.Particularly, when the similarity of the fingerprint of file to be stored data block and the fingerprint of the data block of storage file during more than or equal to default similarity threshold, then think this data block of storage file be the Backup Data piece corresponding with the data block of file to be stored.Similarity can account for for the fingerprint of file to be stored data block and the same or analogous fingerprint number of fingerprint of the data block of storage file the ratio of fingerprint of the data block of file to be stored.
Step 102: according to the Backup Data piece file to be stored is carried out data de-duplication operations, and carry out status indicator for the Backup Data piece.
After having determined the Backup Data piece, in this Backup Data piece file to be stored being carried out data de-duplication processes, concrete delet method can with prior art in similar, the fingerprint of preserving in the fingerprint of each piecemeal of the file to be stored that is about to calculate and this Backup Data piece is complementary.If when having preserved the same or analogous fingerprint of data block with a file to be stored in the Backup Data piece, then delete the data of the data block of this file to be stored; If in the Backup Data piece during not with the same or analogous fingerprint of data block of file to be stored, then the data of the data block of this file to be stored are stored.
When according to the Backup Data piece file to be stored being carried out data de-duplication operations, also to carry out status indicator to the Backup Data piece in this step.Wherein, status indicator be used for to characterize this Backup Data piece whether in the use of data de-duplication operations.
The concrete form of status indicator can have multiple, preferably includes the packet number of Backup Data piece and the classified counting of this Backup Data piece.Classified counting refers to carry out according to this Backup Data piece the number of times of data de-duplication, namely is applicable to the Backup Data piece and is used in the repetition deletion action of a plurality of executed in parallel.Therefore, according to described Backup Data piece described file to be stored is carried out data de-duplication operations, and carry out the operation of status indicator preferably before according to the Backup Data piece file to be stored being carried out data de-duplication operations for described Backup Data piece, the classified counting of Backup Data piece is added one, finish according to the Backup Data piece file to be stored is carried out data de-duplication operations after, the classified counting of this Backup Data piece is subtracted one.
It will be understood by those skilled in the art that when classified counting is non-vanishing, namely carrying out data de-duplication operations.In the present embodiment, also be provided with one and heavily delete tabulation, this heavy status indicator that has comprised each Backup Data piece that need carry out the data de-duplication processing in the tabulation of deleting when the classified counting of Backup Data piece is zero, can be deleted the status indicator of this Backup Data piece from heavily deleting the tabulation.
Step 103: the status indicator according to the Backup Data piece recycles the Backup Data piece.
When determining the Backup Data piece recycled, at first need to determine whether this Backup Data piece is recycled according to the status indicator of Backup Data piece.In the present embodiment, can determine whether the Backup Data piece is using by inquiring about the status indicator of heavily deleting Backup Data piece in the tabulation, thereby determine whether recycle.When the classified counting of this Backup Data piece in the status indicator of Backup Data piece is zero, illustrate that this Backup Data piece does not repeat deletion and processes, therefore, can recycle this Backup Data piece.When the classified counting of this Backup Data piece in the status indicator of Backup Data piece is non-vanishing, illustrate that this Backup Data piece is repeating deletion and processing, therefore, suspend the recycling to the Backup Data piece.In the present embodiment, also can arrange one and reclaim tabulation, comprise the status indicator that needs each Backup Data piece of recycling in this recovery tabulation, after the Backup Data piece recycles, with status indicator deletion from reclaim tabulation of this Backup Data piece.
It will be appreciated by those skilled in the art that, when this Backup Data piece being repeated the deletion processing, also can inquire about and reclaim tabulation, when in reclaiming tabulation, having comprised the status indicator of this Backup Data piece, can suspend equally the recycling to this Backup Data piece, guarantee to repeat the deletion processing and preferentially carry out.
It will be appreciated by those skilled in the art that, when the classified counting of Backup Data piece is zero, repeat to delete processing procedure and recycling process and not only can carry out simultaneously, and the two is in operational process, can not influence each other, guarantee to greatest extent the concurrent execution of heavily deleting and reclaiming.
The embodiment of the invention provides a kind of date storage method, by the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, to obtain corresponding Backup Data piece, according to the Backup Data piece file to be stored is carried out data de-duplication operations, and for the Backup Data piece carries out status indicator, status indicator according to the Backup Data piece recycles the Backup Data piece, data de-duplication is processed preferentially to carry out, solve data de-duplication and processed the problem that causes loss of data with the concurrent execution of recycling, guaranteed to repeat to delete the security of carrying out and having stored data in order of processing and recycling.
Fig. 2 is the process flow diagram of date storage method embodiment two of the present invention, and as shown in Figure 2, present embodiment provides a kind of date storage method, and the method can be carried out by the equipment of any executing data storage operation, can specifically comprise the steps:
Step 201: file to be stored is carried out piecemeal process, obtain each data block, and calculate the fingerprint of each data block.
This step is carried out the piecemeal processing to file to be stored first, and concrete piecemeal processing procedure can adopt partition of the prior art, as by elongated block algorithm file to be stored being carried out piecemeal.Calculate again the fingerprint of each piecemeal that obtains after piecemeal is processed, concrete fingerprint computation process also can adopt computing method of the prior art, as adopting Secure Hash Algorithm (Secure Hash Algorithm), the two hash algorithms of Message Digest Algorithm 5 (Message Digest Algorithm is called for short MD5) to calculate the fingerprint of each piecemeal.
Step 202: the fingerprint to each data block carries out sample process, and generates the fingerprint sampling table of file to be stored according to the fingerprint that is drawn into.
Go heavy calculated amount in the data de-duplication process in order to reduce, behind the fingerprint of each piecemeal that obtains file to be stored, these fingerprints are sampled, the basic demand of sampling be fingerprint in the sampling results in the scope of the fingerprint of each piecemeal of file to be stored, and the quantity of the piecemeal fingerprint of the no more than file to be stored of quantity of fingerprint in the sampling results.Each piecemeal fingerprint sampled be specifically as follows: directly be the fingerprint that 0 fingerprint is drawn into as sample process with last byte in the fingerprint of each piecemeal; Perhaps with the piecemeal on the fixed position as the fingerprint that is drawn into, for example with the locational piecemeal of 9 integral multiple as being drawn into to get fingerprint; Perhaps sample according to predetermined sampling proportion, for example randomly draw 5% piecemeal as the fingerprint that is drawn into.The fingerprint of each piecemeal carried out sample process herein, fingerprint is screened, and generate the fingerprint sampling table of this file to be stored according to the fingerprint that is drawn into.It will be understood by those skilled in the art that also to exist sampling results all not satisfy the sampling condition in the present embodiment, namely do not have the situation of the piece that satisfies the sampling condition in this file to be stored, the fingerprint sampling table that then obtains is for empty.
Step 203: according to fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under file to be stored is in the cluster sampling storehouse, the data block of having stored that similar grouping is corresponding is as the Backup Data piece.
After getting access to the fingerprint sampling table of file to be stored, according to fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under file to be stored is in the cluster sampling storehouse, the data block of having stored that similar grouping is corresponding is as the Backup Data piece.The cluster sampling storehouse is carried out sample process by fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the similar fingerprint sampling table that is grouped in the cluster sampling storehouse with file to be stored is complementary.
Especially, fingerprint base has been preserved storage file through all fingerprints behind the data de-duplication.If the file to be stored of this step process is first file, then fingerprint base is empty.At this moment, if the fingerprint sampling table is not empty, then in the cluster sampling storehouse, set up a newly-built grouping, determine that similar under in the cluster sampling storehouse of file to be stored is grouped into newly-built grouping, and the fingerprint in the fingerprint sampling table of file to be stored is saved in the newly-built grouping.When the fingerprint sampling table is not empty, and fingerprint base carries out sample process to the fingerprint in the fingerprint base, acquisition cluster sampling storehouse when be empty.It is similar wherein in the method for sample process and the step 202 each data block of file to be stored to be carried out the method for sample process, and present embodiment repeats no more herein.It will be understood by those skilled in the art that the fingerprint in the fingerprint base is carried out the method for sample process should be consistent with the method for each data block of file to be stored being carried out sample process, can obtain the higher similar grouping of similarity like this.
By to the coupling one by one of each sampling grouping in each fingerprint in the fingerprint sampling table and the current cluster sampling storehouse, in current cluster sampling storehouse, determine similar grouping under the file to be stored according to matching result.Particularly, when the fingerprint similarity of one in each fingerprint in the fingerprint sampling table and current cluster sampling fingerprint base sampling grouping during more than or equal to default similarity threshold, think that then this file to be stored belongs to this sampling grouping, this sampling is grouped into similar grouping, and the data block of having stored corresponding to the fingerprint in this similar grouping is as the Backup Data piece; When the fingerprint similarity of all groupings in each fingerprint in the fingerprint sampling table and the current group sampling storehouse during all less than default similarity threshold, in the cluster sampling storehouse, set up a newly-built grouping, determine that similar under in the cluster sampling storehouse of file to be stored is grouped into newly-built grouping, and the fingerprint in the fingerprint sampling table of file to be stored is saved in the newly-built grouping.
When the sampling results in the step 202 does not all satisfy the sampling condition, namely there is not the piece that satisfies the sampling condition in this file to be stored, then determine the similar default grouping that is grouped in the current cluster sampling storehouse under described file to be stored is in current cluster sampling storehouse, the similarity analysis process of present embodiment finishes.In fingerprint base with in the fingerprint grouping that should defaultly divide into groups corresponding, file to be stored is carried out the data de-duplication processing.Should default be grouped into the predefined grouping of present embodiment, there is not specific implication, should default grouping can be sky, it is corresponding with specific fingerprint grouping in the fingerprint base, and what preserve in this specific fingerprint grouping is that the fingerprint sampling table is the fingerprint of empty file to be stored after these sampling.In actual sampling process, having the rear fingerprint sampling table of sampling is empty special circumstances, only is that this processing is in particular cases described herein, avoids causing whole flow process to be interrupted because this special circumstances occurring.
Further, when according to the Backup Data piece file to be stored being carried out data de-duplication operations, also the fingerprint of similar grouping can be divided into a plurality of intervals, and set up a database in each interval, be used for depositing corresponding interval fingerprint; In inquiry repeating data piece, can separately inquire about, can each interval of concurrent inquiry in the situation of multithreading, multinode, promote the ability of concurrent inquiry, accelerate inquiry velocity.
Step 204: according to the Backup Data piece file to be stored is carried out data de-duplication operations, and carry out status indicator for the Backup Data piece.This step can be similar with above-mentioned steps 102, repeats no more herein.
Step 205: the status indicator according to the Backup Data piece recycles the Backup Data piece.This step can be similar with above-mentioned steps 103, repeats no more herein.
The embodiment of the invention provides a kind of date storage method, process by file to be stored being carried out piecemeal, obtain each data block, and calculate the fingerprint of each data block, fingerprint to each data block carries out sample process, and generate the fingerprint sampling table of file to be stored according to the fingerprint that is drawn into, according to fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under file to be stored is in the cluster sampling storehouse, as the Backup Data piece, present embodiment carries out further sample process to each data block and the fingerprint base of file to be stored, determine similar grouping by similarity analysis first, carrying out data de-duplication again in fingerprint grouping corresponding to similar grouping processes, dwindled the heavy query count amount of going, solved the magnanimity block data is introduced when heavily deleting in the prior art calculated amount and the huge problem of resource consumption, reduced and gone heavy calculated amount in the data de-duplication, promoted and heavily deleted performance.
Fig. 3 is the process flow diagram of date storage method embodiment three of the present invention, and as shown in Figure 3, present embodiment provides a kind of date storage method, and the method can be carried out by the equipment of any executing data storage operation, can specifically comprise the steps:
Step 301: the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, to obtain corresponding Backup Data piece.
Step 301 among Fig. 3 embodiment can with Fig. 1 embodiment in step 101 similar, also can adopt the method for obtaining corresponding Backup Data piece shown in Fig. 2 embodiment, present embodiment repeats no more herein.
Step 302: according to the Backup Data piece file to be stored is carried out data de-duplication operations, and carry out status indicator for the Backup Data piece.
Step 302 among Fig. 3 embodiment can with Fig. 1 embodiment in step 102 similar, present embodiment repeats no more herein.
Step 303: when the numerical value of the reference count that monitors the Backup Data piece changes, the status indicator of the Backup Data piece that identification is corresponding.
In Preset Time, reference count to the Backup Data piece is monitored, when the file of having stored is modified or deletes, the situation of quoting of the Backup Data piece that the position of revising or deleting is corresponding changes, when the reference count of Backup Data piece changes, the status indicator of the Backup Data piece that identification is corresponding.
Step 304: when the status indicator that recognizes corresponding Backup Data piece shows that the Backup Data piece does not use, then identify the numerical value of the reference count of Backup Data piece.
When the status indicator that recognizes corresponding Backup Data piece shows that the Backup Data piece is not used, namely according to this Backup Data piece file to be stored is not carried out data de-duplication operations, then identify the numerical value of the reference count of this Backup Data piece.
Step 305: when the numerical value of the reference count that recognizes this Backup Data piece is zero, trigger that the Backup Data piece is recycled.
When the numerical value of the reference count of this Backup Data piece is zero, illustrate that this Backup Data piece is garbage files, can recycle.Wherein, the reference count method is unique rubbish recovering method that does not use the root collection, the object that the method is distinguished the survival object and do not re-used with reference counter.In general, when object is that the Backup Data piece is dropped or does not re-use, reference counter subtracts 1, in case reference counter is 0, this Backup Data piece has just satisfied the condition of refuse collection.It will be appreciated by those skilled in the art that, step 303 shown in Fig. 3 embodiment to step 305 can also be applied in all recovery of storing data block, when the reference count of the data block that monitors to store changes, carry out the recovery task, when the reference count of this data block is 0, this data block is recycled.
The embodiment of the invention provides a kind of date storage method, the numerical value change of the reference count by monitoring Backup Data piece, when the status indicator of the Backup Data piece that changes when reference count shows that the Backup Data piece does not use, then identify the numerical value of the reference count of Backup Data piece, when the numerical value of the reference count that recognizes the Backup Data piece is zero, triggering recycles the Backup Data piece, the Backup Data piece that present embodiment only changes for reference count reclaims scanning, promote recovery speed, can give more timely user's storage space for change.
Fig. 4 is data storage logic framework embodiment one synoptic diagram of the present invention.The data storage logic configuration diagram that present embodiment provides can be carried out the embodiment of above-mentioned date storage method.As shown in Figure 4, the data storage logic configuration diagram that present embodiment provides comprises cluster management module 40, heavily deletes engine modules 41, meta data server 42, single-instance storehouse 43, forwarding module 44.
Wherein, cluster management module 40 is used for management recovery tabulation and heavily deletes tabulation.
Heavily delete engine modules 41 and be used for the data de-duplication task, space reclamation task, and processing and the management of data block being carried out the various tasks such as reference count.Accordingly, heavily delete engine modules 41 and comprise task processing module 411, task management module 412 and distributor 413.Wherein, task processing module 411 comprises heavily deletes module 4111, is used for carrying out the data de-duplication task, and reference count module 4112 is used for carrying out the task that data block is carried out reference count, and space reclamation module 4113 is used for carrying out the space reclamation task.Task management module 412 is used for the management thread pond, comprises to the monitoring of task queue and to the monitoring of thread pool thread running status.Distributor 413 is used for the deblocking of safeguarding that each meta data server 42 is managed.
Meta data server 42 is used for the similar grouping affiliated in the cluster sampling storehouse of definite file to be stored.
Be used for each sampling grouping in stores packets sampling storehouse in the single-instance storehouse 43, and respectively sampling between packet zone in the sampling grouping.
Forwarding module 44 is responsible for cluster management module 40, heavily deletes engine modules 41, data transfer between the meta data server 42.
In a specific embodiment, when carrying out the data de-duplication task, heavily delete 41 pairs of files to be stored of engine modules and carry out the piecemeal processing, calculate the fingerprint of each piecemeal in the piecemeal result, fingerprint to each piecemeal carries out sample process, and generate the fingerprint sampling table of file to be stored according to the fingerprint that is drawn into, and to meta data server 42 transmission grouping request messages, meta data server 42 is determined the similar grouping corresponding with this fingerprint sampling table, and the Backup Data piece corresponding with described similar grouping.Heavily deleted engine modules 41 before carrying out data de-duplication, to cluster management module 40 send carry Backup Data bulk state sign heavily delete request message, cluster management module 40 determines to reclaim whether have this Backup Data bulk state sign in the tabulation, if exist, then cluster management module 40 makes and heavily deletes engine modules 41 cancellations to the recovery of this Backup Data piece, proceeds the data de-duplication task.
In a specific embodiment, when carrying out the recovery task, heavily delete the Backup Data piece that engine modules 41 statistics reference counts change, the Backup Data piece that reference count is changed reclaims, before reclaiming, send the recovery request message that carries Backup Data bulk state sign to cluster management module 40, cluster management module 40 determines heavily to delete whether have this Backup Data bulk state sign in the tabulation, if exist, then cluster management module 40 makes and heavily deletes engine modules 41 cancellations to the recovery of this Backup Data piece.
The data storage logic framework that present embodiment provides, in the specific implementation data storage procedure, the priority of data de-duplication task is higher than the priority of space reclamation task, solve data de-duplication and processed the problem that causes loss of data with the concurrent execution of recycling, guaranteed to repeat to delete the security of carrying out and having stored data in order of processing and recycling.
Fig. 5 is data store set gang fight structure embodiment one synoptic diagram of the present invention.The data store set gang fight structure that present embodiment provides can be by Fig. 1 to date storage method embodiment shown in Figure 3 and data storage logic framework embodiment shown in Figure 4 realization.As shown in Figure 5, the data store set gang fight structure that present embodiment provides comprises from node 501, host node 502, slave node 503.
Wherein, share from node 501, host node 502, the storage of slave node 503 data.The three includes the cluster management module, heavily deletes engine modules and meta data server.Simultaneously, can finish data de-duplication the above-mentioned date storage method and the process of space reclamation from node 501, host node 502, slave node 503.Host node 502 is specifically as follows the main frame in the LAN (Local Area Network), is specifically as follows extension set the LAN (Local Area Network) from node 501.It will be understood by those skilled in the art that in actual application, can be for a plurality of from the number of node 501.Host node 502 main being responsible for to the order that issues beginning data de-duplication or space reclamation from node 501 are so that carry out corresponding data de-duplication or space reclamation task from node 501.After the end of executing the task from node 501, execution result can be informed host node from node 501.When breaking down from node 501 or host node 502, in the time of can't working, can by working on from node 501 or host node 502 that slave node 503 replacements are broken down, guarantee that data storage procedure can continue to carry out.
The data store set gang fight structure that present embodiment provides all can the executing data storage means from node, host node and slave node, and host node can be controlled many from simultaneously executing data storage of node simultaneously, has improved the efficient of data storages.When breaking down from node and host node, what slave node can replace breaking down works on from node or host node, has avoided the interruption of data storage procedure, guarantees the continuity of data storage procedure.
Fig. 6 is the structural drawing of data storage device embodiment one of the present invention, and as shown in Figure 6, the data storage device that present embodiment provides comprises Backup Data piece acquisition module 61, data de-duplication module 62, recycling module 63.Wherein processing module 61 is used for the fingerprint of each data block of file to be stored and the fingerprint of fingerprint base are mated, to obtain corresponding Backup Data piece; Data de-duplication module 62 is used for according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and carries out status indicator for described Backup Data piece; Recycling module 63 is used for according to the status indicator of described Backup Data piece described Backup Data piece being recycled.
The data storage device of present embodiment can be for the technical scheme of carrying out said method embodiment, and its realization principle and technique effect are similar, repeat no more herein.
Fig. 7 is the structural drawing of data storage device embodiment two of the present invention, and as shown in Figure 7, present embodiment is on basis embodiment illustrated in fig. 6, and described data de-duplication module 62 comprises: the first counting unit 621, the second counting units 622.
Wherein, the first counting unit 621 was used for before according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and the classified counting of described Backup Data piece is added one; The second counting unit 622, be used for finish according to described Backup Data piece described file to be stored is carried out data de-duplication operations after, the classified counting of described Backup Data piece is subtracted one.
On basis embodiment illustrated in fig. 6, described recycling module 63 comprises: reclaim and suspend unit 631, the first recovery trigger elements 632.
Wherein, reclaim and suspend unit 631, be used for when the classified counting of the status indicator that recognizes described Backup Data piece is non-vanishing, suspending the recycling to described Backup Data piece; First reclaims trigger element 632, is used for triggering the recycling to described Backup Data piece when the classified counting of the status indicator that recognizes described Backup Data piece is zero.
The data storage device of present embodiment can be for the technical scheme of carrying out said method embodiment, and its realization principle and technique effect are similar, repeat no more herein.
Fig. 8 is the structural drawing of data storage device embodiment three of the present invention, as shown in Figure 8, present embodiment is on basis embodiment illustrated in fig. 6, and described Backup Data piece acquisition module 61 comprises: fingerprint computing unit 611, fingerprint sampling unit 612, grouping determining unit 613.
Wherein, fingerprint computing unit 611 is used for that described file to be stored is carried out piecemeal and processes, and obtains each data block, and calculates the fingerprint of each data block; Fingerprint sampling unit 612 is used for the fingerprint of each described data block is carried out sample process, and generates the fingerprint sampling table of described file to be stored according to the fingerprint that is drawn into; Grouping determining unit 613, be used for according to described fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under described file to be stored is in described cluster sampling storehouse, the data block of having stored that described similar grouping is corresponding is as described Backup Data piece, described cluster sampling storehouse is carried out sample process by described fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the described similar fingerprint sampling table that is grouped in the described cluster sampling storehouse with described file to be stored is complementary.
On basis embodiment illustrated in fig. 6, described recycling module 63 comprises:
Reference count monitoring means 633 is used for when the numerical value of the reference count that monitors described Backup Data piece changes the status indicator of the Backup Data piece that identification is corresponding; Reference count recognition unit 634 is used for then identifying the numerical value of the reference count of described Backup Data piece when the status indicator that recognizes corresponding Backup Data piece shows that described Backup Data piece does not use; Second reclaims trigger element 635, is used for triggering that described Backup Data piece is recycled when the numerical value of the reference count that recognizes described Backup Data piece is zero.
The data storage device of present embodiment can be for the technical scheme of carrying out said method embodiment, and its realization principle and technique effect are similar, no longer superfluous herein.
One of ordinary skill in the art will appreciate that: all or part of step that realizes above-mentioned each embodiment of the method can be finished by the relevant hardware of programmed instruction.Aforesaid program can be stored in the computer read/write memory medium.This program is carried out the step that comprises above-mentioned each embodiment of the method when carrying out; And aforesaid storage medium comprises: the various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
It should be noted that at last: above each embodiment is not intended to limit only in order to technical scheme of the present invention to be described; Although with reference to aforementioned each embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment puts down in writing, and perhaps some or all of technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the scope of various embodiments of the present invention technical scheme.

Claims (10)

1. a date storage method is characterized in that, comprising:
The fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, to obtain corresponding Backup Data piece;
According to described Backup Data piece described file to be stored is carried out data de-duplication operations, and carry out status indicator for described Backup Data piece;
Status indicator according to described Backup Data piece recycles described Backup Data piece.
2. method according to claim 1 is characterized in that, the fingerprint of each data block of file to be stored and the fingerprint in the fingerprint base are mated, and comprises to obtain corresponding Backup Data piece:
Described file to be stored is carried out piecemeal process, obtain each data block, and calculate the fingerprint of each data block;
Fingerprint to each described data block carries out sample process, and generates the fingerprint sampling table of described file to be stored according to the fingerprint that is drawn into;
According to described fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under described file to be stored is in described cluster sampling storehouse, the data block of having stored that described similar grouping is corresponding is as described Backup Data piece, described cluster sampling storehouse is carried out sample process by described fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the described similar fingerprint sampling table that is grouped in the described cluster sampling storehouse with described file to be stored is complementary.
3. method according to claim 1 is characterized in that, according to described Backup Data piece described file to be stored is carried out data de-duplication operations, and comprises for described Backup Data piece carries out status indicator:
Before according to described Backup Data piece described file to be stored being carried out data de-duplication operations, the classified counting of described Backup Data piece is added one;
Finish according to described Backup Data piece described file to be stored is carried out data de-duplication operations after, the classified counting of described Backup Data piece is subtracted one.
4. method according to claim 3 is characterized in that, according to the status indicator of described Backup Data piece described Backup Data piece is recycled to comprise:
When the classified counting in the status indicator that recognizes described Backup Data piece is non-vanishing, suspend the recycling to described Backup Data piece;
When the classified counting in the status indicator that recognizes described Backup Data piece is zero, trigger the recycling to described Backup Data piece.
5. according to claim 1 and 2 or 3 described methods, it is characterized in that, according to the status indicator of described Backup Data piece described Backup Data piece recycled and comprise:
When the numerical value of the reference count that monitors described Backup Data piece changes, the status indicator of the Backup Data piece that identification is corresponding;
When the status indicator that recognizes corresponding Backup Data piece shows that described Backup Data piece does not use, then identify the numerical value of the reference count of described Backup Data piece;
When the numerical value of the reference count that recognizes described Backup Data piece is zero, trigger that described Backup Data piece is recycled.
6. a data storage device is characterized in that, comprising:
Backup Data piece acquisition module is used for the fingerprint of each data block of file to be stored and the fingerprint of fingerprint base are mated, to obtain corresponding Backup Data piece;
The data de-duplication module is used for according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and carries out status indicator for described Backup Data piece;
Recycling module is used for according to the status indicator of described Backup Data piece described Backup Data piece being recycled.
7. equipment according to claim 6 is characterized in that, described Backup Data piece acquisition module comprises:
The fingerprint computing unit is used for that described file to be stored is carried out piecemeal and processes, and obtains each data block, and calculates the fingerprint of each data block;
The fingerprint sampling unit is used for the fingerprint of each described data block is carried out sample process, and generates the fingerprint sampling table of described file to be stored according to the fingerprint that is drawn into;
The grouping determining unit, be used for according to described fingerprint sampling table and cluster sampling storehouse, determine the similar grouping under described file to be stored is in described cluster sampling storehouse, the data block of having stored that described similar grouping is corresponding is as described Backup Data piece, described cluster sampling storehouse is carried out sample process by described fingerprint base and is obtained, the sampling grouping that the sampling fingerprint in the described similar fingerprint sampling table that is grouped in the described cluster sampling storehouse with described file to be stored is complementary.
8. equipment according to claim 6 is characterized in that, described data de-duplication module comprises:
The first counting unit was used for before according to described Backup Data piece described file to be stored being carried out data de-duplication operations, and the classified counting of described Backup Data piece is added one;
The second counting unit, be used for finish according to described Backup Data piece described file to be stored is carried out data de-duplication operations after, the classified counting of described Backup Data piece is subtracted one.
9. equipment according to claim 8 is characterized in that, described recycling module comprises:
Reclaim and suspend the unit, be used for when the classified counting of the status indicator that recognizes described Backup Data piece is non-vanishing, suspending the recycling to described Backup Data piece;
First reclaims trigger element, is used for triggering the recycling to described Backup Data piece when the classified counting of the status indicator that recognizes described Backup Data piece is zero.
10. according to claim 6 or 7 or 8 described equipment, it is characterized in that described recycling module comprises:
The reference count monitoring means is used for when the numerical value of the reference count that monitors described Backup Data piece changes the status indicator of the Backup Data piece that identification is corresponding;
The reference count recognition unit is used for then identifying the numerical value of the reference count of described Backup Data piece when the status indicator that recognizes corresponding Backup Data piece shows that described Backup Data piece does not use;
Second reclaims trigger element, is used for triggering that described Backup Data piece is recycled when the numerical value of the reference count that recognizes described Backup Data piece is zero.
CN201210552099.7A 2012-12-18 2012-12-18 Date storage method and equipment Expired - Fee Related CN102982180B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210552099.7A CN102982180B (en) 2012-12-18 2012-12-18 Date storage method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210552099.7A CN102982180B (en) 2012-12-18 2012-12-18 Date storage method and equipment

Publications (2)

Publication Number Publication Date
CN102982180A true CN102982180A (en) 2013-03-20
CN102982180B CN102982180B (en) 2016-08-03

Family

ID=47856196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210552099.7A Expired - Fee Related CN102982180B (en) 2012-12-18 2012-12-18 Date storage method and equipment

Country Status (1)

Country Link
CN (1) CN102982180B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886070A (en) * 2014-03-21 2014-06-25 华为技术有限公司 Method and device for recycling data of file system
CN103973708A (en) * 2014-05-26 2014-08-06 中电长城网际***应用有限公司 Determination method and system for data breach event
CN104598927A (en) * 2015-01-29 2015-05-06 中国科学院深圳先进技术研究院 Large-scale graph partitioning method and system
CN104881475A (en) * 2015-06-02 2015-09-02 北京京东尚科信息技术有限公司 Method and system for randomly sampling big data
WO2016037560A1 (en) * 2014-09-10 2016-03-17 华为技术有限公司 Data writing method and apparatus and memory
CN106708927A (en) * 2016-11-18 2017-05-24 北京二六三企业通信有限公司 Duplicate removal processing method and duplicate removal processing device for files
CN106775501A (en) * 2017-02-14 2017-05-31 华南师范大学 Elimination of Data Redundancy method and system based on nonvolatile memory equipment
CN106959888A (en) * 2016-01-11 2017-07-18 杭州海康威视数字技术股份有限公司 Task processing method and device in cloud storage system
CN107193503A (en) * 2017-05-27 2017-09-22 杭州宏杉科技股份有限公司 A kind of data delete method and storage device again
CN108021828A (en) * 2017-12-06 2018-05-11 湖南文理学院 A kind of computer information data multi-stage protection system
CN109416681A (en) * 2016-08-29 2019-03-01 国际商业机器公司 The data de-duplication of workload optimization is carried out using ghost fingerprint
CN109753228A (en) * 2017-11-08 2019-05-14 阿里巴巴集团控股有限公司 Snapshot delet method, apparatus and system
CN110647294A (en) * 2019-09-09 2020-01-03 Oppo(重庆)智能科技有限公司 Storage block recovery method and device, storage medium and electronic equipment
CN110945483A (en) * 2017-08-25 2020-03-31 华为技术有限公司 Network system and method for data de-duplication
CN111124750A (en) * 2019-11-05 2020-05-08 国家电网有限公司 Data rapid deleting method based on source-end deduplication
CN111125033A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Space recovery method and system based on full flash memory array
CN111143343A (en) * 2019-12-27 2020-05-12 南京壹进制信息科技有限公司 Data efficient deleting method and system based on source-end deduplication
CN111522502A (en) * 2019-02-01 2020-08-11 阿里巴巴集团控股有限公司 Data deduplication method and device, electronic equipment and computer-readable storage medium
CN111581955A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Text fingerprint extraction and verification method and device
CN111897845A (en) * 2020-07-29 2020-11-06 徐州金蝶软件有限公司 Method and system for processing mass credit information based on process
CN113568877A (en) * 2020-04-28 2021-10-29 杭州海康威视数字技术股份有限公司 File merging method and device, electronic equipment and storage medium
CN115543979A (en) * 2022-09-29 2022-12-30 广州鼎甲计算机科技有限公司 Method, device, equipment, storage medium and program product for deleting repeated data
CN117369731A (en) * 2023-12-07 2024-01-09 苏州元脑智能科技有限公司 Data reduction processing method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582076A (en) * 2009-06-24 2009-11-18 浪潮电子信息产业股份有限公司 Data de-duplication method based on data base
CN101599079A (en) * 2009-07-22 2009-12-09 中国科学院计算技术研究所 A kind of Backup Data is concentrated the management method of storage
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN102222085A (en) * 2011-05-17 2011-10-19 华中科技大学 Data de-duplication method based on combination of similarity and locality

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582076A (en) * 2009-06-24 2009-11-18 浪潮电子信息产业股份有限公司 Data de-duplication method based on data base
CN101599079A (en) * 2009-07-22 2009-12-09 中国科学院计算技术研究所 A kind of Backup Data is concentrated the management method of storage
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN102222085A (en) * 2011-05-17 2011-10-19 华中科技大学 Data de-duplication method based on combination of similarity and locality

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886070A (en) * 2014-03-21 2014-06-25 华为技术有限公司 Method and device for recycling data of file system
CN103973708A (en) * 2014-05-26 2014-08-06 中电长城网际***应用有限公司 Determination method and system for data breach event
CN105468533B (en) * 2014-09-10 2019-02-19 华为技术有限公司 Method for writing data, device and memory
WO2016037560A1 (en) * 2014-09-10 2016-03-17 华为技术有限公司 Data writing method and apparatus and memory
CN105468533A (en) * 2014-09-10 2016-04-06 华为技术有限公司 Data writing method and apparatus, and memory
CN104598927A (en) * 2015-01-29 2015-05-06 中国科学院深圳先进技术研究院 Large-scale graph partitioning method and system
CN104881475A (en) * 2015-06-02 2015-09-02 北京京东尚科信息技术有限公司 Method and system for randomly sampling big data
CN106959888A (en) * 2016-01-11 2017-07-18 杭州海康威视数字技术股份有限公司 Task processing method and device in cloud storage system
CN109416681B (en) * 2016-08-29 2022-03-18 国际商业机器公司 Deduplication for workload optimization using ghost fingerprints
CN109416681A (en) * 2016-08-29 2019-03-01 国际商业机器公司 The data de-duplication of workload optimization is carried out using ghost fingerprint
CN106708927A (en) * 2016-11-18 2017-05-24 北京二六三企业通信有限公司 Duplicate removal processing method and duplicate removal processing device for files
CN106775501A (en) * 2017-02-14 2017-05-31 华南师范大学 Elimination of Data Redundancy method and system based on nonvolatile memory equipment
CN106775501B (en) * 2017-02-14 2019-06-11 华南师范大学 Elimination of Data Redundancy system based on nonvolatile memory equipment
CN107193503B (en) * 2017-05-27 2020-05-29 杭州宏杉科技股份有限公司 Data deduplication method and storage device
CN107193503A (en) * 2017-05-27 2017-09-22 杭州宏杉科技股份有限公司 A kind of data delete method and storage device again
CN110945483A (en) * 2017-08-25 2020-03-31 华为技术有限公司 Network system and method for data de-duplication
CN109753228B (en) * 2017-11-08 2022-08-02 阿里巴巴集团控股有限公司 Snapshot deleting method, device and system
CN109753228A (en) * 2017-11-08 2019-05-14 阿里巴巴集团控股有限公司 Snapshot delet method, apparatus and system
CN108021828B (en) * 2017-12-06 2020-01-24 湖南文理学院 Computer information data multistage protection system
CN108021828A (en) * 2017-12-06 2018-05-11 湖南文理学院 A kind of computer information data multi-stage protection system
CN111125033A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Space recovery method and system based on full flash memory array
CN111125033B (en) * 2018-10-31 2024-04-09 深信服科技股份有限公司 Space recycling method and system based on full flash memory array
CN111522502B (en) * 2019-02-01 2022-04-29 阿里巴巴集团控股有限公司 Data deduplication method and device, electronic equipment and computer-readable storage medium
CN111522502A (en) * 2019-02-01 2020-08-11 阿里巴巴集团控股有限公司 Data deduplication method and device, electronic equipment and computer-readable storage medium
CN111581955A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Text fingerprint extraction and verification method and device
CN110647294B (en) * 2019-09-09 2022-03-25 Oppo广东移动通信有限公司 Storage block recovery method and device, storage medium and electronic equipment
CN110647294A (en) * 2019-09-09 2020-01-03 Oppo(重庆)智能科技有限公司 Storage block recovery method and device, storage medium and electronic equipment
CN111124750A (en) * 2019-11-05 2020-05-08 国家电网有限公司 Data rapid deleting method based on source-end deduplication
CN111124750B (en) * 2019-11-05 2024-04-30 国家电网有限公司 Quick data deleting method based on source terminal deleting
CN111143343A (en) * 2019-12-27 2020-05-12 南京壹进制信息科技有限公司 Data efficient deleting method and system based on source-end deduplication
CN111143343B (en) * 2019-12-27 2023-12-15 航天壹进制(江苏)信息科技有限公司 Efficient data deleting method and system based on source terminal deduplication
CN113568877A (en) * 2020-04-28 2021-10-29 杭州海康威视数字技术股份有限公司 File merging method and device, electronic equipment and storage medium
CN111897845A (en) * 2020-07-29 2020-11-06 徐州金蝶软件有限公司 Method and system for processing mass credit information based on process
CN111897845B (en) * 2020-07-29 2023-10-31 江苏新蝶数字科技有限公司 Method and system for processing massive credit information based on flow
CN115543979A (en) * 2022-09-29 2022-12-30 广州鼎甲计算机科技有限公司 Method, device, equipment, storage medium and program product for deleting repeated data
CN115543979B (en) * 2022-09-29 2023-08-08 广州鼎甲计算机科技有限公司 Method, apparatus, device, storage medium and program product for deleting duplicate data
CN117369731A (en) * 2023-12-07 2024-01-09 苏州元脑智能科技有限公司 Data reduction processing method, device, equipment and medium
CN117369731B (en) * 2023-12-07 2024-02-27 苏州元脑智能科技有限公司 Data reduction processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN102982180B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN102982180A (en) Method and device for storing data
US11474972B2 (en) Metadata query method and apparatus
US8782011B2 (en) System and method for scalable reference management in a deduplication based storage system
EP2521966B1 (en) Systems and methods for removing unreferenced data segments from deduplicated data systems
CN111913909A (en) Re-fragmentation method and system in distributed storage system
Tan et al. Clost: a hadoop-based storage system for big spatio-temporal data analytics
CN109445702B (en) block-level data deduplication storage system
US20120185447A1 (en) Systems and Methods for Providing Increased Scalability in Deduplication Storage Systems
KR20200004357A (en) Packing objects by predicted lifespan in cloud storage
CN103577336B (en) A kind of stored data processing method and device
CN105787037B (en) A kind of delet method and device of repeated data
CN102521269A (en) Index-based computer continuous data protection method
CN103797470A (en) Storage system
US10235286B1 (en) Data storage system dynamically re-marking slices for reclamation from internal file system to pool storage
CN103858125B (en) Repeating data disposal route, device and memory controller and memory node
CN113836084A (en) Data storage method, device and system
CN102495894A (en) Method, device and system for searching repeated data
CN102591864B (en) Data updating method and device in comparison system
CN105183399A (en) Data writing and reading method and device based on elastic block storage
CN104462389A (en) Method for implementing distributed file systems on basis of hierarchical storage
CN102591789A (en) Storage space recovery method and storage space recovery device
CN104735110A (en) Metadata management method and system
US20200320040A1 (en) Container index persistent item tags
CN105095495A (en) Distributed file system cache management method and system
CN111913925A (en) Data processing method and system in distributed storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20170508

Address after: 510640 Guangdong City, Tianhe District Province, No. five, road, public education building, unit 371-1, unit 2401

Patentee after: Guangdong Gaohang Intellectual Property Operation Co., Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: Huawei Technologies Co., Ltd.

TR01 Transfer of patent right
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Xiao Wenchang

Inventor before: Fu Xudong

Inventor before: Duan Yumei

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20170519

Address after: 414000 Zhongke Industrial Park, Yueyang Road, Yueyang economic and Technological Development Zone, Hunan

Patentee after: Hunan and Magnetic Technology Co., Ltd.

Address before: 510640 Guangdong City, Tianhe District Province, No. five, road, public education building, unit 371-1, unit 2401

Patentee before: Guangdong Gaohang Intellectual Property Operation Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160803

Termination date: 20171218