CN108134775A - A kind of data processing method and equipment - Google Patents
A kind of data processing method and equipment Download PDFInfo
- Publication number
- CN108134775A CN108134775A CN201711167866.1A CN201711167866A CN108134775A CN 108134775 A CN108134775 A CN 108134775A CN 201711167866 A CN201711167866 A CN 201711167866A CN 108134775 A CN108134775 A CN 108134775A
- Authority
- CN
- China
- Prior art keywords
- data block
- data
- equipment
- fingerprint
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/565—Conversion or adaptation of application format or content
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of data processing method and equipment, are related to field of computer technology, help to save bandwidth resources.This method includes:First equipment calculates the similar fingerprints of data to be transmitted, and the similar fingerprints of data to be transmitted include the similar fingerprints of the first data block;First equipment sends the similar fingerprints of data to be transmitted to the second equipment, and the similar fingerprints of data to be transmitted are used to search in the second equipment whether store the referenced data block similar to data to be transmitted;First equipment receives the fingerprint for the referenced data block that the second equipment is sent;The fingerprint of referenced data block includes the fingerprint of the first referenced data block;The similar fingerprints of first referenced data block are identical with the similar fingerprints of the first data block;First equipment finds the first referenced data block in the first equipment according to the fingerprint of the first referenced data block;First equipment is based on the fingerprint of referenced data block to the second equipment transmission data;The data include the variance data between the first referenced data block and the first data block.
Description
Technical field
This application involves field of computer technology more particularly to a kind of data processing methods and equipment.
Background technology
With the continuous propulsion of cloud computing industry, the cloud infrastructure of mainstream vendor, if cloud computing center, cloud calamity are in
The heart, edge cloud etc. start extensive deployment.These infrastructure formed complicated wide area network (wide area network,
WAN) network topology, and when these infrastructure carry out data transmission between each other need a large amount of WAN bandwidth resources of expense.
Under limited bandwidth condition, bandwidth is saved usually using WAN acceleration techniques.
WAN acceleration techniques are to install a special equipment i.e. WAN accelerators respectively at wide-area network link both ends
(accelerator, ACC).WAN accelerators use data de-duplication by caching the part or all of data transmitted
Technology to reduce the data volume transmitted in wide-area network link, so as to save bandwidth resources, shortens information transmission and realizes total time
Information acceleration.Specifically:If the WAN accelerators of sending ending equipment side installation judge that sending ending equipment is sent to receiving device
Data be stored in the WAN accelerators of receiving device side installation, then the WAN accelerators of sending ending equipment side installation can be with
The data are not sent to receiving device, so as to save bandwidth resources.
However, in actual implementation, sending ending equipment is installed to the data that receiving device is sent with receiving device side
The identical situation of data of WAN accelerators caching is generally few, and therefore, above-mentioned WAN acceleration techniques can reach saving band
The effect of wide resource is limited.
Invention content
In order to achieve the above object, this application provides a kind of data processing method and equipment, help to save bandwidth money
Source.
In a first aspect, this application provides a kind of data processing method, this method can include:First equipment calculates to be passed
The similar fingerprints of transmission of data;Wherein, the similar fingerprints of data to be transmitted include the similar fingerprints of the first data block, the first data block
It is a data block in data to be transmitted;First equipment sends the similar fingerprints of data to be transmitted to the second equipment, to be transmitted
The similar fingerprints of data are used to search in the second equipment whether store the referenced data block similar to data to be transmitted;First sets
The standby fingerprint for receiving the referenced data block that the second equipment is sent;Wherein, the fingerprint of referenced data block includes the first referenced data block
Fingerprint;The similar fingerprints of first referenced data block are identical with the similar fingerprints of the first data block;First equipment is according to the first ginseng
The fingerprint for examining data block finds the first referenced data block in the first equipment;First equipment based on the fingerprint of referenced data block to
Second equipment transmission data;Wherein, which includes the variance data between the first referenced data block and the first data block.The skill
In art scheme, the first equipment is stored with and the first number by carrying out information exchange with the second equipment in the second equipment is determined
According to block the first similar referenced data block when, send the difference between the first data block and the first referenced data block to the second equipment
Heteromerism evidence.In this way, for compared to the first data block of transmission, bandwidth resources can be saved, so as to reduce information transmission time, that is, are added
The fast rate of information throughput.
Wherein, the fingerprint of data block refer to that whole characteristic informations based on the data block obtain for marking the data block
Identification information.The fingerprint of different data block is different.The similar fingerprints of data block refer to the special characteristic letter based on the data block
Cease the obtained identification information for being used to mark the data block.The similar fingerprints of different data block can be identical, can not also be identical.
The similar fingerprints of data to be transmitted can include the similar fingerprints of each data block in data to be transmitted.Referenced data block
Fingerprint can include the fingerprint of the referenced data block similar to each data block in data to be transmitted.With data to be transmitted phase
As referenced data block, the specifically referenced data block similar to the data block in data to be transmitted.
In a kind of possible design, the first equipment calculates the similar fingerprints of data to be transmitted, can include:First equipment
Hash operation is carried out to the data block of data to be transmitted using local sensitivity hash algorithm, obtains the similar fingerprints of data block.Its
In, local sensitivity hash algorithm can be such as, but not limited to minhash, simhash etc..
In a kind of possible design, the first equipment utilization local sensitivity hash algorithm to the data block of data to be transmitted into
Row Hash operation obtains the similar fingerprints of data block, can include:First equipment cutting data to be transmitted obtains data block;It is right
In each data block, the first equipment performs following operation:Extract at least one of data block sub-block;It is breathed out using m kinds
Uncommon algorithm, carries out Hash operation at least one sub-block respectively, obtains m Hash sequence;Wherein, it is calculated using a kind of Hash
Method carries out Hash operation at least one sub-block, obtains 1 Hash sequence;M is greater than the integer equal to 2;By m Kazakhstan
The maximum value in each Hash sequence in uncommon sequence merges, and using the Hash sequence obtained after merging as the data block
Similar fingerprints;Alternatively, the minimum value in each Hash sequence in m Hash sequence is merged, and obtained after merging
Similar fingerprints of the Hash sequence arrived as the data block.The possible design provides a kind of specific implementation side of minhash
Formula, in which, obtaining the process of m Hash sequence can perform parallel, can shorten calculate spent by similar fingerprints in this way
Time.
In a kind of possible design, this method can also include:First equipment utilization Differential Compression algorithm is joined to first
It examines data block and the first data block carries out Differential Compression.In this way, bandwidth resources can be further saved, so as to reduce information transmission
Time accelerates the rate of information throughput.
In a kind of possible design, the similar fingerprints of data to be transmitted also include the similar fingerprints of the second data block, the
Two data blocks are another data blocks in data to be transmitted;The fingerprint of referenced data block does not include the finger of the second referenced data block
Line;The similar fingerprints of second referenced data block are identical with the similar fingerprints of the second data block;The data also include the second data block.
In a kind of possible design, the first equipment includes first order caching and second level caching, first order caching right and wrong
Persistence medium, second level caching is persistence medium, first order caching for cache the part that is stored in the caching of the second level or
All data blocks and the partly or entirely fingerprint and similar fingerprints of data block;This method can also include:First equipment is
The fingerprint of the first referenced data block is searched in level cache;If the finger less than the first referenced data block is searched in first order caching
Line searches the fingerprint of the first referenced data block in then being cached in the second level.Thus, the data block in being cached due to the first order
The probability being hit is higher, i.e., usually can find the first referenced data block, therefore can improve letter in first order caching
Cease search efficiency.
In a kind of possible design, second level caching includes one or more containers, and each container is at least two numbers
According to the set that the fingerprint and similar fingerprints of each data block in block and at least two data blocks are formed, in each container at least
There is correlation between the content of two data blocks;This method can also include:If the first equipment is searched in being cached in the second level
To a data block, then the container where data block is cached to the first order in caching.Thus, in being cached due to the first order
The probability that is hit of data block it is higher, i.e., usually can find the data block, therefore can improve in first order caching
Information searching efficiency.
Second aspect, this application provides a kind of data processing method, this method can include:Second equipment receives first
The similar fingerprints for the data to be transmitted that equipment is sent, wherein, the similar fingerprints of data to be transmitted include the similar of the first data block
Fingerprint, the first data block are a data blocks in data to be transmitted;Second equipment is looked into according to the similar fingerprints of data to be transmitted
It finds and the referenced data block similar to data to be transmitted is stored in the second equipment;Wherein, referenced data block includes the first reference
Data block, the similar fingerprints of the first referenced data block are identical with the similar fingerprints of the first data block;Second equipment is to the first equipment
Send the fingerprint of referenced data block;Wherein, the fingerprint of referenced data block includes the fingerprint of the first referenced data block;Referenced data block
Fingerprint for the first equipment to the second equipment transmission data, which includes between the first referenced data block and the first data block
Variance data;Second equipment receives the data that the first equipment is sent.In the technical solution, the second equipment by with the first equipment
Information exchange is carried out, and when being stored with the similar referenced data block to the first data block in determining the second equipment, to first
Equipment sends the fingerprint of the referenced data block, so that the first equipment is sent according to the fingerprint of the referenced data block to the second equipment
Variance data between first data block and the referenced data block.In this way, for compared to the first data block of transmission, band can be saved
Wide resource so as to reduce information transmission time, that is, accelerates the rate of information throughput.
In a kind of possible design, this method can also include:Second equipment receives the to be transmitted of the first equipment transmission
The fingerprint of data, wherein, the fingerprint of data to be transmitted includes the fingerprint of the first data block;Second equipment is according to the first data block
When fingerprint is found in the second equipment without the first data block of storage, the second equipment is searched according to the similar fingerprints of the first data block
In whether store the first referenced data block.If due to being stored with the first data block in the second equipment, the first equipment can not
To the second equipment send the first data block, therefore, the technical solution compared to transmission the first data block and the second referenced data block it
Between variance data technical solution, can further save bandwidth resources.
In a kind of possible design, the similar fingerprints of data to be transmitted include the similar fingerprints of the second data block, and second
Data block is another data block of data to be transmitted;The fingerprint of referenced data block does not include the fingerprint of the second referenced data block;
The similar fingerprints of second referenced data block are identical with the similar fingerprints of the second data block;Data also include the second data block.
In a kind of possible design, the second equipment includes first order caching and second level caching, first order caching right and wrong
Persistence medium, second level caching is persistence medium, first order caching for cache the part that is stored in the caching of the second level or
All data blocks and the partly or entirely fingerprint and similar fingerprints of data block;This method can also include:Second equipment is
The fingerprint (or similar fingerprints) of the first data block is searched in level cache;If it is searched in first order caching less than the first data block
Fingerprint (or similar fingerprints), then search the fingerprint (or similar fingerprints) of the first data block in being cached in the second level.Thus,
The probability that data block in being cached due to the first order is hit is higher, i.e., usually can find the first number in first order caching
According to block (or first referenced data block), therefore information searching efficiency can be improved.
In a kind of possible design, second level caching includes one or more containers, and each container is at least two numbers
According to the set that the fingerprint and similar fingerprints of each data block in block and at least two data blocks are formed, in each container at least
There is correlation between the content of two data blocks;This method can also include:If the first equipment is searched in being cached in the second level
To a data block, then the container where data block is cached to the first order in caching.Wherein, which can be to be transmitted
Any one data block or the reference data similar to any one data block in data to be transmitted in data
Block.Thus, the probability that the data block in being cached due to the first order is hit is higher, i.e., usually in first order caching
The data block is found, therefore information searching efficiency can be improved.
The third aspect, this application provides a kind of data processing equipment, for performing any that above-mentioned first aspect provides
Kind method.The data processing equipment can be specifically above-mentioned first equipment.
In a kind of possible design, the data processing equipment can be carried out according to the method that above-mentioned first aspect provides
The division of function module, can also will be two or more for example, can correspond to each function divides each function module
Function is integrated in a processing module.
In alternatively possible design, which can include:Memory and processor, memory calculate for storing
Machine program, when which is executed by processor so that the either method that first aspect face provides is performed.
Fourth aspect, this application provides a kind of data processing equipment, for performing any that above-mentioned second aspect provides
Kind method.The data processing equipment can be specifically above-mentioned second equipment.
In a kind of possible design, the data processing equipment can be carried out according to the method that above-mentioned second aspect provides
The division of function module, can also will be two or more for example, can correspond to each function divides each function module
Function is integrated in a processing module.
In alternatively possible design, which can include:Memory and processor, memory calculate for storing
Machine program, when which is executed by processor so that the either method that second aspect provides is performed.
The embodiment of the present application additionally provides a kind of processing unit, to realize the work(of above-mentioned first equipment or the second equipment
Can, including processor and interface;Processing unit can be a chip, and processor can also be passed through by hardware to realize
Software realizes that, when passing through hardware realization, which can be logic circuit, integrated circuit etc.;When by software come real
Now, which can be a general processor, be realized by reading the software code stored in memory, the storage
Device can be integrated in the processor, can be located at except processor, be individually present.
Present invention also provides a kind of computer readable storage mediums, store computer program thereon, when the program exists
When being run on computer so that computer performs above-mentioned first aspect to any possible method of second aspect.
Present invention also provides a kind of computer program product, when run on a computer so that first aspect is extremely
The either method that second aspect provides is performed.
It should be understood that any data processing equipment or computer storage media or the computer program production of above-mentioned offer
Product are used to perform corresponding method presented above, and therefore, the attainable advantageous effect of institute can refer to corresponding method
In advantageous effect, details are not described herein again.
Description of the drawings
A kind of configuration diagram for system that Fig. 1 is applicable in by data processing method provided by the embodiments of the present application;
Fig. 2 is a kind of interaction schematic diagram of data processing method provided by the embodiments of the present application;
Magnitude relationships of the Fig. 3 between a kind of data to be transmitted provided by the embodiments of the present application, unit length and data block
Schematic diagram;
Fig. 4 is a kind of process schematic of similar fingerprints for calculating data block provided by the embodiments of the present application;
Fig. 5 is a kind of schematic diagram of first transmission list provided by the embodiments of the present application;
Fig. 6 is a kind of flow chart of the classification of determining data block provided by the embodiments of the present application;
Fig. 7 is a kind of schematic diagram of second transmission list provided by the embodiments of the present application;
Fig. 8 is a kind of schematic diagram of third transmission list provided by the embodiments of the present application;
The configuration diagram for another system that Fig. 9 is applicable in by data processing method provided by the embodiments of the present application;
Figure 10 is the process schematic of information stored in a kind of update first order caching provided by the embodiments of the present application;
Figure 11 is a kind of structure diagram of data processing equipment provided by the embodiments of the present application;
Figure 12 is the structure diagram of another data processing equipment provided by the embodiments of the present application.
Specific embodiment
Data block refers to the set that a part of data to be transmitted is formed.The size of different data block can be identical, can also
It is different.
The fingerprint of data block refers to the mark for being used to mark the data block that whole characteristic informations based on the data block obtain
Know information.The fingerprint of different data block is different.
The similar fingerprints of data block, refer to that the special characteristic information based on the data block obtains for marking the data block
Identification information.For example, a certain data block is character string " 78905 ", if special characteristic information is the spy of the character of the 2nd position
Reference ceases, then the similar fingerprints of the data block are the characteristic information of " 8 ";If special characteristic information is the character of the 5th position
Characteristic information, then the similar fingerprints of the data block are the characteristic informations of " 5 ".The similar fingerprints of different data block can be identical,
It can not also be identical.For example, two data blocks are character string " 78905 " and " 12345 " respectively, if special characteristic information is the 2nd
The characteristic information of the character of a position, then the similar fingerprints of the two data blocks are different, and be respectively " 8 " characteristic information and
The characteristic information of " 2 ";If special characteristic information is the characteristic information of the character of the 5th position, the two data blocks it is similar
Fingerprint is identical, and is the characteristic information of " 5 ".Wherein, the characteristic information of character here can be character in itself or
The information about character being calculated according to specific algorithm.
The referenced data block similar to a certain data block refers to the data block for having same similar fingerprints with the data block.
For example, if two data blocks are character string " 78905 " and " 12345 " respectively, and special characteristic information is the character of the 5th position
Characteristic information, then the similar fingerprints of the two data blocks are identical, and are the characteristic informations of " 5 ", in this case, " 78905 "
Can be as the referenced data block of " 12345 ", similarly, " 12345 " can also be used as the referenced data block of " 78905 ".
Container refers to the set that information is formed.One container can include multiple data blocks, and multiple data are in the block every
The fingerprint and similar fingerprints of one data block.Each container tool is there are one container identification, for marking the container.Accelerator can be right
Information unification in one container is scheduled, such as all information in a container are written to caching etc..
Term "and/or" in the application, be used only for description affiliated partner incidence relation, represent affiliated partner between
There may be three kinds of relationships, for example, A and/or B, can represent:Individualism A exists simultaneously A and B, individualism B.Symbol
"/" represents that affiliated partner is relationship such as A/B expressions A or B either.Term " first ", " second " etc. are for distinguishing not
Same object rather than the particular order for description object." multiple " refer to two or more.
As shown in Figure 1, it is a kind of framework signal for system that data processing method provided by the embodiments of the present application is applicable in
Figure.System shown in FIG. 1 includes:Sending ending equipment 1, transmitting terminal accelerator 2, receiving terminal accelerator 3 and receiving device 4.
Sending ending equipment 1 transmits information through transmitting terminal accelerator 2 and receiving terminal accelerator 3 to receiving device 4.Transmitting terminal accelerator 2
Mounted on 1 side of sending ending equipment, receiving terminal accelerator 3 is mounted on 4 side of receiving device, and transmitting terminal accelerator 2 adds with receiving terminal
Fast device 3 passes through WAN communication.Sending ending equipment 1 and receiving device 3 may each be data center, for example, cloud computing center, cloud
Disaster Preparation Center, edge cloud etc..Transmitting terminal accelerator 2 and receiving terminal accelerator 3 can be referred to as WAN accelerators.The application carries
In the scenes such as the data processing method of confession can be applied to data backup, data are restored.
It should be understood that the sending ending equipment in a certain secondary data transmission procedure, it can in another secondary data transmission procedure
It can be by as receiving device;Correspondingly, the transmitting terminal accelerator of sending ending equipment side installation, in another secondary data transmission
In the process by as receiving terminal accelerator.Similarly, the receiving device in a certain secondary data transmission procedure is passed in another secondary data
It may be by as sending ending equipment during defeated;Correspondingly, the receiving terminal accelerator of receiving device side installation, another at this
By as transmitting terminal accelerator in secondary data transmission procedure.
It should be understood that transmitting terminal accelerator 2 can be the first equipment described in this application, receiving terminal accelerator 3 can
To be the second equipment described in this application;First equipment described in this application may also mean that sending ending equipment 1, and second sets
It is standby to refer to receiving device 4;Another kind realizes that the first equipment described in this application may also mean that sending ending equipment 1 and send
Accelerator 2 is held, the second equipment refers to receiving terminal accelerator 3 and receiving device 4.
As shown in Fig. 2, the interaction schematic diagram for a kind of data processing method provided by the embodiments of the present application.It is shown in Fig. 2
Method can be applied in system architecture as shown in Figure 1.Method shown in Fig. 2 includes the following steps S101~S110:
S101:Sending ending equipment sends data to be transmitted to transmitting terminal accelerator.
In field of cloud calculation, in general, periodically or non-periodically having a large amount of data needs from sending ending equipment through hair
Sending end accelerator and receiving terminal accelerator are transferred to receiving device.The big of the data (i.e. data to be transmitted) transmitted is needed every time
It is small to may be the same or different.For example, the size of data that certain needs transmits is 10GB.
S102:After transmitting terminal accelerator receives the data to be transmitted of sending ending equipment transmission, data to be transmitted is cut
It is divided into several unit lengths, then, elongated piecemeal is carried out to the data of each unit length, obtains several data blocks.
Since the size for the data for needing to transmit every time may be the same or different.For the ease of management, introduce
The concept of " unit length ", the size of unit length can be such as, but not limited to 4MB.In general, transmitting terminal accelerator and reception
End accelerator the data of one unit length are uniformly processed (such as unified calculation fingerprint similar fingerprints, uniform transmission
Deng).
Elongated piecemeal is a kind of block algorithm that data block is carried out according to data content.Elongated piecemeal can for example but not
It is limited to realize using sliding window technique and Rabin fingerprint technology.Any two data block obtained using elongated partition
Size can be equal, can not also be equal.The data block obtained after elongated piecemeal is carried out to the data of different unit lengths
Number can be identical, can not also be identical.For example, it is assumed that the size of unit length is 4MB, and obtained using elongated partition
The average value of the size of data block is about 8KB, then the data of some unit lengths can obtain 511 numbers after elongated piecemeal
According to block, the data of some unit lengths can obtain 512 data blocks after elongated piecemeal, and the data of some unit lengths are through becoming
514 data blocks etc. can be obtained after long piecemeal.Since the data block location of elongated piecemeal relies on data content, in data
The data block as content before being syncopated as shifting with data is remained to after shifting, so as to be conducive to complete subsequent heavy delete
Business, i.e., hereafter described in the business no longer transmitted of primary sources block.
It is as shown in figure 3, big between a kind of data to be transmitted provided by the embodiments of the present application, unit length and data block
The schematic diagram of small relationship.Wherein, it with the size of data to be transmitted is 10GB to be in Fig. 3, and unit length is 4MB, data block size
Average value be about to illustrate for 8K.In the example, transmitting terminal accelerator is receiving sending ending equipment transmission
After 10GB data to be transmitted, can putting in order according to this 10GB data to be transmitted first, using every 4MB as granularity, to this
10GB data to be transmitted carries out cutting, obtains 2560 unit lengths (label is 1~2560 in Fig. 3);Then, it is right
4MB data to be transmitted in per unit length carries out elongated piecemeal, and being divided into 512 data blocks, (label is in Fig. 3
1~512).
It should be noted that block algorithm can be not limited to elongated block algorithm, such as can also be that fixed length piecemeal is calculated
Method.Wherein, it is identical to the size of each data block obtained after the data of each unit length progress fixed length piecemeal.
S103:To any unit length, transmitting terminal accelerator calculate each data block in the unit length fingerprint and
Similar fingerprints.
Transmitting terminal accelerator can calculate the fingerprint of data block by hash algorithm.Wherein, hash algorithm can for example but
It is not limited to following any:Secure Hash Algorithm (secure hash algorithm, SHA1), message digest algorithm 5
(message digest algorithm 5, MD5), modulus algorithm, interception partial bytes algorithm etc..
Transmitting terminal accelerator can by local sensitivity hash algorithm (locality sensitive hashing, LSH),
Calculate the similar fingerprints of data block.Local sensitivity hash algorithm be it is a kind of by design meet the special nature i.e. Kazakhstan of local sensitivity
Uncommon function, the method for improving similar search efficiency.The same data block obtained using different local sensitivity hash algorithms it is similar
Fingerprint can be identical, can not also be identical.Local sensitivity hash algorithm can be such as, but not limited to following any:
Minhash, simhash etc..
Optionally, it in order to shorten the time for calculating similar fingerprints and being consumed, in some embodiments of the present application, provides
1)~4 a kind of method for the similar fingerprints for calculating data block, specifically may include steps of):
1) it is multiple sub-blocks by the data block cutting, for any data block.
Wherein, used algorithm can be fixed length block algorithm during by the data block cutting for multiple sub-blocks,
It can be elongated block algorithm.For elongated block algorithm, the size of each sub-block can be such as, but not limited to 8
~16 bytes (Byte).Wherein, the size of different sub-blocks can be identical, can not also be identical.
2) multiple subdata n target sub-block in the block, is extracted.Wherein, n is greater than the integer equal to 1.It is each
Target sub-block is a special characteristic information for being regarded as the data block.
For example, it is assumed that the data block of a 8KB 1000 sub-blocks, target sub-block are divided into step 1)
Refer to the 4k sub-block, k is greater than the integer equal to 1, then the target sub-block extracted can be:Target subdata
Block 4,8,12,16,20 ... 1000.
Step 1)~2) it is a kind of specific implementation that special characteristic information is extracted from data block.The application is unlimited
In this.
3), using the different hash algorithm of m kinds, each target sub-block in the block to the data carries out Hash fortune respectively
It calculates, obtains m group Hash sequences.Wherein, m is greater than or equal to 2 integer.
Hash operation is carried out to each target sub-block, a cryptographic Hash can be obtained.It is right using a kind of hash algorithm
N target sub-block carries out Hash operation, obtains n cryptographic Hash, this n cryptographic Hash forms a Hash sequence.Therefore, it is sharp
With the different hash algorithm of m kinds, each target sub-block progress Hash operation in the block to the data, can obtain m respectively
Hash sequence, wherein, each Hash sequence includes n cryptographic Hash.
4) maximum value in each Hash sequence, is obtained, the sequence obtained after this m maximum value is merged is as the data
The similar fingerprints of block.Alternatively, obtaining the minimum value in each Hash sequence, the sequence obtained after this m minimum value is merged is made
Similar fingerprints for the data block.M value (including maximum value or minimum value) is merged, is referred to this m value according to m kinds
Putting in order for hash algorithm is ranked up, and obtains a sequence.Wherein, putting in order for this m kind hash algorithm can be appointed
Meaning, still, this m kind hash algorithm put in order once it is determined that, then when calculating the set of metadata of similar data block of each data block, make
Above-mentioned m value is merged with this fixed puts in order.
As shown in figure 4, the process schematic for the optional realization method.Wherein, it is with the different Hash of m kinds in Fig. 4
Algorithm is specifically hash algorithm 1,2,3, and carries out Hash fortune to multiple target sub-block respectively using hash algorithm 1,2,3
It calculates, obtains what is illustrated for Hash sequence 1,2,3.In the optional realization method, transmitting terminal accelerator obtains m Hash
The process of sequence can perform parallel, thus, can shorten the time spent by the similar fingerprints for calculating data block.
It should be noted that since the data in a unit length are uniformly processed in transmitting terminal accelerator,
In actual implementation, transmitting terminal accelerator withouts waiting for having been calculated each data block of all data to be transmitted of this secondary transmission
After fingerprint and similar fingerprints, S104 is just performed, but, it can be in the finger of each data block during a unit length has been calculated
After line and similar fingerprints, you can continue to execute S104 for the unit length.
It should be understood that the fingerprint of data to be transmitted block can include:The finger of each data block in data to be transmitted
Line.The similar fingerprints of data to be transmitted block can include:The similar fingerprints of each data block in data to be transmitted.
S104:Transmitting terminal accelerator sends the fingerprint of each data block in the unit length and similar to receiving terminal accelerator
Fingerprint.
It is exemplary, fingerprint and the similar fingerprints structure first of each data block of the transmitting terminal accelerator in the unit length
Transmission list;Then, the first transmission list is sent to receiving terminal accelerator.First transmission list is used to indicate in the unit length
Each data block fingerprint and similar fingerprints.Be in the present embodiment with transmitting terminal accelerator by the fingerprint of a unit length and
Similar fingerprints are sent to what is illustrated for receiving terminal accelerator in the form of a list, and certain the application is without being limited thereto.
As shown in figure 5, for a kind of schematic diagram of first transmission list provided by the embodiments of the present application.First transmission list can
To include:Each data according to a certain sequence (hereinafter referred to First ray) arrangement in header information and the unit length
Finger print information of block etc..Wherein, header information, can example for recording the summary information of content transmitted in the first transmission list
Such as but it is not limited to include:The number of data block in the unit length, the initial position of the finger print information of first data block.Its
In, the initial position of the finger print information of first data block, Ke Yishi:Represent the first ratio shared by the finger print information of first data block
Special position is information of which bit in the first transmission list etc..First ray can be such as, but not limited in S101
When performing elongated piecemeal, the sequence of each data block composition of acquisition.If for example, when performing elongated piecemeal, the unit length by according to
Secondary to be divided into data block 1, data block 2, data block 3 ..., then the finger print information of each data block in the first transmission list can be according to
It is secondary to be:The finger print information of data block 1, the finger print information of data block 2, data block 3 a data block of finger print information ... finger
Line information includes the fingerprint of the data block and the similar fingerprints of the data block.It is that data are included with a unit length in Fig. 5
It is illustrated for 1~data block of block 512.The finger print information of each data block in first transmission list is not limited to such as Fig. 5 institutes
The example shown.For example, the finger print information of each data block in the first transmission list can also be successively:Fingerprint, the number of data block 1
According to the fingerprint of block 2, data block 3 fingerprint ... the fingerprint of data block 512, the similar fingerprints of data block 1, data block 2 it is similar
Fingerprint, data block 3 similar fingerprints ... data block 512 similar fingerprints.
S105:Receiving terminal accelerator receives the fingerprint of each data block in the unit length of transmitting terminal accelerator transmission
After similar fingerprints, the classification of each data block in the unit length is determined.
Receiving terminal accelerator can determine the classification of each data block successively according to First ray, can also determine simultaneously to
The classification of few two data blocks, the application is to this without limiting.
The classification of data block includes:Primary sources block, secondary sources block and third class data block.Wherein, if one
Data block is primary sources block, represents that receiving terminal accelerator has stored the data block.If a data block is the second class
Data block represents not storing the data block in receiving terminal accelerator, but stores the similar reference number to the data block
According to block.If a data block is third class data block, represents both not storing the data block in receiving terminal accelerator, also not deposit
The storage referenced data block similar to the data block.
For example, for any data block, as shown in fig. 6, the data block can be determined with T1~T6 as follows
Classification:
T1:Receiving terminal accelerator obtains the fingerprint and similar fingerprints of the data block.
T2:Receiving terminal accelerator judges the local fingerprint that whether can find the data block.
If so, illustrating that receiving terminal accelerator has stored the data block, then T3 is performed.
If it is not, illustrating that receiving terminal accelerator does not store the data block, then T4 is performed.
If it should be noted that in general, store a data block in receiving terminal accelerator, this number can be stored simultaneously
According to the fingerprint of block.Therefore, judge the local fingerprint for whether storing the data block, you can judge locally whether store the data
Block.
T3:Receiving terminal accelerator judges that the data block is primary sources block.
After performing T3, then terminate.
T4:Receiving terminal accelerator judges the local similar fingerprints that whether can find the data block.
If so, illustrating that receiving terminal accelerator has stored the referenced data block similar to the data block, then T5 is performed.
If it is not, illustrate that receiving terminal accelerator without storing the referenced data block similar to the data block, then performs T6.
If it should be noted that in general, store a data block in receiving terminal accelerator, this number can be stored simultaneously
According to the similar fingerprints of block.Therefore, judge the local similar fingerprints for whether storing the data block, you can local whether store judged
The referenced data block similar to the data block.
T5:Receiving terminal accelerator judges that the data block is secondary sources block.
After performing T5, then terminate.
It should be noted that the similar fingerprints due to different data block may be identical, it may in receiving terminal accelerator
Multiple data blocks with same similar fingerprints are cached with, based on this, in T5, receiving terminal accelerator can be by multiple data
One of data block in the block is used as with reference to data block.Alternatively, receiving terminal accelerator is in caching data block, for having
Multiple data blocks of same similar fingerprints can only cache one of data block, add in such manner, it is possible to avoid the occurrence of receiving terminal
Having been cached in fast device has a case that multiple data blocks of same similar fingerprints.
It is further to note that receiving terminal accelerator may be used also after judging that a certain data block is secondary sources block
To obtain the fingerprint of the referenced data block similar to the data block, so as to prepare to perform S106.
T6:Receiving terminal accelerator judges that the data block is third class data block.
After performing T6, then terminate.
S106:Receiving terminal accelerator feeds back the classification of each data block in the unit length to transmitting terminal accelerator, with
And the fingerprint of the referenced data block similar to each secondary sources block in the unit length.
It is exemplary, classification structure the second transmission row of each data block of the receiving terminal accelerator in the unit length
Table, and send the second transmission list to transmitting terminal accelerator.Wherein, the second transmission list is used to indicate every in the unit length
The fingerprint of the classification of one data block and each referenced data block.
As shown in fig. 7, for a kind of schematic diagram of second transmission list provided by the embodiments of the present application.Second transmission list can
To include:The classification logotype of each data block in header information and First ray and with each second class in First ray
The fingerprint of the similar referenced data block of data block.Wherein, header information is used for the content transmitted in the second transmission list of record
Summary information can such as, but not limited to include:The total length of the classification logotype of each data block in the unit length, the second class
The item number of the fingerprint of the referenced data block of data block and initial position etc., wherein, the finger of the referenced data block of secondary sources block
The initial position of line, Ke Yishi:It is the second transmission list to represent the first bit shared by the fingerprint of first secondary sources block
In which bit information.That 100 secondary sources blocks are included with the unit length in Fig. 7, and with this 100
The similar referenced data block of secondary sources block is marked as what is illustrated for 1~referenced data block of referenced data block 100.
In addition, the classification logotype of primary sources block can be binary number " 11 ", the classification logotype of secondary sources block can be two
System number " 10 ", the classification logotype of third class data block can be binary number " 00 ", and certain the application is without being limited thereto.
S107:Transmitting terminal accelerator receives the class of each data block in the unit length of receiving terminal accelerator feedback
Not and after the fingerprint of the referenced data block similar to each secondary sources block in the unit length, according to following tactful 1
~3 to receiving terminal accelerator transmission data:
Strategy 1:For primary sources block, any data are not transmitted.
Strategy 2:For secondary sources block, transmitting terminal accelerator judges locally the secondary sources block phase whether can be found
As referenced data block.If can the referenced data block similar to the secondary sources block locally be being found, to second class
Data block and the referenced data block similar to the secondary sources block carry out Differential Compression, and send difference to receiving terminal accelerator
The data obtained after compression.If cannot locally find the referenced data block similar to the secondary sources block, to this second
Class data block is compressed, and the information obtained after compression is sent to receiving terminal accelerator.Differential Compression, it can be understood as:First
The variance data between the secondary sources block and the referenced data block similar to the secondary sources block is calculated, then to the difference
Data are compressed.
Strategy 3:For third class data block, transmitting terminal accelerator compresses third class data block, and sends compression
The information obtained afterwards.
For above-mentioned tactful 1, due to being cached with primary sources block in receiving terminal accelerator, transmitting terminal accelerator
The data block can not be sent to receiving terminal accelerator.
For above-mentioned tactful 2, due to being cached with the referenced data block similar to secondary sources block in receiving terminal accelerator,
Therefore transmitting terminal accelerator only sends the difference number between the secondary sources block and the referenced data block to receiving terminal accelerator
According to, you can receiving terminal accelerator is made to recover the secondary sources block according to the variance data and the referenced data block.Wherein, such as
The referenced data block is not stored in fruit transmitting terminal accelerator, then cannot perform the step of calculating variance data, in this case, hair
Sending end accelerator needs to send the secondary sources block to receiving terminal accelerator.
For above-mentioned tactful 3, due to storage third class data block no in receiving terminal accelerator, also without storage and third
The similar referenced data block of class data block, therefore, transmitting terminal accelerator need to send the third data block to receiving terminal accelerator.
In addition, transmitting terminal accelerator Differential Compression for performing or the step of compression in above-mentioned tactful 2 and strategy 3, it can
Bandwidth resources are further saved, so as to reduce information transmission time, that is, accelerate the rate of information throughput.Wherein, the application is to holding
For used algorithm without limiting, Differential Compression algorithm can be such as, but not limited to following when row Differential Compression and compression
It is a kind of:X-delta, LZ-delta etc..Compression algorithm can be such as, but not limited to following any:Gzip, LZ4, bzip,
7zip etc..
Exemplary, when performing S107, transmitting terminal accelerator is received in the unit length of receiving terminal accelerator feedback
Each data block classification and the referenced data block similar to each secondary sources block in the unit length after, can
To send third transmission list according to above-mentioned construction of strategy third transmission list, and to receiving terminal accelerator.Wherein, third is transmitted
List is used to indicate each information (hereinafter referred to Differential Compression data) obtained after Differential Compression and each compressed
The information (hereinafter referred to compressed data) obtained afterwards.
As shown in figure 8, for a kind of schematic diagram of third transmission list provided by the embodiments of the present application.Third transmission list can
To include:Header information, the classification logotype of each data block in First ray, each Differential Compression data and each pressure
Contracting data.Wherein, header information is used to record the summary information for the content transmitted in third transmission list, such as, but not limited to wraps
It includes:The total length of the classification logotype of each data block in the unit length, the item number of Differential Compression data and initial position, compression
The item number of data and initial position etc..Wherein, the initial position of Differential Compression data (or compressed data), Ke Yishi represent first
First bit shared by a Differential Compression data (or compressed data) is the letter of which bit in third transmission list
Breath.Wherein, it with the number of secondary sources block that referenced data block can be found in transmitting terminal accelerator is 90 that Fig. 8, which is,
And obtain what is illustrated for Differential Compression 1~Differential Compression of data data 90 after Differential Compression.In addition, primary sources
The classification logotype of block can be binary number " 11 ", and the secondary sources of referenced data block can be found in transmitting terminal accelerator
The classification logotype of block can be binary number " 10 ", it is impossible to the secondary sources of referenced data block are found in transmitting terminal accelerator
The classification logotype of block can be binary number " 01 ", and the classification logotype of third class data block can be " 00 ".It should be understood that this
Place is the secondary sources block can find referenced data block in transmitting terminal accelerator, is sentenced with receiving terminal accelerator in S105
Fixed secondary sources block, for being marked using same binary system " 10 ", in actual implementation, the label of the two can not also
Together.
S108:After receiving terminal accelerator receives above- mentioned information, according to it is following it is tactful 4~5 perform difference decompression and/or
Decompression:
Strategy 4:For Differential Compression data, the fingerprint of the corresponding secondary sources block of the Differential Compression data, example are determined
Such as, receiving terminal accelerator can determine a Differential Compression data according to the header information of third transmission list, and determine the difference
The fingerprint of the corresponding secondary sources block of different compressed data.Then, the referenced data block similar to the secondary sources block is obtained,
And difference decompression is carried out to the referenced data block and Differential Compression data, obtain the secondary sources block.
Strategy 5:For compressed data, which is decompressed, obtains data block.It should be understood that based on upper
Strategy 2 and strategy 3 are stated it is found that the data block may be secondary sources block, it is also possible to third class data block.
S109:Receiving terminal accelerator is to each he first-class numbert in each data block and the unit length that are obtained in S108
It is assembled according to block, recovers the data of the unit length.It should be understood that number of the receiving terminal accelerator to multiple unit lengths
After being assembled, data to be transmitted can be recovered.
S110:Data to be transmitted is sent to receiving device by receiving terminal accelerator.
In data processing method provided by the embodiments of the present application, transmitting terminal accelerator with receiving terminal accelerator by carrying out letter
Breath interaction, determines to be stored with the similar reference number to some in data to be transmitted or certain data blocks in receiving terminal accelerator
According to block, then, the variance data between the data block and the referenced data block is sent to receiving terminal accelerator.In this way, compared to passing
For defeated entire data block, bandwidth resources can be saved, so as to reduce information transmission time, that is, accelerate the rate of information throughput.
With reference to Fig. 1, transmitting terminal accelerator 2 can include:Accelerator agency (agent) 21, first order caching 22, the second level
Caching 23 and interface 24;Receiving terminal accelerator 3 can include:Accelerator agency 31, first order caching 32, second level caching
33 and interface 34, as shown in Figure 9.Connection relation between each device can refer to Fig. 9.
Wherein, for any accelerator, it includes accelerator agency be the accelerator control centre.For example,
With reference to Fig. 2, dicing step, elongated blocking step in the S102 that transmitting terminal accelerator performs, the tools such as the calculating step in S103
Body can be that the accelerator agency 21 sent in accelerator performs.Determining step in the S105 that receiving terminal accelerator performs,
Differential Compression, compression step in S108 etc. can be specifically that the accelerator agency 31 in receiving terminal accelerator performs.
For any accelerator, interface is the interface with WAN communication.Interface can be based on agency's (proxy) agreement,
Therefore poxy interfaces are referred to as.For example, an accelerator to another accelerator send information (such as above-mentioned S104,
S106, S107 etc.), can be specifically:The accelerator of one accelerator acts on behalf of the interface through the accelerator to another accelerator
Send information.One accelerator receives the information of another accelerator transmission, can be specifically the accelerator generation of an accelerator
Manage the information of another accelerator transmission of the interface through the accelerator.
For any accelerator, first order caching is non-persistence medium, such as cache memory
(cache).Second level caching is persistence medium, such as disk.The first order caches to cache what is stored in the caching of the second level
The fingerprint and similar fingerprints of part or all of data block and part or all of data each data block in the block.In one kind
It in optional realization method, manages for convenience, first order caching can delay including data buffer storage, fingerprint cache and similar fingerprints
It deposits, as shown in Figure 9.Wherein, data buffer storage is used to cache the part or all of data block stored in the caching of the second level, fingerprint cache
For caching the fingerprint of part or all of data each data block in the block, similar fingerprints caching is for caching the part or complete
The similar fingerprints of portion's data each data block in the block.
For any accelerator, in general, the capacity of second level caching is more than the capacity of first order caching.For example, the
The capacity of L2 cache is 20GB, and the capacity of first order caching is 30KB.The first order is set to cache in an accelerator, it can be with
The efficiency of information searching is improved, so as to promote caching performance.The second level is set to cache in an accelerator, caching can be increased
Capacity so as to improve the hit rate of information searching, and then saves bandwidth resources, reduces information transmission time.
In some embodiments of the present application, accelerator (can be transmitting terminal accelerator or receiving terminal adds
Fast device) it searches when a data block (such as above-mentioned T2 or S107) whether is stored in the accelerator, first, in the acceleration
The fingerprint of the data block is searched in the first order caching of device;If searching the fingerprint less than the data block in first order caching,
The data block is searched in being cached in the second level of the accelerator.Wherein, which can be any one in data to be transmitted
A data block or the referenced data block similar to any one secondary sources block in data to be transmitted.Thus, due to
The probability that data block in first order caching is hit is higher, i.e., usually can find the data block in first order caching,
Therefore information searching efficiency can be improved.
In some embodiments of the present application, receiving terminal accelerator is searched and is received according to the similar fingerprints of a certain data block
When whether being stored with the referenced data block similar to the data block in the accelerator of end, it can first search and whether be deposited in first order caching
Contain the similar fingerprints;If finding, it is secondary sources block to judge the data block.If it is not searched in first order caching
To the similar fingerprints, then such as, but not limited to one of following two realization methods can be performed:
A kind of realization method can be:If not finding the similar fingerprints in first order caching, the is continued to search for
Whether the similar fingerprints are stored in L2 cache, and when finding, it is secondary sources block to judge the data block;Do not having
When finding, do not have to store the reference data similar to the data block into transmitting terminal accelerator feedback representation receiving terminal accelerator
The information of block.Thus, which the probability that the data block in being cached due to the first order is hit is higher, i.e., usually cached in the first order
In can find the referenced data block, therefore information searching efficiency can be improved.
Another realization method can be:If the similar fingerprints are not found in first order caching, directly to hair
There is no the information for storing the referenced data block similar to the data block in sending end accelerator feedback representation receiving terminal accelerator.It needs
What is illustrated is, on the one hand, when the similar fingerprints stored in the caching of the second level are more, receiving terminal accelerator is looked into being cached in the second level
Look for the time spent by the process of a similar fingerprints usually longer;On the other hand, according to analysis above, it is believed that:
If can not find the similar fingerprints in first order caching, the probability that the similar fingerprints are found in being cached in the second level is smaller.Cause
This, in this case, receiving terminal accelerator can not stored directly into transmitting terminal accelerator feedback representation receiving terminal accelerator
The information of the referenced data block similar to the data block, so as to save the time searched spent by similar fingerprints.
The information stored in first order caching and second level caching is can be newer.For first order caching,
Its data block cached can be the data block before and after data block accessed recently and/or data block accessed recently.It is right
For the caching of the second level, the data block of caching can be that access times are greater than or equal to threshold value and/or are accessed recently
Data block.For any level caching, when a data block is cached in the caching, the fingerprint and phase of the data block
It can be buffered therewith like fingerprint;When a data block is deleted in the caching, the fingerprint and similar fingerprints of the data block therewith by
It deletes.
For example, the accelerator agency in transmitting terminal accelerator receives the data to be transmitted of sending ending equipment transmission, and count
It calculates after obtaining the fingerprint and similar fingerprints of each data block in the data of a unit length, which is divided into
The fingerprint and similar fingerprints of each data block and each data block are buffered in first order caching and second level caching.Wherein, if
The free space of any level caching (such as first order caching or second level caching) is not enough to cache what the unit length was divided into
The fingerprint and similar fingerprints of each data block and each data block are then deleted in first order caching and are stored into this grade caching earliest
And/or the data block and the fingerprint and similar fingerprints of the data block being accessed earliest.If the free space of this grade caching is enough
Each data block that the unit length is divided into and the fingerprint and similar fingerprints of each data block are cached, then is directly delayed in this grade of grade
Deposit middle each data block for being divided into of the increase unit length and the fingerprint and similar fingerprints of each data block.
For another example, in some embodiments of the present application, for any accelerator, second level caching includes one or more
Container, each container include the fingerprint of each data block and similar finger at least two data blocks and at least two data blocks
Line.If the accelerator has found a data block in being cached in the second level, the container where the data block is cached to the
In level cache.Wherein, the data block can be any one data block in data to be transmitted or in data to be transmitted
The similar referenced data block of any one secondary sources block.As shown in Figure 10, it is assumed that second level caching includes multiple containers (figure
Container 1~3 is shown in 10), if during certain searching data block (such as above-mentioned T2 or S107), if
The data block has been found in container 1 in L2 cache, then has been cached container 1 to the first order in caching.Wherein, Figure 10 is base
It is drawn in Fig. 9, first order caching and second level caching in Figure 10 can belong to transmitting terminal accelerator, Huo Zheke
To belong to receiving terminal accelerator.It should be noted that since there is correlation between the content of several continuous data blocks,
It is therefore contemplated that a data block is hit, then several data blocks before and after the data block are hit in the future probability compared with
Height based on this, introduces the concept of " container " in the embodiment, i.e., using at least two data blocks as a set, if the collection
A data block in conjunction is hit, then it is assumed that the probability that other data blocks in the set are hit in the future is higher.Such one
Come, subsequent information hit rate can be improved.
It is above-mentioned that mainly scheme provided by the embodiments of the present application is described from the angle of method.In order to realize above-mentioned work(
Can, it comprises perform the corresponding hardware configuration of each function and/or software module.Those skilled in the art should be easy to anticipate
Know, with reference to each exemplary unit and algorithm steps that the embodiments described herein describes, the application can with hardware or
The combining form of hardware and computer software is realized.Some function is actually with the side of hardware or computer software driving hardware
Formula performs, specific application and design constraint depending on technical solution.Professional technician can be to each specific
Using realizing described function using distinct methods, but this realize it is not considered that beyond scope of the present application.
The embodiment of the present application (can be able to be the transmission that is outlined above to data processing equipment according to above method example
Hold accelerator or receiving terminal accelerator) division of function module is carried out, for example, can correspond to each function divides each function mould
Two or more functions can also be integrated in a processing module by block.Above-mentioned integrated module both may be used
The form of hardware is realized, can also be realized in the form of software function module.It is it should be noted that right in the embodiment of the present application
The division of module is schematical, and only a kind of division of logic function can have other dividing mode in actual implementation.
As shown in figure 11, it is a kind of data processing equipment 11 provided by the embodiments of the present application.Data processing equipment 11 can be with
It is the transmitting terminal accelerator being outlined above or refers to sending ending equipment or sending ending equipment and transmitting terminal accelerator.
Data processing equipment 11 shown in Figure 11 can include:Computing unit 1101, transmitting element 1102, receiving unit 1103 and
Searching unit 1104.Wherein, computing unit 1101, for calculating the similar fingerprints of data to be transmitted;Wherein, data to be transmitted
Similar fingerprints include the similar fingerprints of the first data block, and the first data block is a data block in data to be transmitted.It sends single
Member 1102, for sending the similar fingerprints of data to be transmitted to the second equipment, the similar fingerprints of data to be transmitted are for searching the
Whether to data to be transmitted similar referenced data block is stored in two equipment.Receiving unit 1103, for receiving the second equipment
The fingerprint of the referenced data block of transmission;Wherein, the fingerprint of referenced data block includes the fingerprint of the first referenced data block;First reference
The similar fingerprints of data block are identical with the similar fingerprints of the first data block.Searching unit 1104, for according to the first reference data
The fingerprint of block finds the first referenced data block in data processing equipment 11.Transmitting element 1102 is additionally operable to, based on reference number
According to the fingerprint of block to the second equipment transmission data;Wherein, which includes between the first referenced data block and the first data block
Variance data.For example, with reference to Fig. 2, data processing equipment 11 can be specifically transmitting terminal accelerator, and the second equipment can be specifically
Receiving terminal accelerator.First data block can be the secondary sources block being outlined above.Computing unit 1101 can be used for holding
The step of similar fingerprints are calculated in row S103.Transmitting element 1102 can be used for performing in S104 the step of sending similar fingerprints,
In S107 the step of transmission data.Receiving unit 1103 can be used for performing the step for the fingerprint that referenced data block is received in S106
Suddenly.
In a kind of possible design, computing unit 1101 specifically can be used for:It is treated using local sensitivity hash algorithm
The data block for transmitting data carries out Hash operation, obtains the similar fingerprints of data block.
In a kind of possible design, computing unit 1101 specifically can be used for:Cutting data to be transmitted obtains data block.
For each data block, computing unit 1101 performs following operation:Extract at least one of data block sub-block;It utilizes
M kind hash algorithms carry out Hash operation at least one sub-block respectively, obtain m Hash sequence;Wherein, a kind of Kazakhstan is utilized
Uncommon algorithm carries out Hash operation at least one sub-block, obtains 1 Hash sequence;M is greater than the integer equal to 2;By m
The maximum value in each Hash sequence in Hash sequence merges, and using the Hash sequence obtained after merging as data block
Similar fingerprints;Alternatively, the minimum value in each Hash sequence in m Hash sequence is merged, and obtained after merging
Similar fingerprints of the Hash sequence arrived as data block.For example, computing unit 1101 specifically can be used for performing mistake shown in Fig. 4
Each step in journey.
In a kind of possible design, data processing equipment 11 can also include:Differential Compression unit 1105, for utilizing
Differential Compression algorithm carries out Differential Compression to the first referenced data block and the first data block.For example, Differential Compression unit 1105 has
The step of body can be used for performing Differential Compression in S107.
In a kind of possible design, the similar fingerprints of data to be transmitted also include the similar fingerprints of the second data block, the
Two data blocks are another data blocks in data to be transmitted;The fingerprint of referenced data block does not include the finger of the second referenced data block
Line;The similar fingerprints of second referenced data block are identical with the similar fingerprints of the second data block;Data also include the second data block.The
Two data blocks can be the third class data block being outlined above.
In a kind of possible design, data processing equipment 11 further includes first order caching and second level caching, the first order
Caching is non-persistence medium, and second level caching is persistence medium, and first order caching stores for caching in the caching of the second level
Part or all of data block and partly or entirely data block fingerprint and similar fingerprints.In this case, searching unit 1104
Specifically it can be used for:The fingerprint of the first referenced data block is searched in first order caching;If the first order caching in search less than
The fingerprint of first referenced data block searches the fingerprint of the first referenced data block in then being cached in the second level.
In a kind of possible design, second level caching includes one or more containers, and each container is at least two numbers
According to the set that the fingerprint and similar fingerprints of each data block in block and at least two data blocks are formed, in each container at least
There is correlation between the content of two data blocks.In this case, searching unit 1104 can be also used for:If it is cached in the second level
In find a data block, then by the container where data block cache to the first order cache in.For example, searching unit 1104 can
For performing each step in scene shown in Fig. 10.
In a kind of possible design, above-mentioned transmitting element 1102 and receiving unit 1103 specifically can be in corresponding diagrams 9
Interface 24.Part or all of in computing unit 1101, searching unit 1104, Differential Compression unit 1105 can be in corresponding diagram 9
Accelerator agency 21.
As shown in figure 12, it is a kind of data processing equipment 12 provided by the embodiments of the present application.Data processing equipment 12 can be with
It is the receiving terminal accelerator or receiving device or receiving terminal accelerator and receiving device being outlined above.
Data processing equipment 12 shown in Figure 12 can include:Receiving unit 1201, searching unit 1202 and transmitting element 1203.
Wherein, receiving unit 1201, for receiving the similar fingerprints for the data to be transmitted that the first equipment is sent, wherein, data to be transmitted
Similar fingerprints include the similar fingerprints of the first data block, the first data block is a data block in data to be transmitted.It searches
Unit 1202 for the similar fingerprints according to data to be transmitted, finds and is stored in data processing equipment 12 and number to be transmitted
According to similar referenced data block;Wherein, referenced data block includes the first referenced data block, the similar fingerprints of the first referenced data block
It is identical with the similar fingerprints of the first data block.Transmitting element 1203, for sending the fingerprint of referenced data block to the first equipment;Its
In, the fingerprint of referenced data block includes the fingerprint of the first referenced data block;The fingerprint of referenced data block is used for the first equipment to number
According to 12 transmission data of processing equipment, which includes the variance data between the first referenced data block and the first data block.It receives
Unit 1201 is additionally operable to, and receives the data that the first equipment is sent.For example, with reference to Fig. 2, data processing equipment 12 is specifically to receive
Accelerator is held, the first equipment can be specifically transmitting terminal accelerator.First data block can be the he second-class number being outlined above
According to block.Receiving unit 1201 specifically can be used for performing in S104 the step of receiving similar fingerprints.Transmitting element 1203 specifically may be used
The step of for performing the fingerprint that referenced data block is sent in S106.First data block can be the second class being outlined above
Data block.
In a kind of possible design, receiving unit 1201 can be also used for, and receive the number to be transmitted that the first equipment is sent
According to fingerprint, wherein, the fingerprint of data to be transmitted includes the fingerprint of the first data block.In this case, searching unit 1202 may be used also
For when being found in data processing equipment 12 without the first data block of storage according to the fingerprint of the first data block, according to the
The first referenced data block whether is stored in the similar fingerprints searching data processing equipment 12 of one data block.With reference to Fig. 6, receive single
Member 1201 can be used for performing T2 and T4 etc..
In a kind of possible design, the similar fingerprints of data to be transmitted include the similar fingerprints of the second data block, and second
Data block is another data block of data to be transmitted;The fingerprint of referenced data block does not include the fingerprint of the second referenced data block;
The similar fingerprints of second referenced data block are identical with the similar fingerprints of the second data block;Data also include the second data block.Second
Data block can be the third class data block being outlined above.
In a kind of possible design, above-mentioned receiving unit 1201 and transmitting element 1203 specifically can be in corresponding diagrams 9
Interface 34.Searching unit 1202 can act on behalf of 31 with the accelerator in corresponding diagram 9.
Since data processing equipment provided by the embodiments of the present application can be used for performing above-mentioned data processing method,
It can be obtained technique effect and can refer to above method embodiment, details are not described herein for the embodiment of the present application.
It can be realized in a manner of hardware with reference to the step of described method of present disclosure or algorithm, also may be used
It is realized in a manner of being to perform software instruction by processing module.Software instruction can be made of corresponding software module, software
Module can be stored on random access memory (random access memory, RAM), flash memory, read-only memory (read
Only memory, ROM), Erasable Programmable Read Only Memory EPROM (erasable programmable ROM, EPROM), electricity can
Erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM
(CD-ROM) or in the storage medium of any other form well known in the art.A kind of illustrative storage medium is coupled to place
Device is managed, so as to enable a processor to from the read information, and information can be written to the storage medium.Certainly, it stores
Medium can also be the component part of processor.Pocessor and storage media can be located in ASIC.
Those skilled in the art are it will be appreciated that in said one or multiple examples, work(described herein
It can be realized with hardware, software, firmware or their arbitrary combination.It when implemented in software, can be by these functions
Storage is transmitted in computer-readable medium or as one or more instructions on computer-readable medium or code.
Computer-readable medium includes computer storage media and communication media, and wherein communication media includes being convenient for from a place to another
Any medium of one place transmission computer program.It is any that storage medium can be that general or specialized computer can access
Usable medium.
More than specific embodiment has carried out further specifically the purpose, technical solution and advantageous effect of the application
It is bright, it should be understood that the foregoing is merely the specific embodiment of the application, it is not used to limit the protection of the application
Range.
Claims (22)
1. a kind of data processing method, which is characterized in that the method includes:
First equipment calculates the similar fingerprints of data to be transmitted;Wherein, the similar fingerprints of the data to be transmitted include the first number
According to the similar fingerprints of block, first data block is a data block in the data to be transmitted;
First equipment sends the similar fingerprints of the data to be transmitted, the phase of the data to be transmitted to second equipment
It is used for searching in second equipment whether store the referenced data block similar to the data to be transmitted like fingerprint;
First equipment receives the fingerprint for the referenced data block that second equipment is sent;Wherein, the referenced data block
Fingerprint includes the fingerprint of the first referenced data block;The similar fingerprints of first referenced data block and the phase of first data block
It is identical like fingerprint;
First equipment finds first ginseng in first equipment according to the fingerprint of first referenced data block
Examine data block;
The fingerprint of first equipment based on the referenced data block is to the second equipment transmission data;Wherein, the data
Include the variance data between first referenced data block and first data block.
2. according to the method described in claim 1, it is characterized in that, first equipment calculates the similar finger of data to be transmitted
Line, including:
The first equipment utilization local sensitivity hash algorithm carries out Hash operation to the data block of the data to be transmitted, obtains
The similar fingerprints of the data block.
3. according to the method described in claim 2, it is characterized in that, the first equipment utilization local sensitivity hash algorithm is to institute
The data block for stating data to be transmitted carries out Hash operation, obtains the similar fingerprints of the data block, including:
Data to be transmitted described in the first equipment cutting obtains data block;For each data block, first equipment is held
The following operation of row:
Extract at least one of data block sub-block;
Using m kind hash algorithms, Hash operation is carried out at least one sub-block respectively, obtains m Hash sequence;Its
In, Hash operation is carried out at least one sub-block using a kind of hash algorithm, obtains 1 Hash sequence;M is greater than
Integer equal to 2;
Maximum value in each Hash sequence in the m Hash sequence is merged, and the Hash that will be obtained after merging
Similar fingerprints of the sequence as the data block;Alternatively, by the minimum value in each Hash sequence in the m Hash sequence
It merges, and using the Hash sequence obtained after merging as the similar fingerprints of the data block.
4. method according to any one of claims 1 to 3, which is characterized in that the method further includes:
The first equipment utilization Differential Compression algorithm carries out difference to first referenced data block and first data block
Compression.
5. method according to any one of claims 1 to 4, which is characterized in that the similar fingerprints of the data to be transmitted are also
The similar fingerprints of the second data block are included, second data block is another data block in the data to be transmitted;It is described
The fingerprint of referenced data block does not include the fingerprint of the second referenced data block;The similar fingerprints of second referenced data block with it is described
The similar fingerprints of second data block are identical;The data also include second data block.
6. method according to any one of claims 1 to 5, which is characterized in that first equipment is cached including the first order
It being cached with the second level, first order caching is non-persistence medium, and the second level caching is persistence medium, described first
Grade caches to cache the part or all of data block stored in the second level caching and the part or all of data block
Fingerprint and similar fingerprints;The method further includes:
First equipment searches the fingerprint of first referenced data block in first order caching;If in the first order
The fingerprint less than first referenced data block is searched in caching, then searches first reference number in being cached in the second level
According to the fingerprint of block.
7. according to the method described in claim 6, it is characterized in that, second level caching includes one or more containers, often
One container is that the fingerprint of each data block and similar fingerprints are formed at least two data blocks and at least two data block
Set, there is correlation between the content of at least two data blocks in each container;The method further includes:
If first equipment finds a data block in being cached in the second level, by the container where the data block
During caching is cached to the first order.
8. a kind of data processing method, which is characterized in that the method includes:
Second equipment receive the first equipment send data to be transmitted similar fingerprints, wherein, the data to be transmitted it is similar
Fingerprint includes the similar fingerprints of the first data block, and first data block is a data block in the data to be transmitted;
Second equipment according to the similar fingerprints of the data to be transmitted, find stored in second equipment with it is described
The similar referenced data block of data to be transmitted;Wherein, the referenced data block includes the first referenced data block, first reference
The similar fingerprints of data block are identical with the similar fingerprints of first data block;
Second equipment sends the fingerprint of referenced data block to first equipment;Wherein, the fingerprint of the referenced data block
Include the fingerprint of first referenced data block;The fingerprint of the referenced data block sets for first equipment to described second
Standby transmission data, the data include the variance data between first referenced data block and first data block;
Second equipment receives the data that first equipment is sent.
9. according to the method described in claim 8, it is characterized in that, the method further includes:
Second equipment receives the fingerprint for the data to be transmitted that first equipment is sent, wherein, the number to be transmitted
According to fingerprint include the fingerprint of first data block;
Second equipment is found in second equipment according to the fingerprint of first data block without storage described first
During data block, searched according to the similar fingerprints of first data block and first reference whether is stored in second equipment
Data block.
10. method according to claim 8 or claim 9, which is characterized in that the similar fingerprints of the data to be transmitted include second
The similar fingerprints of data block, second data block are another data blocks of the data to be transmitted;The referenced data block
Fingerprint do not include the second referenced data block fingerprint;The similar fingerprints of second referenced data block and second data block
Similar fingerprints it is identical;The data also include second data block.
11. a kind of data processing equipment, which is characterized in that the equipment includes:
Computing unit, for calculating the similar fingerprints of data to be transmitted;Wherein, the similar fingerprints of the data to be transmitted include the
The similar fingerprints of one data block, first data block are a data blocks in the data to be transmitted;
Transmitting element, for sending the similar fingerprints of the data to be transmitted to second equipment, the data to be transmitted
Similar fingerprints are used for searching in second equipment whether store the referenced data block similar to the data to be transmitted;
Receiving unit, for receiving the fingerprint for the referenced data block that second equipment is sent;Wherein, the referenced data block
Fingerprint includes the fingerprint of the first referenced data block;The similar fingerprints of first referenced data block and the phase of first data block
It is identical like fingerprint;
Searching unit, for finding first reference number in the equipment according to the fingerprint of first referenced data block
According to block;
The transmitting element is additionally operable to, and the fingerprint based on the referenced data block is to the second equipment transmission data;Wherein, institute
It states data and includes the variance data between first referenced data block and first data block.
12. equipment according to claim 11, which is characterized in that the computing unit is specifically used for:Utilize local sensitivity
Hash algorithm carries out Hash operation to the data block of the data to be transmitted, obtains the similar fingerprints of the data block.
13. equipment according to claim 12, which is characterized in that the computing unit is specifically used for:
Data to be transmitted obtains data block described in cutting;For each data block, the computing unit performs following operation:
Extract at least one of data block sub-block;
Using m kind hash algorithms, Hash operation is carried out at least one sub-block respectively, obtains m Hash sequence;Its
In, Hash operation is carried out at least one sub-block using a kind of hash algorithm, obtains 1 Hash sequence;M is greater than
Integer equal to 2;
Maximum value in each Hash sequence in the m Hash sequence is merged, and the Hash that will be obtained after merging
Similar fingerprints of the sequence as the data block;Alternatively, by the minimum value in each Hash sequence in the m Hash sequence
It merges, and using the Hash sequence obtained after merging as the similar fingerprints of the data block.
14. according to claim 11 to 13 any one of them equipment, which is characterized in that the equipment further includes:
Differential Compression unit, for utilizing Differential Compression algorithm, to first referenced data block and first data block into
Row Differential Compression.
15. according to claim 11 to 14 any one of them equipment, which is characterized in that the similar fingerprints of the data to be transmitted
The similar fingerprints of the second data block are also included, second data block is another data block in the data to be transmitted;Institute
The fingerprint for stating referenced data block does not include the fingerprint of the second referenced data block;The similar fingerprints of second referenced data block and institute
The similar fingerprints for stating the second data block are identical;The data also include second data block.
16. according to claim 11 to 15 any one of them equipment, which is characterized in that the equipment further includes first order caching
It being cached with the second level, first order caching is non-persistence medium, and the second level caching is persistence medium, described first
Grade caches to cache the part or all of data block stored in the second level caching and the part or all of data block
Fingerprint and similar fingerprints;
The searching unit is specifically used for:The fingerprint of first referenced data block is searched in first order caching;If
The fingerprint less than first referenced data block is searched in the first order caching, then in being cached in the second level described in lookup
The fingerprint of first referenced data block.
17. equipment according to claim 16, which is characterized in that the second level caching includes one or more containers,
Each container is the fingerprint of each data block and similar fingerprints structure at least two data blocks and at least two data block
Into set, there is correlation between the content of at least two data blocks in each container;
The searching unit is additionally operable to:If a data block is found in being cached in the second level, by the data block institute
Container cache to the first order cache in.
18. a kind of data processing equipment, which is characterized in that the equipment includes:
Receiving unit, for receiving the similar fingerprints for the data to be transmitted that the first equipment is sent, wherein, the data to be transmitted
Similar fingerprints include the similar fingerprints of the first data block, and first data block is a data in the data to be transmitted
Block;
Searching unit for the similar fingerprints according to the data to be transmitted, finds to store in the equipment and be treated with described
Transmit the similar referenced data block of data;Wherein, the referenced data block includes the first referenced data block, first reference number
It is identical with the similar fingerprints of first data block according to the similar fingerprints of block;
Transmitting element, for sending the fingerprint of referenced data block to first equipment;Wherein, the fingerprint of the referenced data block
Include the fingerprint of first referenced data block;The fingerprint of the referenced data block is sent out for first equipment to the equipment
Data are sent, the data include the variance data between first referenced data block and first data block;
The receiving unit is additionally operable to, and receives the data that first equipment is sent.
19. equipment according to claim 18, which is characterized in that
The receiving unit is additionally operable to, and receives the fingerprint for the data to be transmitted that first equipment is sent, wherein, it is described to treat
The fingerprint of transmission data includes the fingerprint of first data block;
The searching unit is additionally operable to, and is found in the equipment according to the fingerprint of first data block without storing described the
During one data block, search in the equipment whether store first reference number according to the similar fingerprints of first data block
According to block.
20. the equipment according to claim 18 or 19, which is characterized in that the similar fingerprints of the data to be transmitted include the
The similar fingerprints of two data blocks, second data block are another data blocks of the data to be transmitted;The reference data
The fingerprint of block does not include the fingerprint of the second referenced data block;The similar fingerprints of second referenced data block and second data
The similar fingerprints of block are identical;The data also include second data block.
21. a kind of data processing equipment, which is characterized in that including:Memory and processor, wherein, the memory is used to deposit
Computer program is stored up, when the computer program is performed by the processor so that as described in any one of claim 1 to 10
Method is performed.
22. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
When running on computers so that method as described in any one of claim 1 to 10 is performed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711167866.1A CN108134775B (en) | 2017-11-21 | 2017-11-21 | Data processing method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711167866.1A CN108134775B (en) | 2017-11-21 | 2017-11-21 | Data processing method and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108134775A true CN108134775A (en) | 2018-06-08 |
CN108134775B CN108134775B (en) | 2020-10-09 |
Family
ID=62388793
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711167866.1A Active CN108134775B (en) | 2017-11-21 | 2017-11-21 | Data processing method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108134775B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109309670A (en) * | 2018-09-07 | 2019-02-05 | 深圳市网心科技有限公司 | Data stream method and system, electronic device and computer readable storage medium |
CN109710502A (en) * | 2018-12-19 | 2019-05-03 | 苏州科达科技股份有限公司 | Log transmission method, apparatus and storage medium |
CN111064471A (en) * | 2018-10-16 | 2020-04-24 | 阿里巴巴集团控股有限公司 | Data processing method and device and electronic equipment |
WO2021012162A1 (en) * | 2019-07-22 | 2021-01-28 | 华为技术有限公司 | Method and apparatus for data compression in storage system, device, and readable storage medium |
CN112416694A (en) * | 2019-08-20 | 2021-02-26 | 中国电信股份有限公司 | Information processing method, system, client and computer readable storage medium |
WO2021121042A1 (en) * | 2019-12-18 | 2021-06-24 | 华为技术有限公司 | Data storage method in storage system and related device |
WO2022001548A1 (en) * | 2020-06-30 | 2022-01-06 | 华为技术有限公司 | Data transmission method, system, apparatus, device, and medium |
CN114662160A (en) * | 2022-05-25 | 2022-06-24 | 成都易我科技开发有限责任公司 | Digital summarization method, system and digital summarization method in network transmission |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833486A (en) * | 2010-04-07 | 2010-09-15 | 山东高效能服务器和存储研究院 | Method for designing remote backup and recovery system |
CN102185889A (en) * | 2011-03-28 | 2011-09-14 | 北京邮电大学 | Data deduplication method based on internet small computer system interface (iSCSI) |
CN102495894A (en) * | 2011-12-12 | 2012-06-13 | 成都市华为赛门铁克科技有限公司 | Method, device and system for searching repeated data |
CN103020174A (en) * | 2012-11-28 | 2013-04-03 | 华为技术有限公司 | Similarity analysis method, device and system |
CN103858125A (en) * | 2013-12-17 | 2014-06-11 | 华为技术有限公司 | Repeating data processing methods, devices, storage controller and storage node |
-
2017
- 2017-11-21 CN CN201711167866.1A patent/CN108134775B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833486A (en) * | 2010-04-07 | 2010-09-15 | 山东高效能服务器和存储研究院 | Method for designing remote backup and recovery system |
CN102185889A (en) * | 2011-03-28 | 2011-09-14 | 北京邮电大学 | Data deduplication method based on internet small computer system interface (iSCSI) |
CN102495894A (en) * | 2011-12-12 | 2012-06-13 | 成都市华为赛门铁克科技有限公司 | Method, device and system for searching repeated data |
CN103020174A (en) * | 2012-11-28 | 2013-04-03 | 华为技术有限公司 | Similarity analysis method, device and system |
CN103858125A (en) * | 2013-12-17 | 2014-06-11 | 华为技术有限公司 | Repeating data processing methods, devices, storage controller and storage node |
Non-Patent Citations (1)
Title |
---|
廖海生: "基于重复数据删除技术的数据容灾***的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109309670B (en) * | 2018-09-07 | 2021-02-12 | 深圳市网心科技有限公司 | Data stream decoding method and system, electronic device and computer readable storage medium |
CN109309670A (en) * | 2018-09-07 | 2019-02-05 | 深圳市网心科技有限公司 | Data stream method and system, electronic device and computer readable storage medium |
CN111064471A (en) * | 2018-10-16 | 2020-04-24 | 阿里巴巴集团控股有限公司 | Data processing method and device and electronic equipment |
CN111064471B (en) * | 2018-10-16 | 2023-04-11 | 阿里巴巴集团控股有限公司 | Data processing method and device and electronic equipment |
CN109710502B (en) * | 2018-12-19 | 2022-06-14 | 苏州科达科技股份有限公司 | Log transmission method, device and storage medium |
CN109710502A (en) * | 2018-12-19 | 2019-05-03 | 苏州科达科技股份有限公司 | Log transmission method, apparatus and storage medium |
WO2021012162A1 (en) * | 2019-07-22 | 2021-01-28 | 华为技术有限公司 | Method and apparatus for data compression in storage system, device, and readable storage medium |
CN112416694A (en) * | 2019-08-20 | 2021-02-26 | 中国电信股份有限公司 | Information processing method, system, client and computer readable storage medium |
WO2021121042A1 (en) * | 2019-12-18 | 2021-06-24 | 华为技术有限公司 | Data storage method in storage system and related device |
EP4068071A4 (en) * | 2019-12-18 | 2023-01-25 | Huawei Technologies Co., Ltd. | Data storage method in storage system and related device |
US11755207B2 (en) | 2019-12-18 | 2023-09-12 | Huawei Technologies Co., Ltd. | Data storage method in storage system and related device |
WO2022001548A1 (en) * | 2020-06-30 | 2022-01-06 | 华为技术有限公司 | Data transmission method, system, apparatus, device, and medium |
CN114662160A (en) * | 2022-05-25 | 2022-06-24 | 成都易我科技开发有限责任公司 | Digital summarization method, system and digital summarization method in network transmission |
Also Published As
Publication number | Publication date |
---|---|
CN108134775B (en) | 2020-10-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108134775A (en) | A kind of data processing method and equipment | |
US8924687B1 (en) | Scalable hash tables | |
US6754799B2 (en) | System and method for indexing and retrieving cached objects | |
US8344916B2 (en) | System and method for simplifying transmission in parallel computing system | |
CN110191428B (en) | Data distribution method based on intelligent cloud platform | |
CN108255647B (en) | High-speed data backup method under samba server cluster | |
WO2019024780A1 (en) | Light-weight processing method for blockchain, and blockchain node and storage medium | |
CN103116615B (en) | A kind of data index method and server based on version vector | |
CN109508334B (en) | For the data compression method of block chain database, access method and system | |
CN106407224A (en) | Method and device for file compaction in KV (Key-Value)-Store system | |
CN107046812A (en) | A kind of data save method and device | |
US20120246125A1 (en) | Duplicate file detection device, duplicate file detection method, and computer-readable storage medium | |
CN104584524A (en) | Aggregating data in a mediation system | |
CN109445702A (en) | A kind of piece of grade data deduplication storage | |
US9667737B2 (en) | Publisher-assisted, broker-based caching in a publish-subscription environment | |
CN104618304A (en) | Data processing method and data processing system | |
CN105407096A (en) | Message data detection method based on stream management | |
CN109189759A (en) | Method for reading data, data query method, device and equipment in KV storage system | |
US8868584B2 (en) | Compression pattern matching | |
CN110245129A (en) | Distributed global data deduplication method and device | |
CN103399943A (en) | Communication method and communication device for parallel query of clustered databases | |
CN103609091B (en) | Method and device for data transmission | |
CN114625805B (en) | Return test configuration method, device, equipment and medium | |
CN110489380A (en) | A kind of data processing method, device and equipment | |
CN106202303B (en) | A kind of Chord routing table compression method and optimization file search method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |