CN104182409A - Method and device for optimizing multi-order hash - Google Patents

Method and device for optimizing multi-order hash Download PDF

Info

Publication number
CN104182409A
CN104182409A CN201310196974.7A CN201310196974A CN104182409A CN 104182409 A CN104182409 A CN 104182409A CN 201310196974 A CN201310196974 A CN 201310196974A CN 104182409 A CN104182409 A CN 104182409A
Authority
CN
China
Prior art keywords
data
colliding
colliding data
successful
addressed location
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310196974.7A
Other languages
Chinese (zh)
Other versions
CN104182409B (en
Inventor
万林佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201310196974.7A priority Critical patent/CN104182409B/en
Publication of CN104182409A publication Critical patent/CN104182409A/en
Application granted granted Critical
Publication of CN104182409B publication Critical patent/CN104182409B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for optimizing a multi-order hash. The method comprises the steps that when new data is inserted into the multi-order hash, if conflict data is stored in all addressing locations of the data to be inserted, the conflict data in all the addressing locations is detected for higher order; if conflict data resulting from successful detection exists, the conflict data with the lowest order in the successfully detected position is transferred to the successfully detected position to be stored, and the data to be inserted is stored at the initial position of the conflict data with the lowest order in the successfully detected position. With the method and the device for optimizing multi-order hash, the filling rate of the multi-order hash is improved.

Description

A kind of method that multistage Hash is optimized and device
Technical field
The present invention relates to multistage salted hash Salted field, relate in particular to a kind of method and apparatus that multistage Hash is optimized.
Background technology
Multistage Hash is a kind of outstanding data structure, it has high-performance, can reentry, stablizes, healthy and strong, more carry out stream safety, can be from the advantage such as eliminating.The implementation procedure of multistage Hash is as follows:
Default N large prime number, each large prime number is as the length of every single order Hash bucket;
Need to store new data time, use the length remainder of the current rank of the key-value pair Hash bucket of this new data, obtain addressed location, if preserved colliding data in this addressed location, single order carries out Hash calculation downwards; Otherwise, these new data are kept in this addressed location;
During to multistage Hash storage data, the maximum number of times that can conflict is the exponent number of multistage Hash.
If Fig. 1 is the storage organization schematic diagram of a multistage Hash in prior art.The exponent number of this multistage Hash is 6 rank, and exponent number from top to bottom increases successively.Every single order Hash bucket comprises multiple memory locations, and its bend lattice represent to preserve the position (having used position) of colliding data, and blank lattice represent not have the position (being available position) of save confliction data.
While inserting new data A to this multistage Hash, calculate the addressed location of A on the 1st rank according to the key assignments of A (being designated as KeyA), find that this addressed location, for using position, continues to the 2nd rank addressing; Calculate the addressed location of A on the 2nd rank according to KeyA, find that this addressed location, for using position, continues to the 3rd rank addressing; Calculate the addressed location of A on the 3rd rank according to KeyA, find that this addressed location is available position, is kept at A in the available position on the 3rd rank.If A is addressed to high-order always, find that the addressed location of high-order, still for using position, judges the insertion failure to data A.
From above process, existing hash algorithm, in the time of the new data addressing that needs are inserted, if the addressed location of every single order is all unavailable, is directly judged the insertion failure to new data; But, now in the multistage Hash of possibility, remaining in idle position and do not use, this just causes the filling rate of existing multistage Hash low.
Summary of the invention
The present invention proposes method and the device that a kind of multistage Hash is optimized, and can improve the filling rate of multistage Hash.
Technical scheme of the present invention is achieved in that
The method that multistage Hash is optimized, comprising:
While inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, the colliding data in described all addressed location is surveyed to high-order more;
Survey successful colliding data if existed, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and by the described initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum.
While inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, the colliding data in described all addressed location is surveyed to high-order more;
Survey successful colliding data if existed, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and by the described initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum.
The device that multistage Hash is optimized, comprising:
Detecting module, in the time inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, surveys the colliding data in described all addressed location to high-order more;
Preserve module, for in the time there is the successful colliding data of detection, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and described new data are kept to the initial position of the colliding data that the exponent number of this detection success position is minimum.
Visible, the method and apparatus that multistage Hash is optimized that the present invention proposes, in the time of the new data addressing that needs are inserted, if each addressed location is all unavailable, judge whether the colliding data in its all addressed location can be to more high-order transfer, if can, this colliding data is shifted to high-order more, and by the new initial position that is inserted into data and inserts this colliding data.In this way, can improve the filling rate of multistage Hash.
Brief description of the drawings
Fig. 1 is the storage organization schematic diagram of a multistage Hash in prior art;
Fig. 2 is the method flow diagram that multistage Hash is optimized that the present invention proposes;
Fig. 3 is the schematic diagram 1 of the embodiment of the present invention one;
Fig. 4 is the schematic diagram 2 of the embodiment of the present invention one;
Fig. 5 is the schematic diagram 3 of the embodiment of the present invention one;
Fig. 6 is the schematic diagram 4 of the embodiment of the present invention one;
Fig. 7 is the schematic diagram 1 of the embodiment of the present invention two;
Fig. 8 is the schematic diagram 2 of the embodiment of the present invention two;
Fig. 9 is the schematic diagram 3 of the embodiment of the present invention two;
Figure 10 is the filling rate comparative result schematic diagram of the class Cuckoo Hash based on surveying after the multistage Hash based on surveying, the further improvement that existing multistage Hash, the present invention are proposed;
Figure 11 be the multistage Hash based on surveying that existing multistage Hash, the present invention are proposed, further the class Cuckoo Hash based on surveying after improving write efficiency (per second) comparative result schematic diagram;
Figure 12 be the multistage Hash based on surveying that existing multistage Hash, the present invention are proposed, further the class Cuckoo Hash based on surveying after improving read efficiency (per second) comparative result schematic diagram;
The comparative result schematic diagram of the filling rate of the class Cuckoo Hash of 5 rank that class Cuckoo Hash based on surveying of the 20 common multistage Hash in rank that Figure 13 is is 2700W to data scale, 5 rank that data scale is 700W and data scale are 2700W based on surveying;
The comparative result schematic diagram of writing efficiency (per second) of the class Cuckoo Hash of 5 rank that class Cuckoo Hash based on surveying of the 20 common multistage Hash in rank that Figure 14 is is 2700W to data scale, 5 rank that data scale is 700W and data scale are 2700W based on surveying;
The comparative result schematic diagram of reading efficiency (per second) of the class Cuckoo Hash of 5 rank that class Cuckoo Hash based on surveying of the 20 common multistage Hash in rank that Figure 15 is is 2700W to data scale, 5 rank that data scale is 700W and data scale are 2700W based on surveying.
Embodiment
The present invention proposes a kind of method that multistage Hash is optimized, and if Fig. 2 is the method realization flow figure, comprising:
Step 201: while inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, the colliding data in described all addressed location is surveyed to high-order more;
Step 202: survey successful colliding data if existed, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and by the described initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum.
In said process, colliding data is specifically as follows to the mode that more high-order is surveyed: calculate the addressed location of this colliding data on high-order more according to the key assignments of described colliding data, in the time finding idle addressed location, judge that this colliding data surveys successfully, this idle addressed location is surveyed to successful position as this colliding data.
When specific implementation, the colliding data in all addressed location when more high-order is surveyed, can be safeguarded to a variable, the exponent number that the initial value of this variable is described multistage Hash adds 1; If colliding data is surveyed successfully, judge that this colliding data surveys the exponent number of successful position and whether be less than this variable, if be less than, the value of this variable is revised as this from now on day data survey the exponent number of successful position, continue the more detection of high-order colliding data; After all colliding datas detections are complete, if the value of described variable is not more than the exponent number of described multistage Hash, judges to exist and survey successful colliding data.
That is to say, for new data of inserting, calculate the addressed location of these data at every single order according to its key assignments key, if every single order Hash does not all have room to be inserted, so every single order is produced to the value of conflicting with key and go it more to attempt in the place of high-order, see if there is room and inserted.If had, take out minimum replacing of replaceable exponent number, then the value of replacing out is put on the position of detection.
Below lifting specific embodiment is described in detail said method.
Embodiment mono-:
In the present embodiment, new data A need to be inserted to 6 rank Hash.As Fig. 3-Fig. 6 has shown the insertion process of the present embodiment to data A.
In Fig. 3, calculate the addressed location at every single order according to the key value Key A of data A, find all to have preserved colliding data in all addressed location, be respectively B, C, D, E, F and G.
Because all addressed location are all unavailable, the colliding data in all addressed location is surveyed to high-order more.As shown in Figure 4 and Figure 5.
In Fig. 4, data B is surveyed to high-order more, calculate the addressed location on high-order more according to the key value Key B of data B, find the addressed location free time on the 5th rank, think the detection success of data B, and record the exponent number 5 of this detection success position.
In Fig. 5, continue data C to survey to high-order more, calculate the addressed location on high-order more according to the key value Key C of data C, find the addressed location free time on the 3rd rank, think that the detection of data C is successful, and record the exponent number 3 of this detection success position.
Afterwards, continue D, E, F, G to survey to high-order more, these data more addressed location on high-order are all occupied, therefore survey unsuccessfully.
After detection completes, data A is carried out and inserted.As shown in Figure 6, because the exponent number of the detection success position of data C is minimum, preserve therefore data C is transferred to the successful position of this detection, and data A is kept to the initial position of data C.
If all data are all surveyed failure, think failed to the insertion of data A.
In the time of concrete execution, do not find yet gap insertion if be addressed to high-order in the Hash of N rank, can safeguard that so a variable minRow(initial value is N+1), the each data that line number on addressing path are less than to minRow are surveyed (for a key assignments on i rank to high-order more, why it can be positioned at i rank, because the position before i rank is necessarily occupied on its path, therefore only to more high-order detection), be less than minRow if detect a position exponent number of living in, upgrade so minRow for these rank, finish until survey.After detection finishes, if minRow<=N replaces, otherwise think and insert unsuccessfully.
Be more than the optimization method to multistage Hash that the present invention proposes, be called the multistage Hash based on surveying.Owing to the insertion of new data having been carried out to some changes and promoted the filling rate of whole multistage Hash.The present invention can also make further improvements the method, use for reference the thought that the key assignments that will conflict in existing Cuckoo hash algorithm " is extruded ", when the data of new insertion do not find the position that can replace after once surveying, attempt all colliding datas to replace, and the recurrence that the colliding data being replaced carries out same scheme is surveyed to (replacing the thresholding of the number of plies is predefined value, as be set to 3 layers, the too much interrogatory of the number of plies is aobvious).Method after this improvement is called the class Cuckoo Hash based on surveying.
Particularly, after above-mentioned steps 202, may further include:
If there is no survey successful colliding data, carry out following steps for this colliding data being inserted in first addressed location of data:
Described in A, employing, be inserted into data replacement colliding data;
B, using described colliding data as the new data that are inserted into, this new be inserted into colliding data in all addressed location of data and survey to high-order more; Survey successful colliding data if existed, the minimum colliding data of exponent number of surveying successful position being transferred to the successful position of this detection preserves, and by the described new initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum, finish current flow process; If there is no survey successful colliding data, perform step successively A for this new colliding data being inserted in all addressed location of data, until replace the number of plies while reaching the threshold value setting in advance, survey successful colliding data if still do not existed, all colliding datas that are replaced are returned to its initial position and preserve, and be inserted into the colliding data execution step A in the next addressed location of data for this;
Until when this is inserted into colliding data in all addressed location of data and is all finished, survey successful colliding data if still do not existed, judge the insertion failure that this is inserted into data, this colliding data that is inserted into data is returned to its initial position and preserve.
Wherein, to the colliding data in all addressed location of colliding data to mode that more high-order is surveyed with above-mentioned to be inserted into legacy data in all addressed location of data identical to the mode that more high-order is surveyed to new.
Below lifting specific embodiment is described in detail above-mentioned improved method.
Embodiment bis-:
In the present embodiment, new data A need to be inserted to 6 rank Hash.As Fig. 7-Fig. 9 has shown the insertion process of the present embodiment to data A.In the present embodiment, describe as example taking the threshold value of replacing the number of plies as 2 layers.
In Fig. 7, calculate the addressed location at every single order according to the key value Key A of data A, find all to have preserved colliding data in all addressed location, be respectively B, C, D, E, F and G.And the detection of colliding data B, C, D, E, F and G is all unsuccessful, namely addressed location all occupied (Fig. 7 only shown the detection process of data B) of these colliding datas on high-order more.
Afterwards, adopt successively A to replace all colliding data B, C, D, E, F and the G on its addressing path, the colliding data being replaced is carried out to the update of Cuckoo Hash.
As shown in Figure 8, with data A replacement data B, now data B becomes the new data that are inserted into, and data B is carried out to the update of same Cuckoo Hash.The replacement number of plies is now 1 layer (being that A replaces B).Colliding data in all addressed location of B is followed successively by A → H → I → J → K → L, survey by the value of A, H, I, J, K, L so successively, survey successful colliding data if existed, the minimum colliding data of exponent number of surveying successful position is transferred to the successful position of this detection and preserves, B is kept to the initial position of this colliding data.As shown in Figure 9, the detection of the colliding data H of B success, and H to survey the exponent number of successful position minimum, B is kept to the initial position of H, H is kept to it and surveys successful position.So far,, to the insertion success of new data A, finish current flow process.
If after all colliding datas of B have been surveyed, still do not exist and survey successful colliding data, adopt successively B to replace all colliding datas on its addressing path, using this colliding data being replaced as the new data that are inserted into, it is carried out to the update of Cuckoo Hash.The number of plies of now replacing is 2 layers (be that A replaces B, B replaces its colliding data), reaches predefined threshold value.
For the colliding data H of B, suppose that the colliding data in all addressed location of H is followed successively by W → B → M → P → Q → V, survey to high-order more with W, B, M, P, Q, V so successively, if there is no survey successful colliding data, because the replacement number of plies now has reached predefined threshold value, therefore H can not replace its colliding data, B continues to replace the colliding data in its next addressed location.Until B had replaced all colliding datas in its addressed location, still do not exist while surveying successful colliding data, the detection failure to B is described, all colliding datas that are replaced are returned to its initial position and preserve.
Afterwards, A continues to replace next colliding data, i.e. data C, and now data C becomes the new data that are inserted into, and data C is carried out to the operation same with above-mentioned data B.If still survey failure, A continues to replace next colliding data, until after complete to all colliding datas detections, if do not surveyed yet successfully, judge the insertion failure to data A.
In said process, in the time being inserted into data A as the colliding data of certain data, can add that mark is not replaced it; Certainly, can not increase too many extra consumption (because must be all conflicts on its detective path, so only can once survey then rollback) if do not labelled yet.
Class Cuckoo Hash based on surveying has further promoted the filling rate of multistage Hash, and readwrite performance has surmounting of matter.
The class Cuckoo Hash based on surveying after the multistage Hash based on surveying, the further improvement below existing multistage Hash, the present invention being proposed carries out test comparison.Test environment is single-threaded+B6, and data scale is about 2700W, the about 136W of every rank Hash, totally 20 rank.As Figure 10-12 are respectively three kinds of method filling rates, write efficiency (per second) and read the comparative result schematic diagram of efficiency (per second).
From chart, can find out clearly, the class Cuckoo hash performance based on surveying aspect filling rate is very outstanding, can reach more than 96% filling rate in the time of 5 rank.We contrast with the class Cuckoo hash based on surveying on 5 rank and 20 rank Hash of plaintext:
High (the 96.5-83.3)/83.3=15.8% of utilization factor
Write performance improves (2668997-1391161)/1391161=91.9%
Read performance and improve (2462777-614965)/614965=300.5%
Has the performance of why writing also improved? in fact unexpected because exponent number when falling too low and have conflict also and few.
So, the class Cuckoo hash conceptual data scale based on surveying on 5 rank is 136*5 ≈ 700W, and the data scale of common 20 rank Hash is 2700W, like this more meaningful? in fact, the scale of the filling rate of a multistage Hash and readwrite performance and every single order there is no much relations, we are that 540W contrasts by the single-order increase in size of class Cuckoo hash, as Figure 13-15 20 common multistage Hash in rank that to be respectively data scale be 2700W, data scale is the filling rate of class Cuckoo Hash based on surveying of 5 rank of 700W and the data scale class Cuckoo Hash of 5 rank based on surveying that be 2700W, write efficiency (per second) and read the comparative result schematic diagram of efficiency (per second).By these three charts, we find out, the scale of every single order is not large especially on the impact of whole efficiency, and with respect to common multistage Hash structure, the class Cuckoo hash based on surveying has excellent performance on filling rate, readwrite performance.
In actual mailbox Batch Processing, in the time of some sparse data of storage, adopt bitmap in the past can waste a large amount of internal memories, and the method that multistage Hash is optimized that uses the present invention to propose, can promote the filling rate of multistage Hash, thereby very effectively alleviate memory pressure.From read-write efficiency, the method that the present invention proposes also can meet business demand completely.
The present invention also proposes a kind of device that multistage Hash is optimized, and comprising:
Detecting module, in the time inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, surveys the colliding data in described all addressed location to high-order more;
Preserve module, for in the time there is the successful colliding data of detection, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and described new data are kept to the initial position of the colliding data that the exponent number of this detection success position is minimum.
In said apparatus, detecting module also for, if described in be inserted into not exist in the colliding data in all addressed location of data and survey successful colliding data, carry out following steps for this colliding data being inserted in first addressed location of data:
Described in A, employing, be inserted into data replacement colliding data;
B, using described colliding data as the new data that are inserted into, this new be inserted into colliding data in all addressed location of data and survey to high-order more; Survey successful colliding data if existed, notifying described preservation module that the minimum colliding data of exponent number of surveying successful position is transferred to the successful position of this detection preserves, and by the described new initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum, finish current flow process; If there is no survey successful colliding data, perform step successively A for this new colliding data being inserted in all addressed location of data, until replace the number of plies while reaching the threshold value setting in advance, survey successful colliding data if still do not existed, notify described preservation module that all colliding datas that are replaced are returned to its initial position and preserve, and be inserted into the colliding data execution step A in the next addressed location of data for this;
Until when this is inserted into colliding data in all addressed location of data and is all finished, survey successful colliding data if still do not existed, judge the insertion failure that this is inserted into data, notify described preservation module that this colliding data that is inserted into data is returned to its initial position and preserve.
In said apparatus, detecting module by colliding data to the mode that more high-order is surveyed is: calculate the addressed location of this colliding data on high-order more according to the key assignments of described colliding data, in the time finding idle addressed location, judge that this colliding data surveys successfully, this idle addressed location is surveyed to successful position as this colliding data.
Detecting module is for a described colliding data that is inserted into data, and the colliding data in all addressed location when more high-order is surveyed, can be safeguarded to a variable, and the exponent number that the initial value of this variable is described multistage Hash adds 1; If colliding data is surveyed successfully, judge that this colliding data surveys the exponent number of successful position and whether be less than this variable, if be less than, the value of this variable is revised as to this colliding data and surveys the exponent number of successful position, continue the more detection of high-order colliding data; After all colliding datas detections are complete, if the value of described variable is not more than the exponent number of described multistage Hash, judges to exist and survey successful colliding data.As fully visible, the method and apparatus that multistage Hash is optimized that the present invention proposes, in the time of the new data addressing that needs are inserted, if each addressed location is all unavailable, judge whether the colliding data in its all addressed location can be to more high-order transfer, if can, this colliding data is shifted to high-order more, and will be inserted into data and insert the initial position of this colliding data.Further, if surveyed unsuccessfully, so can attempt the colliding data on its detective path to replace, using the colliding data being replaced as the new data that are inserted into, carry out the recursive operation of same scheme, allow the level of replacing to set in advance.Can improve in this way the filling rate of multistage Hash, ensure higher readwrite performance simultaneously.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any amendment of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (8)

1. the method multistage Hash being optimized, is characterized in that, described method comprises:
While inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, the colliding data in described all addressed location is surveyed to high-order more;
Survey successful colliding data if existed, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and by the described initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum.
2. method according to claim 1, is characterized in that, described method further comprises:
If there is no survey successful colliding data, carry out following steps for this colliding data being inserted in first addressed location of data:
Described in A, employing, be inserted into data replacement colliding data;
B, using described colliding data as the new data that are inserted into, this new be inserted into colliding data in all addressed location of data and survey to high-order more; Survey successful colliding data if existed, the minimum colliding data of exponent number of surveying successful position being transferred to the successful position of this detection preserves, and by the described new initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum, finish current flow process; If there is no survey successful colliding data, perform step successively A for this new colliding data being inserted in all addressed location of data, until replace the number of plies while reaching the threshold value setting in advance, survey successful colliding data if still do not existed, all colliding datas that are replaced are returned to its initial position and preserve, and be inserted into the colliding data execution step A in the next addressed location of data for this;
Until when this is inserted into colliding data in all addressed location of data and is all finished, survey successful colliding data if still do not existed, judge the insertion failure that this is inserted into data, this colliding data that is inserted into data is returned to its initial position and preserve.
3. method according to claim 1 and 2, it is characterized in that, describedly by colliding data to the mode that more high-order is surveyed be: calculate the addressed location of this colliding data on high-order more according to the key assignments of described colliding data, in the time finding idle addressed location, judge that this colliding data surveys successfully, this idle addressed location is surveyed to successful position as this colliding data.
4. method according to claim 1 and 2, it is characterized in that, for a described colliding data that is inserted into data, by the colliding data in all addressed location when more high-order is surveyed, safeguard a variable, the exponent number that the initial value of this variable is described multistage Hash adds 1; If colliding data is surveyed successfully, judge that this colliding data surveys the exponent number of successful position and whether be less than this variable, if be less than, the value of this variable is revised as to this colliding data and surveys the exponent number of successful position, continue the more detection of high-order colliding data; After all colliding datas detections are complete, if the value of described variable is not more than the exponent number of described multistage Hash, judges to exist and survey successful colliding data.
5. the device multistage Hash being optimized, is characterized in that, described device comprises:
Detecting module, in the time inserting new data to multistage Hash, if this is inserted in all addressed location of data all save confliction data, surveys the colliding data in described all addressed location to high-order more;
Preserve module, for in the time there is the successful colliding data of detection, be transferred to the successful position of this detection and preserve surveying the minimum colliding data of exponent number of successful position, and described new data are kept to the initial position of the colliding data that the exponent number of this detection success position is minimum.
6. device according to claim 5, it is characterized in that, described detecting module also for, if described in be inserted into not exist in the colliding data in all addressed location of data and survey successful colliding data, carry out following steps for this colliding data being inserted in first addressed location of data:
Described in A, employing, be inserted into data replacement colliding data;
B, using described colliding data as the new data that are inserted into, this new be inserted into colliding data in all addressed location of data and survey to high-order more; Survey successful colliding data if existed, notifying described preservation module that the minimum colliding data of exponent number of surveying successful position is transferred to the successful position of this detection preserves, and by the described new initial position that is inserted into data and is kept at the colliding data that the exponent number of this detection success position is minimum, finish current flow process; If there is no survey successful colliding data, perform step successively A for this new colliding data being inserted in all addressed location of data, until replace the number of plies while reaching the threshold value setting in advance, survey successful colliding data if still do not existed, notify described preservation module that all colliding datas that are replaced are returned to its initial position and preserve, and be inserted into the colliding data execution step A in the next addressed location of data for this;
Until when this is inserted into colliding data in all addressed location of data and is all finished, survey successful colliding data if still do not existed, judge the insertion failure that this is inserted into data, notify described preservation module that this colliding data that is inserted into data is returned to its initial position and preserve.
7. according to the device described in claim 5 or 6, it is characterized in that, described detecting module by colliding data to the mode that more high-order is surveyed is: calculate the addressed location of this colliding data on high-order more according to the key assignments of described colliding data, in the time finding idle addressed location, judge that this colliding data surveys successfully, this idle addressed location is surveyed to successful position as this colliding data.
8. according to the device described in claim 5 or 6, it is characterized in that, by described detecting module for a described colliding data that is inserted into data, by the colliding data in all addressed location when more high-order is surveyed, safeguard a variable, the exponent number that the initial value of this variable is described multistage Hash adds 1; If colliding data is surveyed successfully, judge that this colliding data surveys the exponent number of successful position and whether be less than this variable, if be less than, the value of this variable is revised as to this colliding data and surveys the exponent number of successful position, continue the more detection of high-order colliding data; After all colliding datas detections are complete, if the value of described variable is not more than the exponent number of described multistage Hash, judges to exist and survey successful colliding data.
CN201310196974.7A 2013-05-24 2013-05-24 A kind of method and device optimized to multistage Hash Active CN104182409B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310196974.7A CN104182409B (en) 2013-05-24 2013-05-24 A kind of method and device optimized to multistage Hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310196974.7A CN104182409B (en) 2013-05-24 2013-05-24 A kind of method and device optimized to multistage Hash

Publications (2)

Publication Number Publication Date
CN104182409A true CN104182409A (en) 2014-12-03
CN104182409B CN104182409B (en) 2018-01-19

Family

ID=51963460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310196974.7A Active CN104182409B (en) 2013-05-24 2013-05-24 A kind of method and device optimized to multistage Hash

Country Status (1)

Country Link
CN (1) CN104182409B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804234A (en) * 2017-04-28 2018-11-13 腾讯科技(深圳)有限公司 Data-storage system and its operating method
CN113297209A (en) * 2021-02-10 2021-08-24 阿里巴巴集团控股有限公司 Method and device for performing hash connection on database

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101782922A (en) * 2009-12-29 2010-07-21 山东山大鸥玛软件有限公司 Multi-level bucket hashing index method for searching mass data
CN103064906A (en) * 2012-12-18 2013-04-24 华为技术有限公司 File management method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101782922A (en) * 2009-12-29 2010-07-21 山东山大鸥玛软件有限公司 Multi-level bucket hashing index method for searching mass data
CN103064906A (en) * 2012-12-18 2013-04-24 华为技术有限公司 File management method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAGH R等: "Cuckoo hashing", 《SPRINGER》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804234A (en) * 2017-04-28 2018-11-13 腾讯科技(深圳)有限公司 Data-storage system and its operating method
CN113297209A (en) * 2021-02-10 2021-08-24 阿里巴巴集团控股有限公司 Method and device for performing hash connection on database
CN113297209B (en) * 2021-02-10 2024-03-08 阿里巴巴集团控股有限公司 Method and device for database to execute hash connection

Also Published As

Publication number Publication date
CN104182409B (en) 2018-01-19

Similar Documents

Publication Publication Date Title
CN103019887B (en) Data back up method and device
CN111552692B (en) Plus-minus cuckoo filter
CN104461390A (en) Method and device for writing data into imbricate magnetic recording SMR hard disk
US9043546B2 (en) Sliding-window multi-class striping
CN103164490A (en) Method and device for achieving high-efficient storage of data with non-fixed lengths
CN112148928A (en) Cuckoo filter based on fingerprint family
CN106155915A (en) The processing method and processing device of data storage
CN106599091B (en) RDF graph structure storage and index method based on key value storage
CN105138282A (en) Storage space recycling method and storage system
CN103914483A (en) File storage method and device and file reading method and device
US10628487B2 (en) Method for hash collision detection based on the sorting unit of the bucket
CN102959548A (en) Data storage method, search method and device
CN106598548A (en) Solution method and device for read-write conflict of storage unit
CN105573673A (en) Database based data cache system
US20120137107A1 (en) Method of decaying hot data
CN109407985B (en) Data management method and related device
US9164978B2 (en) Identifying objects within a multidimensional array
CN103714121A (en) Index record management method and device
CN104182409A (en) Method and device for optimizing multi-order hash
CN113836116A (en) Data migration method and device, electronic equipment and readable storage medium
CN105138528B (en) Method and device for storing and reading multi-value data and access system thereof
CN105574124A (en) Data storage system based on product information
CN106293530A (en) A kind of method for writing data and device
CN103473179A (en) Background system and method for deleting repeating data in solid state disk
CN112015791B (en) Data processing method, device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231229

Address after: 518057, 35th Floor, Tencent Building, Keji Middle Road, High tech Zone, Shenzhen, Guangdong Province

Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 2, 518044, East 403 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.