CN102722531B - Query method based on regional bitmap indexes in cloud environment - Google Patents

Query method based on regional bitmap indexes in cloud environment Download PDF

Info

Publication number
CN102722531B
CN102722531B CN201210155253.7A CN201210155253A CN102722531B CN 102722531 B CN102722531 B CN 102722531B CN 201210155253 A CN201210155253 A CN 201210155253A CN 102722531 B CN102722531 B CN 102722531B
Authority
CN
China
Prior art keywords
bitmap
condition
tuple
node
cloud environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210155253.7A
Other languages
Chinese (zh)
Other versions
CN102722531A (en
Inventor
孟必平
王腾蛟
李红燕
高军
杨冬青
唐世渭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201210155253.7A priority Critical patent/CN102722531B/en
Publication of CN102722531A publication Critical patent/CN102722531A/en
Application granted granted Critical
Publication of CN102722531B publication Critical patent/CN102722531B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a query method based on regional bitmap indexes in a cloud environment. The method comprises the following steps of: 1) establishing the regional bitmap indexes; 1.1) performing range division on index attributes on a data table in the cloud environment to generate a global sequencing table of attribute values, wherein the global sequencing table is used for sequencing tuples by using a set rule; 1.2) establishing an indicating bitmap on each data node according to the range division result, wherein the indicating bitmap records the storage condition of local attribute values; 1.3) establishing a local bitmap index on each data node according to the framework of the cloud environment to finish establishment of the regional bitmap indexes; and 2) inputting a query condition, establishing a condition bitmap according to the query condition by a main node, and distributing the condition bitmap to each data node, wherein the condition bitmap covers all probabilities included in the query condition; and concurrently executing retrieval task through each data node, acquiring the query result of each data node by the main node, and returning a union set of the query results of the data nodes to a user. By establishing the regional bitmap indexes, configurable parallel computing resources in the cloud environment can be fully utilized, and quick response can be provided for the data query request using capacity comparison as a condition.

Description

Querying method based on burst bitmap index in a kind of cloud environment
Technical field
The invention belongs to areas of information technology, relate to the distributed bitmap indexing method in a kind of cloud environment and utilize the method to inquire about data.
Background technology
Cloud computing environment and data management
The store and management that the fast development of cloud computing technology is mass data provides may.Compare traditional unit computing environment, the huge computational resource that cloud environment can effectively utilize distributed type assemblies carrys out the demand of satisfying magnanimity data management to computational resource and storage resources, and has and be easy to safeguard, be easy to expansion and be easy to the good characteristics such as management.In the face of the rapid growth of data volume, cloud computing technology can rapid adjustment and is distributed resource requirement to expand to adapt to the madness of data; Simultaneously can there is the memory module of flexible, crumbly texture and be based upon the distributed parallel computational resource on this memory module for non-institutional data provide.Along with the rapid expansion of data cybertimes, managing large scale data become a very urgent demand.Cloud environment makes it possess the ability of store and management large-scale data in the advantage aspect computational resource and storage resources.
Bitmap index
Bitmap index is a kind of special database index technology of using bitmap.In bitmap index, in bitmap, the value of each is 0 or 1, represents whether corresponding tuple obtains a certain given value on indexed attribute.Therefore the length of bitmap equals tuple sum.Bitmap index on attribute A will be set up a bitmap for all possible value on this attribute, in order to indicate the value condition of each tuple on this attribute.If the upper possible value of attribute A is more, will produce more a plurality of bitmaps.In this case, can use B +tree is organized these bitmaps.B +tree can guarantee the bitmap that location wants to search fast.The advantage of bitmap index is, can utilize the efficient step-by-step logical operation of traditional computer to carry out the compound querying condition of fast processing.For example, will be at attribute A 1the bitmap that upper retrieval obtains and at attribute A 2the bitmap actuating logic step-by-step that upper retrieval obtains and the result that can obtain simultaneously meeting the querying condition on these two attributes.
The centralized management scheme of index in cloud environment
In centralized solution, all values in indexed field is sorted by the overall situation and manages concentratedly.Particularly, every is recorded corresponding index entry the value that comprises indexed field and this is recorded to corresponding Major key.In index structure, these index entries are according to the value overall situation sequence of indexed field.The process that system is processed the inquiry request in indexed field is divided into two steps.First in the index structure of overall situation sequence, find qualified index entry, thereby learn the Major key of respective record.Then, thus according to Major key access, assemble index and locate complete record.
The management of the index entry that the ultimate challenge of centralized solution has sorted no more than the overall situation.A method the most direct is by these index entries distributed being stored in data management system as other general datas.The major key search mechanism of increasing income in project HBase of take is example, Fig. 1 has shown formation and the access mode of index structure in centralized solution, wherein root table record target tuple should corresponding to which metadata table, metadata table has recorded target tuple real position in tables of data.In retrieving, first by the 1st, 2,3 steps, obtain the corresponding Major key of target tuple, then through the 5th, 6,7 steps find target tuple according to Major key.In this way, huge index structure also can be enjoyed reliability, extensibility and the ease of manageability that Mass Data Management system provides.But each step in whole access process only has an independent back end to participate in, the advantage that this does not effectively utilize Distributed Calculation resource to bring.Thereby the response time of inquiry will be very long.
The distributed management scheme of index in cloud environment
In distributed schemes, the independent index of setting up in the local data managing separately on each back end.Distributed schemes is not safeguarded the overall situation sequence of index value, but is localized on each independent back end.Thereby back end is Existence dependency relationship not each other, this has brought facility for the concurrent execution of inquiry request.When the inquiry request on indexed key assignments arrives, retrieval tasks is carried out being distributed on all back end and in concurrent mode, and final Query Result is the union returning results on all back end.On each back end, the index of data will be by independent maintenance, so its local index structure has very strong dirigibility: the index technology that each node is used can be isomorphism, for example, all use B +tree index; Also can make isomery, for example, on the node having, use B +tree index, is used bitmap index etc. on other node.The partial indexes of isomery allows each back end to select used index technology according to the computational resource of self, and for example, the insufficient node of main memory resource can be used B +tree index is also only safeguarded B in main memory +tree is near the two-layer node of root.The poor node of CPU computing power can be used bitmap index, thereby utilizes the logical operation on bitmap to reduce calculated amount.Fig. 2 has shown formation and the access mode of index structure in distributed schemes.
In this way, index structure depends on data itself and is dispersed in each node.Between node, there is independence.Retrieval tasks is also assigned to independent execution in each node, thereby concurrent computation resource is utilized well.But, special take the most frequently used to equivalence condition be representative, due to the target record quantity of most retrieval tasks seldom, in distributed type assemblies, carry out concurrently this task and often cause the back end of much not storing any target record also to trigger retrieving, and return to the most at last empty set.In retrieval tasks, frequently in situation, this executed in parallel process will expend a large amount of unnecessary computational resources, finally will reduce the handling capacity of system.
Summary of the invention
The object of the present invention is to provide the distributed bitmap indexing method in a kind of cloud environment---burst bitmap index (Regional Bitmap Index, RBI) and utilize the method efficiently to inquire about large-scale data.The present invention has drawn in centralized index scheme and distributed index scheme advantage separately, proposes burst bitmap index structure and by the overall ordering mechanism of property value, sets up indication bitmap and make each back end can understand the distribution situation of local data in global data; By index structure is localized, thereby make the separate as far as possible enforcement of being convenient to parallel search between back end.Method in the present invention takes full advantage of the response time of the inquiry that the concurrent computation resource in cloud environment significantly improves, each back end utilizes the numeric distribution information in codomain to avoid the unnecessary calculation cost and the expense that a large amount of, without retrieving in the back end hitting, cause simultaneously, thereby inquiry handling capacity has also obtained raising.
The present invention proposes the querying method based on burst bitmap index in a kind of cloud environment, and its step comprises:
1) set up burst bitmap index,
1.1) property value of each tuple in cloud environment is carried out to codomain division, generate the overall sequencing table of property value, described overall sequencing table is to the rule compositor of setting for tuple;
1.2) according to codomain division result, on each distributed data node, set up indication bitmap, described indicating bit figure records local attribute's value storage condition;
1.3) according to cloud environment framework, on each distributed data node, set up local bitmap index, complete the establishment of burst bitmap index;
2) input inquiry condition, host node is according to querying condition set up the condition bitmap, and is distributed to each back end, described condition bitmap cover querying condition comprise likely; The concurrent execution retrieval tasks of each back end, host node is collected the Query Result of each back end, and to user, returns to the union of Query Result on each back end.
With long, be each tuple of bits string representation, wherein, the codomain of the attribute i of tuple is cut into c iindividual subdomain, f is the number that participates in the attribute of cutting, 1≤i≤f.
Described c iindividual subdomain forms set C i, and use cartesian product Des 1...f=C 1* C 2* C 3* ... C frepresent, the size of described cartesian product is: B = Π i = 1 f c i .
Described bit string is carried out overall situation sequence, and the ranking value obtaining is with tuple is unique corresponding in the value being queried in field arbitrarily, the corresponding bit string of described tuple likely value according to order from small to large, sort.
The length of described indication bitmap equals the number that this tuple attributes codomain is divided subdomain, identical with the big or small B of cartesian product.
Step 3) method of setting up bitmap index in is: the overall ranking value to the corresponding bit string of tuple existing on this node is set up B +tree, each key in the leaf node of tree is corresponding to a ranking value; For B +the additional length of each key on the leaf node of tree is that the bitmap of the tuple sum managed of notebook data node is as the corresponding tuple bitmap of corresponding ranking value.
When querying condition is single query condition,
A) each computing node is split as condition bitmap respectively the corresponding bit string of element in attribute cartesian product, and the bit string that fractionation is obtained is converted to corresponding ranking value and sets up a goal ordering value set;
B) generate length and equal the full 0 bit string cb of B, and be 1 by the position belonging in goal ordering value set;
C) whether the result of calculation of inspection logic step-by-step and eb & cb is 0, and wherein eb represents the indication bitmap on this computing node, and cb is tuple bit string;
D) if 0 on this computing node, directly return to empty set as result of calculation;
E) otherwise, search for the local B of this computing node +set and find corresponding leaf node and on the tuple bitmap that adheres to, check one by one whether in tuple bitmap, be set to the corresponding tuple in position of 1 satisfies condition.
When querying condition is multiple queries condition,
I) according to single query condition situation, carry out inquiry;
II) by step I) in the Query Result of each querying condition according to the complex method of former inquiry conditional, carry out corresponding step-by-step logical operation, check one by one whether in result of calculation, be set to the corresponding tuple in position of 1 satisfies condition;
III) the most backward host node returns to all results that satisfy condition as Query Result.
Condition bitmap be the corresponding bit string of qualified element in tuple attributes bit string cartesian product logic step-by-step with.
When inquiry request arrives distributed data node, by comparison, indicate bitmap to determine whether notebook data node comprises target tuple, if there is no, directly return to null value as the Query Result of this back end, and without carrying out retrieval tasks.Beneficial effect of the present invention is:
The present invention has designed the Indexing Mechanism in conjunction with the two advantage on the difference basis of analyzing contrast centralized solution and distributed schemes.The method can take full advantage of the configurable concurrent computation resource in cloud environment, can provide corresponding fast for take the big or small data query request as condition.Have benefited from indicating the use of bitmap, the present invention has avoided unnecessary computational resource expense effectively.The present invention is deployed in cloud environment, and in the face of large-scale data, search method provided by the invention is with good expansibility.
Accompanying drawing explanation
Fig. 1 is centralized solution schematic diagram of the prior art;
Fig. 2 is distributed schemes schematic diagram of the prior art;
Fig. 3 be in cloud environment of the present invention the querying method based on burst bitmap index aspect the response time with the comparison schematic diagram of prior art search method;
Fig. 4 be in cloud environment of the present invention the querying method based on burst bitmap index aspect inquiry handling capacity with the comparison schematic diagram of prior art search method;
Fig. 5 is the tuple on certain back end and bit string value and ranking value in one embodiment of the querying method based on burst bitmap index in cloud environment of the present invention;
Fig. 6 is the local B in one embodiment of the querying method based on burst bitmap index in cloud environment of the present invention +the schematic diagram of tree construction and tuple bitmap;
Fig. 7 is the querying method query steps process flow diagram based on burst bitmap index in cloud environment of the present invention.
Embodiment
At one, have 4.5 hundred million tuples, big or small is that the result of testing on the data set that records Twitter microblogging forwarding relation of 52GB shows, the indexing means proposing in the present invention all has good performance in response time and inquiry handling capacity.Fig. 3 and Fig. 4 have listed basic realization (the Global Approach of centralized solution index, GA), basic realization (the DistributedApproach of distributed schemes index, the contrast and experiment of the burst bitmap index (Regional Bitmap Index, RBI) DA) proposing with this method.Experiment has been used the distributed type assemblies consisting of 4 machines as experimental situation.
Below in conjunction with accompanying drawing 7, introduce in detail each step in the building process of the burst bitmap index that the present invention proposes:
First, the value of generated data on indexed attribute carried out overall sequencing table.The generation method of overall situation sequencing table is as follows:
1. at numerical attribute, be A 1, A 2, A 3..., A ff data list and set up bitmap index altogether; First by each attribute A 1codomain be cut into c iindividual subdomain (if attribute value is discrete value, each value can be divided into separately to a subdomain), the set of establishing these subdomains formations is C i, the cartesian product Des of subdomain so 1...f=C 1* C 2* C 3* ... C fsize be:
Figure BDA00001652285000051
2. use one to be longly
Figure BDA00001652285000052
bit string represent each tuple, wherein c iit is the size of codomain set.If the attribute A of tuple t ij subdomain corresponding position be b i, j, stipulate that so the corresponding bit string of this tuple can be:
Figure BDA00001652285000054
Figure BDA00001652285000055
3. the corresponding bit string of given any tuple t,
Figure BDA00001652285000056
there is and only has unique i ∈ [1 a, c i] make b i, j=1.Therefore the institute of this bit string likely value can be one by one corresponding to above-mentioned subdomain cartesian product Des 1...fin element.Thereby the corresponding bit string of tuple will have arbitrarily
Figure BDA00001652285000057
individual possible value.Different tuple in same tables of data has identical table schema, table schema has determined the attribute that table has, in the situation that given attribute row and on subdomain divide, by the institute of bit string corresponding to tuple likely value according to order from small to large, sort.The all possible value homogeneous one of bit string corresponds on a ranking value r ∈ [1, B].The value of tuple on detected data field is all corresponding to a unique ranking value r like this, arbitrarily.So far, completed the overall situation sequence to the index value in all tuples.
Think that it is example that two attributes in certain company personnel's information table are set up combined index, introduces and generates corresponding bit string from tuple, and then calculate the process of corresponding ranking value below.
Example 1: the employee information table of given certain company, establish this Table Properties A 1indication employee sex, comprises two values of male male and female female; Attribute A 2registrar's salary water, value is the integer in [0,3000] scope.First, attribute A 1codomain be split into two subdomains, only comprise respectively value man male and female female; Attribute A 2codomain be split into three subdomains: [0,1000], (1000,2000] and (2000,3000].Consider employee 1, establish its sex for the male sex, salary is 1300, and the bit string of this employee's 1 correspondence is so: ' 10010 '.Wherein front two ' 10 ' represents attribute A 1upper value is male, rear three ' 010 ' represent its salary scope (1000,2000] in.Consider employee 2, establishing its sex is women again, and salary is 2600, and bit string corresponding to this employee is so: ' 01001 '.Wherein front two ' 01 ' represents attribute A 1upper value is male, rear three ' 001 ' represent its salary scope (2000,3000] in.According to the above division to attribute codomain, the size of the cartesian product of known attribute subdomain is B=2 * 3=6.The possible value of the corresponding bit string of tuple has 6 arbitrarily, it is arranged and can be obtained according to order from small to large: ' 01001 ', ' 01010 ', ' 01100 ', ' 10001 ', ' 10010 ' and ' 10100 '.Contrast knownly, the bit string of employee's 1 correspondence comes the 5th, and the bit string of employee's 2 correspondences comes the 1st.Therefore, the ranking value of the bit string of employee 1 and employee's 2 correspondences is respectively 5 and 1.
Next, at each back end, generate respectively indication bitmap.Indication bitmap is a bit string that length is the big or small B of subdomain cartesian product.If having ranking value on this back end is the tuple of r, on this back end, indicating the r position of bitmap is 1, if not having ranking value on this back end is the tuple (being greater than or less than r tuple) of r, on this back end, indicating the r position of bitmap is 0.Indication bitmap has reflected the distribution situation of local data in the whole codomain of attribute.It will be stored in the internal memory of back end.
Example 2: about the hypothesis of employee information table with example 1.As shown in Figure 5, establish this table and on certain back end, have 7 records.Due to B=6, so the indication bitmap lengths on this back end is 6.Because this back end only comprises ranking value, be 1,3,4 and 5 tuple again, so its indication bitmap should be: 101110.
Indicating bit figure has recorded the situation that exists of local attribute's value.When inquiry request arrives distributed data node, first by comparison, indicate bitmap to determine whether notebook data node comprises target tuple, if there is no, directly return to null value, and do not carry out retrieval tasks.Step-by-step logical and to the comparison of indication bitmap by bit string has operated.
Example 3: about the hypothesis of employee information table with example 1.As shown in Figure 5, establish this table and on certain back end, have 7 records.By known its indication bitmap of example 3, be 101110.Suppose user to have sent inquiry sex be women and salary scope (1000,2000] in employee, the corresponding bit string that can generate accordingly target record is 01010, its ranking value is 2.Thereby, structure bit string 01000 with exist 101110 carry out logic step-by-step with: 01000 & 101110=0, result be complete zero, so on this node, does not have target tuple, can directly return to empty set as the Query Result of this node.
Finally, the data that are local management at each back end are set up bitmap index.
Be similar to described distributed schemes above, this method adopts back end to manage the mode of the index in local data independently equally, to improve the degree of parallelism while carrying out retrieval.On each back end, executed in parallel following steps:
1. the overall ranking value of the corresponding bit string of tuple existing on pair this node is set up B +tree, each leaf node of tree is corresponding to a ranking value.B +the building process of tree is as Goetz Graefe and Harumi A.Kuno.Modern B-Tree techniques.(In ICDE, pages 1370-1373,2011.) described in.
2. be B +the additional length of each leaf node of tree is that the bitmap of the tuple sum managed of notebook data node is as data corresponding to this leaf node.The corresponding position of tuple that has this ranking value in bitmap is set to 1 other positions and is set to 0.This bitmap is called as tuple bitmap.Easily see the B on individual data node +tree has at most B leaf node, therefore has at most B tuple bitmap.
Continue the hypothesis in example 1, example has below been introduced the generative process of the local bitmap index on certain back end.
Example 4: about the hypothesis of employee information table with example 1.As shown in Figure 5, establish this table and on certain back end, have 7 records.Wherein bit string Bit String and ranking value Rank are not the attributes in former table, but the bit string that the value of do as one likes not and on two attributes of salary generates and corresponding ranking value thereof.The B setting up according to the ranking value of each record +tree construction is (this B as shown in Figure 6 +each node in tree contains 3 child nodes at the most).
Below in conjunction with accompanying drawing 7, introduce in detail given querying condition, with distributed way, carry out the step of retrieval tasks:
First, host node is according to querying condition formation condition bitmap, and is distributed to each back end.Condition bitmap be the corresponding bit string of qualified element in attribute cartesian product logic step-by-step with.For example, inquiry women salary is that employee between 1500 to 1800 will obtain condition bitmap 01010; And for example, querying condition salary will be converted into condition bit Figure 101 10 lower than 1300 male sex employee.Note, the condition bit string of generation should cover all possibilities that querying condition comprises.In addition, if querying condition relates to the compound condition of the multiple queries condition on the attribute relevant to a plurality of index structures, should be so each condition and generate independently condition bitmap.
Then, each back end executed in parallel retrieval tasks.For the compound query condition of single query condition and multiple queries condition composition, below divide briefing:
1. the situation of single query condition:
A) condition bitmap is split as to the corresponding bit string of element in attribute cartesian product, and the bit string that fractionation is obtained is converted to corresponding ranking value.The set of goal ordering value that these ranking value have formed.
B) then generate the full 0 bit string cb that length equals B, and be 1 by the position belonging in goal ordering value set.
C) whether the result of calculation of inspection logic step-by-step and eb & cb is 0.Wherein eb represents the indication bitmap on this back end.
D) if 0 on this back end, directly return to empty set as result of calculation.
E) otherwise, search for the local B of this back end +set and find corresponding leaf node and on the tuple bitmap that adheres to, check one by one whether in tuple bitmap, be set to the corresponding tuple in position of 1 satisfies condition.The most backward host node returns to all results that satisfy condition as Query Result.
2. the situation of the compound query condition that multiple queries condition forms:
A) for each independent querying condition, carry out successively following steps:
Each back end of I is converted to condition bit string respectively the scope of corresponding ranking value.
II then generates the full 0 bit string cb that length equals B, and is 1 by the position belonging within the scope of ranking value.
III checks whether the result of calculation of step-by-step logical and eb & cb is 0.
IV if 0 on this back end, directly generate the full 0 bitmap of local tuple quantity that length equals this back end management as the Query Result of this querying condition.
V otherwise, search for the local B of this back end +set and find corresponding leaf node and on the tuple bitmap that adheres to as the Query Result of this querying condition.
B) by step a) in the Query Result of each querying condition according to the complex method of former inquiry conditional, carry out corresponding step-by-step logical operation, check one by one whether in result of calculation, be set to the corresponding tuple in position of 1 satisfies condition.
C) the most backward host node returns to all results that satisfy condition as the Query Result on notebook data node.
Finally, host node is collected the Query Result that each back end returns, and gets union as net result.

Claims (6)

1. the querying method based on burst bitmap index in cloud environment, its step comprises:
1) set up burst bitmap index,
1.1) property value of each tuple in cloud environment is carried out to codomain division, generate the overall sequencing table of property value, described overall sequencing table is to the rule compositor of setting for tuple; With long, be
Figure FDA0000441407450000011
each tuple of the bits string representation of ci, wherein, the codomain of the attribute i of tuple is cut into c iindividual subdomain, f is the number that participates in the attribute of cutting, 1≤i≤f; Described c iindividual subdomain forms set c i, and use cartesian product Des 1...f=C 1* C 2* C 3* ... * C frepresent, the size of described cartesian product is:
Figure FDA0000441407450000012
1.2) according to codomain division result, on each distributed data node, set up indication bitmap, described indicating bit figure records local attribute's value storage condition;
1.3) according to cloud environment framework, on each distributed data node, set up local bitmap index, complete the establishment of burst bitmap index;
2) input inquiry condition, host node is according to querying condition set up the condition bitmap, described condition bitmap be the corresponding bit string of qualified element in tuple attributes bit string cartesian product logic step-by-step with; And being distributed to each back end, the institute that described condition bitmap covering querying condition comprises is likely; The concurrent execution retrieval tasks of each back end, host node is collected the Query Result of each back end, and to user, returns to the union of Query Result on each back end;
When querying condition is single query condition,
2.1) each computing node is split as condition bitmap respectively the corresponding bit string of element in attribute cartesian product, and the bit string that fractionation is obtained is converted to corresponding ranking value and sets up a goal ordering value set;
2.2) generate length and equal the full 0 bit string cb of B, and be 1 by the position belonging in goal ordering value set;
2.3) whether the result of calculation of inspection logic step-by-step and eb & cb is 0, and wherein eb represents the indication bitmap on this computing node;
2.4) if 0 on this computing node, directly return to empty set as result of calculation;
2.5) otherwise, search for the local B of this computing node +set and find corresponding leaf node and on the tuple bitmap that adheres to, check one by one whether in tuple bitmap, be set to the corresponding tuple in position of 1 satisfies condition.
2. the querying method based on burst bitmap index in cloud environment as claimed in claim 1, it is characterized in that, described bit string is carried out to overall situation sequence, the ranking value obtaining is with arbitrarily tuple is unique corresponding in the value being queried in field, and the institute of the corresponding bit string of described tuple likely value sorts according to order from small to large.
3. the querying method based on burst bitmap index in cloud environment as claimed in claim 1, is characterized in that, the length of described indication bitmap equals the number that this tuple attributes codomain is divided subdomain, identical with the big or small B of cartesian product.
4. the querying method based on burst bitmap index in cloud environment as claimed in claim 1, is characterized in that described step 1.3) in set up local bitmap index method be: the overall ranking value to the corresponding bit string of tuple existing on this node is set up B +tree, each key in the leaf node of tree is corresponding to a ranking value; For B +the additional length of each key on the leaf node of tree is that the bitmap of the tuple sum managed of notebook data node is as the corresponding tuple bitmap of corresponding ranking value.
5. the querying method based on distributed bitmap index in cloud environment as claimed in claim 1, is characterized in that, when querying condition is multiple queries condition,
1) according to single query condition situation, carry out inquiry;
2) by step 1) in retrieval obtains according to each querying condition tuple bitmap according to the complex method of former inquiry conditional, carry out corresponding step-by-step logical operation, check one by one whether in result of calculation, be set to the corresponding tuple in position of 1 satisfies condition;
3) the most backward host node returns to all results that satisfy condition as Query Result.
6. the querying method based on burst bitmap index in cloud environment as claimed in claim 1, it is characterized in that, when inquiry request arrives distributed data node, by comparison, indicate bitmap to determine whether notebook data node comprises target tuple, if do not comprised, directly return to null value as the Query Result of this back end, and without carrying out retrieval tasks.
CN201210155253.7A 2012-05-17 2012-05-17 Query method based on regional bitmap indexes in cloud environment Expired - Fee Related CN102722531B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210155253.7A CN102722531B (en) 2012-05-17 2012-05-17 Query method based on regional bitmap indexes in cloud environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210155253.7A CN102722531B (en) 2012-05-17 2012-05-17 Query method based on regional bitmap indexes in cloud environment

Publications (2)

Publication Number Publication Date
CN102722531A CN102722531A (en) 2012-10-10
CN102722531B true CN102722531B (en) 2014-04-16

Family

ID=46948292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210155253.7A Expired - Fee Related CN102722531B (en) 2012-05-17 2012-05-17 Query method based on regional bitmap indexes in cloud environment

Country Status (1)

Country Link
CN (1) CN102722531B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968309B (en) * 2012-11-30 2016-01-20 亚信科技(中国)有限公司 A kind of rule matching method and device realizing rule-based engine
WO2014086019A1 (en) * 2012-12-06 2014-06-12 Empire Technology Development Llc Decentralizing a hadoop cluster
CN103310023A (en) * 2013-07-05 2013-09-18 深圳中兴网信科技有限公司 Distributed searching system and method
CN105573824B (en) * 2014-10-10 2020-04-03 腾讯科技(深圳)有限公司 Monitoring method and system for distributed computing system
CN106250565B (en) * 2016-08-30 2019-05-07 福建天晴数码有限公司 Querying method and system based on fragment relevant database
CN107704527B (en) * 2017-09-18 2020-05-08 华为技术有限公司 Data storage method, device and storage medium
CN110019204A (en) * 2017-10-27 2019-07-16 航天信息股份有限公司 Method and apparatus are indexed inside split towards HDFS
CN109960944A (en) * 2017-12-14 2019-07-02 中兴通讯股份有限公司 A kind of data desensitization method, server, terminal and computer readable storage medium
CN109086344A (en) * 2018-07-12 2018-12-25 广州市闲愉凡生信息科技有限公司 Full-text retrieval method for cloud computing platform
CN109165262B (en) * 2018-10-16 2022-05-10 成都索贝数码科技股份有限公司 Fragmentation clustering system and fragmentation method of relational large table
CN109960695B (en) * 2019-04-09 2020-03-13 苏州浪潮智能科技有限公司 Management method and device for database in cloud computing system
CN110968762B (en) * 2019-12-05 2023-07-18 北京天融信网络安全技术有限公司 Adjustment method and device for retrieval
CN111737264A (en) * 2020-07-20 2020-10-02 智者四海(北京)技术有限公司 Information processing method and system
CN112765171B (en) * 2021-01-12 2023-05-23 湖北宸威玺链信息技术有限公司 Optimization algorithm for multi-field combined index fetch of block chain data uplink
CN112783835B (en) * 2021-03-11 2024-06-04 百果园技术(新加坡)有限公司 Index management method and device and electronic equipment
CN116701386A (en) * 2022-02-28 2023-09-05 华为技术有限公司 Key value pair retrieval method, device and storage medium
CN117555906B (en) * 2024-01-12 2024-04-05 腾讯科技(深圳)有限公司 Data processing method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN102722531A (en) 2012-10-10

Similar Documents

Publication Publication Date Title
CN102722531B (en) Query method based on regional bitmap indexes in cloud environment
CN104881424B (en) A kind of acquisition of electric power big data, storage and analysis method based on regular expression
CN106095862B (en) Storage method of centralized extensible fusion type multi-dimensional complex structure relation data
Liang et al. Express supervision system based on NodeJS and MongoDB
CN102867066B (en) Data Transform Device and data summarization method
US11093473B2 (en) Hierarchical tree data structures and uses thereof
CN105117442B (en) A kind of big data querying method based on probability
EP3678032A1 (en) Computer implemented methods and systems for improved data retrieval
CN102495834A (en) Incremental data cleaning method based on memory mapping
CN106484815B (en) A kind of automatic identification optimization method based on mass data class SQL retrieval scene
CN113535788A (en) Retrieval method, system, equipment and medium for marine environment data
CN103049555A (en) Dynamic hierarchical integrated data accessing method capable of guaranteeing semantic correctness
Ji et al. Scalable nearest neighbor query processing based on inverted grid index
CN104199924B (en) The method and device of network form of the selection with snapshot relation
CN107273443B (en) Mixed indexing method based on metadata of big data model
CN104794237A (en) Web page information processing method and device
CN106055690A (en) Method for carrying out rapid retrieval and acquiring data features on basis of attribute matching
Min et al. Data mining and economic forecasting in DW-based economical decision support system
US20230070159A1 (en) Database modification using a script component
Yadav et al. Wavelet tree based hybrid geo-textual indexing technique for geographical search
EP3364314B1 (en) Methods and systems for indexing using indexlets
Xu et al. What-if query processing policy for big data in OLAP system
Zeng et al. Efficient xml keyword search: from graph model to tree model
Lee MDH*: Multidimensional histograms for Linked Data queries
CN116303392B (en) Multi-source data table management method for real estate registration data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140416

Termination date: 20170517

CF01 Termination of patent right due to non-payment of annual fee