CN106503243B - Electric power big data querying method based on HBase secondary index - Google Patents

Electric power big data querying method based on HBase secondary index Download PDF

Info

Publication number
CN106503243B
CN106503243B CN201610980816.4A CN201610980816A CN106503243B CN 106503243 B CN106503243 B CN 106503243B CN 201610980816 A CN201610980816 A CN 201610980816A CN 106503243 B CN106503243 B CN 106503243B
Authority
CN
China
Prior art keywords
secondary index
data
index table
column
tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610980816.4A
Other languages
Chinese (zh)
Other versions
CN106503243A (en
Inventor
马艳
苏建军
张方正
李红梅
郭志红
陈玉峰
祝永新
盛戈皞
杨祎
许乃媛
沈宇蓝
王畅
刘斌
孙占睿
李程启
林颖
耿玉杰
白德盟
李华东
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Original Assignee
Shanghai Jiaotong University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN201610980816.4A priority Critical patent/CN106503243B/en
Publication of CN106503243A publication Critical patent/CN106503243A/en
Application granted granted Critical
Publication of CN106503243B publication Critical patent/CN106503243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the electric power big data querying methods based on HBase secondary index;It includes: step (1): establishing secondary index table;Step (2): judging whether tables of data has update, does not update secondary index table if not having if so, just updating secondary index table;Step (3): data are inquired using secondary index table.Basic update operation may be implemented in the present invention, and more can efficiently realize connection Query and selection inquiry operation between tables of data for each specific business, to realize the support to complicated business demand.

Description

Electric power big data querying method based on HBase secondary index
Technical field
The present invention relates to the electric power big data querying methods based on HBase secondary index.
Background technique
The safety of power transmission and transforming equipment is the basis of electric power netting safe running.Data information relevant to power transmission and transformation equipment state produces The operational process such as inspection, test, live detection, on-line monitoring, operation of power networks, environment weather and equipment account are born from, are dispersed Among different systems, data volume is big, and type is complicated.Design effective distributed storage mould towards power transmission and transforming equipment big data Type is to realize the basis comprehensively and accurately evaluated equipment state, is the important of realization power grid big data Complete Coupling Analysis Support, is of great significance.
The HBase database run in Hadoop platform be a high reliability, high-performance, towards column, it is expansible Distributed memory system.Large-scale storage cluster, energy can be erected on low-cost server cluster using HBase database technology Enough meet the storage demand of power grid big data.But the big data storage scheme based on HBase is not fully solved data Efficient retrieval problem, especially in face of electric power big data, complicated, flexible inquiry business demand, single line unit are necessarily unable to satisfy Service inquiry needs, therefore a kind of urgently big data search method that can satisfy needs.
[1] power grid timing big data storage method, 104239447 A of CN propose a kind of power grid timing big data storage Method, by selecting open source distribution columnar database HBase as accumulation layer, in conjunction with SG-CIM model in electrical network business to industry Business re-starts description with a collection of measuring point information of position correlation in logic, is deposited by designing a kind of reasonable measuring point data The index organization's mode for storing up table, using the subregion and load-balancing function of HBase, so that having position correlation in service logic Position of the historical data in physical store of a collection of measuring point be adjacent so as to the historical data of this batch of measuring point into The disk tracking time can be reduced when row inquiry, improved search efficiency, provided immediate inquiring service for service application.
[2] HBase secondary index method and device, 104112013 A of CN propose to establish the two of user's table based on HBase The index entry of grade index, secondary index sorts to the value of the rowkey of user's table, to facilitate according to value to user's table It is searched.The corresponding secondary index table of every user's table, and user's table is stored in corresponding secondary index table when storage In identical region server, transregional index is avoided.
Patent [1], [2] are different from the present invention.[1] what is proposed is a kind of number for corresponding service logic correlation According to secondary index organizational form, core concept be so that logically related data realized in storage it is physically adjacent, To improve search efficiency.[2] a kind of index for occuping HBase proposed generates scheme, and core concept is a tables of data pair A concordance list is answered, and tables of data and the storage of manipulative indexing table are on the same server, to improve search efficiency.This hair The secondary index scheme of bright proposition is the electric power big data storage model based on HBase, first according to inquiry business to dependency number Secondary index table is established according to column, the corresponding secondary index table of a basic query business, a complex query business can be right Answer multiple secondary index tables.When inquiry, the line unit for obtaining corresponding data is inquired according to concordance list first, is existed further according to line unit Inquiry is in tables of data to obtain data.When the related column of more new data table, need to update corresponding secondary index simultaneously Table.
Summary of the invention
The purpose of the present invention is to solve the above-mentioned problems, provides a kind of big number of the electric power based on HBase secondary index It is investigated that asking method and system, basic update operation may be implemented, and can be more efficient real for each specific business Connection Query and selection inquiry operation between existing tables of data, to realize the support to complicated business demand.
To achieve the goals above, the present invention adopts the following technical scheme:
Electric power big data querying method based on HBase secondary index, includes the following steps:
Step (1): secondary index table is established;
Step (2): judging whether tables of data has update, does not update two if not having if so, just updating secondary index table Grade concordance list;
Step (3): data are inquired using secondary index table.
The method that the step (1) establishes secondary index table includes the following steps:
Step (11): secondary index table is generated according to action type;
Step (12): according to data column-generation secondary index entry and it is inserted into secondary index table;
The step of step (11) are as follows:
Step (111): for selecting inquiry operation, the M data column for being related to selection inquiry are respectively stored into M second level In concordance list, wherein M is more than or equal to 1, and the line unit R of each secondary index table is made of three parts, is successively: QUALIFIER, VALUE and ROEKEY;Wherein QUALIFIER is the identifier that data arrange in tables of data, and VALUE is in tables of data The value of data column, ROWKEY is the line unit of tables of data;
Step (112): operating connection Query, and the N number of data column for being related to connection Query are stored to a second level rope Draw in table, wherein N is more than or equal to 2, and the line unit R of secondary index table is made of three parts, is successively: PREFIX, VALUE, QUALIFIER;Wherein PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is that data arrange in tables of data Value, QUALIFIER be in tables of data data arrange identifier;
Step (113): for step (111) and step (112), the value that data arrange in the secondary index table is corresponding number According to the ROWKEY of table;The line unit R of data arrange in the secondary index table value and secondary index table collectively forms secondary index table An entry;
Using HBase creation secondary index table (table name of specified secondary index table), and data are arranged into corresponding second level The incidence relation of concordance list is stored into metadata table, and the line unit of metadata table is constituted successively are as follows:
The table name of tables of data, column family name, column name, the action type of secondary index table, timestamp,
The corresponding value of the line unit of metadata table are as follows: the action type and secondary index table name of secondary index table.
The action type of secondary index table includes: selection inquiry operation and connection Query operation.
The step of step (12) are as follows:
Step (121): for selecting inquiry operation, M data column are scanned respectively, according to item described in step (113) Mesh format generates secondary index table clause, and secondary index entry is inserted into corresponding secondary index table.
Step (122): operating connection Query, N number of data column is scanned respectively, according to item described in step (113) Mesh format generates secondary index entry, and secondary index entry is inserted into the same secondary index table.
The method that the step (2) updates secondary index table includes the following steps:
Step (21): more new data table: the Put method interface provided by the HBase in Hadoop platform submits data The values of column, line unit, column family and column identifier, the update of complete paired data table;
Step (22): generate secondary index entry: for the column of the data currently updated, query metadata table is needed The secondary index table and the corresponding action type of secondary index table to be updated select corresponding secondary index according to action type Tableau format meets the corresponding tabular entry of secondary index using the data information generation updated in tables of data;
Step (23): secondary index table is updated: the interface provided by the HBase Coprocessor in Hadoop platform Method, the format of the secondary index entry generated according to step (22) submit the mark of the value of secondary index table, line unit, column family and column Know symbol, completes the update to secondary index table.
The step (22) includes the following steps:
Step (221): if the action type of secondary index table is selection inquiry operation, according to step (111) second level rope Draw tableau format, meets the corresponding tabular entry of secondary index using the data information generation updated in tables of data;
Step (222): if the action type of secondary index table is connection Query operation, according to step (112) second level rope Draw tableau format, generates the compound corresponding tabular entry of secondary index using the data information updated in tables of data.
The step (3) inquires data using secondary index table, includes the following steps:
Step (31): scanning secondary index table obtains the line unit of data to be checked;
Step (32): the collection query tables of data of the ROWKEY of data to be checked is used.
The step of step (31) are as follows:
Step (311): for the querying method of the secondary index table of selection inquiry:
Each of the M data column being related to for selection inquiry business data column, inquire first number according to action type According to table, the title of corresponding secondary index table is obtained.Look into the secondary index table, specific query process are as follows:
It is according to the secondary index table row key format in step (111) it is found that directly fixed according to the condition value in selection inquiry Position continues to scan on, to first qualified data until one ineligible data of discovery;Scanned meets item The data composition of part meets the set of the ROWKEY of the querying condition of current data column.
If M is equal to the set that 1, ROWKEY set is the ROWKEY of data to be checked;
If M is greater than 1, according to the logical relation in inquiry business in M data column, the ROWKEY of different lines is gathered Do corresponding set operation: logical AND corresponds to the operation of intersection of sets collection, logic or corresponding union operation, operation the result is that be checked Ask the set of the ROWKEY of data.
Step (312): for the querying method of the secondary index table of connection Query:
For N number of data column that connection Query business is related to, corresponding two are obtained according to action type query metadata table The title (the corresponding same secondary index table of N number of column) of grade concordance list.Inquire the secondary index table, specific query process are as follows:
According to the secondary index table row key format in step (112) it is found that N number of data of value having the same are listed in second level Corresponding entry continuous arrangement in concordance list;
If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY structure of N number of entry Meet N tuple<R1, R2 of querying condition at one ..., RN>;
Entire secondary index table is scanned, then obtains the set {<R1, R2 ..., RN>} of all N tuples for meeting condition, then Set {<R1, R2 ..., RN>} is exactly the set of the ROWKEY of data to be checked.
The step of step (32) are as follows:
It is provided using the set of the ROWKEY of the data to be checked obtained in step (311) and step (312) by HBase Get interface method corresponding data value is obtained in tables of data.
Electric power big data inquiry system based on HBase secondary index, comprising:
Secondary index table establishes module: for establishing secondary index table;
Judge update module: judging whether tables of data has update, if so, just updating secondary index table, if not having, not more New secondary index table;
Data inquiry module: data are inquired using secondary index table.
Beneficial effects of the present invention:
This patent proposes a kind of secondary index design scheme based on HBase.The secondary index design scheme can have Most basic connection Query, selection inquiry operation in the support relational database of effect, to be power grid big data complex query Business provides good support.Meanwhile service-oriented establishes corresponding secondary index table, it can be in the performance and business of inquiry It is balanced between flexibility.
The invention proposes a kind of secondary index design schemes based on HBase database, realize in relational database Basic selection inquiry and connection Query function, support can be provided to complicated inquiry business demand in network system.
Selection query performance of the invention: for any table T1, inquiry meets condition<T1.a, a '>record, the present invention The item number for the data record for needing to scan is equal to the item number of the record for the condition that meets, less than the item number of the record of whole table | T1 |, It is suitable with the record strip number that the column for establishing index for inquiring traditional relational database need to scan.
Connection Query performance of the invention: connection Query operation, traditional relationship number are carried out for any two table T1, T2 The item number of record scanned is needed to be according to library | T1 | * | T2 |, the present invention needs the record strip number scanned to be | T1 |+| T2 |, it is comprehensive The join operation between set after consideration, the present invention can largely improve the performance of connection Query.
Detailed description of the invention
Fig. 1 is data query flow chart of the invention;
Fig. 2 is that electric power big data of the invention selects querying method flow chart;
Fig. 3 is electric power big data connection Query method flow diagram of the invention.
Specific embodiment
The invention will be further described with embodiment with reference to the accompanying drawing.
The present invention program mainly includes the content of two aspects, to the update scheme and logarithm of tables of data and secondary index According to the query scheme of table, wherein the query scheme of tables of data includes secondary index organization's scheme of basic selection inquiry and right Secondary index organization's scheme of basic connection Query.As shown in Figs. 1-3.
5.1 establish secondary index table
In the present invention, the corresponding secondary index table of a basic query business, a complex query business can be right Answer multiple secondary index tables.The values of the data of secondary index table, line unit, column family and column the information of identifier be to be believed by former data Breath integrates layout acquisition.
A) for selecting inquiry operation, the present invention deposits the secondary index for the corresponding column for being related to multiple tables of selection inquiry It stores up into a table, the line unit R of concordance list is made of three parts, is successively: QUALIFIER, VALUE, ROEKEY.Wherein QUALIFIER is the identifier arranged in tables of data, and VALUE is the value that data arrange in tables of data, and ROWKEY is the line unit of tables of data.
B) connection Query is operated, the present invention deposits the secondary index for the corresponding column for being related to multiple tables of connection Query It stores up into a table, the line unit R of concordance list is made of three parts, is successively: PREFIX, VALUE, QUALIFIER.PREFIX by Hash function generates, and for distinguishing the group of connection Query, VALUE is the value that data arrange in tables of data, and QUALIFIER is tables of data The identifier of middle column.
The train value of concordance list collectively forms an entry of concordance list for the line unit and concordance list line unit of corresponding data.
5.2 select corresponding secondary index table according to operation requests
In the present invention, the corresponding relationship of business and corresponding concordance list stores in the metadata, update or inquiry one When the corresponding tables of data of business, corresponding secondary index table is obtained according to metadata.
5.3 data update
5.3.1 more new data table
HBase Coprocessor in the Hadoop platform that the present invention uses provides the addition delete operation of tables of data Basic support.The interface provided by HBase Coprocessor submits the mark of the values of data, line unit, column family and column Symbol, can be updated tables of data.
5.3.2 generating secondary index entry
According to secondary index tableau format, meet corresponding second level rope using the known data information generation for needing to update Draw tabular entry.
5.3.3 concordance list is updated
The update method of concordance list is similar with data table updating method, the interface provided by HBase Coprocessor, The identifier for submitting the value of concordance list, line unit, column family and column, can be updated concordance list.
5.4 data query
5.4.1 inquiring secondary index table
For the line unit value for being determined for compliance with condition data, need to carry out prescan to secondary index table before inquiring data.
A) for the querying method of the concordance list of selection inquiry:
The compound selection querying condition of business is split as single query item first by selection inquiry business compound for one Then part is obtained the entry set for meeting single condition by the line unit of concordance list, will finally meet the entry of each single condition Set carries out set operation, can be obtained all secondary index entries for meeting compound query condition, then mention from these entries Take all qualified tables of data line units.Wherein, obtain meet the secondary index destination aggregation (mda) of single condition when, can be according to It directly positions according to the line unit of concordance list to first qualified data, scans down, until discovery one is ineligible Data, then scanned entry is merged into the secondary index destination aggregation (mda) for meeting single condition.
As shown in Fig. 2, there are tables of data T1, T2, for compound selection inquiry business (Y1):<T1.a, a '>| |<T1.c, C '>| |<T2.b, b '>(value for meeting the data column a in table T1 " is less than " a ', or meets the value of the data column c in table T1 " being less than " c ', or meet the value of the data column b in table T2 and " be less than " b '), secondary index table is by the middle data of tables of data T1, T2 Corresponding secondary index entry storage is into a table.For Y1, in corresponding secondary index table, with identical QUALIFIER The line unit of beginning forms continuous storage record segment (secondary index table).For querying condition<T1.a, a '>, it can be according to T1.a First for being directly targeted to the condition of satisfaction records, and after continuous scanning, encounters first record for being unsatisfactory for condition, i.e. data Value be greater than a ' record, scanning i.e. complete, scanned entry is merged to the set S1:{ R1 for obtaining a line unit }, be Meet condition<T1.a, a '>all data be recorded in the set of the line unit in tables of data.Similarly, sequential scan concordance list is other Part, can successively obtain meeting condition<T1.c, c '>set S2:{ R2 and meet condition<T2.b, b '>set S3: { R3 } then asks S1 ∪ S2 ∪ S3 can be obtained and meets all data of Y1 and be recorded in the value of line unit in tables of data.
B) it is directed to the querying method of connection Query concordance list:
For compound connection Query business, inquiry can be divided into two connection Query groups, the number of same connection Query group When being inserted into concordance list according to column, identical PREFIX value is generated by hash function.The corresponding value of line unit R is then that this is listed in data Line unit in table.Whole scan is carried out to secondary index table when inquiry, records qualified multi-component system set, then these are more Tuple-set carries out set operation, obtains the line unit value of eligible data.Wherein recording qualified multi-component system set In the process, when the multi-component system of only continuous entry composition can meet the condition of connection Query group, just this multi-component system is added It adds in multi-component system set.
As shown in figure 3, there are tables of data T1, T2, T3, T4, for compound connection Query business (Y2): T1.a=T2.b= T4.d&&T1.e=T3.c (wherein, a, b, c, d, e are respectively the column in table T1, T2, T3, T4, T1), can be divided into two for inquiry A connection Query group, two (Z2): T1.e=T3.c of one (Z1): T1.a=T2.b=T4.d of group and group.It is all in Z1 for Y1 Column can all be started with same PREFIX, therefore will form continuous storage record (secondary index table), and scanning should from the beginning Section storage record, the record of VALUE having the same can be scanned consecutively, count to scanning, three continuous (because of Z1 Be related to 3 tables) VALUE it is identical be recorded as connection Query a result record, result be a triple set S1:{ < R1, R2, R4 > }, R1, R2, R4 respectively correspond the line unit that three data with same VALUE are listed in tables of data T1, T2, T4 Value.S1 is the connection Query result of Z1.Equally, it scans that Z2 is formed so record, available similar connection Query knot Fruit S2:{<R1, R3>} because between Z1 and Z2 being the relationship (&& of intersection), connection Query behaviour is done on R1 to S1 and S2 It can be obtained by the final query result S:{<R1, R2, R3, R4>of business Y2 }.
5.4.2 content is obtained in tables of data using line unit
After the line unit value for obtaining the data for meeting querying condition, the line unit value obtained can be used to pass through HBase The Get interface that Coprocessor is provided obtains data value corresponding to the line unit value in tables of data.
Specific embodiment:
Hadoop distributed file system is installed;
Install HBase database, version be 0.92 and after;
PrePut the and postPut method for rewriteeing region observer in HBase Coprocessor, according to The data being newly inserted into are updated corresponding secondary index table;
It realizes the preGet method of region observer in HBase Coprocessor, is first accessed according to query argument Corresponding secondary index table obtains the line unit of inquiry data, the data then needed according to line unit inquiry.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (7)

1. the electric power big data querying method based on HBase secondary index, characterized in that include the following steps:
Step (1): secondary index table is established;
Step (2): judging whether tables of data has update, does not update second level rope if not having if so, just updating secondary index table Draw table;
Step (3): data are inquired using secondary index table;
The method that the step (1) establishes secondary index table includes the following steps:
Step (11): secondary index table is generated according to action type;
Step (12): according to data column-generation secondary index entry and it is inserted into secondary index table;
The step of step (11) are as follows:
Step (111): for selecting inquiry operation, the M data column for being related to selection inquiry are respectively stored into M secondary index In table, wherein M is more than or equal to 1, and the line unit R of each secondary index table is made of three parts, is successively: QUALIFIER, VALUE and ROEKEY;Wherein QUALIFIER is the identifier that data arrange in tables of data, and VALUE is that data arrange in tables of data Value, ROWKEY is the line unit of tables of data;
Step (112): operating connection Query, and the N number of data column for being related to connection Query are stored to a secondary index table In, wherein N is more than or equal to 2, and the line unit R of secondary index table is made of three parts, is successively: PREFIX, VALUE, QUALIFIER;Wherein PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is that data arrange in tables of data Value, QUALIFIER be in tables of data data arrange identifier;
Step (113): for step (111) and step (112), the value that data arrange in the secondary index table is corresponding data table ROWKEY;The line unit R of data arrange in the secondary index table value and secondary index table collectively forms the one of secondary index table A entry;
Secondary index table is created using HBase, and the incidence relation that data arrange corresponding secondary index table is stored to first number According in table, the line unit of metadata table is constituted successively are as follows:
The table name of tables of data, column family name, column name, the action type of secondary index table, timestamp,
The corresponding value of the line unit of metadata table are as follows: the action type and secondary index table name of secondary index table.
2. the electric power big data querying method based on HBase secondary index as described in claim 1, characterized in that
The step of step (12) are as follows:
Step (121): for selecting inquiry operation, M data column are scanned respectively, according to entry lattice described in step (113) Formula generates secondary index table clause, and secondary index entry is inserted into corresponding secondary index table;
Step (122): operating connection Query, N number of data column is scanned respectively, according to entry lattice described in step (113) Formula generates secondary index entry, and secondary index entry is inserted into the same secondary index table.
3. the electric power big data querying method based on HBase secondary index as described in claim 1, characterized in that the step Suddenly the method that (2) update secondary index table includes the following steps:
Step (21): more new data table: the Put method interface provided by the HBase in Hadoop platform submits data column Value, line unit, column family and column identifier, the update of complete paired data table;
Step (22): generate secondary index entry: for the column of the data currently updated, query metadata table is obtained and is needed more New secondary index table and the corresponding action type of secondary index table, selects corresponding secondary index table according to action type Format meets the corresponding tabular entry of secondary index using the data information generation updated in tables of data;
Step (23): secondary index table is updated: the interface side provided by the HBase Coprocessor in Hadoop platform Method, the format of the secondary index entry generated according to step (22) submit the mark of the value of secondary index table, line unit, column family and column Symbol completes the update to secondary index table.
4. the electric power big data querying method based on HBase secondary index as claimed in claim 3, characterized in that the step Suddenly (22) include the following steps:
Step (221): if the action type of secondary index table is selection inquiry operation, according to step (111) secondary index table Format, use the data information that updates in tables of data to generate the compound corresponding tabular entry of secondary index;
Step (222): if the action type of secondary index table is connection Query operation, according to step (112) secondary index table Format, use the data information that updates in tables of data to generate the compound corresponding tabular entry of secondary index.
5. the electric power big data querying method based on HBase secondary index as described in claim 1, characterized in that the step Suddenly (3) inquire data using secondary index table, include the following steps:
Step (31): scanning secondary index table obtains the line unit of data to be checked;
Step (32): the collection query tables of data of the ROWKEY of data to be checked is used.
6. the electric power big data querying method based on HBase secondary index as claimed in claim 5, characterized in that the step Suddenly the step of (31) are as follows:
Step (311): for the querying method of the secondary index table of selection inquiry:
Each of the M data column being related to for selection inquiry business data arrange, according to action type query metadata table, Obtain the title of corresponding secondary index table;Look into the secondary index table, specific query process are as follows:
According to the secondary index table row key format in step (111) it is found that according to selection inquiry in condition value directly position to First qualified data, continues to scan on, until one ineligible data of discovery;Scanned is qualified Data composition meets the set of the ROWKEY of the querying condition of current data column;
If M is equal to the set that 1, ROWKEY set is the ROWKEY of data to be checked;
If M is greater than 1, according to the logical relation in inquiry business in M data column, phase is done to the ROWKEY set of different lines The set operation answered: logical AND corresponds to the operation of intersection of sets collection, logic or corresponding union operation, operation the result is that number to be checked According to ROWKEY set;
Step (312): for the querying method of the secondary index table of connection Query:
For N number of data column that connection Query business is related to, corresponding second level rope is obtained according to action type query metadata table Draw the title of table;Inquire the secondary index table, specific query process are as follows:
According to the secondary index table row key format in step (112) it is found that N number of data of value having the same are listed in secondary index Corresponding entry continuous arrangement in table;
If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY of N number of entry constitutes one A N tuple<R1, R2 for meeting querying condition ..., RN>;
Entire secondary index table is scanned, then obtains the set {<R1, R2 ..., RN>} of all N tuples for meeting condition, then gathers {<R1, R2 ..., RN>be exactly data to be checked ROWKEY set.
7. the electric power big data querying method based on HBase secondary index as claimed in claim 6, characterized in that the step Suddenly the step of (32) are as follows:
It is provided using the set of the ROWKEY of the data to be checked obtained in step (311) and step (312) by HBase Get interface method obtains corresponding data value in tables of data.
CN201610980816.4A 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index Active CN106503243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610980816.4A CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610980816.4A CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Publications (2)

Publication Number Publication Date
CN106503243A CN106503243A (en) 2017-03-15
CN106503243B true CN106503243B (en) 2019-08-06

Family

ID=58323974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610980816.4A Active CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Country Status (1)

Country Link
CN (1) CN106503243B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241724A (en) * 2017-05-11 2018-07-03 新华三大数据技术有限公司 A kind of metadata management method and device
CN107341198B (en) * 2017-06-16 2020-05-12 云南电网有限责任公司信息中心 Electric power mass data storage and query method based on theme instance
CN107506464A (en) * 2017-08-30 2017-12-22 武汉烽火众智数字技术有限责任公司 A kind of method that HBase secondary indexs are realized based on ES
CN108398641B (en) * 2017-11-30 2021-03-09 深圳市科列技术股份有限公司 Battery data processing method and battery data server
CN108319665B (en) * 2018-01-18 2022-04-19 努比亚技术有限公司 Hbase column value searching method, terminal and storage medium
CN109063186A (en) * 2018-08-27 2018-12-21 郑州云海信息技术有限公司 A kind of General query method and relevant apparatus
CN109299102B (en) * 2018-10-23 2020-11-13 中国电子科技集团公司第二十八研究所 HBase secondary index system and method based on Elastcissearch
CN109800222B (en) * 2018-12-11 2021-06-01 中国科学院信息工程研究所 HBase secondary index self-adaptive optimization method and system
CN110502524B (en) * 2019-08-15 2022-06-10 济南浪潮数据技术有限公司 Phoenix index data asynchronous updating method and device
CN114372064B (en) * 2022-03-22 2022-07-12 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table

Also Published As

Publication number Publication date
CN106503243A (en) 2017-03-15

Similar Documents

Publication Publication Date Title
CN106503243B (en) Electric power big data querying method based on HBase secondary index
CN106528773B (en) Map computing system and method based on Spark platform supporting spatial data management
CN104881424B (en) A kind of acquisition of electric power big data, storage and analysis method based on regular expression
CN109582667A (en) A kind of multiple database mixing storage method and system based on power regulation big data
CN103714134B (en) Network flow data index method and system
CN110633186A (en) Log monitoring system for electric power metering micro-service architecture and implementation method
Tran et al. Managing structured and semistructured RDF data using structure indexes
CN107506464A (en) A kind of method that HBase secondary indexs are realized based on ES
CN102332030A (en) Data storing, managing and inquiring method and system for distributed key-value storage system
Zhu et al. Distributed skyline retrieval with low bandwidth consumption
CN104484472A (en) Database cluster for mixing various heterogeneous data sources and implementation method
De Virgilio et al. A similarity measure for approximate querying over RDF data
CN104700190A (en) Method and device for matching item and professionals
Yun et al. Research on intelligent fault diagnosis of power acquisition based on knowledge graph
CN106599190A (en) Dynamic Skyline query method based on cloud computing
CN113706333A (en) Method and system for automatically generating topology island of power distribution network
CN107491463A (en) The optimization method and system of data query
Shangguan et al. Big spatial data processing with Apache Spark
CN103377236B (en) A kind of Connection inquiring method and system for distributed data base
Chen et al. Multi-source and heterogeneous data integration model for big data analytics in power DCS
Chen et al. An optimized distributed OLAP system for big data
CN109189873A (en) A kind of Meteorological Services big data monitoring analysis system platform
CN110399337B (en) File automation service method and system based on data driving
Wang et al. Smart grid time series big data processing system
Ma et al. Multi-sourced data storage and index construction for equipment condition assessment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant