CN105824957A - Query engine system and query method of distributive memory column-oriented database - Google Patents

Query engine system and query method of distributive memory column-oriented database Download PDF

Info

Publication number
CN105824957A
CN105824957A CN201610193220.XA CN201610193220A CN105824957A CN 105824957 A CN105824957 A CN 105824957A CN 201610193220 A CN201610193220 A CN 201610193220A CN 105824957 A CN105824957 A CN 105824957A
Authority
CN
China
Prior art keywords
query engine
subtask
state
query
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610193220.XA
Other languages
Chinese (zh)
Other versions
CN105824957B (en
Inventor
段翰聪
王瑾
闵革勇
聂晓文
郑松
张博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201610193220.XA priority Critical patent/CN105824957B/en
Publication of CN105824957A publication Critical patent/CN105824957A/en
Application granted granted Critical
Publication of CN105824957B publication Critical patent/CN105824957B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a query engine system and a query method of a distributive memory column-oriented database. The query method comprises the following steps that a resource management module determines a conversation with a user in charge by a main query engine; the main query engine converts SQL (structured query language) sent by the user into a query plan; the resource management module allocates a sub query engine for the main query engine; the main query engine divides the query plan into at least two sub tasks and allocates sub query engine for each sub task; after the execution of the precursor sub tasks of the current sub task is completed, the current sub task is executed; middle data generated after the execution of the current sub task is completed is transmitted to the sub query engine in which the subsequent sub tasks are located; the current sub task completion state is sent to the main query engine; the main query engine notifies a customer to obtain final result data from the sub query engine. The query engine system and query method of the distributive memory column-oriented database provided by the invention have the advantage that good query efficiency can be obtained.

Description

The query engine system of distributed memory columnar database and querying method
Technical field
The present invention relates to database technical field, be specifically related to query engine system and the querying method of a kind of distributed memory columnar database.
Background technology
NewSQL is that this kind of data base not only has the NoSQL storage operating capability to mass data to various new expansible, the abbreviations in high-performance data storehouse, also maintains traditional database and supports the characteristics such as ACID and SQL.In general, NewSQL is roughly divided into three classes: new architecture, uses brand-new database platform, takes different methods for designing, such as GoogleSpanner, Clustrix, VoltDB and MemSQL;SQL query engine, the SQL storage engines of height optimization, it is provided that the DLL that MySQL is identical, but autgmentability is more preferable than built-in engine InnoDB;Transparent burst, it is provided that the middleware layer of burst, data base is automatically segmented in multiple node and runs.As time goes on, the NewSQL data base of these three type is the most gradually merged, and has been born towards the large-scale distributed internal memory columnar database of on-line analytical processing (OLAP, OnlineAnalyticalProcessing).
Query engine is the core of Database Systems, and that is responsible for whole Database Systems inquiry calculating task performs scheduling.Article one, the SQL statement of user's input, first can carry out SQL statement morphology syntax parsing generative grammar tree in Database Systems, then deform syntax tree through database query optimizer, finally change into the inquiry plan that database query engine can identify.Inquiry plan tells how query engine performs, and how to extract data from data base's bottom storage engines, deforms data and be finally converted into the result that user wants.
HIVE is a Tool for Data Warehouse based on Hadoop, and provides simple SQL query function, SQL statement can be converted into MapReduce task and run.For SQL statement SELECTc_custkeyFROMcustomerJOINnationONcustomer.C_NATION KEY=nation.N_NATIONKEYJOINlineitemONlineitem.L_PARTKEY=c ustomer.C_CUSTKEY, HIVE to a SQL query plan and tasks carrying flow process as shown in Figure 1.What HIVE really performed is MapReduce task, so inquiry plan can be converted into MapReduce set of tasks, former inquiry plan is converted to two MapReduce tasks.Wherein, JOB1 is responsible for calculating Join1, namely the Join computing of lineitem table and customer table;JOB2 is responsible for calculating Join2, namely calculates Join1 result and the Join computing of nation table, finally exports result.After JOB1 has performed, intermediate result data can be write external storage system, JOB2 just can start to perform, and then JOB2 can carry out evaluation work from the intermediate object program that external storage system reading JOB1 produces.The shortcoming of HIVE is apparent, its bottom uses MapReduce computation module, for the data sharing between each two MapReduce calculating task, one of them result calculating task can only be exported external storage system (distributed file system or local file system), later calculates task and calculates from external storage system reading data, cause substantial amounts of magnetic disc i/o, to such an extent as to whole query script postpones higher.
Spark-SQL is another Tool for Data Warehouse, similar with HIVE function, but Spark-SQL bottom uses Spark computation model rather than MapReduce computation module.For SQL statement SELECTc_custkeyFROMcustomerJOINnationONcustomer.C_NATION KEY=nation.N_NATIONKEYJOINlineitemONlineitem.L_PARTKEY=c ustomer.C_CUSTKEY, Spark-SQL to a SQL query plan and tasks carrying flow process as shown in Figure 2.Stage1 is mainly used to process the ScanTable(lineitem in inquiry plan) and ScanTable(customer), the most corresponding RDD1 and RDD2.Owing to RDD is distributed elastic data set, corresponding multiple physical nodes, each physical node can perform the task of correspondence, so a RDD is by the Task(task of multiple executed in parallel) obtain, such as RDD1 is just calculated by Task1-1, Task1-2.After having read lineitem table and customer table content, Stage2 is mainly used to process Join1 operation and ScanTable(nation) operation, generate RDD3 and RDD4 respectively.Finally, Stage3 has been used for Join2 operation.Spark-SQL is a lot of soon relative to HIVE on computing relay, but still there are disadvantages that.
One is that Spark-SQL bottom uses scala language to realize, and on a java virtual machine, its memory management mechanism depends on Java Virtual Machine to overall operation.And Java Virtual Machine memory management mechanism is a kind of general memory management mechanism, in database query engine, do not do the internal memory optimization customized for database query engine, cause Spark-SQL to consume substantial amounts of memory headroom during calculating.
It two is during Spark-SQL tasks carrying to perform according to phase sequence, and the precondition starting to perform such as Stage2 is that Stage1 has performed, and the precondition of Stage3 execution is that Stage2 has performed.Each Stage comprises several can the Task(task of executed in parallel), the Task postponed by the time that performs in this Stage is the longest that performs of each Stage determines.Thus producing a problem, perform to wait other Task being not carried out in same Stage after fast Task completes, after treating that in same Stage, all tasks carryings complete, the Task in next Stage just can start to perform.Such as, Task1-1, Task1-2, Task2-1 and Task2-2 are in Stage1, Task3-1 and Task3-2 is in Stage2, and Task3-2 depends on the result of calculation of Task1-1, Task1-2 and Task2-1.If Task1-1, Task1-2 and Task2-1 tasks carrying completes and Task2-2 has been not carried out, even when Task3-1 meets execution condition, under the constraints of Spark Computational frame, Task3-1 still can not start to perform, and needs just to start to perform after Task2-2 has performed by the time.If it is oversize that Task2-2 performs the time, then can affect the computing relay of whole calculating process.
Summary of the invention
To be solved by this invention is the problem that existing database query engine computational efficiency is low.
The present invention is achieved through the following technical solutions:
The query engine system of a kind of distributed memory columnar database, including resource management module, at least one main query engine and at least one is from query engine;Inquiry plan, for sql like language is converted to inquiry plan, is divided at least two subtask, and is responsible for monitoring and the execution process of scheduling inquiry plan by described main query engine;Described from query engine for perform described main query engine distribution subtask;Described resource management module is for being responsible for management and the distribution of system resource.
Optionally, described system resource includes that CPU calculates resource and memory source.
Query engine system based on above-mentioned distributed memory columnar database, the present invention also provides for the querying method of a kind of distributed memory columnar database, including: resource management module determines the session that a main query engine is responsible between user;The sql like language that user sends is converted to inquiry plan by main query engine;Resource management module is that main query engine distributes from query engine, and sets up from the communication between query engine and main query engine;Inquiry plan is divided at least two subtask by main query engine, and is that each subtask is distributed from query engine;Subtask is added to task queue from query engine, current subtask is performed after the forerunner subtask of current subtask has all performed, current subtask has been performed the intermediate data transmission that produces to place, follow-up subtask from query engine, and current subtask completion status is sent to main query engine;After whole inquiry plan completes, main query engine notifies that client is obtaining final result data from query engine.
Inquiry plan is divided into some subtasks having dependence by the present invention, and by subtask distribution to accordingly from the task queue of query engine, by the subtask performed successively from query engine in task queue, without occurring in Spark-SQL, although in the latter half, certain task is satisfied can perform condition, but perform the restriction of framework due to Spark-SQL, and the shortcoming performing calculating task can not be started.Therefore, the querying method of the distributed memory columnar database of present invention offer is provided, good search efficiency can be obtained.
Optionally, subtask uses physics operator representation, and described physics operator includes at least one in the operation of extraction column data operation, attended operation, condition filter operation, division operation, aggregate function operation, sorting operation and table of being embarked on journey by final result data convert.
Optionally, main query engine is that each subtask is distributed from query engine according to Cost Model.Use Cost Model be the distribution of each subtask from query engine, can be each subtask distribution Executing Cost minimum from query engine, thus improve search efficiency further.
Optionally, main query engine is that the distribution of each subtask includes from query engine according to Cost Model: according to obtaining the IP from query engine place node and the database table information of this node storage and column information from the metadata information of query engine;According in data localization principle distribution inquiry plan, each extracts the execution node IP that column data operates;Greedy algorithm is used to choose the execution node of non-extraction column data operation.
Optionally, the state of each subtask the pending state such as includes, calculates state, distribute data mode, the state that is finished and perform status of fail.
Optionally, the original state of current subtask such as is at the pending state, receives after all forerunners subtask, current subtask performed the intermediate data produced at place, current subtask from query engine, changes calculating the state of current subtask into state;After current subtask has calculated, the state of current subtask changes distributing data mode into, and by calculate the intermediate data produced send extremely place, follow-up subtask from query engine;If intermediate data sends successfully, change the state of current subtask into be finished state;If etc. between pending state and calculating state, calculate between state and distribution data mode or distribute data mode and being finished between state and break down, changing into the state of current subtask performing status of fail;When the state of current subtask changes, the main query engine of asynchronous notifications.
Optionally, between query engine, the intermediate data of transmission is the column data processed through overcompression.In traditional database enforcement engine, intermediate data is pressed the form of table and is occurred, data storage stores according to row, but under major part analytical type business scenario, some attributes in user's only one relation table of relation, the mode using row storage can additionally load the unconcerned attribute data of user during calculating, thus causes the waste of internal memory, uses the mode of row storage to solve this problem well.
Optionally, described compression processes and includes that position compression process and dictionary compression process.Use the mode that dictionary compression processes and position compression processes can reduce memory cost further, improve the service efficiency of internal memory.
The present invention compared with prior art, has such advantages as and beneficial effect:
The query engine system of the distributed memory columnar database that the present invention provides and querying method, integral operation efficiency is improved by the execution of each subtask of asynchronous schedule, some subtasks having dependence will be divided into by inquiry plan, and by subtask distribution to accordingly from the task queue of query engine, by the subtask performed successively from query engine in task queue.Further, between query engine, the data of transmission are the column data processed through overcompression, solve the mode using row storage during calculating extra load user unconcerned attribute data and cause the waste problem of internal memory.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing being further appreciated by the embodiment of the present invention, constitutes the part of the application, is not intended that the restriction to the embodiment of the present invention.In the accompanying drawings:
Fig. 1 is a SQL query plan and the tasks carrying schematic flow sheet of HIVE;
Fig. 2 is a SQL query plan and the tasks carrying schematic flow sheet of Spark-SQL;
Fig. 3 is the part-structure schematic diagram of the query engine system of the distributed memory columnar database of the embodiment of the present invention;
Fig. 4 is a SQL query plan schematic diagram of the embodiment of the present invention;
Fig. 5 is the tasks carrying schematic flow sheet of the embodiment of the present invention;
Fig. 6 is the execution state transition diagram of the subtask of the embodiment of the present invention;
Fig. 7 is the schematic diagram transmitting data between query engine of the embodiment of the present invention.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, and the exemplary embodiment of the present invention and explanation thereof are only used for explaining the present invention, not as a limitation of the invention.
Embodiment
The present embodiment provides the query engine system of a kind of distributed memory columnar database, and the query engine system of described distributed memory columnar database includes resource management module, at least one main query engine and at least one is from query engine.
Specifically, sql like language is converted to inquiry plan by resolving sql like language by described main query engine, it is distributed to described from query engine execution after inquiry plan is divided at least two subtask, and is responsible for monitoring and the execution process of scheduling inquiry plan and fault-tolerant processing.Similar with prior art, inquiry plan tree represents.Described from query engine for performing the subtask of described main query engine distribution, described resource management module is for being responsible for management and the distribution of system resource.Further, described system resource includes that CPU calculates resource and memory source.Fig. 3 is the part-structure schematic diagram of the query engine system of the distributed memory columnar database of the present embodiment, and main query engine 31 correspondence three is from query engine: from query engine 32, from query engine 33 and from query engine 34.
The present embodiment also provides for the querying method of the distributed memory columnar database of query engine system based on above-mentioned distributed memory columnar database, including:
Step S1, resource management module determines the session that a main query engine is responsible between user.Specifically, when user has query demand, resource management module creates the session that a main query engine is responsible between user in resource pool.
Step S2, the sql like language that user sends is converted to inquiry plan by main query engine.Main query engine is resolved by morphology and syntax parsing, and rule-based query optimization, and sql like language is converted into inquiry plan.Similar with prior art, inquiry plan tree represents.
Step S3, resource management module is that main query engine distributes from query engine, and sets up from the communication between query engine and main query engine.After sql like language is converted into inquiry plan, main query engine calculates resource to resource management module application, and resource management module distribution gives main query engine from query engine, and sets up from the network connection between query engine and main query engine.
Step S4, inquiry plan is divided at least two subtask, and is that each subtask is distributed from query engine by main query engine.Due in the present embodiment query engine towards be distributed memory columnar database, tables of data is by row storage in distributed column data base, and every string is cut into some bursts according to value scope.For this characteristic, the present embodiment has taken out some physics operators, is used for representing the subtask that in inquiry plan, some is concrete.Described physics operator includes at least one in the operation of extraction column data operation, attended operation, condition filter operation, division operation, aggregate function operation, sorting operation and table of being embarked on journey by final result data convert.
Extraction column data operates: i.e. GetColumn operator, being responsible for extracting the data of certain string in column database, GetColumn operator itself can be with additional restrictions, such as GetColumn(Teacher.ageTeacher.age > 1), represent the age row extracting Teacher table, and age value is more than 1.
Attended operation: i.e. Join operator, is responsible for performing Join computing, including LeftJoin, RightJoin, FullJoin etc..
Condition filter operates: i.e. Filter operator, is responsible for performing condition filter operation, mainly includes the logical operationss such as AND and OR.
Division operation: i.e. GroupBy operator, is responsible for performing GroupBy division operation, for meeting the function of GroupBy keyword in SQL statement.
Aggregate function operates: i.e. AGG operator, including Max(maximizing), Avg(averages) etc. the conventional operation of data base.
Sorting operation: i.e. Order operator, for being ranked up operation to the row needing sequence.
Final result data convert is embarked on journey the operation of table: i.e. BuildRow operator, for becoming user to may be appreciated row table column database final result data convert, with the form of relation table, final result is presented to user.
nullIllustrate,Article one, concrete SQL statement SELECTc_custkeyFROMcustomerJOINnationONcustomer.C_NATION KEY=nation.N_NATIONKEYJOINlineitemONlineitem.L_PARTKEY=c ustomer.C_CUSTKEY,The inquiry plan generated is resolved as shown in Figure 4 through main query engine,The subtask being divided into is as shown in Figure 5,Including six from query engine: from query engine Slave-QE1、From query engine Slave-QE2、From query engine Slave-QE3、From query engine Slave-QE4、From query engine Slave-QE5 and from query engine Slave-QE6.
Assume that each row all have two bursts, then for there being a GetColumn operator on each burst arranged, owing to the burst of each row has codomain scope, then also can produce Join operator based on this burst scope for each burst.With reference to Fig. 5, Join1 node represents the equivalent attended operation of row L_PARTKEY Yu C_CUSTKEY, in actual subtask, Join1 is split into two concrete physics operators, Join1-1 and Join1-2, is each responsible for codomain scope and operates at the equivalent Join of 101-150 at 1-100 and codomain scope.The like, in inquiry plan, Join2 is also split as two concrete Join operators.
Further, in the present embodiment main query engine be according to Cost Model be each subtask distribute from query engine.Specifically, main query engine is according to obtaining the IP from query engine place node and the database table information of this node storage and column information from the metadata information of query engine.According in data localization principle distribution inquiry plan, each extracts the execution node IP that column data operates.The most in Figure 5, from the fragment data of query engine Slave-QE1 place physical node storage L_PARTKEY row, then the GetColumn operator for this fragment data is just assigned to from the physical node of query engine Slave-QE1 place perform.The like, the node node all at corresponding data place that performs of the GetColumn operator of each burst performs.Node is performed for non-GetColumn operator and chooses employing greedy algorithm, non-GetColumn operator performs node and chooses in the execution node of its son's operator node, calculating the Executing Cost performed on every son operator node physical node respectively, the physical node selecting Executing Cost minimum performs.Principle basis cost computing formula: between Executing Cost=network cost+calculation cost=node, offered load × transmitted data amount+node tasks loads × calculate data volume.In Figure 5, Join1-1 operator performs node or from query engine Slave-QE1, from query engine Slave-QE3, here select to be through calculating Join1-1 operator respectively at the Executing Cost from query engine Slave-QE1 node with at the Executing Cost on query engine SlaveQE-3 node as the foundation performing node from query engine Slave-QE1, calculating determines at Executing Cost on query engine Join1-1 less, so final execution physical node is chosen as from query engine Slave-QE1.
Step S5, subtask is added to task queue from query engine, current subtask is performed after the forerunner subtask of current subtask has all performed, current subtask has been performed the intermediate data transmission that produces to place, follow-up subtask from query engine, and current subtask completion status is sent to main query engine.Specifically, the pending states such as each subtask includes, calculate state, distribute data mode, the state that is finished and perform these five kinds of states of status of fail, and the list of forerunner subtask and the list of follow-up subtask of this subtask can be safeguarded in each subtask, and the execution state transition graph of each subtask is as shown in Figure 6.
As a example by the Join1-1 operator shown in Fig. 4, its forerunner's operator list is GetColumn(L_PARTKEYSlice1 [1-100]), GetColumn (C_CUSTKEYSlice1 [1-150]), its Consequence operator list is Join2-1 operator.Join1-1 operator original state is etc. pending, after Join1-1 operator place physical node receives the data that the transmission of its all forerunner's operators comes, Join1-1 operator state changes into calculating, after Join1-1 operator has calculated, to work as pre-operator change into distribute data, and by calculation result data by network be sent to Consequence operator place from query engine.Data send successfully, when pre-operator tasks carrying completes.If the most a certain step breaks down, i.e. etc. between pending state and calculating state, calculating between state and distribution data mode or distribute data mode and being finished between state and break down, operator state can be set to perform failure.Certainly, often there is one-shot change in Join1-1 operator state, and pre-operator state is worked as in all can be real-time report to main query engine.The execution of each operator is separate, and during each operator performs, state once changes, will the main query engine of asynchronous notifications, and result data is pushed to the execution physical node at Consequence operator place.In this way, whether the execution of the subtask forerunner subtask that places one's entire reliance upon completes, and without as Spark or MapReduce, goes execution task stage by stage.
In step s 5, it is the column data processed through overcompression from query engine and the intermediate data transmitted between query engine, position compression process processes with dictionary compression to use compression processing method to include, as a example by the data structure shown in Fig. 7, intermediate data comprises three vectors, i.e. dictionary vector, side-play amount vector sum position vector.Initial data is ranked up by dictionary vector, and then duplicate removal processes, and the data of redundancy is abandoned, and saves memory storage space.As for side-play amount vector sum position vector, it is integer due to store inside the two vector, uses position Compression Strategies here.In a computer, an INT type accounts for four bytes, i.e. 32bit, and denotable scope of data is-2147483648~2147483647, and for the side-play amount vector sum position vector shown in Fig. 7, in vector, the maximum of integer may determine that.So in most of the cases, a have more than is needed 32bit of numeral is stored.Assume that in side-play amount vector or position vector, the maximum of integer is A, then storing the bit number used by a numeral is that log2A rounds up, contrast conventionally employed INT type or LONG type variable to store integer, adopt and the most more save internal memory.
Step S6, after whole inquiry plan completes, main query engine notifies that client is obtaining final result data from query engine.So far, whole inquiry work is completed.
Above-described detailed description of the invention; the purpose of the present invention, technical scheme and beneficial effect are further described; it is it should be understood that; the foregoing is only the detailed description of the invention of the present invention; the protection domain being not intended to limit the present invention; all within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. done, should be included within the scope of the present invention.

Claims (10)

1. the query engine system of a distributed memory columnar database, it is characterised in that include resource management module, at least one main query engine and at least one is from query engine;
Inquiry plan, for sql like language is converted to inquiry plan, is divided at least two subtask, and is responsible for monitoring and the execution process of scheduling inquiry plan by described main query engine;
Described from query engine for perform described main query engine distribution subtask;
Described resource management module is for being responsible for management and the distribution of system resource.
The query engine system of distributed memory columnar database the most according to claim 1, it is characterised in that described system resource includes that CPU calculates resource and memory source.
3. the querying method of the distributed memory columnar database of a query engine system based on the distributed memory columnar database described in claim 1 or 2, it is characterised in that including:
Resource management module determines the session that a main query engine is responsible between user;
The sql like language that user sends is converted to inquiry plan by main query engine;
Resource management module is that main query engine distributes from query engine, and sets up from the communication between query engine and main query engine;
Inquiry plan is divided at least two subtask by main query engine, and is that each subtask is distributed from query engine;
Subtask is added to task queue from query engine, current subtask is performed after the forerunner subtask of current subtask has all performed, current subtask has been performed the intermediate data transmission that produces to place, follow-up subtask from query engine, and current subtask completion status is sent to main query engine;
After whole inquiry plan completes, main query engine notifies that client is obtaining final result data from query engine.
The querying method of distributed memory columnar database the most according to claim 3, it is characterized in that, subtask uses physics operator representation, and described physics operator includes at least one in the operation of extraction column data operation, attended operation, condition filter operation, division operation, aggregate function operation, sorting operation and table of being embarked on journey by final result data convert.
The querying method of distributed memory columnar database the most according to claim 4, it is characterised in that main query engine is that each subtask is distributed from query engine according to Cost Model.
The querying method of distributed memory columnar database the most according to claim 5, it is characterised in that main query engine is that the distribution of each subtask includes from query engine according to Cost Model:
According to obtaining the IP from query engine place node and the database table information of this node storage and column information from the metadata information of query engine;
According in data localization principle distribution inquiry plan, each extracts the execution node IP that column data operates;
Greedy algorithm is used to choose the execution node of non-extraction column data operation.
The querying method of distributed memory columnar database the most according to claim 3, it is characterised in that the state of each subtask the pending state such as includes, calculates state, distribute data mode, the state that is finished and perform status of fail.
The querying method of distributed memory columnar database the most according to claim 7, it is characterized in that, the pending states such as the original state of current subtask is, at place, current subtask after query engine receives the intermediate data that all forerunners subtask, current subtask has performed generation, change calculating the state of current subtask into state;After current subtask has calculated, the state of current subtask changes distributing data mode into, and by calculate the intermediate data produced send extremely place, follow-up subtask from query engine;If intermediate data sends successfully, change the state of current subtask into be finished state;If etc. between pending state and calculating state, calculate between state and distribution data mode or distribute data mode and being finished between state and break down, changing into the state of current subtask performing status of fail;When the state of current subtask changes, the main query engine of asynchronous notifications.
The querying method of distributed memory columnar database the most according to claim 3, it is characterised in that the intermediate data of transmission is the column data processed through overcompression between query engine.
The querying method of distributed memory columnar database the most according to claim 9, it is characterised in that described compression processes and includes that position compression process and dictionary compression process.
CN201610193220.XA 2016-03-30 2016-03-30 The query engine system and querying method of distributed memory columnar database Active CN105824957B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610193220.XA CN105824957B (en) 2016-03-30 2016-03-30 The query engine system and querying method of distributed memory columnar database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610193220.XA CN105824957B (en) 2016-03-30 2016-03-30 The query engine system and querying method of distributed memory columnar database

Publications (2)

Publication Number Publication Date
CN105824957A true CN105824957A (en) 2016-08-03
CN105824957B CN105824957B (en) 2019-09-03

Family

ID=56524572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610193220.XA Active CN105824957B (en) 2016-03-30 2016-03-30 The query engine system and querying method of distributed memory columnar database

Country Status (1)

Country Link
CN (1) CN105824957B (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326387A (en) * 2016-08-17 2017-01-11 电子科技大学 Distributive data storage architecture, data storage method and data inquiry method
CN106445645A (en) * 2016-09-06 2017-02-22 北京百度网讯科技有限公司 Method and device for executing distributed computation tasks
CN106649503A (en) * 2016-10-11 2017-05-10 北京集奥聚合科技有限公司 Query method and system based on sql
CN107329814A (en) * 2017-06-16 2017-11-07 电子科技大学 A kind of distributed memory database query engine system based on RDMA
CN107450972A (en) * 2017-07-04 2017-12-08 阿里巴巴集团控股有限公司 A kind of dispatching method, device and electronic equipment
CN107818100A (en) * 2016-09-12 2018-03-20 杭州海康威视数字技术股份有限公司 A kind of SQL statement performs method and device
WO2018058707A1 (en) * 2016-09-30 2018-04-05 北京百度网讯科技有限公司 Task processing method and distributed computing framework
CN108520011A (en) * 2018-03-21 2018-09-11 哈工大大数据(哈尔滨)智能科技有限公司 A kind of method and device of determining task to carry into execution a plan
CN109471893A (en) * 2018-10-24 2019-03-15 上海连尚网络科技有限公司 Querying method, equipment and the computer readable storage medium of network data
CN109547512A (en) * 2017-09-22 2019-03-29 ***通信集团浙江有限公司 A kind of method and device of the distributed Session management based on NoSQL
CN110020006A (en) * 2017-07-27 2019-07-16 北京国双科技有限公司 The generation method and relevant device of query statement
CN110083441A (en) * 2018-01-26 2019-08-02 中兴飞流信息科技有限公司 A kind of distributed computing system and distributed computing method
CN110119275A (en) * 2019-05-13 2019-08-13 电子科技大学 A kind of distributed memory columnar database Complied executing device framework
CN110263105A (en) * 2019-05-21 2019-09-20 北京百度网讯科技有限公司 Inquiry processing method, query processing system, server and computer-readable medium
CN110300332A (en) * 2019-06-18 2019-10-01 南京科源信息技术有限公司 A kind of game loading method and system based on IPTV
CN110851452A (en) * 2020-01-16 2020-02-28 医渡云(北京)技术有限公司 Data table connection processing method and device, electronic equipment and storage medium
CN110968579A (en) * 2018-09-30 2020-04-07 阿里巴巴集团控股有限公司 Execution plan generation and execution method, database engine and storage medium
CN110990430A (en) * 2019-11-29 2020-04-10 广西电网有限责任公司 Large-scale data parallel processing system
CN111382156A (en) * 2020-02-14 2020-07-07 石化盈科信息技术有限责任公司 Data acquisition method, system, device, electronic equipment and storage medium
CN111552689A (en) * 2020-03-30 2020-08-18 平安医疗健康管理股份有限公司 Method, device and equipment for calculating deduplication index of fund audit
CN111723112A (en) * 2020-06-11 2020-09-29 咪咕文化科技有限公司 Data task execution method and device, electronic equipment and storage medium
CN112000688A (en) * 2020-08-14 2020-11-27 杭州数云信息技术有限公司 Query method and query system based on universal query language
CN112269835A (en) * 2020-11-10 2021-01-26 浪潮云信息技术股份公司 Method for asynchronously reading and processing batch data by distributed database
CN112416926A (en) * 2020-11-02 2021-02-26 浙商银行股份有限公司 Design method of distributed database high-performance actuator supporting domestic CPU SIMD instruction
CN112650561A (en) * 2019-10-11 2021-04-13 中兴通讯股份有限公司 Transaction management method, system, network device and readable storage medium
CN113792079A (en) * 2021-11-17 2021-12-14 腾讯科技(深圳)有限公司 Data query method and device, computer equipment and storage medium
CN113934763A (en) * 2021-12-17 2022-01-14 北京奥星贝斯科技有限公司 SQL query method and device for distributed database
CN113946600A (en) * 2021-10-21 2022-01-18 北京人大金仓信息技术股份有限公司 Data query method, data query device, computer equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103324765A (en) * 2013-07-19 2013-09-25 西安电子科技大学 Multi-core synchronization data query optimization method based on column storage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103324765A (en) * 2013-07-19 2013-09-25 西安电子科技大学 Multi-core synchronization data query optimization method based on column storage

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326387B (en) * 2016-08-17 2019-06-04 电子科技大学 A kind of Distributed Storage structure and date storage method and data query method
CN106326387A (en) * 2016-08-17 2017-01-11 电子科技大学 Distributive data storage architecture, data storage method and data inquiry method
CN106445645A (en) * 2016-09-06 2017-02-22 北京百度网讯科技有限公司 Method and device for executing distributed computation tasks
CN107818100A (en) * 2016-09-12 2018-03-20 杭州海康威视数字技术股份有限公司 A kind of SQL statement performs method and device
CN107818100B (en) * 2016-09-12 2019-12-20 杭州海康威视数字技术股份有限公司 SQL statement execution method and device
US11709894B2 (en) 2016-09-30 2023-07-25 Beijing Baidu Netcom Science And Technology Co., Ltd. Task processing method and distributed computing framework
WO2018058707A1 (en) * 2016-09-30 2018-04-05 北京百度网讯科技有限公司 Task processing method and distributed computing framework
CN106649503A (en) * 2016-10-11 2017-05-10 北京集奥聚合科技有限公司 Query method and system based on sql
CN107329814B (en) * 2017-06-16 2020-05-26 电子科技大学 RDMA (remote direct memory Access) -based distributed memory database query engine system
CN107329814A (en) * 2017-06-16 2017-11-07 电子科技大学 A kind of distributed memory database query engine system based on RDMA
CN107450972B (en) * 2017-07-04 2020-10-16 创新先进技术有限公司 Scheduling method and device and electronic equipment
CN107450972A (en) * 2017-07-04 2017-12-08 阿里巴巴集团控股有限公司 A kind of dispatching method, device and electronic equipment
CN110020006A (en) * 2017-07-27 2019-07-16 北京国双科技有限公司 The generation method and relevant device of query statement
CN109547512A (en) * 2017-09-22 2019-03-29 ***通信集团浙江有限公司 A kind of method and device of the distributed Session management based on NoSQL
CN110083441B (en) * 2018-01-26 2021-06-04 中兴飞流信息科技有限公司 Distributed computing system and distributed computing method
CN110083441A (en) * 2018-01-26 2019-08-02 中兴飞流信息科技有限公司 A kind of distributed computing system and distributed computing method
CN108520011A (en) * 2018-03-21 2018-09-11 哈工大大数据(哈尔滨)智能科技有限公司 A kind of method and device of determining task to carry into execution a plan
CN110968579A (en) * 2018-09-30 2020-04-07 阿里巴巴集团控股有限公司 Execution plan generation and execution method, database engine and storage medium
CN110968579B (en) * 2018-09-30 2023-04-11 阿里巴巴集团控股有限公司 Execution plan generation and execution method, database engine and storage medium
CN109471893A (en) * 2018-10-24 2019-03-15 上海连尚网络科技有限公司 Querying method, equipment and the computer readable storage medium of network data
CN110119275B (en) * 2019-05-13 2021-04-02 电子科技大学 Distributed memory column type database compiling executor architecture
CN110119275A (en) * 2019-05-13 2019-08-13 电子科技大学 A kind of distributed memory columnar database Complied executing device framework
CN110263105A (en) * 2019-05-21 2019-09-20 北京百度网讯科技有限公司 Inquiry processing method, query processing system, server and computer-readable medium
CN110263105B (en) * 2019-05-21 2021-09-10 北京百度网讯科技有限公司 Query processing method, query processing system, server, and computer-readable medium
US11194807B2 (en) 2019-05-21 2021-12-07 Beijing Baidu Netcom Science And Technology Co., Ltd. Query processing method, query processing system, server and computer readable medium
CN110300332A (en) * 2019-06-18 2019-10-01 南京科源信息技术有限公司 A kind of game loading method and system based on IPTV
CN112650561A (en) * 2019-10-11 2021-04-13 中兴通讯股份有限公司 Transaction management method, system, network device and readable storage medium
CN110990430A (en) * 2019-11-29 2020-04-10 广西电网有限责任公司 Large-scale data parallel processing system
CN110851452A (en) * 2020-01-16 2020-02-28 医渡云(北京)技术有限公司 Data table connection processing method and device, electronic equipment and storage medium
CN111382156A (en) * 2020-02-14 2020-07-07 石化盈科信息技术有限责任公司 Data acquisition method, system, device, electronic equipment and storage medium
CN111552689A (en) * 2020-03-30 2020-08-18 平安医疗健康管理股份有限公司 Method, device and equipment for calculating deduplication index of fund audit
CN111552689B (en) * 2020-03-30 2022-05-03 平安医疗健康管理股份有限公司 Method, device and equipment for calculating deduplication index of fund audit
CN111723112B (en) * 2020-06-11 2023-07-07 咪咕文化科技有限公司 Data task execution method and device, electronic equipment and storage medium
CN111723112A (en) * 2020-06-11 2020-09-29 咪咕文化科技有限公司 Data task execution method and device, electronic equipment and storage medium
CN112000688A (en) * 2020-08-14 2020-11-27 杭州数云信息技术有限公司 Query method and query system based on universal query language
CN112416926A (en) * 2020-11-02 2021-02-26 浙商银行股份有限公司 Design method of distributed database high-performance actuator supporting domestic CPU SIMD instruction
CN112269835A (en) * 2020-11-10 2021-01-26 浪潮云信息技术股份公司 Method for asynchronously reading and processing batch data by distributed database
CN113946600A (en) * 2021-10-21 2022-01-18 北京人大金仓信息技术股份有限公司 Data query method, data query device, computer equipment and medium
CN113792079B (en) * 2021-11-17 2022-02-08 腾讯科技(深圳)有限公司 Data query method and device, computer equipment and storage medium
CN113792079A (en) * 2021-11-17 2021-12-14 腾讯科技(深圳)有限公司 Data query method and device, computer equipment and storage medium
CN113934763A (en) * 2021-12-17 2022-01-14 北京奥星贝斯科技有限公司 SQL query method and device for distributed database

Also Published As

Publication number Publication date
CN105824957B (en) 2019-09-03

Similar Documents

Publication Publication Date Title
CN105824957A (en) Query engine system and query method of distributive memory column-oriented database
US11068439B2 (en) Unsupervised method for enriching RDF data sources from denormalized data
CN110032604B (en) Data storage device, translation device and database access method
CN111344693B (en) Aggregation in dynamic and distributed computing systems
KR101621137B1 (en) Low latency query engine for apache hadoop
Zhao et al. Modeling MongoDB with relational model
US9298774B2 (en) Changing the compression level of query plans
CN107515878B (en) Data index management method and device
EP3285178A1 (en) Data query method in crossing-partition database, and crossing-partition query device
CN109491989B (en) Data processing method and device, electronic equipment and storage medium
US9135647B2 (en) Methods and systems for flexible and scalable databases
CN106897411A (en) ETL system and its method based on Spark technologies
CN109241159B (en) Partition query method and system for data cube and terminal equipment
CN103440303A (en) Heterogeneous cloud storage system and data processing method thereof
CN103631870A (en) System and method used for large-scale distributed data processing
CN111562885A (en) Data processing method and device, computer equipment and storage medium
CN105550351B (en) The extemporaneous inquiry system of passenger's run-length data and method
CN114969441A (en) Knowledge mining engine system based on graph database
CN117056303B (en) Data storage method and device suitable for military operation big data
US20140379691A1 (en) Database query processing with reduce function configuration
CN116756150B (en) Mpp database large table association acceleration method
CN108319604B (en) Optimization method for association of large and small tables in hive
Azez et al. JOUM: an indexing methodology for improving join in hive star schema
CN113568931A (en) Route analysis system and method for data access request
CN111046054A (en) Method and system for analyzing power marketing business data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant