CN113127509A - Method and device for adapting SQL execution engine in PaaS platform - Google Patents

Method and device for adapting SQL execution engine in PaaS platform Download PDF

Info

Publication number
CN113127509A
CN113127509A CN201911413753.4A CN201911413753A CN113127509A CN 113127509 A CN113127509 A CN 113127509A CN 201911413753 A CN201911413753 A CN 201911413753A CN 113127509 A CN113127509 A CN 113127509A
Authority
CN
China
Prior art keywords
sql
execution engine
metadata
information
decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911413753.4A
Other languages
Chinese (zh)
Other versions
CN113127509B (en
Inventor
颜涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Chongqing Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Chongqing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Chongqing Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201911413753.4A priority Critical patent/CN113127509B/en
Publication of CN113127509A publication Critical patent/CN113127509A/en
Application granted granted Critical
Publication of CN113127509B publication Critical patent/CN113127509B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an adaptation method and a device of an SQL execution engine in a PaaS platform, wherein the method comprises the following steps: receiving an SQL statement input by a user; analyzing the SQL statement to obtain SQL metadata information; extracting key information in the SQL metadata information from the Hadoop platform to form table metadata; inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result; and providing the SQL statements to the SQL execution engine to select the SQL execution engine corresponding to the result for execution. By the mode, the method and the system realize automatic adaptation of the SQL execution engine, when a user accesses data of the PaaS platform by using SQL, the optimal SQL engine is automatically identified and adapted according to the SQL submitted by the user, uncertainty of human experience is avoided, and the most appropriate SQL execution engine can be used for executing the corresponding SQL statement in the PaaS platform.

Description

Method and device for adapting SQL execution engine in PaaS platform
Technical Field
The invention relates to the technical field of computers, in particular to an adaptation method and an adaptation device of an SQL execution engine in a PaaS platform.
Background
Most of the open data Service bottom layer of the PaaS (Platform as a Service) Platform is based on Hadoop, and a general execution flow of the Platform is that a user accesses data of the Hadoop through the PaaS Platform data Service, and the execution of calculation and the storage of the data are completed on the Hadoop. The number of SQL processing engines on Hadoop is large, and hive, sparkSQL, flink, impala and the like are the mainstream. Each SQL processing engine has different applicable scenes, the optimal effect can be exerted in the good scenes, and the operation efficiency and stability can be greatly reduced or even fail to operate in the unsuitable scenes. An improper SQL engine is used for operating the program on the Hadoop platform, so that the phenomenon of unstable operation efficiency or failure is easy to occur, and the operation and maintenance difficulty of the service is increased. Moreover, the Hadoop platform generally has a lot of programs running every day, and a large number of programs with low running efficiency can cause obvious waste of computing resources of the whole platform.
In the prior art, technicians mostly rely on experience of the technicians or determine an SQL execution engine according to results of repeated running tests when developing and submitting SQL jobs.
On the one hand, on the other hand, the execution engine is selected through manual experience judgment, the personal ability of technicians is seriously depended on, and the selected SQL engine is not guaranteed to be the most suitable for the business scene. On the other hand, the efficiency of selecting which optimal engine is selected after the efficiency test is performed by selecting different SQL engines is low, and the repeated operation of the same program on each SQL engine to test the operation efficiency causes the waste of platform computing resources. Therefore, an efficient method for adapting the SQL execution engine is needed.
Disclosure of Invention
In view of the above, the present invention is proposed to provide an adaptation method and apparatus for an SQL execution engine in a PaaS platform, which overcome or at least partially solve the above problems.
According to one aspect of the invention, an adaptation method of an SQL execution engine in a PaaS platform is provided, which comprises the following steps:
receiving an SQL statement input by a user;
analyzing the SQL statement to obtain SQL metadata information;
extracting key information in the SQL metadata information from the Hadoop platform to form table metadata;
inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result;
and providing the SQL statements to the SQL execution engine to select the SQL execution engine corresponding to the result for execution.
Optionally, the method further comprises:
monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
Optionally, the SQL selector comprises at least one decision unit with a prediction model;
inputting the SQL metadata information and the table metadata into the SQL selector for processing, and obtaining an SQL execution engine selection result specifically includes:
integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to at least one decision unit for the decision of the at least one decision unit and outputting a decision result of an SQL execution engine;
and selecting the SQL execution engine selection result from at least one SQL execution engine decision result according to a voting mode.
Optionally, a self-learning manner is adopted, the prediction model of the SQL selector is trained according to data in the historical index library, or the prediction model of the decision unit included in the SQL selector is trained according to data in the historical index library.
Optionally, the SQL metadata information comprises at least one of the following information: table, field type, field length, result field, sorting information, association information, operator type and temporary table insertion times;
the key information in the SQL metadata information includes at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information. .
According to another aspect of the present invention, an adapting apparatus for an SQL execution engine in a PaaS platform is provided, which includes:
the SQL receiver is used for receiving SQL sentences input by a user;
the SQL parser is used for parsing the SQL statement to obtain SQL metadata information;
the table metadata acquirer is used for extracting key information in the SQL metadata information from the Hadoop platform to form table metadata;
the SQL selector receives and processes the SQL metadata information and the table metadata and outputs an SQL execution engine selection result;
and the SQL executor is used for providing the SQL statements to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
Optionally, the apparatus further comprises:
the detection module is used for monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and the storage module is used for storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
Optionally, the SQL selector comprises at least one decision unit with a prediction model, a data receiving unit, and a decision outputter;
the data receiving unit is used for receiving the SQL metadata and the table metadata, integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to the at least one decision unit so that the at least one decision unit can make a decision and output an SQL execution engine decision result;
and the decision output device is used for selecting the SQL execution engine selection result from the at least one SQL execution engine decision result according to a voting mode.
Optionally, the apparatus further comprises:
and the self-learning module is used for training the prediction model of the SQL selector according to the data in the historical index library or training the prediction model of the decision unit contained in the SQL selector according to the data in the historical index library in a self-learning mode.
Optionally, the SQL metadata information comprises at least one of the following information: table, field type, field length, result field, sorting information, association information, operator type and temporary table insertion times;
the key information in the SQL metadata information includes at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information. .
According to yet another aspect of the present invention, there is provided a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the adaptation method of the SQL execution engine in the PaaS platform.
According to another aspect of the present invention, a computer storage medium is provided, where at least one executable instruction is stored in the storage medium, and the executable instruction causes a processor to execute an operation corresponding to the adaptation method of the SQL execution engine in the PaaS platform.
According to the invention, the method and the device for adapting the SQL execution engine in the PaaS platform comprise the following steps: receiving an SQL statement input by a user; analyzing the SQL statement to obtain SQL metadata information; extracting key information in the SQL metadata information from the Hadoop platform to form table metadata; inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result; and providing the SQL statements to the SQL execution engine to select the SQL execution engine corresponding to the result for execution. By the mode, the method and the system realize automatic adaptation of the SQL execution engine, when a user accesses data of the PaaS platform by using SQL, the optimal SQL engine is automatically identified and adapted according to the SQL submitted by the user, uncertainty of human experience is avoided, and the most appropriate SQL execution engine can be used for executing the corresponding SQL statement in the PaaS platform.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 shows a flowchart of an adaptation method of an SQL execution engine in a PaaS platform according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating an adaptation method of an SQL execution engine in a PaaS platform according to another embodiment of the present invention;
fig. 3 shows a schematic structural diagram of an adaptation apparatus of an SQL execution engine in a PaaS platform according to an embodiment of the present invention;
FIG. 4 is a flow chart of the SQL selector in the embodiment of the invention;
fig. 5 shows a schematic structural diagram of an adaptation apparatus of an SQL execution engine in a PaaS platform in an embodiment of the present invention;
fig. 6 shows a schematic structural diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 shows a flowchart of an adaptation method for an SQL execution engine in a PaaS platform according to an embodiment of the present invention, where the PaaS platform is implemented based on a Hadoop platform in this embodiment, as shown in fig. 1, the method includes the following steps:
step S101, receiving SQL sentences input by users.
When a platform user accesses PaaS platform data by using SQL, SQL statements need to be submitted. In practice, platform users may include program development tasks, business analysts, data modelers, and so on.
And step S102, analyzing the SQL statement to obtain SQL metadata information.
And after receiving the SQL sentences input by the user, analyzing the SQL sentences to obtain SQL metadata.
And S103, extracting key information in the SQL metadata information from the Hadoop platform to form table metadata.
And then matching the SQL metadata information with each table metadata of the Hadoop platform, and extracting key information in the SQL metadata information to form Hadoop table metadata.
And step S104, inputting the SQL metadata information and the table metadata into the SQL selector for processing to obtain an SQL execution engine selection result.
And inputting the SQL metadata information and the table metadata into an SQL selector, and selecting and judging the SQL selector according to the data by the SQL selector to select an SQL execution engine most suitable for processing the SQL statement to obtain an SQL execution engine selection result.
For example, the SQL selector is a prediction model, and performs prediction judgment on SQL metadata information and table metadata through the prediction model to output an SQL execution engine selection result.
Step S105, providing the SQL statement to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
And finally, providing the SQL statement to the selected most appropriate SQL execution engine for execution.
According to the adaptation method of the SQL execution engine in the PaaS platform, provided by the embodiment of the invention, the SQL sentences input by the user are analyzed, the corresponding SQL metadata and Hadoop table metadata are extracted, and the most suitable SQL execution engine is selected according to the data. The method provides a method for automatically adapting the SQL execution engine, when a user accesses PaaS platform data by using SQL, the optimal SQL engine is automatically identified and adapted according to the SQL submitted by the user, uncertainty of manual experience is avoided, the most appropriate SQL execution engine is used for executing the corresponding SQL statement in the PaaS platform, the problem of unstable operation efficiency or operation failure can be further avoided, the operation and maintenance difficulty of the service is reduced, and waste of computing resources can be avoided.
Fig. 2 shows a flowchart of an adaptation method for an SQL execution engine in a PaaS platform according to another embodiment of the present invention, where the PaaS platform is implemented based on a Hadoop platform in this embodiment, as shown in fig. 2, the method includes the following steps:
step S201, receiving an SQL statement input by a user.
When a platform user accesses PaaS platform data by using SQL, SQL statements need to be submitted. In practice, platform users may include program development tasks, business analysts, data modelers, and so on.
Step S202, analyzing the SQL statement to obtain SQL metadata information.
Analyzing the SQL statement to obtain SQL metadata information, wherein optionally, the SQL metadata information includes at least one of the following information: table, field type, field length, result field, sorting information, association information, operator type, number of times the temporary table is inserted.
An example of a detailed metadata information list required to be obtained by SQL statement parsing is as follows, wherein a table one is an example of input table metadata information of input SQL. Table two is an example of associated metadata information for input SQL. Table three is an example of key operator metadata information of input SQL.
Watch 1
Figure BDA0002350648250000071
Watch two
Figure BDA0002350648250000072
Watch III
Figure BDA0002350648250000081
And step S203, extracting the key information in the SQL metadata information from the Hadoop platform to form table metadata.
And matching the SQL metadata information with each table metadata of the Hadoop platform, and extracting key information in the SQL metadata information to form Hadoop table metadata.
Optionally, the key information in the SQL metadata information includes at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information. The constructed table metadata includes at least one of: table size, storage format, field type, length.
Table four is an example of field metadata information of the input table of the input SQL in the hadoop platform, and table five is an example of storage metadata information of the input table of the input SQL in the hadoop platform.
Watch four
Figure BDA0002350648250000082
Watch five
Figure BDA0002350648250000091
Step S204, integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to at least one decision unit for the decision of the at least one decision unit and outputting the decision result of the SQL execution engine.
In the embodiment of the invention, the SQL selector comprises at least one decision unit with a prediction model, the total number of the decision units is odd, and the decision units are mutually independent. The SQL metadata information and the table metadata are integrated into decision requests, the decision requests are provided for each decision unit, each decision unit carries out decision judgment on the received decision requests through a self prediction model, and each decision unit outputs an SQL execution engine decision request.
And each decision unit trains a prediction model according to data in the historical index library in a self-learning mode. The historical index database comprises historical operating SQL metadata information, historical table metadata and historical SQL operating performance indexes. Specifically, the prediction model of the decision unit is trained according to a preset period through an independent learning model.
Step S205, selecting an SQL execution engine selection result from the at least one SQL execution engine decision result according to a voting manner.
And then, selecting a final result, namely an SQL execution engine selection result, from the SQL execution engine decision results output by the decision units according to a voting mode, wherein the SQL execution engine selection result is an execution engine which is selected according to SQL metadata and Hadoop table metadata and is most suitable for running the SQL statement.
Step S206, providing the SQL statement to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
And providing the SQL statements for the selected most appropriate SQL execution to execute.
Step S207, monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index.
And monitoring the operation process of the SQL statement on the selected optimal SQL execution engine to obtain the operation performance indexes, wherein the operation performance indexes specifically comprise operation duration, CPU consumption, memory consumption and the like.
And S208, storing the operation performance index, the SQL metadata information and the table metadata into a history index library.
And merging the operation performance index, the SQL metadata information and the table metadata and recording the merged operation performance index, the SQL metadata information and the table metadata in a historical index library. According to the above, the data in the historical index library can be used for training the prediction model of the decision unit.
Table six is an example of performance indicators running on the SQL execution engine.
Watch six
Figure BDA0002350648250000101
According to the adaptation method of the SQL execution engine in the PaaS platform, provided by the embodiment of the invention, the SQL sentences input by the user are analyzed, the corresponding SQL metadata and Hadoop table metadata are extracted, the data are decided through a plurality of decision units, and the most suitable SQL execution engine is selected according to the SQL decision result. The method provides a method for automatically adapting the SQL execution engine, when a user accesses PaaS platform data by using SQL, the optimal SQL engine is automatically identified and adapted according to the SQL submitted by the user, uncertainty of manual experience is avoided, the most appropriate SQL execution engine is used for executing the corresponding SQL statement in the PaaS platform, the problem of unstable operation efficiency or operation failure can be further avoided, the operation and maintenance difficulty of the service is reduced, and waste of platform computing resources can be avoided. Meanwhile, the decision unit has the self-learning and self-optimizing capacity and performs self-optimization according to data in the historical index library, so that the accuracy of selection can be improved, and the workload of model tuning is avoided.
Fig. 3 shows a schematic structural diagram of an adaptation apparatus of an SQL execution engine in a PaaS platform according to an embodiment of the present invention, and as shown in fig. 3, the apparatus includes:
an SQL receiver 31 for receiving an SQL statement input by a user;
the SQL parser 32 is configured to parse the SQL statement to obtain SQL metadata information.
The device firstly receives SQL sentences submitted by a user and submits the SQL sentences to an SQL parser to parse the SQL. The SQL analyzing process is to extract needed metadata information from SQL statements submitted by users, wherein the metadata information comprises tables, fields, association types, association fields, sorting types, sorting fields, the number of times of inserting temporary tables and the like which need to be inquired and is used as input data of a table metadata acquirer and an SQL selector.
And the table metadata acquirer 33 is configured to extract key information in the SQL metadata information from the Hadoop platform to form table metadata.
The table metadata acquirer receives SQL to extract key information in the SQL metadata information from the Hadoop platform to form table metadata which is used as input data of the SQL selector.
And the SQL selector 34 is used for receiving and processing the SQL metadata information and the table metadata and outputting an SQL execution engine selection result.
And submitting the metadata analyzed by the SQL analyzer and the table metadata acquired by the table metadata acquirer as input data to the SQL engine selector to select an optimal SQL engine.
The SQL executor 35 provides the SQL statements to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
In an alternative mode, the SQL selector 34 includes a decision maker, a data receiving unit, and a decision outputter; the decision maker comprises at least one decision unit with a prediction model
And the data receiving unit is used for receiving the SQL metadata and the table metadata, integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to the at least one decision unit so that the at least one decision unit can make a decision and output an SQL execution engine decision result.
And the decision output device is used for selecting the SQL execution engine selection result from the at least one SQL execution engine decision result according to a voting mode.
The data receiving unit is responsible for receiving the submitted SQL metadata and the table metadata of the hadoop platform, integrating the two data into a complete decision-making request and submitting the complete decision-making request to the decision-making device. The decision maker is composed of a group of odd number of independent decision making units, and is used for respectively making decision making judgment on decision making requests submitted by the data receiving unit and outputting an optimal SQL execution engine to the decision maker output unit. And after receiving the output results of each decision unit, the decision output unit selects a final result, namely an optimal SQL execution engine, according to a voting mode.
Fig. 4 shows a schematic flow diagram in the SQL selector in the embodiment of the present invention, as shown in fig. 4, SQL metadata and Hadoop table metadata are received first, then the received SQL metadata and Hadoop table metadata are integrated into a decision request, and the decision request is sent to decision unit 1, decision unit 2, decision unit 3, and decision unit 4 … …, an output result of each decision unit is provided to a decision outputter, and the decision outputter outputs an optimal SQL execution engine selection result finally selected.
In an optional manner, the apparatus further includes: the detection module is used for monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index; and the storage module is used for storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
Fig. 5 shows a schematic structural diagram of an adaptation apparatus of an SQL execution engine in a PaaS platform in an embodiment of the present invention, as shown in fig. 5, the apparatus includes: the system comprises an SQL receiver, an SQL parser, a Hadoop table metadata acquirer, an SQL execution engine selector, an SQL submission executor, an SQL selection self-learning device and a historical SQL operation index library.
The SQL receiver receives SQL sentences submitted by a user, submits the SQL sentences to the SQL parser to parse SQL, and extracts SQL metadata information which is used as input data of the Hadoop table metadata acquirer and the SQL execution engine selector. The Hadoop table metadata acquirer matches SQL metadata information with each table metadata in the Hadoop platform, acquires key information in the SQL metadata to form Hadoop table metadata, and the Hadoop table metadata is also used as input data of the SQL execution engine selector.
Then, the metadata analyzed by the SQL analyzer and the table metadata acquired by the hadoop platform table metadata acquirer are used as input data and submitted to an SQL execution engine selector to select an optimal SQL engine. The SQL execution engine selector also comprises a data receiving unit, a decision maker and a decision output device. The specific implementation process is as follows:
the data receiving unit is responsible for receiving the submitted SQL metadata and the table metadata of the hadoop platform, integrating the two data into a complete decision-making request and submitting the complete decision-making request to the decision-making device.
The decision maker is composed of a group of odd number of independent decision making units, and is used for respectively making decision making judgment on the decision making requests submitted by the data receiving unit and outputting the decision making results of the optimal SQL execution engine to the decision maker output unit.
And after receiving the output results of each decision unit, the decision output unit selects a final result, namely an optimal SQL execution engine, according to a voting mode.
The SQL engine selection self-learning device provides decision support for each decision unit in the SQL execution engine selector and is a brain of the decision unit. And each decision unit is an independent algorithm model, so that automatic learning is performed every day, and the decision accuracy is continuously accumulated and improved.
The decision unit self-learning specifically refers to acquiring SQL metadata, table metadata and historical SQL operation performance indexes of various historical operations from a historical SQL operation index library, and performing decision calculation training every day through an independent learning model.
Finally, the SQL submission executor submits the SQL statement to the selected SQL execution engine, such as hive, spark, flink, impala, and so on.
The embodiment of the invention provides a nonvolatile computer storage medium, wherein the computer storage medium stores at least one executable instruction, and the computer executable instruction can execute an adaptation method of an SQL execution engine in a PaaS platform in any method embodiment.
The executable instructions may be specifically configured to cause the processor to:
receiving an SQL statement input by a user;
analyzing the SQL statement to obtain SQL metadata information;
extracting key information in the SQL metadata information from the Hadoop platform to form table metadata;
inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result;
and providing the SQL statements to the SQL execution engine to select the SQL execution engine corresponding to the result for execution.
In an alternative, the executable instructions cause the processor to:
monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
In an alternative, the SQL selector comprises at least one decision unit with a predictive model, the executable instructions causing the processor to:
integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to at least one decision unit for the decision of the at least one decision unit and outputting a decision result of an SQL execution engine;
and selecting the SQL execution engine selection result from at least one SQL execution engine decision result according to a voting mode.
In an alternative, the executable instructions cause the processor to:
and training the prediction model of the SQL selector according to the data in the historical index database or training the prediction model of the decision unit contained in the SQL selector according to the data in the historical index database by adopting a self-learning mode.
In an alternative approach, the SQL metadata information comprises at least one of the following information: table, field type, field length, result field, sorting information, association information, operator type and temporary table insertion times;
the key information in the SQL metadata information includes at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information.
Fig. 6 is a schematic structural diagram of an embodiment of a computing device according to the present invention, and a specific embodiment of the present invention does not limit a specific implementation of the computing device.
As shown in fig. 6, the computing device may include: a processor (processor)602, a communication Interface 604, a memory 606, and a communication bus 608.
Wherein: the processor 602, communication interface 604, and memory 606 communicate with one another via a communication bus 608. A communication interface 604 for communicating with network elements of other devices, such as clients or other servers. The processor 602 is configured to execute the program 610, and may specifically execute relevant steps in the foregoing adaptation method embodiment for the SQL execution engine in the PaaS platform for the computing device.
In particular, program 610 may include program code comprising computer operating instructions.
The processor 602 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 606 for storing a program 610. Memory 606 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 610 may specifically be configured to cause the processor 602 to perform the following operations:
receiving an SQL statement input by a user;
analyzing the SQL statement to obtain SQL metadata information;
extracting key information in the SQL metadata information from the Hadoop platform to form table metadata;
inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result;
and providing the SQL statements to the SQL execution engine to select the SQL execution engine corresponding to the result for execution.
In an alternative, the program 610 causes the processor 602 to:
monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
In an alternative approach, the SQL selector comprises at least one decision unit with a prediction model; the program 610 causes the processor 602 to perform the following operations:
integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to at least one decision unit for the decision of the at least one decision unit and outputting a decision result of an SQL execution engine;
and selecting the SQL execution engine selection result from at least one SQL execution engine decision result according to a voting mode.
In an alternative, the program 610 causes the processor 602 to: and training the prediction model of the SQL selector according to the data in the historical index database or training the prediction model of the decision unit contained in the SQL selector according to the data in the historical index database by adopting a self-learning mode.
In an alternative approach, the SQL metadata information comprises at least one of the following information: table, field type, field length, result field, sorting information, association information, operator type and temporary table insertion times;
the key information in the SQL metadata information includes at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. An adaptation method of an SQL execution engine in a PaaS platform comprises the following steps:
receiving an SQL statement input by a user;
analyzing the SQL statement to obtain SQL metadata information;
extracting key information in the SQL metadata information from a Hadoop platform to form table metadata;
inputting the SQL metadata information and the table metadata into an SQL selector for processing to obtain an SQL execution engine selection result;
and providing the SQL statement to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
2. The method of claim 1, wherein the SQL selector comprises at least one decision unit with a predictive model;
inputting the SQL metadata information and the table metadata into an SQL selector for processing, and obtaining an SQL execution engine selection result specifically includes:
integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to the at least one decision unit so that the at least one decision unit can make a decision and output an SQL execution engine decision result;
and selecting an SQL execution engine selection result from the at least one SQL execution engine decision result according to a voting mode.
3. The method of claim 1, wherein the method further comprises:
monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
4. The method according to any one of claims 1-3, wherein the prediction model of the SQL selector is trained in a self-learning manner according to data in a historical index library, or the prediction model of at least one decision unit included in the SQL selector is trained according to data in a historical index library.
5. The method of claim 1, wherein the SQL metadata information comprises at least one of: table, field type, field length, result field, sorting information, association information, operator type and temporary table insertion times;
the key information in the SQL metadata information comprises at least one of the following information: table information in the SQL metadata information, field information in the SQL metadata information, storage information in the SQL metadata information.
6. An adaptation device of an SQL execution engine in a PaaS platform comprises the following components:
the SQL receiver is used for receiving SQL sentences input by a user;
the SQL parser is used for parsing the SQL statement to obtain SQL metadata information;
the table metadata acquirer is used for extracting key information in the SQL metadata information from the Hadoop platform to form table metadata;
the SQL selector receives and processes the SQL metadata information and the table metadata and outputs an SQL execution engine selection result;
and the SQL executor is used for providing the SQL statements to the SQL execution engine corresponding to the SQL execution engine selection result for execution.
7. The apparatus of claim 6, wherein the SQL selector comprises at least one decision unit with a prediction model, a data receiving unit, a decision outputter;
the data receiving unit is used for receiving SQL metadata and table metadata, integrating the SQL metadata information and the table metadata into a decision request, and providing the decision request to the at least one decision unit so that the at least one decision unit can make a decision and output an SQL execution engine decision result;
and the decision output device is used for selecting an SQL execution engine selection result from the at least one SQL execution engine decision result according to a voting mode.
8. The apparatus of claim 6, wherein the apparatus further comprises:
the detection module is used for monitoring the execution process of the SQL execution engine corresponding to the SQL execution engine selection result to obtain an operation performance index;
and the storage module is used for storing the operation performance index, the SQL metadata information and the table metadata into a historical index library.
9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the adaptation method of the SQL execution engine in the PaaS platform according to any one of claims 1-5.
10. A computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the method for adapting an SQL execution engine in a PaaS platform according to any of claims 1 to 5.
CN201911413753.4A 2019-12-31 2019-12-31 Method and device for adapting SQL execution engine in PaaS platform Active CN113127509B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911413753.4A CN113127509B (en) 2019-12-31 2019-12-31 Method and device for adapting SQL execution engine in PaaS platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911413753.4A CN113127509B (en) 2019-12-31 2019-12-31 Method and device for adapting SQL execution engine in PaaS platform

Publications (2)

Publication Number Publication Date
CN113127509A true CN113127509A (en) 2021-07-16
CN113127509B CN113127509B (en) 2023-08-15

Family

ID=76770385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911413753.4A Active CN113127509B (en) 2019-12-31 2019-12-31 Method and device for adapting SQL execution engine in PaaS platform

Country Status (1)

Country Link
CN (1) CN113127509B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090240711A1 (en) * 2008-03-20 2009-09-24 Dbsophic Ltd. Method and apparatus for enhancing performance of database and environment thereof
US20170139956A1 (en) * 2015-11-18 2017-05-18 Linkedin Corporation Dynamic data-ingestion pipeline
US20180060341A1 (en) * 2016-09-01 2018-03-01 Paypal, Inc. Querying Data Records Stored On A Distributed File System
CN107818112A (en) * 2016-09-13 2018-03-20 腾讯科技(深圳)有限公司 A kind of big data analysis operating system and task submit method
CN108549683A (en) * 2018-04-03 2018-09-18 联想(北京)有限公司 data query method and system
CN108763573A (en) * 2018-06-06 2018-11-06 众安信息技术服务有限公司 A kind of OLAP engines method for routing and system based on machine learning
CN109670653A (en) * 2018-12-29 2019-04-23 北京航天数据股份有限公司 A kind of method and device predicted based on industrial model predictive engine
US20190325292A1 (en) * 2019-06-28 2019-10-24 Intel Corporation Methods, apparatus, systems and articles of manufacture for providing query selection systems
CN110427992A (en) * 2019-07-23 2019-11-08 杭州城市大数据运营有限公司 Data matching method, device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090240711A1 (en) * 2008-03-20 2009-09-24 Dbsophic Ltd. Method and apparatus for enhancing performance of database and environment thereof
US20170139956A1 (en) * 2015-11-18 2017-05-18 Linkedin Corporation Dynamic data-ingestion pipeline
US20180060341A1 (en) * 2016-09-01 2018-03-01 Paypal, Inc. Querying Data Records Stored On A Distributed File System
CN107818112A (en) * 2016-09-13 2018-03-20 腾讯科技(深圳)有限公司 A kind of big data analysis operating system and task submit method
CN108549683A (en) * 2018-04-03 2018-09-18 联想(北京)有限公司 data query method and system
CN108763573A (en) * 2018-06-06 2018-11-06 众安信息技术服务有限公司 A kind of OLAP engines method for routing and system based on machine learning
CN109670653A (en) * 2018-12-29 2019-04-23 北京航天数据股份有限公司 A kind of method and device predicted based on industrial model predictive engine
US20190325292A1 (en) * 2019-06-28 2019-10-24 Intel Corporation Methods, apparatus, systems and articles of manufacture for providing query selection systems
CN110427992A (en) * 2019-07-23 2019-11-08 杭州城市大数据运营有限公司 Data matching method, device, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XINGCHENG HUA等: "Hadoop configuration tuning with ensemble modeling and metaheuristic optimization", 《IEEE ACCESS》, vol. 6, pages 44161 - 44174, XP011689398, DOI: 10.1109/ACCESS.2018.2857852 *
XUANXUFENG: "Hadoop生态***流行数据格式和存储引擎性能测试比较", pages 1, Retrieved from the Internet <URL:《https://www.aboutyun.com/thread-21449-1-1.html》> *
顾荣: "大数据处理技术与***研究", 《中国博士学位论文全文数据库信息科技辑》, no. 3, pages 138 - 17 *

Also Published As

Publication number Publication date
CN113127509B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
US11379755B2 (en) Feature processing tradeoff management
US10878004B2 (en) Keyword extraction method, apparatus and server
US20200050968A1 (en) Interactive interfaces for machine learning model evaluations
US10339465B2 (en) Optimized decision tree based models
US10318882B2 (en) Optimized training of linear machine learning models
US11915104B2 (en) Normalizing text attributes for machine learning models
US11182691B1 (en) Category-based sampling of machine learning data
US20170132314A1 (en) Identifying relevant topics for recommending a resource
CA3109481A1 (en) Identification and application of hyperparameters for machine learning
CN111241389B (en) Sensitive word filtering method and device based on matrix, electronic equipment and storage medium
CN108647329B (en) User behavior data processing method and device and computer readable storage medium
CN111680085A (en) Data processing task analysis method and device, electronic equipment and readable storage medium
CN112052082B (en) Task attribute optimization method, device, server and storage medium
CN112860855B (en) Information extraction method and device and electronic equipment
CN110968664A (en) Document retrieval method, device, equipment and medium
CN115344805A (en) Material auditing method, computing equipment and storage medium
CN116150327A (en) Text processing method and device
US10559223B2 (en) Food description processing methods and apparatuses
CN111611781B (en) Data labeling method, question answering device and electronic equipment
CN109712704B (en) Scheme recommendation method and device
EP4322066A1 (en) Method and apparatus for generating training data
CN113127509B (en) Method and device for adapting SQL execution engine in PaaS platform
KR101870658B1 (en) System and method for distributed realtime processing of linguistic intelligence moduel
CN113593546B (en) Terminal equipment awakening method and device, storage medium and electronic device
CN115238194A (en) Book recommendation method, computing device and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant