CA3148489A1 - Method of and device for assessing data query time consumption, computer equipment and storage medium - Google Patents

Method of and device for assessing data query time consumption, computer equipment and storage medium Download PDF

Info

Publication number
CA3148489A1
CA3148489A1 CA3148489A CA3148489A CA3148489A1 CA 3148489 A1 CA3148489 A1 CA 3148489A1 CA 3148489 A CA3148489 A CA 3148489A CA 3148489 A CA3148489 A CA 3148489A CA 3148489 A1 CA3148489 A1 CA 3148489A1
Authority
CA
Canada
Prior art keywords
data
table information
query
obtaining
total table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA3148489A
Other languages
French (fr)
Other versions
CA3148489C (en
Inventor
Fuping Wang
Xiaoqing ZHAI
Sheng Yang
Naishuai CHEN
Qian Sun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3148489A1 publication Critical patent/CA3148489A1/en
Application granted granted Critical
Publication of CA3148489C publication Critical patent/CA3148489C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to a method and apparatus for evaluating data query time consumption, and a computer device and a storage medium. The method comprises: receiving a data query request, and parsing the data query request into an execution plan; performing data feature conversion on the execution plan to obtain a first queried data feature; acquiring first full-table information, and performing data feature conversion on the first full-table information to obtain a first full-table information data feature, wherein the first full-table information is full-table information obtained by means of query in the current database; and according to the first queried data feature, the first full-table information data feature, and a preset data query time consumption evaluation model, obtaining a data query time consumption evaluation result. By using the method, the accuracy of data query time consumption evaluation can be improved.

Description

METHOD OF AND DEVICE FOR ASSESSING DATA QUERY TIME
CONSUMPTION, COMPUTER EQUIPMENT AND STORAGE MEDIUM
BACKGROUND OF THE INVENTION
Technical Field [0001] The present application relates to the field of data processing technology, and more particularly to a method of and a device for assessing data query time consumption, a computer equipment and a storage medium.
Description of Related Art
[0002] With the development of the Internet and the related field of technologies, the large-scale database technology is widely applied in various fields, while the operating efficiency of data queries determines the response speed of servicing requests, so it appears quite necessary to assess data query time consumptions. The current big-data based OLAP
(Online Analytical Processing) query engines invariably employ preset rules to assess time consumptions, and this method is relatively low in precision due to restrictions by the preset assessing rules.
SUMMARY OF THE INVENTION
[0003] In view of the above, there is an urgent need to provide a method, a device, a computer equipment and a storage medium capable of enhancing precision in assessing data query time consumption with respect to the aforementioned technical problem.
[0004] There is provided a method of assessing data query time consumption, and the method comprises:

Date Recue/Date Received 2022-01-24
[0005] receiving a data query request, and analyzing the data query request to an execution plan;
[0006] performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0007] obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database;
and
[0008] obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0009] In one of the embodiments, the data query time consumption assessing result can include time consumption assessing results of a plurality of execution plans, and the time consumption assessing results of the plurality of execution plans can be utilized to select an optimal execution plan.
[0010] In one of the embodiments, the method further comprises:
[0011] obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds;
[0012] performing a time discretization process on the data query time consumption, and obtaining classification label data;
[0013] performing data feature transformation on the data query history record, and obtaining a second query data feature;
[0014] obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and
[0015] employing the classification label data, the second query data feature, and the second total Date Recue/Date Received 2022-01-24 table information data feature to perform model training, and obtaining the data query time consumption assessing model.
[0016] In one of the embodiments, the step of performing data feature transformation on the data query history record, and obtaining a second query data feature includes:
[0017] employing a preset coding mode to transform the data query history record to a data feature of a preset format, and obtaining the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
[0018] In one of the embodiments, the method further comprises:
[0019] calculating an average value of the second total table information data and a standard variance of the second total table information data according to the second total table information data, and removing data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
[0020] In one of the embodiments, the method further comprises:
[0021] after completion of current data query, inputting current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform training, and obtaining a new data query time consumption assessing model.
[0022] There is provided a device for assessing data query time consumption, and the device comprises:
[0023] a data collecting module, for receiving a data query request, and analyzing the data query request to an execution plan;
[0024] a first data processing module, for performing data feature transformation on the execution plan, and obtaining a first query data feature;

Date Recue/Date Received 2022-01-24
[0025] a second data processing module, for obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database; and
[0026] an assessing module, for obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0027] There is provided a computer equipment that comprises a memory, a processor and a computer program stored on the memory and operable on the processor, and the following steps are realized when the processor executes the computer program:
[0028] receiving a data query request, and analyzing the data query request to an execution plan;
[0029] performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0030] obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database;
and
[0031] obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0032] There is provided a computer-readable storage medium storing a computer program thereon, and the following steps are realized when the computer program is executed by a processor:
[0033] receiving a data query request, and analyzing the data query request to an execution plan;
[0034] performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0035] obtaining first total table information, performing data feature transformation on the first Date Recue/Date Received 2022-01-24 total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database;
and
[0036] obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0037] In the aforementioned method of and device for assessing data query time consumption, computer equipment and storage medium, data query time consumption is estimated through the data query time consumption assessing model, since the data volume of tables and the quantities of fields involved in the data query relatively greatly affect the data query performance, the present application introduces first total table information that contains the data volume of tables and the quantities of fields in the assessing of data query time consumption, and obtains the data query time consumption assessing result according to a first total table information data feature obtained through data feature transformation of the first total table information, a first query data feature and a preset data query time consumption assessing model, so that the precision in assessing data query time consumption is enhanced.
BRIEF DESCRIPTION OF THE DRAWINGS
[0038] Fig. 1 is a view illustrating the application environment of the method of assessing data query time consumption in an embodiment;
[0039] Fig. 2 is a flowchart schematically illustrating the method of assessing data query time consumption in an embodiment;
[0040] Fig. 3 is a flowchart schematically illustrating the steps of training the data query time consumption assessing model in an embodiment;
[0041] Fig. 4 is a block diagram illustrating the structure of the device for assessing data query time consumption in an embodiment; and Date Recue/Date Received 2022-01-24
[0042] Fig. 5 is a view illustrating the internal structure of the computer equipment in an embodiment.
DETAILED DESCRIPTION OF THE INVENTION
[0043] To make more lucid and clear the objectives, technical solutions and advantages of the present application, the present application is described in greater detail below in conjunction with accompanying drawings and embodiments. As should be understood, the specific embodiments described in this context are merely meant to explain the present application, rather than to restrict the present application.
[0044] The method of assessing data query time consumption provided by the present application is applicable to the application environment as shown in Fig. 1, in which terminal 102 communicates with server 104 via network. Terminal 102 sends a data query request to server 104, server 104 receives the data query request, performs data processing, employs a data query time consumption assessing model to assess the time consumption of the current data query request, and finally preferentially selects an execution plan according to the assessing result. Terminal 102 can be, but is not limited to be, any of a personal computer, a notebook computer and a smart mobile phone, and server 104 can be embodied as an independent server or a server cluster consisting of a plurality of servers.
[0045] In one embodiment, as shown in Fig. 2, there is provided a method of assessing data query time consumption, and the method is explained with an example of its being applied to the server in Fig. 1, the method comprises the following steps.
[0046] Step 201 - receiving a data query request, and analyzing the data query request to an execution plan.
[0047] The execution plan in Step 201 is generated after the database has analyzed the query Date Recue/Date Received 2022-01-24 request, and the execution plan can include datasheet information, filter condition information, and grouping statistics condition information etc. involved in the query request.
[0048] Specifically, when it is required to perform data query, the terminal can send a corresponding query request to the server. The data query request contains therein attribute information of the data to be enquired and datasheet information to be enquired.
After receiving the query request, the server analyzes to generate an execution plan with respect to the query request.
[0049] Step 202 - performing data feature extraction and transformation on the execution plan, and obtaining a first query data feature.
[0050] The first query data feature in Step 202 is the data feature used by the server to input a data query time consumption assessing model to assess time consumption.
[0051] The server firstly performs data feature extraction on the execution plan, the extracted data feature includes datasheet related information involved in the data query request, and the datasheet related information includes the relevant datasheet quantity, filter conditions and statistic analyzing method used by the query as involved in the data query. The server performs feature transformation on these data features to obtain the first query data feature usable to assess the current data query time consumption.
[0052] Step 203 - obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database.
[0053] The first total table information in Step 203 includes globally related information data of Date Recue/Date Received 2022-01-24 database tables and datasheet fields.
[0054] The server enquires the current database, obtains globally related information data of database tables and datasheet fields involved in the current query (for instance, how many data volumes of all datasheets involved in the current query are there in the current database), and performs feature transformation on the globally related information data to obtain the first total table information data feature usable to assess time consumption.
[0055] Step 204 - obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0056] The data query time consumption assessing model in Step 204 is a machine learning model possessing dynamic learning capabilities constructed by and employed in the present application, and completes time consumption assessing function with respect to the data query request in the present application.
[0057] The first query data feature and the first total table information data feature obtained in Steps 202 and 203 are input in the preset data query time consumption assessing model to obtain time consumption data of the current data query.
[0058] In the aforementioned method of assessing data query time consumption, since the first total table information that relatively greatly affects the data query performance is introduced in the assessing, and the first total table information contains the data information capable of reflecting the global database table quantities and field quantities, the precision in assessing data query time consumption is enhanced.
[0059] In one of the embodiments, the data query time consumption assessing result can include time consumption assessing results of a plurality of execution plans, and the time Date Recue/Date Received 2022-01-24 consumption assessing results of the plurality of execution plans can be utilized to select an optimal execution plan.
[0060] The server preferentially selects the optimal execution plan with the least time consumption according to the assessing results to execute the data query task, whereby system resources are economized for the server, and system efficiency is enhanced.
[0061] In another embodiment, as shown in Fig. 3, the method further comprises data query time consumption assessing model training steps, and training of the data query time consumption assessing model includes the following steps.
[0062] Step 301 - obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds.
[0063] Specifically, after the completion of each data query, the data query request and its time consumption information are stored in the database log, and the server can obtain a plurality of historical data query requests and their time consumption information from the database log.
[0064] Step 302 - performing a time discretization process on the data query time consumption, and obtaining classification label data.
[0065] The discretization process in Step 302 means to divide the values of time consumption data of historical data query requests into corresponding time segments according to rules, each time segment corresponds to one time consumption label, and these time consumption labels together serve as classification label data of the model.
[0066] Step 303 - performing data feature transformation on the data query history record, and Date Recue/Date Received 2022-01-24 obtaining a second query data feature.
[0067] Specifically, the data query history record includes table information fields, filter fields, analysis fields, and time partition fields information. The table information fields are plural datasheets involved in the data query request, the filter fields are filter condition information involved in the query, the analysis fields are grouping information per fields involved in the query in which statistical analysis can be performed by groups, and time partition indicates time segments required to be enquired by the data query request.
[0068] The server performs data feature transformation respectively on the aforementioned information fields data, filter fields data, analysis fields data, and time partition fields data to obtain the second query data feature. The second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
[0069] Step 304 - obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds.
[0070] The second total table information in Step 304 includes globally related information data of database table data volumes and analysis field bases, which are respectively extracted from various data query history records. The server performs data feature transformation on the database table data volumes and analysis field bases to obtain the second total table information data feature, and the second total table information data feature includes a table data volume data feature and an analysis field base data feature.
[0071] Step 305 ¨ employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the Date Recue/Date Received 2022-01-24 data query time consumption assessing model.
[0072] Specifically, the server takes the classification label data, the second query data feature, and the second total table information data feature as samples to be input in the machine learning model, and employs an XGBoost (eXtreme Gradient Boosting) classification algorithm for training to obtain the data query time consumption assessing model. The XGBoost classification algorithm makes it possible to integrate a plurality of weak classifiers together to form a strong classifier, while the tree model used is a CART
(Classification And Regression Tree) model.
[0073] In this embodiment, since the training data of the data query time consumption assessing model originates from the data query history record and the database global table information, the data features conform to the requirements of the data query time consumption assessing model, so the precision of the model is guaranteed; in addition, the model training data includes database table data volume data feature and analysis field base data feature that relatively greatly affect the data query performance, so the precision in assessing data query time consumption is also enhanced thereby.
[0074] In one of the embodiments, the method further comprises the following step:
[0075] employing a preset coding mode to transform the data query history record to a data feature of a preset format, and obtaining the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
[0076] Specifically, a one-hot coding mode can be employed in this embodiment to transform table information fields, filter fields, analysis fields and time partition fields respectively to the second query data feature usable in model training.
[0077] The one-hot coding is one-bit valid coding. The mode is to use an N-bit status register to Date Recue/Date Received 2022-01-24 code N number of statuses, each status has its independent register bit, and only one bit is valid at any given time.
[0078] In one of the embodiments, the method further comprises the following step:
[0079] calculating a standard variance of the second total table information data and an average value of the second total table information data according to the second total table information data, and removing data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
[0080] Specifically, three times of the standard variance can be employed in this embodiment as a reference numerical value to remove the data having a difference from the average value, which difference is greater than three times of the standard variance, from the second total table information data, and to obtain precise table data volume data feature and analysis field base data feature for training the machine learning model.
[0081] In one of the embodiments, the method further comprises the following step:
[0082] after completion of current data query, inputting current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform training, and obtaining a new data query time consumption assessing model.
[0083] Specifically, after completion of the current data query, the server records the data query and the time consumption information in the database log, the data query assessing model uses the current data query and time consumption thereof, the first query data feature and first total table information data feature to which the current data query corresponds as new samples to perform model training, and the data query assessing model is updated in real time.

Date Recue/Date Received 2022-01-24
[0084] In this embodiment, after completion of each data query, the data query time consumption assessing model learns the query record in real time, and the assessing model is updated, with the passing of time, learning samples will be more and more abundant, and precision will be consistently enhanced.
[0085] As should be understood, although the various steps in the flowcharts of Figs. 2-3 are sequentially displayed as indicated by arrows, these steps are not necessarily executed in the sequences indicated by arrows. Unless otherwise explicitly noted in this paper, execution of these steps is not restricted by any sequence, as these steps can also be executed in other sequences (than those indicated in the drawings). Moreover, at least partial steps in Figs. 2-3 may include plural sub-steps or multi-phases, these sub-steps or phases are not necessarily completed at the same timing, but can be executed at different timings, and these sub-steps or phases are also not necessarily sequentially performed, but can be performed in turns or alternately with other steps or with at least some of sub-steps or phases of other steps.
[0086] In one embodiment, as shown in Fig. 4, there is provided a device for assessing data query time consumption, and the device comprises a data collecting module 401, a first data processing module 402, a second data processing module 403, and an assessing module 404, of which:
[0087] the data collecting module 401 is employed for receiving a data query request, and analyzing the data query request to an execution plan;
[0088] the first data processing module 402 is employed for performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0089] the second data processing module 403 is employed for obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database; and
[0090] the assessing module 404 is employed for obtaining a data query time consumption Date Recue/Date Received 2022-01-24 assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0091] In one of the embodiments, the data query time consumption assessing result can include time consumption assessing results of a plurality of execution plans, and the time consumption assessing results of the plurality of execution plans can be utilized to select an optimal execution plan.
[0092] In one of the embodiments, the data collecting module 401 can be further employed for obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds;
[0093] the first data processing module 402 can be further employed for performing a time discretization process on the data query time consumption, and obtaining classification label data; performing data feature transformation on the data query history record, and obtaining a second query data feature;
[0094] the second data processing module 403 can be further employed for obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and
[0095] the assessing module 404 can be further employed for employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the data query time consumption assessing model.
[0096] In one of the embodiments, the first data processing module 402 can employ a preset coding mode to transform the data query history record to a data feature of a preset format, and obtain the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, Date Recue/Date Received 2022-01-24 and a time partition data feature.
[0097] In one of the embodiments, the second data processing module 403 can calculate an average value of the second total table information data and a standard variance of the second total table information data according to the second total table information data, and remove data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
[0098] In one of the embodiments, after completion of current data query, the assessing module 404 can further input current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform training, and obtain a new data query time consumption assessing model.
[0099] Specific definitions relevant to the device for assessing data query time consumption may be inferred from the aforementioned definitions to the method of assessing data query time consumption, while no repetition is made in this context. The various modules in the aforementioned device for assessing data query time consumption can be wholly or partly realized via software, hardware, and a combination of software with hardware.
The various modules can be embedded in the form of hardware in a processor in a computer equipment or independent of any computer equipment, and can also be stored in the form of software in a memory in a computer equipment, so as to facilitate the processor to invoke and perform operations corresponding to the aforementioned various modules.
[0100] In one embodiment, a computer equipment is provided, the computer equipment can be a server, and its internal structure can be as shown in Fig. 5. The computer equipment comprises a processor, a memory, a network interface and a database connected to each other via a system bus. The processor of the computer equipment is employed to provide computing and controlling capabilities. The memory of the computer equipment includes a nonvolatile storage medium, and an internal memory. The nonvolatile storage medium Date Recue/Date Received 2022-01-24 stores therein an operating system, a computer program and a database. The internal memory provides environment for the running of the operating system and the computer program in the nonvolatile storage medium. The database of the computer equipment is employed to store relevant data involved in the method of assessing data query time consumption. The network interface of the computer equipment is employed to connect to an external terminal via network connection for communication. The computer program realizes a method of assessing data query time consumption when it is executed by a processor.
[0101] As understandable to persons skilled in the art, the structure illustrated in Fig. 5 is merely a block diagram of partial structure relevant to the solution of the present application, and does not constitute any restriction to the computer equipment on which the solution of the present application is applied, as the specific computer equipment may comprise component parts that are more than or less than those illustrated in Fig. 5, or may combine certain component parts, or may have different layout of component parts.
[0102] In one embodiment, there is provided a computer equipment that comprises a memory, a processor and a computer program stored on the memory and operable on the processor, and the following steps are realized when the processor executes the computer program:
[0103] receiving a data query request, and analyzing the data query request to an execution plan;
[0104] performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0105] obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database;
and
[0106] obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.

Date Recue/Date Received 2022-01-24
[0107] In one embodiment, the data query time consumption assessing result can include time consumption assessing results of a plurality of execution plans, and the time consumption assessing results of the plurality of execution plans can be utilized to select an optimal execution plan.
[0108] In one embodiment, the following steps are further realized when the processor executes the computer program:
[0109] obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds; performing a time discretization process on the data query time consumption, and obtaining classification label data; performing data feature transformation on the data query history record, and obtaining a second query data feature; obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the data query time consumption assessing model.
[0110] In one embodiment, when the processor executes the computer program to realize the step of performing data feature transformation on the data query history record, and obtaining a second query data feature, the following steps are further realized:
[0111] employing a preset coding mode to transform the data query history record to a data feature of a preset format, and obtaining the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
[0112] In the embodiment, the following steps are further realized when the processor executes Date Recue/Date Received 2022-01-24 the computer program:
[0113] calculating an average value of the second total table information data and a standard variance of the second total table information data according to the second total table information data, and removing data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
[0114] In the embodiment, the following steps are further realized when the processor executes the computer program:
[0115] after completion of current data query, inputting current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform training, and obtaining a new data query time consumption assessing model.
[0116] In one embodiment, there is provided a computer-readable storage medium storing thereon a computer program, and the following steps are realized when the computer program is executed by a processor:
[0117] receiving a data query request, and analyzing the data query request to an execution plan;
[0118] performing data feature transformation on the execution plan, and obtaining a first query data feature;
[0119] obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database;
and
[0120] obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
[0121] In one embodiment, the data query time consumption assessing result can include time Date Recue/Date Received 2022-01-24 consumption assessing results of a plurality of execution plans, and the time consumption assessing results of the plurality of execution plans can be utilized to select an optimal execution plan.
[0122] In one embodiment, the following steps are further realized when the computer program is executed by a processor:
[0123] obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds; performing a time discretization process on the data query time consumption, and obtaining classification label data; performing data feature transformation on the data query history record, and obtaining a second query data feature; obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the data query time consumption assessing model.
[0124] In one embodiment, when the computer program is executed by a processor to realize the step of performing data feature transformation on the data query history record, and obtaining a second query data feature, the following steps are further realized:
[0125] employing a preset coding mode to transform the data query history record to a data feature of a preset format, and obtaining the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
[0126] In the embodiment, the following steps are further realized when the computer program is executed by a processor:
[0127] calculating an average value of the second total table information data and a standard Date Recue/Date Received 2022-01-24 variance of the second total table information data according to the second total table information data, and removing data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
[0128] In the embodiment, the following steps are further realized when the computer program is executed by a processor:
[0129] after completion of current data query, inputting current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform training, and obtaining a new data query time consumption assessing model.
[0130] As comprehensible to persons ordinarily skilled in the art, the entire or partial flows in the methods according to the aforementioned embodiments can be completed via a computer program instructing relevant hardware, the computer program can be stored in a nonvolatile computer-readable storage medium, and the computer program can include the flows as embodied in the aforementioned various methods when executed. Any reference to the memory, storage, database or other media used in the various embodiments provided by the present application can all include nonvolatile and/or volatile memory/memories. The nonvolatile memory can include a read-only memory (ROM), a programmable ROM (PROM), an electrically programmable ROM (EPROM), an electrically erasable and programmable ROM (EEPROM) or a flash memory. The volatile memory can include a random-access memory (RAM) or an external cache memory. To serve as explanation rather than restriction, the RAM is obtainable in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM
(SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct Rambus dynamic RAM (DRDRAM), and Rambus dynamic RAM
(RDRAM), etc.
Date Recue/Date Received 2022-01-24
[0131] Technical features of the aforementioned embodiments are randomly combinable, while all possible combinations of the technical features in the aforementioned embodiments are not exhausted for the sake of brevity, but all these should be considered to fall within the scope recorded in the Description as long as such combinations of the technical features are not mutually contradictory.
[0132] The foregoing embodiments are merely directed to several modes of execution of the present application, and their descriptions are relatively specific and detailed, but they should not be hence misunderstood as restrictions to the inventive patent scope. As should be pointed out, persons with ordinary skill in the art may further make various modifications and improvements without departing from the conception of the present application, and all these should pertain to the protection scope of the present application.
Accordingly, the patent protection scope of the present application shall be based on the attached Claims.

Date Recue/Date Received 2022-01-24

Claims (10)

What is claimed is:
1. A method of assessing data query time consumption, characterized in comprising:
receiving a data query request, and analyzing the data query request to an execution plan;
performing data feature transformation on the execution plan, and obtaining a first query data feature;
obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database; and obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
2. The method according to Claim 1, characterized in that the data query time consumption assessing result includes time consumption assessing results of a plurality of execution plans, and the time consumption assessing results of the plurality of execution plans are utilized to select an optimal execution plan.
3. The method according to Claim 1, characterized in further comprising:
obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds;

performing a time discretization process on the data query time consumption, and obtaining classification label data;
performing data feature transformation on the data query history record, and obtaining a second query data feature;
obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the data query time consumption assessing model.
4. The method according to Claim 3, characterized in that the step of performing data feature transformation on the data query history record, and obtaining a second query data feature includes:
employing a preset coding mode to transform the data query history record to a data feature of a preset format, and obtaining the second query data feature, wherein the second query data feature includes a table information data feature, a filter field data feature, an analysis field data feature, and a time partition data feature.
5. The method according to Claim 3, characterized in further comprising:
calculating an average value of the second total table information data and a standard variance of the second total table information data according to the second total table information data, and removing data having a difference from the average value, which difference is greater than a set multiple of the standard variance, from the second total table information data.
6. The method according to Claim 3, characterized in further comprising:
after completion of current data query, inputting current data query time consumption, the first query data feature, and the first total table information data feature in the data query assessing model to perform model training, and obtaining a new data query time consumption assessing model.
7. A device for assessing data query time consumption, characterized in comprising:
a data collecting module, for receiving a data query request, and analyzing the data query request to an execution plan;
a first data processing module, for performing data feature transformation on the execution plan, and obtaining a first query data feature;
a second data processing module, for obtaining first total table information, performing data feature transformation on the first total table information, and obtaining a first total table information data feature, wherein the first total table information is total table information enquired from a current database; and an assessing module, for obtaining a data query time consumption assessing result according to the first query data feature, the first total table information data feature and a preset data query time consumption assessing model.
8. The device according to Claim 7, characterized in that:
the data collecting module is further employed for obtaining a data query history record from a database log, and obtaining the data query history record and data query time consumption to which the data query history record corresponds;
the first data processing module is further employed for performing a time discretization process on the data query time consumption, obtaining classification label data, performing data feature transformation on the data query history record, and obtaining a second query data feature;
the second data processing module is further employed for obtaining second total table information, performing data feature transformation on the second total table information, and obtaining a second total table information data feature, wherein the second total table information is total table information to which each data query history record corresponds; and the assessing module is further employed for employing the classification label data, the second query data feature, and the second total table information data feature to perform model training, and obtaining the data query time consumption assessing model.
9. A computer equipment, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, characterized in that the method steps according to any of Claims 1 to 6 are realized when the processor executes the computer program.
10. A computer-readable storage medium, storing a computer program thereon, characterized in that the method steps according to any of Claims 1 to 6 are realized when the computer program is executed by a processor.
CA3148489A 2019-07-23 2020-06-24 Method of and device for assessing data query time consumption, computer equipment and storage medium Active CA3148489C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910666596.1A CN110516123A (en) 2019-07-23 2019-07-23 Data query time-consuming appraisal procedure, device, computer equipment and storage medium
CN201910666596.1 2019-07-23
PCT/CN2020/097850 WO2021012861A1 (en) 2019-07-23 2020-06-24 Method and apparatus for evaluating data query time consumption, and computer device and storage medium

Publications (2)

Publication Number Publication Date
CA3148489A1 true CA3148489A1 (en) 2021-01-28
CA3148489C CA3148489C (en) 2024-01-02

Family

ID=68623422

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3148489A Active CA3148489C (en) 2019-07-23 2020-06-24 Method of and device for assessing data query time consumption, computer equipment and storage medium

Country Status (3)

Country Link
CN (1) CN110516123A (en)
CA (1) CA3148489C (en)
WO (1) WO2021012861A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110516123A (en) * 2019-07-23 2019-11-29 苏宁云计算有限公司 Data query time-consuming appraisal procedure, device, computer equipment and storage medium
CN112749191A (en) * 2021-01-19 2021-05-04 成都信息工程大学 Intelligent cost estimation method and system applied to database and electronic equipment
CN113505276A (en) * 2021-06-21 2021-10-15 跬云(上海)信息科技有限公司 Scoring method, device, equipment and storage medium of pre-calculation model

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7895192B2 (en) * 2007-07-19 2011-02-22 Hewlett-Packard Development Company, L.P. Estimating the loaded execution runtime of a database query
US20130151504A1 (en) * 2011-12-09 2013-06-13 Microsoft Corporation Query progress estimation
US9449249B2 (en) * 2012-01-31 2016-09-20 Nokia Corporation Method and apparatus for enhancing visual search
CN107133332B (en) * 2017-05-11 2020-10-16 广州视源电子科技股份有限公司 Query task allocation method and device
US11010362B2 (en) * 2017-08-25 2021-05-18 Vmware, Inc. Method and system for caching a generated query plan for time series data
CN109241101B (en) * 2018-08-31 2020-06-30 阿里巴巴集团控股有限公司 Database query optimization method and device and computer equipment
CN109635100A (en) * 2018-12-24 2019-04-16 上海仁静信息技术有限公司 A kind of recommended method, device, electronic equipment and the storage medium of similar topic
CN110516123A (en) * 2019-07-23 2019-11-29 苏宁云计算有限公司 Data query time-consuming appraisal procedure, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CA3148489C (en) 2024-01-02
CN110516123A (en) 2019-11-29
WO2021012861A1 (en) 2021-01-28

Similar Documents

Publication Publication Date Title
CA3148489C (en) Method of and device for assessing data query time consumption, computer equipment and storage medium
JP5298117B2 (en) Data merging in distributed computing
US10642832B1 (en) Reducing the domain of a subquery by retrieving constraints from the outer query
CN109992601B (en) To-do information pushing method and device and computer equipment
CN110659282B (en) Data route construction method, device, computer equipment and storage medium
US20070143246A1 (en) Method and apparatus for analyzing the effect of different execution parameters on the performance of a database query
CN106293891B (en) Multidimensional investment index monitoring method
CN109783457B (en) CGI interface management method, device, computer equipment and storage medium
WO2019179408A1 (en) Construction of machine learning model
CN109656947B (en) Data query method and device, computer equipment and storage medium
CN104699788A (en) Database query method and device
WO2022205938A1 (en) Data acquisition method and apparatus, computer device, and storage medium
CN107220283B (en) Data processing method, device, storage medium and electronic equipment
CN116483831B (en) Recommendation index generation method for distributed database
CN109408532B (en) Data acquisition method, device, computer equipment and storage medium
US20190340540A1 (en) Adaptive continuous log model learning
CA3184895A1 (en) User behavior data writing method and device, computer equipment and storage medium
CN115795521A (en) Access control method, device, electronic equipment and storage medium
CN114116773A (en) Structured Query Language (SQL) text auditing method and device
CN112667682A (en) Data processing method, data processing device, computer equipment and storage medium
CN107679093B (en) Data query method and device
CN105653568A (en) Method and apparatus analyzing user behaviors
US11797578B2 (en) Technologies for unsupervised data classification with topological methods
CN116955470A (en) Method and device for constructing data partition strategy
CN117435492A (en) Database performance test method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407

EEER Examination request

Effective date: 20220407