CN111475534B - Data query method and related equipment - Google Patents

Data query method and related equipment Download PDF

Info

Publication number
CN111475534B
CN111475534B CN202010397694.2A CN202010397694A CN111475534B CN 111475534 B CN111475534 B CN 111475534B CN 202010397694 A CN202010397694 A CN 202010397694A CN 111475534 B CN111475534 B CN 111475534B
Authority
CN
China
Prior art keywords
information
query
data
target
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010397694.2A
Other languages
Chinese (zh)
Other versions
CN111475534A (en
Inventor
钟舒妍
邓范鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aibee Technology Co Ltd
Original Assignee
Beijing Aibee Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aibee Technology Co Ltd filed Critical Beijing Aibee Technology Co Ltd
Priority to CN202010397694.2A priority Critical patent/CN111475534B/en
Publication of CN111475534A publication Critical patent/CN111475534A/en
Application granted granted Critical
Publication of CN111475534B publication Critical patent/CN111475534B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data query method and related equipment, wherein the method comprises the following steps: after the query language input by the user is obtained, firstly, the query statement is analyzed to obtain first information and second information. The first information is data source information stored with a query target; the first information includes third information and fourth information; the third information represents the data type of the target data set, the fourth information represents the storage identification of the target data set in the data source, and the target data set is a data set required for query processing of a query target; the second information characterizes a feature identification of the query object. Then, determining a query action according to the first information, and determining a target data set from the data pool according to the first information; the data pool comprises N data sources; the data sources include at least one data set, and the data types of the data sets stored in the different data sources are different. Finally, a query target is determined using the query action and the target dataset to improve query efficiency.

Description

Data query method and related equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data query method and related devices.
Background
With the proliferation of data information, the data types of data sets (e.g., tables, texts, documents, knowledge profiles, etc.) used to record data are also subject to diversification. For example, the data types of a dataset may include structured data (e.g., tables) as well as unstructured data (e.g., documents or knowledge graphs).
At present, when a technician needs to perform data Query on data to be queried, the technician needs to determine a Query Language (such as a Language such as an object oriented programming Language (object oriented programming Language) and a Structured Query Language (SQL) Language) needed for processing a target data set according to a data type of the target data set (that is, a data set used for determining the data to be queried) as a target Query Language. Then, the technician uses the target query language to perform query processing in the target data set, and determines the data to be queried. For example, when the target data set is structured data, the technician may first generate an SQL query statement using the SQL language, and then perform a data query from the database using the SQL query statement. Therefore, the data query process is complicated because technicians need to input different query languages for data sets of different data types for query.
Disclosure of Invention
In order to solve the technical problems in the prior art, the data query method and the related device are provided, so that technical personnel do not need to input different query languages for data sets of different data types for query, the data query process is simplified, and the data query efficiency is improved.
In order to achieve the above object, the embodiments of the present application provide the following technical solutions:
the embodiment of the application provides a data query method, which comprises the following steps:
acquiring a query statement input by a user; the query statement carries information required for querying a query target;
analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
determining a query action according to the third information, and determining the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
determining the query target using the query action and the target dataset.
Optionally, the determining a query action according to the third information specifically includes:
determining the query action according to the third information and the first mapping relation; and the first mapping relation is used for recording query actions corresponding to data sets of different data types.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and data operation information;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the data operation information.
Optionally, the generating a query action according to the third information and the data operation information specifically includes:
determining an initial action according to the third information and a second mapping relation; the second mapping relation is used for recording query actions corresponding to data sets of different data types;
and generating a query action according to the initial action and the data operation information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and fifth information; wherein the fifth information is attribute description information of the query target in the target data set;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the fifth information.
Optionally, the determining the target data set from the data pool according to the first information specifically includes:
determining a target data source from the data pool according to the third information;
and determining a target data set from the target data source according to the fourth information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
identifying a programming normal form type used by the query statement, and determining the programming normal form type as a target programming normal form type;
and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
An embodiment of the present application further provides a data query device, including:
the acquisition unit acquires a query sentence input by a user; the query statement carries information required for querying a query target;
the analysis unit is used for analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
a first determining unit, configured to determine a query action according to the third information, and determine the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
a second determining unit for determining the query target using the query action and the target dataset.
An embodiment of the present application further provides an apparatus, where the apparatus includes a processor and a memory:
the memory is used for storing a computer program;
the processor is configured to execute any implementation manner of the data query method provided by the embodiment of the application according to the computer program.
The embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium is used for storing a computer program, and the computer program is used for executing any implementation manner of the data query method provided in the embodiment of the present application.
Compared with the prior art, the embodiment of the application has at least the following advantages:
in the data query method provided by the embodiment of the application, after the query language input by the user is obtained, the query statement is firstly analyzed to obtain the first information and the second information. The first information comprises third information and fourth information; the third information represents the data type of the target data set, the fourth information represents the storage identification of the target data set in the data source, and the target data set is a data set required for query processing of the query target; the second information characterizes a feature identification of the query object. Then, determining a query action according to the first information, and determining a target data set from the data pool according to the first information; the data pool comprises N data sources; the data sources include at least one data set, and the data types of the data sets stored in the different data sources are different. Finally, a query objective is determined using the query action and the objective dataset.
It can be seen that, since the query statement input by the user carries information (for example, a data type of the target data set, a storage identifier of the target data set, and feature identifier information of the query target) required for querying the query target, after the first information and the second information are obtained by parsing the query statement, the query action and the target data set used for querying the query target can be directly determined by using the first information and the second information, and the query target is determined from the target data set by using the query action, so that the purpose of querying data of data sets of different data types by using one query statement input by the user is achieved, the defect that technicians need to input different query languages for the data sets of different data types for querying is overcome, the data query process is simplified, and the data query efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a data query method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a data pool provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a data query provided in an embodiment of the present application;
fig. 4 is a schematic diagram of a syntax structure of an MQL statement provided in an embodiment of the present application;
FIG. 5 is a schematic illustration of a map provided by an embodiment of the present application;
FIG. 6 is a schematic diagram of a data set for storing RDF data according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a data source for storing document data according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a data query device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an apparatus provided in an embodiment of the present application.
Detailed Description
The inventor finds that the traditional data query has the following defects in the traditional data query research: (1) in a traditional data query process, because data sets of different data types need to be queried by different query languages, technicians need to input different query languages for querying the data sets of different data types, and thus the technicians need to know various types of query languages, thereby increasing the technical threshold of the technicians. (2) In conventional data query processes, a query language is only suitable for querying data sets of one data type (e.g., SQL is only suitable for querying structured data like tables). However, in many data query processes, data query from data sets of multiple data types may be involved, and in this case, a technician needs to perform query alternately using multiple different types of query languages, which results in a complicated query process and thus a low query efficiency. (3) In some complex query scenarios, dependency relationships often exist between query tasks of different data sets, so that a single language cannot meet the requirements of the complex query scenarios.
In order to solve the above technical problem, an embodiment of the present application provides a data query method, including: after the query language input by the user is obtained, firstly, the query statement is analyzed to obtain first information and second information. The first information comprises third information and fourth information; the third information represents the data type of the target data set, the fourth information represents the storage identification of the target data set in the data source, and the target data set is a data set required for query processing of the query target; the second information characterizes a feature identification of the query object. Then, determining a query action according to the first information, and determining a target data set from the data pool according to the first information; the data pool comprises N data sources; the data sources include at least one data set, and the data types of the data sets stored in the different data sources are different. Finally, a query objective is determined using the query action and the objective dataset.
It can be seen that, because the query statement input by the user carries information (for example, a plurality of information such as a data type of a target data set, a storage identifier of the target data set, and a feature identifier of the query target) required for querying the query target, after the first information and the second information are obtained by parsing in the query statement, the query action and the target data set used for querying the query target can be directly determined by using the first information and the second information, and the query target is determined from the target data set by using the query action, so that the purpose of querying data of data sets of different data types based on one query statement input by the user is achieved, disadvantages existing in conventional data query are overcome, a data query process is simplified, and data query efficiency is improved.
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
Method embodiment
Referring to fig. 1, the figure is a flowchart of a data query method provided in an embodiment of the present application.
The data query method provided by the embodiment of the application comprises the following steps of S1-S5:
s1: and acquiring a query statement input by a user.
The query statement refers to an instruction statement input by a user for performing data query; and the query statement carries information required for querying the query target. The query target refers to a query result determined by using the query statement. For example, when a user queries "sum of 3 and 2" using a query statement, then the query target is "5" (i.e., the sum of 3 and 2).
It should be noted that the query target is not limited in the embodiments of the present application, for example, the query target may include at least one of data existing in a table, data calculated by using the data in the table, entities and/or relationships in a graph, characters recorded in a document, text information (e.g., semantic information, translation information, subject information, etc.) processed by using the characters recorded in the document, data existing in stream data, and data information mined from the stream data.
It should be further noted that the syntax structure of the query statement is not limited in the embodiments of the present application, for example, the following may be adopted for the query statementGrammar structure embodimentThe syntax structure of the MQL statement provided in (1). That is, the query statement may be an MQL statement.
In addition, the information carried by the query statement is not limited in the embodiments of the present application. For ease of understanding and explanation, the following description is made in conjunction with the situation.
In the first case, the query statement carries the first information and the second information. The first information represents information of a data source storing the query target (namely, storage information of data required when the query target is determined); the second information represents the feature identifier (such as name, count, etc.) of the query target. It should be noted that the feature identifier of the query target is not limited in the embodiment of the present application, for example, the feature identifier of the query target may be a name identifier (e.g., an attribute name such as name, count, etc.).
In addition, the first information includes third information and fourth information, the third information characterizes a data type (e.g., table) of the target data set, the fourth information characterizes a storage identifier (e.g., web _ data. Websites) of the target data set in the data source, and the target data set is a data set required for query processing of the query target. Wherein the data source comprises at least one data set, in particular a plurality of data sets of the same data type, and wherein the data types of the data sets stored in different data sources are different. In addition, the plurality of data sources form a data pool, so that the data pool comprises N data sources, and N is a positive integer. For example, the data pool as shown in FIG. 2 includes a first data source and a second data source. The first data source may comprise a relational database (e.g., SQL) as shown in FIG. 3, and is used to store structured data. The second data source may comprise a spectral database and/or a Distributed File System (HDFS) as shown in fig. 3, and is used to store unstructured data.
It should be noted that the data type of the data set is not limited in the embodiments of the present application, for example, the data type of the data set may be structured data (e.g., table data) and unstructured data (e.g., atlas data, stream data, resource Description Framework (RDF) data, and document data).
Based on the first condition, the query statement may carry the data type of the target data set, the storage identifier of the target data set in the data source, and the feature identifier of the query target, so that the query target can be queried subsequently based on the information carried in the query statement.
In the second case, the query statement carries data operation information in addition to the first information and the second information. The data operation information refers to related information of part or all of data operations required to be used when querying a query target. It should be noted that the content of the data operation information is not limited in the embodiments of the present application, and in one possible implementation, the data operation information includes a data operation and/or a constraint condition of the data operation. For example, the data operation information may be "query person. Where "query" is a data operation and "person.
In a third case, the query statement carries fifth information in addition to the above-mentioned part or all of the information. And the fifth information is attribute description information of the query target in the target data set. For example, when the target dataset is a graph and the query targets an entity in the graph, then the fifth information is entity description information.
Based on the above, in the embodiment of the application, when a user (especially a technician) needs to determine a query target from a data pool, the user may input a query statement carrying information required for querying the query target, so that the query target can be determined by performing data query processing from the data pool based on the query statement in the following step.
S2: and analyzing the query statement to obtain first information and second information.
The embodiment of the application does not limit the analysis process, and the analysis process can be any process of extracting the information which is carried in the query statement and is needed when the query statement queries the query target. For example, the parsing process of the query statement may specifically be: the query statement is parsed in grammar and morphology to generate a grammar tree, and a prefix declaration, keywords (SELECT, FROM, and WHERE), an expression, a data source, and a query target are identified FROM the grammar tree.
In some cases, the query statement may be parsed according to information carried by the query statement, and based on this, some possible implementations of S2 are also provided, which are described in turn below.
In a first possible implementation manner, if the query statement carries the first information and the second information, S2 specifically is: and analyzing the query statement to obtain first information and second information. That is, when the query statement carries the first information and the second information, the first information and the second information may be parsed from the query statement only.
In a second possible implementation manner, if the query statement carries the first information, the second information, and the data operation information, S2 specifically is: and analyzing the query statement to obtain first information, second information and data operation information. That is, when the query statement carries the first information, the second information, and the data operation information, the first information, the second information, and the data operation information may be parsed from the query statement only.
In a third possible implementation manner, if the query statement carries the first information, the second information, and the fifth information, S2 specifically is: and analyzing the query statement to obtain first information, second information and fifth information. That is, when the query statement carries the first information, the second information, and the fifth information, the first information, the second information, and the fifth information may be parsed from the query statement only.
In a fourth possible implementation manner, if the query statement carries the first information, the second information, the data operation information, and the fifth information, S2 specifically is: and analyzing the query statement to obtain first information, second information, data operation information and fifth information. That is, when the query statement carries the first information, the second information, the data operation information, and the fifth information, the first information, the second information, the data operation information, and the fifth information may be parsed from the query statement only.
Based on the four possible implementation manners of S2, in the embodiment of the present application, useful information carried in the query statement (that is, information required when the query is performed on the query target) may be correspondingly analyzed, so as to obtain various information used when the query is performed on the query target.
In addition, the embodiments of the present application do not limit the supported programming paradigm types of the query language, such as an object-oriented programming paradigm, a functional programming paradigm, and an SQL-like programming paradigm. The object-oriented programming paradigm refers to that a value to be queried is regarded as an object, the data type and behavior of the object are defined by a class, the class comprises a corresponding data operation method and necessary attributes, and the data query and processing are realized by calling methods in the class when the class is applied. The functional programming paradigm means that an operation or query process is written in a function in the form of an expression, and a function value is a result returned after the expression is instantiated, and can be independently expressed or be embedded in a high-order function for expression. The SQL-like programming paradigm is to input an SQL-like query statement, which contains a data source, keywords and a query body, and can call existing functions in the statement to realize simple operation and also can realize slightly complex data processing by a user-defined method. SQL-like programming is more friendly to structured data queries.
Because the query sentences written by using the different programming normal forms have different characteristics, the query sentences written by using the different programming normal forms should use different analysis methods, so after the query sentences are obtained, the programming normal form types used by the query sentences can be determined firstly, and then the query sentences are analyzed based on the determined programming normal form types. Based on this, the present application example also provides another implementation manner of S2, in this implementation manner, S2 may specifically be: identifying a programming normal form type used by a query statement, and determining the programming normal form type as a target programming normal form type; and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
In the embodiment of the application, after a query statement input by a user is obtained, a programming normal form type used by the query statement is identified according to statement structural features of the query statement and is used as a target programming normal form type; and analyzing the query statement by using the target programming paradigm type to obtain at least two kinds of information of the first information, the second information, the data operation information and the fifth information. Therefore, when a user can write the query statement by adopting at least one programming paradigm according to personal habits or business requirements, the corresponding query statement can be analyzed according to the programming paradigm type used by the user, and the analysis accuracy of the query statement is improved. It can be seen that the user (especially the technician) only needs to understand one type of programming paradigm to realize the query process of the structured data and the unstructured data, so that the technical threshold of the technician can be effectively reduced.
Based on the related content of S2, in the embodiment of the present application, after the query statement input by the user is obtained, the query statement may be analyzed, so as to obtain various information required when querying the query target.
S3: and determining the query action according to the third information.
The query action refers to a data operation used when a query target is queried from a data pool. In addition, the query action is not limited in the embodiments of the present application, for example, if the query target is related to data in the table (for example, the query target is data existing in the table or data calculated by using data in the table), the query action may include a table query processing action; if the query target is related to data in the graph (e.g., entities and/or relationships in the graph), the query action may include a graph query processing action; if the query target is related to data in the document (such as characters recorded in the document or text information processed by using the characters recorded in the document), the query action may include a document query processing action; the query action may include a streaming data query processing action if the query target is related to data in the streaming data (e.g., data present in the streaming data, or data information mined from the streaming data).
The query action is not limited in the embodiment of the present application, for example, the query action may include at least one calculation function, and the at least one calculation function may include a calculation function in a conventional calculation function set and/or a calculation function in a preset function database. Where conventional computing functions are used to provide arithmetic operators and logical operators to support simple logical operations and mathematical operations, while supporting dictionaries, lists, derivations of collections, and iterative expressions. The preset function database can be a standard library and/or a third-party library, is used for providing mathematical function support to be responsible for complex operation, can call the standard library to realize functions of database access, text processing, image processing, XML processing and the like in the programming process, or can complete scientific calculation by using functions in the third-party library, such as functions of matrix calculation, linear algebra, data modeling, data visualization and the like.
The query action may be determined based on the data type of the target dataset (i.e., the dataset needed for query processing for the query target).
In addition, the present embodiment does not limit the determination manner of the query operation, and will be described with reference to various embodiments of S3.
In a first possible implementation, S3 may specifically be: and determining a query action according to the third information and the first mapping relation. Wherein the first mapping relation is used for recording the query action of the data sets of different data types.
Based on the first possible implementation manner, if the query action of the data sets with different data types is recorded by using the first mapping relationship in advance, after the third information (that is, the data type of the data set required for performing query processing on the query target) is analyzed from the query statement, the query action corresponding to the third information may be determined from the first mapping relationship.
In a second possible implementation manner, when the query statement carries the third information and the data operation information, S3 may specifically be: and generating a query action according to the third information and the data operation information.
The embodiment of the present application does not limit a specific implementation manner of generating the query action based on the third information and the data operation information. In a possible embodiment, S3 may specifically be: determining an initial action according to the third information and the second mapping relation; and generating a query action according to the initial action and the data operation information. Wherein the second mapping relation is used for recording the query action of the data sets of different data types.
Based on the related content of the second possible implementation manner of the S3, when the query statement carries the third information and the data operation information, the query action may be generated based on the third information and the data operation information, so that the determined query action meets the data type of the target data set carried in the query statement and the query requirement specified by the data operation information.
In a third possible implementation manner, when the query statement carries the third information and the fifth information, S3 may specifically be: and generating a query action according to the third information and the fifth information.
Based on the related content of the third possible implementation manner of S3, when the query statement carries the third information and the fifth information, the query action may be generated based on the third information and the fifth information, so that the determined data type of the target data set carried in the query statement and the query requirement specified by the fifth information are met.
In a fourth possible embodiment, S3 may specifically be: determining at least one set of candidate actions according to the third information; and determining a group of candidate actions meeting a preset condition in at least one group of candidate actions as the query action. The preset condition is preset, and the preset condition is not limited in the embodiment of the present application, for example, the preset condition may be a group of actions that takes the shortest time to select.
Based on the related content of the above-mentioned S3 in the fourth possible implementation manner, after determining multiple sets of candidate actions according to the third information, a set of candidate actions meeting the preset condition may be selected from the multiple sets of candidate actions by using a preset condition as a query action, so that the finally determined query action is better.
Based on the above-mentioned related content of S3, in the embodiment of the present application, after the third information is extracted from the query statement, the query action may be determined by using the third information, so that the query of the query target can be performed in the data pool based on the query action in the following.
S4: a target data set is determined from the data pool based on the first information.
In this embodiment of the application, after the first information is obtained, a target data set may be determined from the data pool according to the first information, and specifically, the target data set may be: determining a target data source from the data pool according to the third information; and determining the target data set from the target data source according to the fourth information. As can be seen, in the embodiment of the present application, after the first information is parsed from the query statement, a data source corresponding to the data type in the data pool may be determined as a target data source according to the data type (that is, the third information) of the target data set recorded in the first information; and determining the data set corresponding to the storage identifier in the target data source as the target data set according to the storage identifier of the target data set recorded in the first information in the data source.
It should be noted that the target data set is not limited by the embodiments of the present application, and for example, the target data set may include at least one of a table, a map, stream data, and a document.
It should be noted that the embodiment of the present application does not limit the execution order of S3 and S4. For example, S3 and S4 may be performed sequentially, S4 and S3 may be performed sequentially, and S3 and S4 may be performed simultaneously.
S5: a query objective is determined using the query action and the objective dataset.
In the embodiment of the application, after the query action and the target data set are obtained, data query can be performed from the target data set by using the query action, and a query target is determined.
Based on the relevant contents of S1 to S5, in the data query method provided in the embodiment of the present application, after the query language input by the user is obtained, the query statement is first analyzed to obtain the first information and the second information. The first information is data source information stored with a query target; the first information comprises third information and fourth information; the third information represents the data type of the target data set, the fourth information represents the storage identification of the target data set in the data source, and the target data set is a data set required for query processing of a query target; the second information characterizes a feature identification of the query object. Then, determining a query action according to the first information, and determining a target data set from the data pool according to the first information; the data pool comprises N data sources; the data sources include at least one data set, and the data types of the data sets stored in the different data sources are different. Finally, a query objective is determined using the query action and the objective dataset.
It can be seen that, because the query statement input by the user carries information (for example, a data type of the target data set, a storage identifier of the target data set, and feature identifier information of the query target) required for querying the query target, after the first information and the second information are obtained by parsing the query statement, the query action and the target data set used for querying the query target can be directly determined by using the first information and the second information, and the query target is determined from the target data set by using the query action, so that the purpose of performing data query (as shown in fig. 3) on data sets of different data types based on one query statement input by the user is achieved, the disadvantages of conventional data query are overcome, the data query process is simplified, and the data query efficiency is improved.
In addition, embodiments of the present application further provide a multi-Modal Query Language (MQL) statement (i.e., the above Query statement) that can be applied to the above data Query method, and the following description is combined with the above Query statementGrammar knot Structural exampleThe MQL sentence will be explained.
Example of syntactic Structure
Based on the above, the MQL statement provided in the embodiment of the present application can support various types of programming paradigms, so the embodiment of the present application does not limit the types of the programming paradigms supported by the MQL statement. For convenience of explaining the syntax structure of MQL, an MQL statement using a syntax similar to SQL will be described as an example.
Referring to fig. 4, this figure is a schematic diagram of a syntax structure of an MQL statement provided in the embodiment of the present application.
In a possible implementation manner, as shown in fig. 4, the MQL statement provided in this embodiment of the present application may be a multi-mode fused query statement, and the syntax structure of the MQL statement is similar to the syntax structure of the SQL statement, and is insensitive to case, and may support user update, query and command operations, including both rich function packages and mllibs, and also including user-defined parameters, files and functions.
In addition, as shown in fig. 4, the MQL statement includes a prefix declaration section, a query target information section, a data set storage information section, and a data manipulation information section. To facilitate understanding of the MQL statements, the above parts are described below in connection with table 1, respectively.
The prefix declaration part comprises a type declaration and an attribute declaration; wherein, the type declaration is used for declaring the data type of the target data set carried by the MQL statement. The attribute declaration is used to declare attribute description information that a query target carried by the MQL statement has in a target dataset (e.g., the attribute declaration may be an entity and/or a relationship in the graph). Based on this, when the query statement input by the user is an MQL statement, the third information above and the fifth information above may be parsed from the prefix declaration section of the query statement.
The query target information part is used for pointing out characteristic identification information (such as attribute identification in a table, entity identification in a graph, relation identification in the graph and the like) of a query target carried by the MQL statement. Based on this, when the query statement input by the user is an MQL statement, the above second information can be parsed from the query target information portion of the query statement.
The data set storage information part is used for pointing out the storage identification information of the target data set carried by the MQL statement in the data source. In this way, when the query statement input by the user is an MQL statement, the fourth information above can be parsed from the data set storage information part of the query statement.
The data operation information part is used for pointing out data operation related information carried by the MQL statement, and the data operation information part comprises data operation identification information and data operation constraint information. And the data operation identification information is used for uniquely identifying the data operation. The data operation constraint information refers to constraint condition information to which the data operation should comply. Based on this, when the query statement input by the user is an MQL statement, the above data operation information can be parsed from the data operation information part of the query statement.
It should be noted that the attribute declaration in the prefix declaration section is an optional parameter, that is, there may be no attribute declaration in some MQL statements, and there may be attribute declaration in other MQL statements. Similarly, the data operation information part is an optional part, that is, there may be no data operation information part in some MQL statements, and there may be a data operation information part in other MQL statements.
Figure BDA0002488272310000141
Figure BDA0002488272310000151
TABLE 1
In addition, in order to facilitate understanding of the syntax structure of the MQL statement shown in fig. 4, the following description is given by taking the query syntax of data sets of different data types as an example.
(1) MQL statement introduction to structured data (e.g., table data).
The data characteristics of the structured data are: the database for storing structured data (i.e., the above relational database) may contain a plurality of tables, and each table is a data structure in a two-dimensional form, with one row of data representing one entity information in units of rows.
The structure of the query paradigm of MQL statements for structured data is: predefining a data type table in a query statement; database in the query statement represents the database name, table represents the table name, select _ list is the query target, expressions are possible constraints (i.e., data manipulation information). That is, the syntax structure of the MQL statement for structured data is specifically as follows:
data type declaration
PREFIX table
# query statement
SELECT<select_list>FROM<database.tablename>WHERE<expressions>。
The syntax structure of the MQL statement for structured data described above is explained below with reference to specific examples.
For example, when selecting name and count columns from the Websites table of the web data database (i.e., one of the data sources in the data pool) and storing the query result in a result table, the user may enter the following query statement:
PREFIX table
SELECT name,country FROM web_data.Websites。
(2) MQL statement introduction for unstructured data (such as atlas data, streaming data, RDF data, document data, or mixed data).
(1) Atlas data query
The data characteristics of the map data are as follows: there is only one graph in the database for storing the graph data (i.e., the graph database above), each graph being composed of nodes and edges. The nodes comprise variables, attributes and labels of the entities, and the edges represent relationship types, relationship attributes and directions.
The structure of the query paradigm of MQL statements for atlas data is: and predefining a data type graph in the query statement, and then defining a limitation statement of the entity and the relationship attribute as a supplementary definition of data for the FROM statement. That is, the syntax structure of the MQL sentence for the map data is specifically as follows:
data type declaration
PREFIX graph
# Attribute declaration
[PREFIX entity:<expression>
PREFIX relation:<expression>]
# query statement
SELECT<select_list>[FROM entity|relation]WHERE<expressions>。
The syntax structure of the MQL statement for the map data described above is explained below with reference to specific examples.
For example, when there is a data source in a data pool that includes the graph shown in fig. 5, the data pool may be queried for entities in fig. 5 (i.e., nodes in the graph) and relationships between entities (i.e., edges in the graph), and the query contents are as follows:
as an example of node query in the graph, when searching for a node which has a relationship with a movie label and ID is 1 from the graph shown in fig. 5, the user may declare a data type in a query statement, declare a relationship that the node satisfies, and call an ID function in a where clause, so that the user may input the following query statement:
PREFIX graph
PREFIX relation:{(n)—(movie)}
SELECT n FROM relation WHERE id(n)=1。
as an example of a relational query in a graph, when finding a relation between Tom Hanks and a movie from the graph shown in fig. 5, the user may then enter the following query statement:
PREFIX graph
PREFIX relation:{(person)—[r]->(movie)}
SELECT r,type(r)FROM relation WHERE person.name=’Tom Hanks’and movie.name=’Forrest Gump’。
(2) streaming data queries
The data characteristics of the stream data are similar to the relational data, and the stream data refers to real-time data in a rolling time window and can return the calculation result of the stream data at a certain moment. Additionally, the data source of the streaming data may be streaming data or other types of data.
The structure of the query paradigm for MQL statements of streaming data is: the data type stream is predefined in the query statement, and the configuration attribute is defined. In addition, when data query is performed on streaming data, the SELECT statement cannot be used alone, and can only be used together with the Insert statement. That is, the syntax structure of the MQL statement for stream data is specifically as follows:
data type declaration
PREFIX stream
# Attribute declaration
PREFIX properties:<expression>
# query statement
INSERT INTO STREAM streamname(select_list definition)properties
SELECT<select_list>FROM stream|datasource WHERE<expressions>。
The syntax structure of the MQL statement for stream data described above is explained below with reference to specific examples.
For example, when importing data in the relationship data table context _ tb into an undefined stream, the user may enter the following query statement:
PREFIX stream
PREFIXproperties:(topic=’mqlout’,zookeepers=’127.0.0.1:2181’,brokers=’127.0.0.1:9092’)
INSERT INTO STREAM s1(context String,user_id String)propertites
SELECT context,user_id FROM context_tb。
(3) RDF data queries
The data characteristics of RDF data are: the RDF is used for assisting in the query of the dynamic webpage and is stored in a graph data form, the data comprise subject-predicate triple, a subject node, a predicate node and an object node are sequentially connected, and a query object is related among a plurality of RDFs in the query process.
The structure of the query paradigm of the MQL statement for RDF data is: the data type RDF is predefined in the query statement, as well as the RDF data associated with the query. In addition, triples that need to be queried can be defined, if necessary. That is, the syntax structure of the MQL statement for RDF data is specifically as follows:
data type declaration
PREFIX rdf
# Attribute declaration
PREFIX url_name:<url>
[PREFIX tri:<expression>]
# query statement
SELECT<select_list>FROM url_name WHERE<expression>。
The syntax structure of the MQL statement for RDF data is explained below with reference to specific examples.
For example, assume that the data set shown in fig. 6 exists in the data source for storing RDF data in the data pool, and the data set shown in fig. 6 is used to describe RDF data of an apartment and its location. Based on this assumption, when an apartment whose number of rooms is less than 4 needs to be found in the data set shown in fig. 6, the user can input the following query statement:
PREFIX rdf
PREFIX swp:<http://www.semanticwebprimer.org/ontology/apartments.ttl#>
PREFIX dbpedia:<http://www.dbpedia.org/resource/>
PREFIX dbpedia-owl:<http://dbpedia.org/ontology/>
PREFIX tri:{(appartment)-[swp:hasNumberOfBedrooms]-(num)}
SELECT apartment FROM tri
WHERE num<4。
(4) document data query
The data characteristics of the document data are as follows: the document data is stored in json form, and a plurality of document data groups are stored in a database (i.e., the above HDFS). It can be seen that a database that is a collection of documents is analogous to a table in a relational database, with each document being analogous to a row of data in a relational database.
The query paradigm for the MQL statement of document data has the structure: predefining a data type doc in a query statement; docset is a storage identifier of a document set (i.e., a data source for storing document data), and docset is equivalent to a table name in relational data, and the rest of the query process is similar to that of the relational data. That is, the syntax structure of the MQL sentence for document data is specifically as follows:
data type declaration
PREFIX doc
# query statement
SELECT<select_list>FROM database.docset WHERE<expression>。
The syntax structure of the MQL sentence for document data described above is explained below with reference to a specific example.
For example, it is assumed that the data source for storing document data shown in fig. 7 exists in the data pool, the storage of the data source for storing document data shown in fig. 7 in the data pool is identified as doc _ set, and two documents are stored in the data source for storing document data shown in fig. 7. Based on this assumption, when it is required to query a case of 5.0 less score in the data source for storing document data shown in fig. 7, the user can input the following query statement:
PREFIX doc
SELECT score FROM doc_set
WHERE score<5.0。
(5) hybrid data query
The data characteristics of the mixed data are as follows: and inquiring a plurality of fields which are targeted to different data sources, and finally storing the fields in a data format of a type relation type.
The structure of the query paradigm of the MQL statement for mixed data is: declaring and defining all data types to which a required data set belongs and necessary attribute declarations in a query statement in the process of querying a query target; and each field in the query statement is defined as a data type. That is, the syntax structure of the MQL statement for the hybrid data is specifically as follows:
PREFIX datatypeA
PREFIX datatypeB
[PREFIX……]
SELECT datatypeA.fieldA,datatypeB.fieldB FROM
datatypeA.database.table,datatypeB.database.table
WHERE<expression>。
the syntax structure of the MQL statement for the hybrid data described above is explained below with reference to a specific example.
For example, it is assumed that the data pool includes a map database and a document database (i.e., the above HDFS), wherein the map database includes a map for recording the relationship between movies and characters, and the document database stores a plurality of document data recorded with movie scores. Based on this assumption, when it is desired to query the score of a movie played by Tom Hanks, the user may enter the following query sentence:
PREFIX doc
PREFIX graph
PREFIX relation:{(person)—[r]->(movie)}
SELECT doc.doc_set.score,graph.relation.r FROM doc.doc_set,graph.relation
WHERE doc.doc_set.name=graph.relation.movie and graph.person.name=’Tom Hanks’。
based on the related content of the MQL sentence, the MQL sentence provided by the embodiment of the present application opens a language barrier, and realizes a function of performing data query on data sets of various data types by using one query language, so that a user can efficiently and accurately query various data in a data pool by using the MQL sentence.
Based on the data query method provided by the above method embodiment, the embodiment of the present application further provides a data query device, which is explained and explained with reference to the accompanying drawings.
Device embodiment
Please refer to the above method embodiment for technical details of the data query device provided by the device embodiment.
Referring to fig. 8, the figure is a schematic structural diagram of a data query device according to an embodiment of the present application.
The data query apparatus 800 provided in the embodiment of the present application includes:
an acquisition unit 801 that acquires a query sentence input by a user; the query statement carries information required for querying a query target;
an analyzing unit 802, configured to analyze the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
a first determining unit 803, configured to determine a query action according to the third information, and determine the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
a second determining unit 804, configured to determine the query target by using the query action and the target data set.
In one possible implementation, the first determining unit 803 includes:
the first determining subunit is configured to determine the query action according to the third information and the first mapping relationship; wherein the first mapping relation is used for recording query actions of data sets of different data types.
In a possible implementation manner, the parsing unit 802 is specifically configured to parse the query statement to obtain first information, second information, and data operation information;
the first determining subunit is specifically configured to generate a query action according to the third information and the data operation information.
In a possible implementation manner, the first determining subunit is specifically configured to: determining an initial action according to the third information and a second mapping relation; the second mapping relation is used for recording query actions of data sets of different data types; and generating a query action according to the initial action and the data operation information.
In a possible implementation manner, the parsing unit 802 is specifically configured to parse the query statement to obtain first information, second information, and fifth information; wherein the fifth information is attribute description information of the query target in the target data set;
the first determining subunit is specifically configured to generate a query action according to the third information and the fifth information.
In a possible implementation, the first determining unit 803 includes:
a second determining subunit, configured to determine, according to the third information, a target data source from the data pool; and determining a target data set from the target data source according to the fourth information.
In a possible implementation manner, the parsing unit 802 is specifically configured to identify a programming paradigm type used by the query statement, and determine that the programming paradigm type is a target programming paradigm type; and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
As can be seen from the related contents of the data query apparatus 800 provided above, in the embodiment of the present application, after the query language input by the user is acquired, the query statement is first analyzed to obtain the first information and the second information. The first information is data source information stored with a query target; the first information comprises third information and fourth information; the third information represents the data type of the target data set, the fourth information represents the storage identifier of the target data set in the data source, and the target data set is a data set required for carrying out query processing on a query target; the second information characterizes a feature identification of the query object. Then, determining a query action according to the first information, and determining a target data set from the data pool according to the first information; the data pool comprises N data sources; the data sources include at least one data set, and the data types of the data sets stored in the different data sources are different. Finally, a query objective is determined using the query action and the objective dataset.
It can be seen that, since the query statement input by the user carries information (for example, a data type of the target data set, a storage identifier of the target data set, and feature identifier information of the query target) required for querying the query target, after the first information and the second information are obtained by parsing the query statement, the query action and the target data set used for querying the query target can be directly determined by using the first information and the second information, and the query target is determined from the target data set by using the query action, so that the purpose of querying data of data sets of different data types by using one query statement input by the user is achieved, the defect that technicians need to input different query languages for the data sets of different data types for querying is overcome, the data query process is simplified, and the data query efficiency is improved.
Based on the data query method provided by the above method embodiment, the embodiment of the present application further provides a device, which is explained and explained below with reference to the accompanying drawings.
Apparatus embodiment
Please refer to the above method embodiment for the device technical details provided by the device embodiment.
Referring to fig. 9, the drawing is a schematic structural diagram of an apparatus provided in the embodiment of the present application.
The device 900 provided in the embodiment of the present application includes: a processor 901 and a memory 902;
the memory 902 is used for storing computer programs;
the processor 901 is configured to execute any implementation manner of the data query method provided by the above method embodiments according to the computer program. That is, the processor 901 is configured to perform the following steps:
acquiring a query statement input by a user; the query statement carries information required for querying a query target;
analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
determining a query action according to the third information, and determining the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
determining the query objective using the query action and the objective dataset.
Optionally, the determining a query action according to the third information specifically includes:
determining the query action according to the third information and the first mapping relation; wherein the first mapping relation is used for recording query actions of data sets of different data types.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and data operation information;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the data operation information.
Optionally, the generating a query action according to the third information and the data operation information specifically includes:
determining an initial action according to the third information and a second mapping relation; wherein the second mapping relation is used for recording query actions of data sets of different data types;
and generating a query action according to the initial action and the data operation information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and fifth information; wherein the fifth information is attribute description information that the query target has in the target dataset;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the fifth information.
Optionally, the determining the target data set from the data pool according to the first information specifically includes:
determining a target data source from the data pool according to the third information;
and determining a target data set from the target data source according to the fourth information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
identifying a programming normal form type used by the query statement, and determining the programming normal form type as a target programming normal form type;
and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
The above is related to the apparatus 900 provided in the embodiment of the present application.
Based on the data query method provided by the method embodiment, the embodiment of the application also provides a computer readable storage medium.
Media embodiments
For technical details of a computer-readable storage medium provided in the media embodiment, please refer to the method embodiment.
The embodiment of the present application provides a computer-readable storage medium, which is used for storing a computer program, where the computer program is used for executing any implementation manner of the data query method provided by the above method embodiment. That is, the computer program is for performing the steps of:
acquiring a query statement input by a user; the query statement carries information required for querying a query target;
analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
determining a query action according to the third information, and determining the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
determining the query target using the query action and the target dataset.
Optionally, the determining a query action according to the third information specifically includes:
determining the query action according to the third information and the first mapping relation; wherein the first mapping relation is used for recording query actions of data sets of different data types.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and data operation information;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the data operation information.
Optionally, the generating a query action according to the third information and the data operation information specifically includes:
determining an initial action according to the third information and a second mapping relation; wherein the second mapping relation is used for recording query actions of data sets of different data types;
and generating a query action according to the initial action and the data operation information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
analyzing the query statement to obtain first information, second information and fifth information; wherein the fifth information is attribute description information of the query target in the target data set;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the fifth information.
Optionally, the determining the target data set from the data pool according to the first information specifically includes:
determining a target data source from the data pool according to the third information;
and determining a target data set from the target data source according to the fourth information.
Optionally, the analyzing the query statement to obtain the first information and the second information specifically includes:
identifying a programming normal form type used by the query statement, and determining the programming normal form type as a target programming normal form type;
and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
The above is related to the computer-readable storage medium provided in the embodiments of the present application.
It should be understood that, in this application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b and c may be single or plural.
The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner. Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make many possible variations and modifications to the disclosed solution, or to modify equivalent embodiments, without departing from the scope of the solution, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are within the scope of the technical solution of the present invention, unless the technical essence of the present invention is not departed from the content of the technical solution of the present invention.

Claims (9)

1. A method for querying data, comprising:
acquiring a query statement input by a user; the query statement carries information required for querying a query target;
analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
determining a query action according to the third information, and determining the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
determining the query target using the query action and the target dataset;
the analyzing the query statement to obtain first information and second information specifically comprises:
identifying a programming normal form type used by the query statement, and determining the programming normal form type as a target programming normal form type;
and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
2. The method according to claim 1, wherein the determining a query action according to the third information specifically comprises:
determining the query action according to the third information and the first mapping relation; and the first mapping relation is used for recording query actions corresponding to data sets of different data types.
3. The method according to claim 1, wherein the parsing the query statement to obtain first information and second information specifically comprises:
analyzing the query statement to obtain first information, second information and data operation information;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the data operation information.
4. The method according to claim 3, wherein the generating a query action according to the third information and the data operation information specifically comprises:
determining an initial action according to the third information and a second mapping relation; the second mapping relation is used for recording query actions corresponding to data sets of different data types;
and generating a query action according to the initial action and the data operation information.
5. The method according to claim 1, wherein the parsing the query statement to obtain first information and second information specifically comprises:
analyzing the query statement to obtain first information, second information and fifth information; wherein the fifth information is attribute description information that the query target has in the target dataset;
the determining a query action according to the third information specifically includes:
and generating a query action according to the third information and the fifth information.
6. The method according to claim 1, wherein the determining the target data set from a data pool according to the first information comprises:
determining a target data source from the data pool according to the third information;
and determining a target data set from the target data source according to the fourth information.
7. A data query apparatus, comprising:
the acquisition unit acquires an inquiry sentence input by a user; the query statement carries information required for querying a query target;
the analysis unit is used for analyzing the query statement to obtain first information and second information; wherein the first information comprises third information and fourth information; the third information represents the data type of a target data set, the fourth information represents the storage identifier of the target data set in a data source, and the target data set is a data set required for query processing of the query target; the second information represents the characteristic identification of the query target;
a first determining unit, configured to determine a query action according to the third information, and determine the target data set from a data pool according to the first information; the data pool comprises N data sources, wherein N is a positive integer; the data sources comprise at least one data set, and the data types of the data sets stored in different data sources are different;
a second determining unit for determining the query target using the query action and the target data set;
the analysis unit is specifically configured to identify a programming normal form type used by the query statement, and determine the programming normal form type as a target programming normal form type; and analyzing the query statement according to the target programming paradigm type to obtain first information and second information.
8. An apparatus, comprising a processor and a memory:
the memory is used for storing a computer program;
the processor is configured to perform the method of any one of claims 1-6 in accordance with the computer program.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium is used for storing a computer program for performing the method of any of claims 1-6.
CN202010397694.2A 2020-05-12 2020-05-12 Data query method and related equipment Active CN111475534B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010397694.2A CN111475534B (en) 2020-05-12 2020-05-12 Data query method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010397694.2A CN111475534B (en) 2020-05-12 2020-05-12 Data query method and related equipment

Publications (2)

Publication Number Publication Date
CN111475534A CN111475534A (en) 2020-07-31
CN111475534B true CN111475534B (en) 2023-04-14

Family

ID=71764513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010397694.2A Active CN111475534B (en) 2020-05-12 2020-05-12 Data query method and related equipment

Country Status (1)

Country Link
CN (1) CN111475534B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114238286B (en) * 2022-02-28 2022-08-05 连连(杭州)信息技术有限公司 Data warehouse data processing method and device, electronic equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030224A (en) * 2006-03-03 2007-09-05 国际商业机器公司 System and method for building a unified query that spans heterogeneous environments
CN102591896A (en) * 2011-01-05 2012-07-18 北京大用科技有限责任公司 System, implementation, application, and query language for a tetrahedral data model for unstructured data
CN102968307A (en) * 2012-11-29 2013-03-13 中国传媒大学 Java-based web development middleware
CN103823815A (en) * 2012-11-19 2014-05-28 中国联合网络通信集团有限公司 Server and database access method
CN105338026A (en) * 2014-07-24 2016-02-17 阿里巴巴集团控股有限公司 Data resource acquisition method, device and system
US9348815B1 (en) * 2013-06-28 2016-05-24 Digital Reasoning Systems, Inc. Systems and methods for construction, maintenance, and improvement of knowledge representations
CN107515887A (en) * 2017-06-29 2017-12-26 中国科学院计算机网络信息中心 A kind of interactive query method suitable for a variety of big data management systems
CN107615277A (en) * 2015-03-26 2018-01-19 卡斯维尔公司 System and method for inquiring about data source
CN108090154A (en) * 2017-12-08 2018-05-29 广州市申迪计算机***有限公司 A kind of isomerous multi-source data fusion querying method and device
CN108363746A (en) * 2018-01-26 2018-08-03 福建星瑞格软件有限公司 A kind of unified SQL query system for supporting multi-source heterogeneous data
CN110399388A (en) * 2019-07-29 2019-11-01 中国工商银行股份有限公司 Data query method, system and equipment
CN110431545A (en) * 2017-03-31 2019-11-08 亚马逊科技公司 Inquiry is executed for structural data and unstructured data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10255378B2 (en) * 2015-03-18 2019-04-09 Adp, Llc Database structure for distributed key-value pair, document and graph models

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101030224A (en) * 2006-03-03 2007-09-05 国际商业机器公司 System and method for building a unified query that spans heterogeneous environments
CN102591896A (en) * 2011-01-05 2012-07-18 北京大用科技有限责任公司 System, implementation, application, and query language for a tetrahedral data model for unstructured data
CN103823815A (en) * 2012-11-19 2014-05-28 中国联合网络通信集团有限公司 Server and database access method
CN102968307A (en) * 2012-11-29 2013-03-13 中国传媒大学 Java-based web development middleware
US9348815B1 (en) * 2013-06-28 2016-05-24 Digital Reasoning Systems, Inc. Systems and methods for construction, maintenance, and improvement of knowledge representations
CN105338026A (en) * 2014-07-24 2016-02-17 阿里巴巴集团控股有限公司 Data resource acquisition method, device and system
CN107615277A (en) * 2015-03-26 2018-01-19 卡斯维尔公司 System and method for inquiring about data source
CN110431545A (en) * 2017-03-31 2019-11-08 亚马逊科技公司 Inquiry is executed for structural data and unstructured data
CN107515887A (en) * 2017-06-29 2017-12-26 中国科学院计算机网络信息中心 A kind of interactive query method suitable for a variety of big data management systems
CN108090154A (en) * 2017-12-08 2018-05-29 广州市申迪计算机***有限公司 A kind of isomerous multi-source data fusion querying method and device
CN108363746A (en) * 2018-01-26 2018-08-03 福建星瑞格软件有限公司 A kind of unified SQL query system for supporting multi-source heterogeneous data
CN110399388A (en) * 2019-07-29 2019-11-01 中国工商银行股份有限公司 Data query method, system and equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
周汉民 ; 徐汀荣 ; .基于XML与Oracle9i的邮件数据库的实现.现代计算机.2006,(第06期),全文. *
曹忠升 ; 吴宗大 ; 王元珍 ; .多媒体查询语言及其评价准则.计算机科学.2009,(第03期),全文. *
毛佳飞 ; 叶霞 ; 李俊山 ; .异构数据集成查询处理研究.微电子学与计算机.2018,(第05期),全文. *
陈涛 ; 张永娟 ; 陈恒 ; .Web数据到RDF数据的框架实现.现代图书情报技术.2015,(第02期),全文. *

Also Published As

Publication number Publication date
CN111475534A (en) 2020-07-31

Similar Documents

Publication Publication Date Title
US11080295B2 (en) Collecting, organizing, and searching knowledge about a dataset
Shigarov et al. Rule-based spreadsheet data transformation from arbitrary to relational tables
Bikakis et al. The XML and semantic web worlds: technologies, interoperability and integration: a survey of the state of the art
US10394803B2 (en) Method and system for semantic-based queries using word vector representation
US8438190B2 (en) Generating web services from business intelligence queries
US11941034B2 (en) Conversational database analysis
US8583652B2 (en) Efficiently registering a relational schema
JP6014725B2 (en) Retrieval and information providing method and system for single / multi-sentence natural language queries
US9785725B2 (en) Method and system for visualizing relational data as RDF graphs with interactive response time
US8825621B2 (en) Transformation of complex data source result sets to normalized sets for manipulation and presentation
US20130060807A1 (en) Relational metal- model and associated domain context-based knowledge inference engine for knowledge discovery and organization
CN111813798B (en) Mapping method, device, equipment and storage medium based on R2RML standard
US11698918B2 (en) System and method for content-based data visualization using a universal knowledge graph
US20120246175A1 (en) Annotating schema elements based on associating data instances with knowledge base entities
US20120226715A1 (en) Extensible surface for consuming information extraction services
US20140379753A1 (en) Ambiguous queries in configuration management databases
US20230205996A1 (en) Automatic Synonyms Using Word Embedding and Word Similarity Models
US10489024B2 (en) UI rendering based on adaptive label text infrastructure
CN116108194A (en) Knowledge graph-based search engine method, system, storage medium and electronic equipment
CN111475534B (en) Data query method and related equipment
Kilias et al. INDREX: In-database relation extraction
JP2001236352A (en) Method and device for semistructured document retrieval and storage medium stored with semistructured document retrieval program
Im et al. Backward inference and pruning for RDF change detection using RDBMS
Unbehauen et al. SPARQL Update queries over R2RML mapped data sources
Zheng et al. A novel conditional knowledge graph representation and construction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant