WO2018218504A1 - Method and device for data query - Google Patents

Method and device for data query Download PDF

Info

Publication number
WO2018218504A1
WO2018218504A1 PCT/CN2017/086600 CN2017086600W WO2018218504A1 WO 2018218504 A1 WO2018218504 A1 WO 2018218504A1 CN 2017086600 W CN2017086600 W CN 2017086600W WO 2018218504 A1 WO2018218504 A1 WO 2018218504A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
file
query
order
statement
Prior art date
Application number
PCT/CN2017/086600
Other languages
French (fr)
Chinese (zh)
Inventor
高紫娟
王铁英
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201780091399.0A priority Critical patent/CN110678854B/en
Priority to PCT/CN2017/086600 priority patent/WO2018218504A1/en
Publication of WO2018218504A1 publication Critical patent/WO2018218504A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24535Query rewriting; Transformation of sub-queries or views

Definitions

  • the present application relates to the field of computers and, more particularly, to methods and apparatus for data query.
  • Distributed database refers to the use of high-speed computer networks to connect physically dispersed multiple data storage units to form a logically unified database.
  • the basic idea of a distributed database is to distribute the data in the original centralized database to multiple data storage nodes connected through the network to obtain larger storage capacity and higher concurrent access.
  • distributed database technology has also developed rapidly.
  • Traditional relational databases have begun to evolve from centralized models to distributed architectures.
  • Distributed databases based on relational types retain traditional databases. Under the data model and basic characteristics, from centralized storage to distributed storage, from centralized computing to distributed computing.
  • NoSQL non-relational database represented by NoSQL
  • NoSQL database products such as key-value storage systems and document databases.
  • document storage as a database storage model in a document database can support storage of structured data and storage of unstructured data, and there is no mandatory limitation on the structure of data stored in the document.
  • the management system in the database queries the file according to a fixed query order, for example, the query order from the old file to the new file,
  • the file of the data collection in which the data is stored is iterated to query the data in the file.
  • querying files in a fixed query order as described above results in less efficient query data.
  • the filter condition in the query statement is only an attribute for the underlying data in the hierarchical structure of the data storing the target application
  • the file query order is the query order from the old file to the new file
  • the filtering conditions in the query statement for the top-level data in the data hierarchy is empty, a large number of The intermediate result that does not meet all the filtering conditions in the query statement reduces the efficiency of querying data.
  • the present application provides a method and apparatus for data query, which is beneficial to improving the efficiency of querying data.
  • the first aspect provides a data query method, including: acquiring a query statement, where the query statement is used to query N-layer data in a data hierarchy of a data set, where the data hierarchy is to store the data set a hierarchical structure of the data, and the data set is stored in K files in descending order of the levels in the data hierarchy, the K files including a first file and a second file, the first The file is the earliest file created in the K files, and the second file is the file with the latest creation time among the K files, where N and K are positive integers greater than 1;
  • the preset condition includes the query
  • the filter condition for querying the data set in the first sub-statement of the statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty, the first Statement is used to query the data top N layer data, the second sub-query statement to
  • the data checking method in the embodiment of the present application may query the K files according to the order from the second file to the first file when the first sub-sent and the second sub-sent in the query statement satisfy the preset condition.
  • the filtering condition in the second sub-sentence to first query the data from the second file (ie, the new file)
  • the prior art can be reduced, and the query statement satisfying the preset condition is still adopted from the first
  • the query sequence of the file to the second file, and a large number of intermediate results that do not meet the filtering conditions of the query statement, is beneficial to improve the efficiency of the query data.
  • the method further includes: querying an attribute of the data in the data set and the filtering condition for querying data in the data set according to the query included in the query statement,
  • the query statement is divided into a plurality of sub-statements according to a data hierarchy of the data set.
  • the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data located in different layers in the data hierarchy.
  • querying the K files according to a query order from the second file to the first file to obtain target data to be queried including:
  • the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the query order from the second file to the first file, The K files are queried to obtain the target data to be queried.
  • the query order from the second file to the first file may be followed.
  • Querying in K files helps to reduce the number of intermediate results that do not match all the filtering conditions in the query.
  • the querying the K files to obtain target data to be queried according to a query order from the second file to the first file includes: When the query statement satisfies the preset condition, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, and for the K files Each of the files is queried in a first order to obtain the target data, wherein the first order is an order from bottom to top according to a data hierarchy of the file, and a hierarchical order in a data hierarchy of the file The hierarchical order of the data hierarchy of the data set is the same.
  • Querying the data in each file in the first order helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
  • the method further includes: when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
  • a query statement that does not satisfy the preset condition querying in the query order from the first file to the second file is beneficial to speed up the query data.
  • a query statement that does not satisfy the preset condition may be the first query statement.
  • the query condition in the sub-statement is not empty.
  • the data hierarchy can be used to query the data according to the query order from the first file to the second file, that is, the query in the first file. In the process of data, if the data to be queried does not satisfy the filtering condition in the first sub-statement, the lower-level data in the data tier structure in which the query data that does not satisfy the filtering condition may not be queried.
  • the K files are in accordance with a query order from the first file to the second file.
  • Performing a query to obtain target data to be queried includes: when the query statement does not satisfy the preset condition, according to the first file The query order of the second file, querying the K files to obtain the target data to be queried, and querying each of the K files in a second order to obtain the Target data, wherein the second order is an order from top to bottom according to a data hierarchy of the file, and a hierarchical order in a data hierarchy of the file is the same as a hierarchical order of a data hierarchy of the data set.
  • querying the data in each file helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
  • an apparatus for data query comprising means for performing the method of the first aspect.
  • an apparatus for data query comprising: a memory, a processor, an input/output interface, and a communication interface.
  • the memory is for storing instructions for executing the instructions stored by the memory, and when the instructions are executed, the processor
  • the method of the first aspect is performed by the communication interface, and the input/output interface is controlled to receive input data and information, and output data such as an operation result.
  • a computer readable medium storing program code for execution by a terminal device, the program code comprising instructions for performing the method of the first aspect.
  • a computer program product comprising instructions, when executed on a computer, causes the computer to perform the methods described in the various aspects above.
  • the technical solution provided by the present application is beneficial for reducing the intermediate result in the data query process that does not meet the filter condition of the query statement, so as to improve the efficiency of the data query.
  • FIG. 1 is a schematic flowchart of a method for generating a file according to an embodiment of the present application.
  • FIG. 2 is a schematic block diagram of a file generated by the file generating method of the embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a method for data query according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a method for data query according to another embodiment of the present application.
  • FIG. 5 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
  • the embodiment of the present application can be applied to a database, and specifically, can be applied to a document type database.
  • Document storage is used as a database storage model in document-based databases and can support the storage of structured data.
  • the user can send the query statement to the application module, and the application module sends the query statement sent by the user to the management system in the database, and the management system in the database queries the stored target data in the document database according to the query statement, and finally can The target data (ie, the query result) is presented to the user through the screen.
  • the structured data stored in the above-mentioned document type database may be data stored in the document type database in a data hierarchical structure.
  • the data hierarchical structure of the stored data in the embodiment of the present application is briefly described below.
  • All files corresponding to an application can be stored in a schema in a distributed database.
  • the data hierarchy of the data stored in the file stored in each schema can contain multiple levels, and the first field of each layer can be used to uniquely identify the data stored in the hierarchy, that is, using the first field of each layer as the primary key. , each floor It is also possible to store data for different data attributes.
  • the data hierarchy of the data stored in the WeChat can be:
  • the above data hierarchy can be divided into four levels, from the top of the data hierarchy to the bottom of the data hierarchy: Layer 1, Layer 2, Layer 3, and Layer 4 (see Table 1).
  • the first layer (also referred to as "top layer”) may use a user ID as a primary key, and may store data with data attributes of User ID, Name, Sex, and Birthday;
  • the layer may use a Topic ID as a primary key, and may store data of a Topic ID, a Title, and a topic's release date (T_Date);
  • T_Date topic's release date
  • the third layer may use a Comment ID as a primary key.
  • the storage data attribute is the data of the Comment ID, the comment content (C_Content), the comment date (C_Date), and the user ID (C_User ID) of the user who posted the comment;
  • the fourth layer also referred to as the "bottom layer”
  • the fourth layer may be a reply ID (Feed ID)
  • eed ID a reply ID
  • Table 1 shows, in tabular form, the hierarchical structure of the data stored in the above WeChat.
  • Hierarchical serial number Primary key Data attribute 1 User ID Name, Sex, Birthday 2 Topic ID Title, T_Date 3 Comment ID C_Content, C_Date, C_User ID 4 Feed ID F_Content, F_Date
  • data storage may be sequentially performed from the top layer of the data hierarchy to the bottom of the data hierarchy according to the data hierarchy described above.
  • the database management system can save the data block output formed in the buffer area to the disk to form a file by a flush operation.
  • FIG. 1 is a schematic flowchart of a method for generating a file according to an embodiment of the present application.
  • the method shown in Figure 1 includes:
  • the data stored in the buffer of the database in steps 110 to 114 is stored in a disk of the database by a Flush operation to form a file 1.
  • the primary key Comment ID of the third layer and the primary key (User ID) of the first layer and the primary key (Topic ID) of the second layer are required to be carried. .
  • the data stored in the buffer of the database in steps 116 to 119 is stored in a disk of the database by a Flush operation to form a file 2.
  • a Delete Tag Bitmap may also be generated at the same time and the deleted tag bitmap may be stored in the disk together with the corresponding file.
  • the delete tag bitmap is used to indicate whether the data stored in the file is valid.
  • the value corresponding to the bit is 0 (initial value)
  • it may be that the data storage record corresponding to the bit is not deleted, and if the bit corresponding to the value is 1, the bit may be referred to.
  • the bit corresponding to the data storage record is deleted. Referring to FIG. 2, in the file 1, the data stored in the database from step 110 to step 114 is stored, and a total of 5 data storage records are stored.
  • 5 bits can be used to indicate the file 1 Whether the data stored in the database corresponding to the 5 data records in the file is deleted, and the values of the 5 bits corresponding to the 5 data storage records stored in the file 1 are all 0, that is, 5 in the file 1 The data storage records have not been deleted.
  • the value of the 4 bits used to indicate whether the four data storage records in the file 2 are deleted is 0, indicating that the file stored in the file 2 is stored in steps 116 through 119. The data has not been deleted.
  • the file number can be numbered from small to large according to the file generation order, that is, the file number is from small to large and from old to new.
  • the order of the file is the same.
  • the larger the number of the file the closer the time the file is generated to the current time, that is, the more "new" the file is.
  • the smaller the file number the more the file generation time is from the current time, that is, the file is more "old”.
  • the file number is smaller, the file generation time is farther from the current time, and the more "old” the file, the data stored in the file is located.
  • the higher the upper layer of the data hierarchy (for example, the first layer data and the second layer data shown in Table 1); the larger the file number, the closer the file generation time is to the current time, and the more "new" the file
  • the greater the likelihood that the data stored in the file is located in the lower layer of the data hierarchy for example, the third layer data and the fourth layer data shown in Table 1).
  • the embodiment of the present application provides a data query method by using the above-mentioned new file and the hierarchical relationship between the old file and the data stored in the file in the data hierarchy structure.
  • the data query method of the embodiment of the present application is described in detail below with reference to FIG. .
  • FIG. 3 is a schematic flowchart of a method for data query according to an embodiment of the present application. It should be understood that the method illustrated in FIG. 3 may be performed by a management system in a database, such as a distributed data management system. The method shown in Figure 3 includes:
  • a query statement where the query statement is used to query N-layer data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in a data set, and the data set is in accordance with the data hierarchy.
  • the middle level is stored in K files in a high-to-low order, the K files including a first file and a second file, the first file being the file with the earliest creation time among the K files, the first Two files are in the K files The file with the latest creation time, where N and K are positive integers greater than one.
  • the foregoing data set is stored in K files in descending order of the level in the data hierarchy, and may refer to that the data in the data set stored in the K files may be substantially in accordance with the level in the data hierarchy. It is stored in K files in a low order, and it is not excluded that in the process of storing data into K files according to the above rules, there is a case where newly inserted data of a higher layer close to the data hierarchy is stored in a new file. . For example, in the generated file 2 shown in FIG. 2 (that is, a new file), data Comment(1, 2, 3, ...) is stored.
  • the first file described above is the file with the earliest creation time among the K files, that is, the first file is the old file mentioned above.
  • the second file above is the file with the latest creation time among the K files, that is, the second file is the new file mentioned above.
  • the above data set may be a collection of all data in an application stored in the database, for example, may be a data set of all data in the WeChat.
  • the above query statement may be a query statement for a range query for finding at least one data that meets the filter condition, for example, a Scan query statement or a range query statement with a key.
  • the attributes of the data contained in the query statement Q1 are: Topic ID and Title (Title)
  • Table 2 shows the attributes and filter conditions of the data of each sub-statement in the query statement Q1.
  • the query statement Q1 can contain two sub-statements: sub-statement 1 and sub-statement 2.
  • the attribute of the data in sub-statement 1 is empty, as shown in Table 1.
  • Data hierarchy structure for filtering data of the attribute of the first layer in the data hierarchy to "Birthday", that is, the sub-statement 1 is used to query the first in the data hierarchy shown in Table 1.
  • Layer data; the filter condition in sub-statement 2 is "Title like "***"", and the attribute of the data is "Topic ID, Title”, see the data hierarchy shown in Table 1, sub-statement 2 is used for the query.
  • the attribute of the data of the sub-statement in the above query statement is empty, and the sub-statement may only filter the attribute of the data of a certain layer in the data hierarchy, and may not be reflected in the layer in the query result.
  • the attribute of the data for example, the sub-statement 1 in the query statement Q1 is only filtering the data whose attribute is "Birthday". In the query result corresponding to the query statement Q1, the first layer in the query of the sub-statement 1 may not be used.
  • the properties of the data are empty, and the sub-statement may only filter the attribute of the data of a certain layer in the data hierarchy, and may not be reflected in the layer in the query result.
  • the query statement meets a preset condition, query the K files to obtain target data to be queried according to a query order from the second file to the first file, where
  • the preset condition includes that the filter condition for querying the data set in the first sub-statement of the query statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty
  • the first sub-statement is used to query top-level data in the N-layer data
  • the second sub-statement is used to query bottom-level data in the N-layer data.
  • the filtering condition for querying the data set in the foregoing first sub-statement is empty, replaceable, There is no filter in a substatement.
  • the filtering condition for querying the data set in the second sub-sentence is not empty, and the filtering condition exists in the second sub-statement.
  • the foregoing sub-statement may include an attribute for querying the data set filtering condition and data, wherein any one of the filtering condition and the attribute of the data used for querying the data set may be empty, or The filter conditions and data attributes of the query data set are not empty.
  • the querying the K files according to the query order from the second file to the first file may refer to sequentially querying K files according to the query order from the second file to the first file. Each time one file of K files can be queried; it can also refer to grouping K files according to the query order from the second file to the first file, and in accordance with the query order from the second file to the first file, You can query multiple files within a group.
  • the embodiment of the present application does not specifically define whether the filtering condition for querying the data set in the sub-words other than the first sub-word and the second sub-word in the query statement is empty.
  • the two sub-statements are used to query the data located in the first layer and the data located in the second layer in the data hierarchy shown in Table 1, wherein the sub-statement 1 can It is regarded as the "first sub-statement" in the query statement Q1, which is used to query the data in the first layer (top layer) of the layer 2 data; the sub-statement 2 can be regarded as the "second sub-statement” in the query statement Q1. Used to query data in the second layer (bottom layer) in the layer 2 data.
  • the filter condition in the sub-statement 1 is not empty, the query statement Q1 does not satisfy the above-mentioned preset condition.
  • the query statement Q2 is: Select Name, Topic ID, Title from Table where Title like "***”
  • the attributes of the data included in the query statement Q2 include “Name” and “Topic ID, Title”
  • the query statement The filtering conditions that need to be met in the data contained in Q2 include: “Title like "***””.
  • Table 3 shows the attributes and filter conditions of the data of each sub-statement in the query statement Q2.
  • the query statement Q2 can contain two sub-statements: sub-statement 3 and sub-statement 4, the attribute of the data in the sub-statement 3 is "Name", and the filter condition in the sub-statement 3 is empty, see the data hierarchy shown in Table 1.
  • the attribute of the data is located in the first layer of the data hierarchy; the filter condition in sub-statement 4 is "Title like "***"", and the attribute of the data is "Topic ID, Title", as shown in Table 1.
  • the data hierarchy structure, sub-statement 4 is used to query the data of the second layer in the data hierarchy shown in Table 1.
  • Substatement Data attribute Filter condition Substatement 3 Name Substatement 4 Topic ID, Title Title like "***"
  • the two sub-statements are used to query the data located at the first layer and the data located at the second layer in the data hierarchy shown in Table 1, wherein the sub-statement 3 can be regarded as The "first sub-statement" in the query statement Q2 is used to query the data in the first layer (top layer) of the layer 2 data; the sub-statement 4 can be regarded as the "second sub-statement” in the query statement Q2, for Query the data in the second layer (bottom layer) of the 2 layers of data. Since the filter condition in the sub-statement 3 is empty, and the filter condition in the sub-statement 4 is not empty, the query statement Q2 satisfies the above-mentioned preset condition.
  • the data checking method in the embodiment of the present application may query the K files according to the order from the second file to the first file when the first sub-sent and the second sub-sent in the query statement satisfy the preset condition.
  • the query order from the first file to the second file is still used, and a large number of intermediate results that do not meet the filter condition of the query statement are generated, which is beneficial to improving the efficiency of the query data.
  • the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data of different layers in the data hierarchy, and step 320 further includes: If the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the data needs to be satisfied at the same time, the K items are in accordance with the query order from the second file to the first file. The file is queried.
  • different filtering conditions in the at least three filtering conditions are used to filter data located in different layers in the data hierarchical structure, which may be referred to as “cross-layer” filtering conditions, and logical operations may be performed between “cross-layer” filtering conditions.
  • the AND connection that is, the data needs to meet the "cross-layer” filter condition at the same time, or the data needs to satisfy each of the at least 3 filter conditions.
  • the filtering condition of the data in the second layer that is, the above-mentioned "cross-layer” filtering condition.
  • the two filter conditions are connected by "and", that is, the data needs to satisfy the two "cross-layer” filter conditions at the same time.
  • the query order from the second file to the first file may be followed.
  • Querying in K files helps to reduce the number of intermediate results that do not match all the filtering conditions in the query.
  • step 320 further includes: when the query statement meets a preset condition, querying the K files according to a query order from the second file to the first file And querying, in the first order, each of the K files to obtain the target data, wherein the first order is an order from bottom to top according to a data hierarchy of the file,
  • the hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
  • the data hierarchical structure of the above file may refer to a data hierarchical structure of data stored in each file, and may be regarded as a sub-data hierarchical structure of a data hierarchical structure of the data set. That is to say, the data hierarchy of the file contains the hierarchy in the data hierarchy of the partial data set, and the data hierarchy of the file is used to represent the hierarchical relationship or hierarchical order between the data stored in the file.
  • the data structure of the file in the file 1 shown in FIG. 2 includes the first level data User ID, the second level data Topic ID, and the third level data Comment ID in the data hierarchy shown in Table 1. It can be seen that the data hierarchy of the files in the file 1 can be the sub-data hierarchical structure of the data hierarchy (four levels) shown in Table 1.
  • each file in the K files may be queried in the first order; or the query condition satisfies the preset condition, and the “cross-layer” in the query statement
  • the filtering condition is that the data needs to satisfy the filtering condition at the same time
  • each of the K files is queried in the first order.
  • Querying the data in each file in the first order helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
  • the method further includes: 330, according to querying, in the query statement, an attribute of querying data in the data set and the filtering condition for querying data in the data set, According to the data hierarchy, the query statement is divided into a plurality of sub-statements.
  • the query statement is divided into multiple sub-words, which may refer to Determining the level of the attribute of the data in the query data set in the query statement in the data hierarchy, and the filter condition is to define the data of the layer located in the data hierarchy, and determine the query included in each sub-statement in the query statement.
  • the attribute of the data in the data set and the filtering condition of the data in the query data set wherein when the attribute of the data in the query data set and the data filtered by the filter condition are at the same level of the hierarchical structure, they may be merged into one sub-statement, and the query statement Different sub-statements are used to query data located at different levels in the data hierarchy.
  • the method further includes: 340, when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
  • the K files may be queried according to the query order from the old file to the new file to obtain the target data to be queried.
  • a query statement that does not satisfy the preset condition querying in the query order from the first file to the second file is beneficial to speed up the query data.
  • a query statement that does not satisfy the preset condition may be the first query statement.
  • the query condition in the sub-statement is not empty.
  • the data hierarchy can be used to query the data according to the query order from the first file to the second file, that is, the query in the first file. In the process of data, if the data to be queried does not satisfy the filtering condition in the first sub-statement, the lower-level data in the data tier structure in which the query data that does not satisfy the filtering condition may not be queried.
  • the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data located in different layers in the data hierarchy, if the query statement satisfies Presetting the condition, but the filtering condition that the data needs to be satisfied at the same time is not the filtering condition, the querying the K files according to the query order from the first file to the second file Obtaining the target data to be queried.
  • the filtering condition that the data needs to be satisfied at the same time is not the filtering condition, and the data may satisfy any one of the at least three filtering conditions, or the data satisfies any two of the at least three filtering conditions.
  • the above at least three filter conditions are connected by a logical OR.
  • step 340 further includes: when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, the K The file is queried to obtain the target data to be queried, and each of the K files is queried in a second order to obtain the target data, wherein the second order is according to a file.
  • the order of the data hierarchy from top to bottom, the hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
  • the data hierarchical structure of the above file may refer to a data hierarchical structure of data stored in each file, and may be regarded as a sub-data hierarchical structure of a data hierarchical structure of the data set. That is to say, the data hierarchy of the file contains the hierarchy in the data hierarchy of the partial data set, and the data hierarchy of the file is used to represent the hierarchical relationship or hierarchical order between the data stored in the file.
  • Each of the K files is queried in the second order, and the second order is a top-to-bottom order in the data hierarchy of the file, that is, inquiring each file.
  • the second order is a top-to-bottom order in the data hierarchy of the file, that is, inquiring each file. In the process, from the top level of the data hierarchy of each file to the underlying query of the data hierarchy of the file.
  • each file in the K files may be queried in the second order; or the query condition satisfies the preset condition, but does not satisfy the "cross" in the query statement.
  • the filtering condition of the layer is that the data needs to satisfy the filtering condition at the same time, the K files in the second order are queried. Every file.
  • querying the data in each file helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
  • FIG. 4 is only intended to help those skilled in the art to understand the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific examples illustrated. A person skilled in the art can obviously make various equivalent changes or modifications according to the example shown in FIG. 4, and such changes or modifications are also within the scope of the embodiments of the present application. It should be noted that, in order to facilitate understanding, in the process of describing the data query method below, the second file is replaced with a new file, and the first file is replaced with the old file for explanation.
  • FIG. 4 is a schematic flowchart of a method for data query according to an embodiment of the present application. The method shown in Figure 4 includes:
  • the query statement contains the attributes of the target data and the filter conditions that the target data needs to satisfy.
  • the foregoing preset rule may be that the filtering condition in the first sub-statement of the query statement is empty, and the filtering condition in the second sub-statement of the query statement is not empty, and the filtering condition of “cross-layer” in the query statement The operation is between "and”.
  • query statement If the query statement satisfies the preset rule, query the target data from the file for storing the target data in the WeChat according to the query order from the new file to the old file.
  • query statement does not satisfy the preset rule, query the target data from the file for storing the target data in the WeChat according to the query order from the old file to the new file.
  • step 470 if there is a next target data in the file for storing the target data in the WeChat, step 470 is performed; if there is no next target data in the file for storing the target data in the WeChat, the target data query process is ended.
  • step 490 is performed; if the target data that meets the filtering condition has a value of 1 in the deleted label bit map, the target data that meets the filtering condition is indicated. If it is deleted, the target data is discarded, and step 460 is performed.
  • step 460 is performed.
  • FIG. 5 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
  • the apparatus 500 shown in Figure 5 includes: The obtaining unit 510 and the query unit 520.
  • the obtaining unit 510 is configured to obtain a query statement, where the query statement is used to query N-level data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in the data set, and the data set is in accordance with the
  • the levels in the data hierarchy are stored in K files in descending order, the K files including a first file and a second file, the first file being the file with the earliest creation time among the K files
  • the second file is the file with the latest creation time among the K files, where N and K are positive integers greater than one;
  • the query unit 520 is configured to: when the query statement acquired by the obtaining unit meets a preset condition, query the K files according to a query order from the second file to the first file to obtain Target data to be queried,
  • the preset condition includes that the filtering condition for querying the data set in the first sub-sent of the query statement is empty, and the filtering of the data set is used in the second sub-sent of the query statement.
  • the condition is not empty, the first sub-statement is used to query top-level data in the N-layer data, and the second sub-statement is used to query bottom-level data in the N-layer data.
  • the K files are queried according to the order from the second file to the first file, and the second sub-query is utilized.
  • the filtering condition in the statement first starts to query the data from the second file (ie, the new file), and to a certain extent, can reduce the query in the prior art for the above-mentioned satisfying the preset condition, still adopting the first file to the second
  • the query order of the files, the large number of intermediate results that do not meet the filter conditions of the query, is conducive to improving the efficiency of the query data.
  • the device further includes: a determining unit, configured to query, according to the query, the attribute of the data in the data set and the data used to query the data set in the query statement Filtering conditions, according to the data hierarchy structure, dividing the query statement into multiple sub-statements.
  • a determining unit configured to query, according to the query, the attribute of the data in the data set and the data used to query the data set in the query statement Filtering conditions, according to the data hierarchy structure, dividing the query statement into multiple sub-statements.
  • the query statement includes at least three filtering conditions, and different filtering conditions of the at least three filtering conditions are used to filter data located in different layers in the data hierarchical structure.
  • the query unit is further configured to: when the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the second file to the The query order of the first file is performed, and the K files are queried.
  • the querying unit is further configured to: when the query statement meets a preset condition, according to a query order from the second file to the first file, to the K The file is queried, and each of the K files is queried in a first order to obtain the target data, wherein the first order is in order from bottom to top according to a data hierarchy of the file.
  • the hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
  • the querying unit is further configured to: if the query statement does not satisfy the preset condition, follow the query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
  • the querying unit is further configured to: if the query statement does not satisfy the preset condition, follow the query order from the first file to the second file, The K files are queried to obtain the target data to be queried, and each of the K files is queried in a second order to obtain the target data, wherein the second order is
  • the hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set, in order from top to bottom in the data hierarchy of the file.
  • the obtaining unit 510 and the query unit 520 may be processors.
  • FIG. 6 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
  • the apparatus 600 shown in FIG. 6 includes a memory 610, a processor 620, an input/output interface 630, and a communication interface 640.
  • the memory 610, the processor 620, the input/output interface 630, and the communication interface 640 are connected through a communication interface, the memory 610 is configured to store instructions, and the processor 620 is configured to execute instructions stored in the memory 610 to control input/output.
  • the interface 630 receives the input data and information, outputs data such as an operation result, and controls the communication interface 640 to transmit a signal.
  • the processor 620 is configured to obtain a query statement, where the query statement is used to query N-layer data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in a data set, and the data set is Stored in K files in descending order of the hierarchy in the data hierarchy, the K files including a first file and a second file, the first file being the earliest creation time in the K files
  • the second file is the file with the latest creation time among the K files, where N and K are positive integers greater than 1; and is also used when the query statement obtained by the obtaining unit satisfies the preset In the condition, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, wherein the preset condition includes the first query statement
  • the filter condition for querying the data set in a sub-statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty, and the first sub-statement is used for querying Top-
  • the processor 620 may be a general-purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more.
  • the integrated circuit is used to implement the related program to implement the technical solution provided by the embodiment of the present invention.
  • communication interface 640 enables communication between mobile terminal 600 and other devices or communication networks using transceivers such as, but not limited to, transceivers.
  • the memory 610 can include read only memory and random access memory and provides instructions and data to the processor 620.
  • a portion of the processor 620 can also include a non-volatile random access memory.
  • the processor 620 can also store information of the device type.
  • the bus system 650 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 650 in the figure.
  • each step of the above method may be completed by an integrated logic circuit of hardware in the processor 620 or an instruction in a form of software.
  • the method for data query disclosed in the embodiment of the present invention may be directly implemented as a hardware processor execution, or may be performed by a combination of hardware and software modules in the processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 610, and the processor 620 reads the information in the memory 610 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
  • the K files are queried according to the order from the second file to the first file, and the second sub-query is utilized.
  • the filtering condition in the statement first starts to query the data from the second file (ie, the new file), and to a certain extent, can reduce the query in the prior art for the above-mentioned satisfying the preset condition, still adopting the first file to the second
  • the query order of the files, the large number of intermediate results that do not meet the filter conditions of the query, is conducive to improving the efficiency of the query data.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B according to A does not mean that B is determined only on the basis of A, but also based on A and/or other Information determines B.
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be read by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a Digital Video Disc (DVD)), or a semiconductor medium (eg, a Solid State Disk (SSD)). )Wait.
  • a magnetic medium eg, a floppy disk, a hard disk, a magnetic tape
  • an optical medium eg, a Digital Video Disc (DVD)
  • DVD Digital Video Disc
  • SSD Solid State Disk

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and device for data query. The method comprises: acquiring a query statement, the query statement being used for querying N layers of data in a data hierarchical structure of a data set to acquire target data to be queried, the data hierarchical structure being a hierarchical structure in which the data in the data set is stored, the data set being stored in K files according to a descending order of hierarchy in the data hierarchical structure, the K files comprising a first file and a second file, the first file being the earliest created file among the K files, and the second file being the latest created file among the K files, where N and K are positive integers greater than 1 (310); and if the query statement satisfies preset criteria, then querying the K files according to the querying order from the second file to the first file, the preset criteria comprising a filtering criterion in the first sub-statement used for querying the data set being null, a filtering criterion in the second sub-statement used for querying the data set not being null, the first sub-statement being used for querying top layer data in the N layers of data, and the second sub-statement being used for querying bottom layer data in the N layers of data (320). The method favors increased efficiency of data query.

Description

数据查询的方法和装置Method and device for data query 技术领域Technical field
本申请涉及计算机领域,并且更具体地,涉及数据查询的方法和装置。The present application relates to the field of computers and, more particularly, to methods and apparatus for data query.
背景技术Background technique
分布式数据库是指利用高速计算机网络将物理上分散的多个数据存储单元连接起来组成一个逻辑上统一的数据库。分布式数据库的基本思想是将原来集中式数据库中的数据分散存储到多个通过网络连接的数据存储节点上,以获取更大的存储容量和更高的并发访问量。近年来,随着数据量的高速增长,分布式数据库技术也得到了快速的发展,传统的关系型数据库开始从集中式模型向分布式架构发展,基于关系型的分布式数据库在保留了传统数据库的数据模型和基本特征下,从集中式存储走向分布式存储,从集中式计算走向分布式计算。Distributed database refers to the use of high-speed computer networks to connect physically dispersed multiple data storage units to form a logically unified database. The basic idea of a distributed database is to distribute the data in the original centralized database to multiple data storage nodes connected through the network to obtain larger storage capacity and higher concurrent access. In recent years, with the rapid growth of data volume, distributed database technology has also developed rapidly. Traditional relational databases have begun to evolve from centralized models to distributed architectures. Distributed databases based on relational types retain traditional databases. Under the data model and basic characteristics, from centralized storage to distributed storage, from centralized computing to distributed computing.
另一方面,随着数据量越来越大,关系型数据库开始暴露出一些难以克服的缺点,以NoSQL为代表的非关系型数据库,其高可扩展性、高并发性等优势出现了快速发展,一时间市场上出现了大量的键值(key-value)存储***、文档型数据库等NoSQL数据库产品。其中,文档型数据库中以文档存储作为数据库存储模型,可以支持结构化数据的存储和非结构化的数据的存储,并且对文档中存储的数据的结构没强制性的限定。On the other hand, with the increasing amount of data, relational databases have begun to expose some insurmountable shortcomings. The non-relational database represented by NoSQL has developed rapidly with its advantages of high scalability and high concurrency. At one time, there were a large number of NoSQL database products such as key-value storage systems and document databases. Among them, document storage as a database storage model in a document database can support storage of structured data and storage of unstructured data, and there is no mandatory limitation on the structure of data stored in the document.
目前,在文档型数据库的数据查询过程中,数据库中的管理***在接收用户输入的查询语句后,会按照固定的查询顺序对文件进行查询,例如,从旧文件到新文件的查询顺序,对存储数据所在的数据集合的文件进行迭代,以查询文件中的数据。At present, in the data query process of the document database, after receiving the query sentence input by the user, the management system in the database queries the file according to a fixed query order, for example, the query order from the old file to the new file, The file of the data collection in which the data is stored is iterated to query the data in the file.
然而,上述这种以固定的查询顺序查询文件,会导致查询数据的效率较低。例如,如果查询语句中的过滤条件仅仅是针对于存储所述目标应用的数据的层级结构中的底层数据的属性,且文件查询顺序为从旧文件到新文件的查询顺序,由于旧文件中存储的数据大多是存储数据集合的数据层级结构中靠近顶层的数据,那么在旧文件中查找数据时,由于查询语句中针对数据层级结构中位于顶层的数据的过滤条件为空,则会产生大量的不符合查询语句中全部过滤条件的中间结果,降低了查询数据的效率。However, querying files in a fixed query order as described above results in less efficient query data. For example, if the filter condition in the query statement is only an attribute for the underlying data in the hierarchical structure of the data storing the target application, and the file query order is the query order from the old file to the new file, since the old file is stored Most of the data is stored in the data hierarchy of the data collection close to the top level of data, then when looking up the data in the old file, because the filtering conditions in the query statement for the top-level data in the data hierarchy is empty, a large number of The intermediate result that does not meet all the filtering conditions in the query statement reduces the efficiency of querying data.
发明内容Summary of the invention
本申请提供一种数据查询的方法和装置,有利于提高查询数据的效率。The present application provides a method and apparatus for data query, which is beneficial to improving the efficiency of querying data.
第一方面,提供一种数据查询的方法,包括:获取查询语句,所述查询语句用于对数据集合的数据层级结构中的N层数据进行查询,所述数据层级结构为存储所述数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中创建时间最晚的文件,其中,N和K为大于1的正整数;当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底 层数据。The first aspect provides a data query method, including: acquiring a query statement, where the query statement is used to query N-layer data in a data hierarchy of a data set, where the data hierarchy is to store the data set a hierarchical structure of the data, and the data set is stored in K files in descending order of the levels in the data hierarchy, the K files including a first file and a second file, the first The file is the earliest file created in the K files, and the second file is the file with the latest creation time among the K files, where N and K are positive integers greater than 1; When the preset condition is met, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, wherein the preset condition includes the query The filter condition for querying the data set in the first sub-statement of the statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty, the first Statement is used to query the data top N layer data, the second sub-query statement to the bottom layer of the N data Layer data.
本申请实施例中的数据查方法,可以在查询语句中的第一子语句和第二子语句满足预设条件时,按照从第二文件到第一文件的顺序,对K个文件进行查询,利用第二子语句中的过滤条件先从第二文件(即新文件)开始查询数据,在一定程度上,可以减少现有技术中,针对上述满足预设条件的查询语句,依然采用从第一文件到第二文件的查询顺序,而产生的大量的不符合查询语句过滤条件的中间结果,有利于提高查询数据效率。The data checking method in the embodiment of the present application may query the K files according to the order from the second file to the first file when the first sub-sent and the second sub-sent in the query statement satisfy the preset condition. Using the filtering condition in the second sub-sentence to first query the data from the second file (ie, the new file), to a certain extent, the prior art can be reduced, and the query statement satisfying the preset condition is still adopted from the first The query sequence of the file to the second file, and a large number of intermediate results that do not meet the filtering conditions of the query statement, is beneficial to improve the efficiency of the query data.
结合第一方面,在一些实现方式中,所述方法还包括:根据所述查询语句中包含的查询所述数据集合中数据的属性和所述用于查询所述数据集合中数据的过滤条件,按照所述数据集合的数据层级结构,将所述查询语句划分为多条子语句。With reference to the first aspect, in some implementations, the method further includes: querying an attribute of the data in the data set and the filtering condition for querying data in the data set according to the query included in the query statement, The query statement is divided into a plurality of sub-statements according to a data hierarchy of the data set.
结合第一方面,在一些实现方式中,所述查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同层的数据,所述当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:当所述查询语句满足所述预设条件,且所述至少3条过滤条件为所述目标数据需要同时满足的过滤条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据。With reference to the first aspect, in some implementations, the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data located in different layers in the data hierarchy. When the query statement satisfies a preset condition, querying the K files according to a query order from the second file to the first file to obtain target data to be queried, including: When the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the query order from the second file to the first file, The K files are queried to obtain the target data to be queried.
本申请实施例中,若用于过滤位于所述数据层级结构中不同的层的数据的“跨层”过滤条件满足上述预设条件,则可以按照从第二文件到第一文件的查询顺序,在K个文件中进行查询,有利于减少不符合查询语句中全部过滤条件的中间结果的数量。In the embodiment of the present application, if the “cross-layer” filtering condition for filtering data located in different layers in the data hierarchical structure satisfies the foregoing preset condition, the query order from the second file to the first file may be followed. Querying in K files helps to reduce the number of intermediate results that do not match all the filtering conditions in the query.
结合第一方面,在一些实现方式中,所述按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,并且,对于所述K个文件中的每个文件按照第一顺序进行查询以获得所述目标数据,其中,所述第一顺序为按照文件的数据层级结构从底层到顶层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。With reference to the first aspect, in some implementations, the querying the K files to obtain target data to be queried according to a query order from the second file to the first file includes: When the query statement satisfies the preset condition, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, and for the K files Each of the files is queried in a first order to obtain the target data, wherein the first order is an order from bottom to top according to a data hierarchy of the file, and a hierarchical order in a data hierarchy of the file The hierarchical order of the data hierarchy of the data set is the same.
按照第一顺序,在每个文件中查询数据,有利于进一步减少不符合查询语句全部过滤条件的中间结果的数量。Querying the data in each file in the first order helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
结合第一方面,在一些实现方式中,所述方法还包括:在所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。With reference to the first aspect, in some implementations, the method further includes: when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
对于不满足预设条件的查询语句,按照从第一文件到第二文件的查询顺序查询,有利于提高查询数据的速度,例如,不满足预设条件的查询语句可以是该查询语句的第一子语句中的过滤条件不为空的查询语句,对于该类查询语句可以按照从第一文件到第二文件的查询顺序,利用数据层级结构,查询数据,也就是说,在第一文件中查询数据的过程中,如果待查询的数据不满足第一子语句中的过滤条件,则该不满足过滤条件的查询数据所在的数据层级结构中的下层数据可以不再查询。For a query statement that does not satisfy the preset condition, querying in the query order from the first file to the second file is beneficial to speed up the query data. For example, a query statement that does not satisfy the preset condition may be the first query statement. The query condition in the sub-statement is not empty. For the query statement of the class, the data hierarchy can be used to query the data according to the query order from the first file to the second file, that is, the query in the first file. In the process of data, if the data to be queried does not satisfy the filtering condition in the first sub-statement, the lower-level data in the data tier structure in which the query data that does not satisfy the filtering condition may not be queried.
结合第一方面,在一些实现方式中,所述在所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:当所述查询语句不满足所述预设条件时,按照从所述第一文件到 所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据,并且,对于所述K个文件中的每个文件按照第二顺序进行查询以获得所述目标数据,其中,所述第二顺序为按照文件的数据层级结构从顶层到底层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。With reference to the first aspect, in some implementations, when the query statement does not satisfy the preset condition, the K files are in accordance with a query order from the first file to the second file. Performing a query to obtain target data to be queried includes: when the query statement does not satisfy the preset condition, according to the first file The query order of the second file, querying the K files to obtain the target data to be queried, and querying each of the K files in a second order to obtain the Target data, wherein the second order is an order from top to bottom according to a data hierarchy of the file, and a hierarchical order in a data hierarchy of the file is the same as a hierarchical order of a data hierarchy of the data set.
按照第二顺序,在每个文件中查询数据,有利于进一步减少不符合查询语句全部过滤条件的中间结果的数量。In the second order, querying the data in each file helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
第二方面,提供一种数据查询的装置,所述终端包括用于执行第一方面中的方法的单元。In a second aspect, an apparatus for data query is provided, the terminal comprising means for performing the method of the first aspect.
第三方面,提供一种数据查询的装置,所述装置包括:存储器、处理器、输入/输出接口和通信接口。其中,存储器、处理器、输入/输出接口和通信接口之间存在通信连接,该存储器用于存储指令,该处理器用于执行该存储器存储的指令,当所述指令被执行时,所述处理器通过所述通信接口执行第一方面的方法,并控制输入/输出接口接收输入的数据和信息,输出操作结果等数据。In a third aspect, an apparatus for data query is provided, the apparatus comprising: a memory, a processor, an input/output interface, and a communication interface. Therein, there is a communication connection between the memory, the processor, the input/output interface and the communication interface, the memory is for storing instructions for executing the instructions stored by the memory, and when the instructions are executed, the processor The method of the first aspect is performed by the communication interface, and the input/output interface is controlled to receive input data and information, and output data such as an operation result.
第四方面,提供一种计算机可读介质,所述计算机可读介质存储用于终端设备执行的程序代码,所述程序代码包括用于执行第一方面中的方法的指令。In a fourth aspect, a computer readable medium storing program code for execution by a terminal device, the program code comprising instructions for performing the method of the first aspect.
第五方面,提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a fifth aspect, a computer program product comprising instructions, when executed on a computer, causes the computer to perform the methods described in the various aspects above.
本申请提供的技术方案有利于减少数据查询过程中不符合查询语句过滤条件的中间结果,以提高数据查询的效率。The technical solution provided by the present application is beneficial for reducing the intermediate result in the data query process that does not meet the filter condition of the query statement, so as to improve the efficiency of the data query.
附图说明DRAWINGS
图1是本申请实施例的文件生成的方法的示意性流程图。FIG. 1 is a schematic flowchart of a method for generating a file according to an embodiment of the present application.
图2是以本申请实施例的文件生成方法生成的文件的示意性框图。FIG. 2 is a schematic block diagram of a file generated by the file generating method of the embodiment of the present application.
图3是本申请实施例的数据查询的方法的示意性流程图。FIG. 3 is a schematic flowchart of a method for data query according to an embodiment of the present application.
图4是本申请另一实施例的数据查询的方法的示意性流程图。FIG. 4 is a schematic flowchart of a method for data query according to another embodiment of the present application.
图5是本申请实施例的数据查询的装置的示意性框图。FIG. 5 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
图6是本申请实施例的数据查询的装置的示意性框图。FIG. 6 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in the present application will be described below with reference to the accompanying drawings.
本申请实施例可以应用于数据库中,具体的,可以应用于文档型数据库。文档型数据库中以文档存储作为数据库存储模型,并且可以支持结构化数据的存储。用户可以通过向应用模块发送查询语句,应用模块再将用户发送的查询语句发送至数据库中的管理***,由数据库中的管理***根据查询语句查询文档型数据库中的存储的目标数据,最终可以将目标数据(即查询结果)通过屏幕呈现给用户。The embodiment of the present application can be applied to a database, and specifically, can be applied to a document type database. Document storage is used as a database storage model in document-based databases and can support the storage of structured data. The user can send the query statement to the application module, and the application module sends the query statement sent by the user to the management system in the database, and the management system in the database queries the stored target data in the document database according to the query statement, and finally can The target data (ie, the query result) is presented to the user through the screen.
上述文档型数据库存储的结构化数据可以是以数据层级结构存入文档型数据库中的数据,下文简单介绍本申请实施例的存储数据的数据层级结构。The structured data stored in the above-mentioned document type database may be data stored in the document type database in a data hierarchical structure. The data hierarchical structure of the stored data in the embodiment of the present application is briefly described below.
在分布式数据库中的模式(Schema)中可以存储某个应用对应的全部文件。每个Schema中存储的文件中存储数据的数据层级结构可以包含多个层级,并且可以使用每层的第一个字段唯一标识该层级中存储的数据,即使用每层的第一个字段作为主键,每层 还可以存储不同数据属性的数据。All files corresponding to an application can be stored in a schema in a distributed database. The data hierarchy of the data stored in the file stored in each schema can contain multiple levels, and the first field of each layer can be used to uniquely identify the data stored in the hierarchy, that is, using the first field of each layer as the primary key. , each floor It is also possible to store data for different data attributes.
下面以微信为例,详细说明在存储微信中的数据的Schema中,存储的数据所使用的的数据层级结构。存储微信中的数据的数据层级结构具体可以为:The following uses WeChat as an example to describe in detail the data hierarchy used by the stored data in the schema of the data stored in WeChat. The data hierarchy of the data stored in the WeChat can be:
Figure PCTCN2017086600-appb-000001
Figure PCTCN2017086600-appb-000001
上述数据层级结构具体可以分为4个层级,由数据层级结构的顶层到数据层级结构的底层依次为第一层、第二层、第三层和第四层(参见表1)。其中,第一层(又称“顶层”)可以以用户标识(User ID)作为主键,可以存储数据属性为User ID、姓名(Name)、性别(Sex)和生日(Birthday)的数据;第二层可以以话题标识(Topic ID)作为主键,可以存储数据属性为Topic ID、标题(Title)和话题的发布日期(T_Date)的数据;第三层可以以评论标识(Comment ID)作为主键,可以存储数据属性为Comment ID、评论内容(C_Content)、评论日期(C_Date)以及发表评论的用户的用户标识(C_User ID)的数据;第四层(又称“底层”)可以以回复标识(Feed ID)作为主键,可以存储数据属性为Feed ID、回复(Feed)内容(F_Content)和Feed的发布日期(F_Date)的数据。表1以表格的形式示出了上述存储微信中数据的层级结构。The above data hierarchy can be divided into four levels, from the top of the data hierarchy to the bottom of the data hierarchy: Layer 1, Layer 2, Layer 3, and Layer 4 (see Table 1). The first layer (also referred to as "top layer") may use a user ID as a primary key, and may store data with data attributes of User ID, Name, Sex, and Birthday; The layer may use a Topic ID as a primary key, and may store data of a Topic ID, a Title, and a topic's release date (T_Date); the third layer may use a Comment ID as a primary key. The storage data attribute is the data of the Comment ID, the comment content (C_Content), the comment date (C_Date), and the user ID (C_User ID) of the user who posted the comment; the fourth layer (also referred to as the "bottom layer") may be a reply ID (Feed ID) As a primary key, you can store data with data attributes as Feed ID, Reply (F_Content), and Feed Release Date (F_Date). Table 1 shows, in tabular form, the hierarchical structure of the data stored in the above WeChat.
表1Table 1
层级序号Hierarchical serial number 主键Primary key 数据属性Data attribute
11 User IDUser ID Name、Sex、BirthdayName, Sex, Birthday
22 Topic IDTopic ID Title、T_DateTitle, T_Date
33 Comment IDComment ID C_Content,C_Date,C_User IDC_Content, C_Date, C_User ID
44 Feed IDFeed ID F_Content,F_DateF_Content, F_Date
下面简单介绍基于上述数据层级结构的数据存储过程。数据在存入数据库的过程中,可以按照上文描述的数据层级结构,由数据层级结构的顶层到数据层级结构的底层依次进行数据存储的。在向数据库存入数据的过程中,数据库管理***可以通过刷新(Flush)操作,将缓存区中已经形成的数据块输出保存到磁盘中形成文件。下文结合图1详细描述本申请实施例的文件生成过程。The following is a brief introduction to the data storage process based on the above data hierarchy. In the process of storing data in the database, data storage may be sequentially performed from the top layer of the data hierarchy to the bottom of the data hierarchy according to the data hierarchy described above. In the process of importing data into the data inventory, the database management system can save the data block output formed in the buffer area to the disk to form a file by a flush operation. The file generation process of the embodiment of the present application is described in detail below with reference to FIG.
图1是本申请实施例的文件生成的方法的示意性流程图。图1所示的方法包括:FIG. 1 is a schematic flowchart of a method for generating a file according to an embodiment of the present application. The method shown in Figure 1 includes:
110,向数据库中存入User(1,…)。110. Deposit User(1,...) into the database.
111,向数据库中存入Topic(1,1,…)。111. Deposit Topic(1,1,...) into the database.
112,向数据库中存入Topic(1,2,…)。112. Deposit Topic (1, 2, ...) into the database.
113,向数据库中存入Comment(1,1,1,…)。113. Deposit Comment(1,1,1,...) into the database.
114,向数据库中存入Comment(1,1,2,…)。114. Deposit Comment(1,1,2,...) into the database.
115,生成文件1。115, generate file 1.
具体地,将步骤110至步骤114中存储在数据库的缓冲区中的数据,通过Flush操作,存储至数据库的磁盘中,形成文件1。Specifically, the data stored in the buffer of the database in steps 110 to 114 is stored in a disk of the database by a Flush operation to form a file 1.
116,向数据库中存入Feed(1,1,1,1,…)。 116. Deposit the feed (1, 1, 1, 1, ...) into the database.
117,向数据库中存入Feed(1,1,2,2,…)。117, deposit the feed (1, 1, 2, 2, ...) into the database.
118,向数据库中存入Comment(1,2,3,…)。118. Deposit Comment(1, 2, 3, ...) into the database.
具体地,向数据库中存入Comment(1,2,3,…)时,需要携带第三层的主键Comment ID,以及第一层的主键(User ID)和第二层的主键(Topic ID)。Specifically, when the Comment (1, 2, 3, ...) is stored in the database, the primary key Comment ID of the third layer, and the primary key (User ID) of the first layer and the primary key (Topic ID) of the second layer are required to be carried. .
119,向数据库中存入Feed(1,2,3,4…)。119. Deposit the feed (1, 2, 3, 4...) into the database.
120,生成文件2。120, generate file 2.
具体地,将步骤116至步骤119中存储在数据库的缓冲区中的数据,通过Flush操作,存储至数据库的磁盘中,形成文件2。Specifically, the data stored in the buffer of the database in steps 116 to 119 is stored in a disk of the database by a Flush operation to form a file 2.
应理解,在上述将数据存入文件的过程中可以按照同层数据存在同一文件中,但是相同上层不同下层之间的数据可以存在不同的文件中。It should be understood that in the process of depositing data into a file, the same file may exist in the same file, but data between different lower layers of the same upper layer may exist in different files.
需要说明的是,在以图1所示的文件生成的方法生成文件的过程中,还可以同时生成删除标签比特图(Delete Tag Bitmap)且该删除标签比特图可以与相应的文件一起存入磁盘中,该删除标签比特图用于指示文件中所存储的数据是否有效。其中,在删除标签比特图中,若比特位对应的取值为0(初始值)可以指该比特位对应的数据存储记录中未被删除,若比特位对应的取值为1可以指该比特位对应的数据存储记录中被删除。参见图2,文件1中存储了步骤110至步骤114存入数据库中的数据,共5条数据存储记录,则在文件1对应的删除标签比特图中可以使用5个比特位用于指示文件1中的5条数据记录对应的已存入数据库的数据是否被删除,文件1中存储的5条数据存储记录对应的5个比特位的取值都为0,也就是说,文件1中的5条数据存储记录都未被删除。同理,在删除标签比特图中,用于指示文件2中4条数据存储记录是否被删除的4个比特位的取值为0,说明文件2中存储的通过步骤116至步骤119存入的数据未被删除。It should be noted that, in the process of generating a file by the method of file generation shown in FIG. 1, a Delete Tag Bitmap may also be generated at the same time and the deleted tag bitmap may be stored in the disk together with the corresponding file. The delete tag bitmap is used to indicate whether the data stored in the file is valid. In the deleted label bit map, if the value corresponding to the bit is 0 (initial value), it may be that the data storage record corresponding to the bit is not deleted, and if the bit corresponding to the value is 1, the bit may be referred to. The bit corresponding to the data storage record is deleted. Referring to FIG. 2, in the file 1, the data stored in the database from step 110 to step 114 is stored, and a total of 5 data storage records are stored. In the deletion label bit map corresponding to the file 1, 5 bits can be used to indicate the file 1 Whether the data stored in the database corresponding to the 5 data records in the file is deleted, and the values of the 5 bits corresponding to the 5 data storage records stored in the file 1 are all 0, that is, 5 in the file 1 The data storage records have not been deleted. Similarly, in the delete tag bit map, the value of the 4 bits used to indicate whether the four data storage records in the file 2 are deleted is 0, indicating that the file stored in the file 2 is stored in steps 116 through 119. The data has not been deleted.
从图1所示的文件生成的方法中可以看出,文件的编号可以按照文件的生成顺序由小到大的编号,也就是说,文件的编号从小到大的顺序与从旧文件到新文件的顺序相同,文件的编号越大可以表示该文件生成的时间离当前时间越近,即文件越“新”;文件的编号越小可以表示该文件的生成时间离当前时间越远,即文件越“旧”。在按照数据层级结构向数据库中存入数据并生成新文件的过程中,若文件的编号越小,文件的生成时间离当前时间越远,文件越“旧”,则该文件中存储的数据位于数据层级结构的上层(例如,表1所示的第一层数据和第二层数据)的可能性越大;若文件的编号越大,文件的生成时间离当前时间越近,文件越“新”,则文件中存储的数据位于数据层级结构中的下层(例如,表1所示的第三层数据和第四层数据)的可能性越大。As can be seen from the method of file generation shown in Figure 1, the file number can be numbered from small to large according to the file generation order, that is, the file number is from small to large and from old to new. The order of the file is the same. The larger the number of the file, the closer the time the file is generated to the current time, that is, the more "new" the file is. The smaller the file number, the more the file generation time is from the current time, that is, the file is more "old". In the process of storing data into a database according to the data hierarchy and generating a new file, if the file number is smaller, the file generation time is farther from the current time, and the more "old" the file, the data stored in the file is located. The higher the upper layer of the data hierarchy (for example, the first layer data and the second layer data shown in Table 1); the larger the file number, the closer the file generation time is to the current time, and the more "new" the file The greater the likelihood that the data stored in the file is located in the lower layer of the data hierarchy (for example, the third layer data and the fourth layer data shown in Table 1).
本申请实施例利用上述新文件和旧文件与文件中存储的数据在数据层级结构中的层级关系提出了一种数据查询的方法,下面结合图3详细地说明本申请实施例的数据查询的方法。The embodiment of the present application provides a data query method by using the above-mentioned new file and the hierarchical relationship between the old file and the data stored in the file in the data hierarchy structure. The data query method of the embodiment of the present application is described in detail below with reference to FIG. .
图3是本申请实施例的数据查询的方法的示意性流程图。应理解,图3所示的方法可以由数据库中的管理***执行,例如,分布式数据管理***执行。图3所示的方法包括:FIG. 3 is a schematic flowchart of a method for data query according to an embodiment of the present application. It should be understood that the method illustrated in FIG. 3 may be performed by a management system in a database, such as a distributed data management system. The method shown in Figure 3 includes:
310,获取查询语句,所述查询语句用于对数据层级结构中的N层数据进行查询,所述数据层级结构为存储数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中 创建时间最晚的文件,其中N和K为大于1的正整数。310. Acquire a query statement, where the query statement is used to query N-layer data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in a data set, and the data set is in accordance with the data hierarchy. The middle level is stored in K files in a high-to-low order, the K files including a first file and a second file, the first file being the file with the earliest creation time among the K files, the first Two files are in the K files The file with the latest creation time, where N and K are positive integers greater than one.
具体地,上述数据集合按照数据层级结构中层级由高到低的顺序存储在K个文件中,可以指,K个文件中存储的数据集合中的数据大致上可以按照数据层级结构中层级由高到低的顺序存储在K个文件中,不排除在按照上述规则向K个文件中存入数据的过程中,存在新***的靠近数据层级结构中较高层的数据存入在新文件中的情况。例如,图2所示的生成后的文件2中(即新文件)存储有数据Comment(1,2,3,…)。Specifically, the foregoing data set is stored in K files in descending order of the level in the data hierarchy, and may refer to that the data in the data set stored in the K files may be substantially in accordance with the level in the data hierarchy. It is stored in K files in a low order, and it is not excluded that in the process of storing data into K files according to the above rules, there is a case where newly inserted data of a higher layer close to the data hierarchy is stored in a new file. . For example, in the generated file 2 shown in FIG. 2 (that is, a new file), data Comment(1, 2, 3, ...) is stored.
上述第一文件为K个文件中创建时间最早的文件,也就是说,第一文件是上文中提到的旧文件。The first file described above is the file with the earliest creation time among the K files, that is, the first file is the old file mentioned above.
上述第二文件为K个文件中创建时间最晚的文件,也就是说,第二文件是上文中提到的新文件。The second file above is the file with the latest creation time among the K files, that is, the second file is the new file mentioned above.
应理解,上述数据集合可以是数据库中存储的某个应用中的全部数据的集合,例如,可以是微信中全部数据的数据集合。It should be understood that the above data set may be a collection of all data in an application stored in the database, for example, may be a data set of all data in the WeChat.
还应理解,上述查询语句可以是用于范围查询(range query)的查询语句,用于查找符合过滤条件的至少一个数据,例如,Scan查询语句或带key的范围查询语句。It should also be understood that the above query statement may be a query statement for a range query for finding at least one data that meets the filter condition, for example, a Scan query statement or a range query statement with a key.
例如,查询语句Q1为:Select Topic ID,Title from WeiXin where Birthday=“1998”and Title like“***”时,查询语句Q1中包含的数据的属性为:Topic ID和标题(Title),查询语句Q1中的过滤条件包括:“Birthday=“1998””和“Title like“***””。表2示出了查询语句Q1中每个子语句的数据的属性和过滤条件。查询语句Q1中可以包含两个子语句:子语句1和子语句2,子语句1中的过滤条件为“Birthday=“1998””,且子语句1中的数据的属性为空,参见表1所示的数据层级结构,该过滤条件用于过滤位于数据层级结构中第一层的数据的属性为“Birthday”的数据,即子语句1用于查询位于表1所示的数据层级结构中的第一层的数据;子语句2中的过滤条件为“Title like“***””,且数据的属性为“Topic ID、Title”,参见表1所示的数据层级结构,子语句2用于查询位于表1所示的数据层级结构中第二层的数据。For example, when the query statement Q1 is: Select Topic ID, Title from WeiXin where Birthday = "1998" and Title like "***", the attributes of the data contained in the query statement Q1 are: Topic ID and Title (Title), query The filter conditions in statement Q1 include: "Birthday="1998"" and "Title like "***"". Table 2 shows the attributes and filter conditions of the data of each sub-statement in the query statement Q1. The query statement Q1 can contain two sub-statements: sub-statement 1 and sub-statement 2. The filter condition in sub-statement 1 is "Birthday="1998"", and the attribute of the data in sub-statement 1 is empty, as shown in Table 1. Data hierarchy structure for filtering data of the attribute of the first layer in the data hierarchy to "Birthday", that is, the sub-statement 1 is used to query the first in the data hierarchy shown in Table 1. Layer data; the filter condition in sub-statement 2 is "Title like "***"", and the attribute of the data is "Topic ID, Title", see the data hierarchy shown in Table 1, sub-statement 2 is used for the query. The data of the second layer in the data hierarchy shown in Table 1.
表2Table 2
子语句Substatement 数据的属性Data attribute 过滤条件Filter condition
子语句1Substatement 1   Birthday=“1998”Birthday=“1998”
子语句2Substatement 2 Topic ID、TitleTopic ID, Title Title like“***”Title like "***"
需要说明的是,上述查询语句中子语句的数据的属性为空,可以指该子语句仅仅是对数据层级结构中某一层的数据的属性进行过滤,可以不用在查询结果中体现该层中的数据的属性,例如,查询语句Q1中的子语句1仅仅是对属性为“Birthday”的数据进行过滤,在查询语句Q1对应的查询结果中,可以不用体现子语句1查询的第一层中数据的属性。It should be noted that the attribute of the data of the sub-statement in the above query statement is empty, and the sub-statement may only filter the attribute of the data of a certain layer in the data hierarchy, and may not be reflected in the layer in the query result. The attribute of the data, for example, the sub-statement 1 in the query statement Q1 is only filtering the data whose attribute is "Birthday". In the query result corresponding to the query statement Q1, the first layer in the query of the sub-statement 1 may not be used. The properties of the data.
320,当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底层数据。320. When the query statement meets a preset condition, query the K files to obtain target data to be queried according to a query order from the second file to the first file, where The preset condition includes that the filter condition for querying the data set in the first sub-statement of the query statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty The first sub-statement is used to query top-level data in the N-layer data, and the second sub-statement is used to query bottom-level data in the N-layer data.
具体地,上述第一子语句中用于查询所述数据集合的过滤条件为空,可替换的,第 一子语句中不存在过滤条件。上述第二子语句中用于查询所述数据集合的过滤条件不为空,可替换的,第二子语句中存在过滤条件。Specifically, the filtering condition for querying the data set in the foregoing first sub-statement is empty, replaceable, There is no filter in a substatement. The filtering condition for querying the data set in the second sub-sentence is not empty, and the filtering condition exists in the second sub-statement.
应理解,上述子语句中可以包含用于查询所述数据集合过滤条件和数据的属性,其中,用于查询所述数据集合的过滤条件和数据的属性中的任意一项可以为空,或者用于查询所述数据集合的过滤条件和数据的属性都不为空。It should be understood that the foregoing sub-statement may include an attribute for querying the data set filtering condition and data, wherein any one of the filtering condition and the attribute of the data used for querying the data set may be empty, or The filter conditions and data attributes of the query data set are not empty.
还应理解,上述按照从第二文件到第一文件的查询顺序,对所述K个文件进行查询,可以指按照从第二文件到第一文件的查询顺序,依次对K个文件进行查询,每次可以查询K个文件中的一个文件;还可以指按照从第二文件到第一文件的查询顺序,将K个文件进行分组,并按照从第二文件到第一文件的查询顺序,每次可以查询一组内的多个文件。It should also be understood that the querying the K files according to the query order from the second file to the first file may refer to sequentially querying K files according to the query order from the second file to the first file. Each time one file of K files can be queried; it can also refer to grouping K files according to the query order from the second file to the first file, and in accordance with the query order from the second file to the first file, You can query multiple files within a group.
还应理解,本申请实施例对于该查询语句中除第一子语句和第二子语句之外的子语句中用于查询数据集合的过滤条件是否为空不作具体限定。It should also be understood that the embodiment of the present application does not specifically define whether the filtering condition for querying the data set in the sub-words other than the first sub-word and the second sub-word in the query statement is empty.
例如,在上述查询语句Q1包含2条子语句中,这2条子语句用于查询表1所示的数据层级结构中的位于第一层的数据和位于第二层的数据,其中,子语句1可以看作查询语句Q1中的“第一子语句”,用于查询2层数据中的第一层(顶层)中的数据;子语句2可以看作查询语句Q1中的“第二子语句”,用于查询2层数据中的第二层(底层)中的数据。然而,由于子语句1中的过滤条件不为空,查询语句Q1不满足上述预设条件。For example, in the above query statement Q1, the two sub-statements are used to query the data located in the first layer and the data located in the second layer in the data hierarchy shown in Table 1, wherein the sub-statement 1 can It is regarded as the "first sub-statement" in the query statement Q1, which is used to query the data in the first layer (top layer) of the layer 2 data; the sub-statement 2 can be regarded as the "second sub-statement" in the query statement Q1. Used to query data in the second layer (bottom layer) in the layer 2 data. However, since the filter condition in the sub-statement 1 is not empty, the query statement Q1 does not satisfy the above-mentioned preset condition.
又例如,查询语句Q2为:Select Name,Topic ID,Title from Table where Title like“***”时,查询语句Q2中包含的数据的属性包括“Name”和“Topic ID、Title”,查询语句Q2中包含的数据需要满足的过滤条件包括:“Title like“***””。表3示出了查询语句Q2中每个子语句的数据的属性和过滤条件。查询语句Q2中可以包含两个子语句:子语句3和子语句4,子语句3中的数据的属性为“Name”,且子语句3中的过滤条件为空,参见表1所示的数据层级结构,该数据的属性位于数据层级结构中第一层;子语句4中的过滤条件为“Title like“***””,且数据的属性为“Topic ID、Title”,参见表1所示的数据层级结构,子语句4用于查询位于表1所示的数据层级结构中第二层的数据。For another example, when the query statement Q2 is: Select Name, Topic ID, Title from Table where Title like "***", the attributes of the data included in the query statement Q2 include "Name" and "Topic ID, Title", and the query statement The filtering conditions that need to be met in the data contained in Q2 include: "Title like "***"". Table 3 shows the attributes and filter conditions of the data of each sub-statement in the query statement Q2. The query statement Q2 can contain two sub-statements: sub-statement 3 and sub-statement 4, the attribute of the data in the sub-statement 3 is "Name", and the filter condition in the sub-statement 3 is empty, see the data hierarchy shown in Table 1. The attribute of the data is located in the first layer of the data hierarchy; the filter condition in sub-statement 4 is "Title like "***"", and the attribute of the data is "Topic ID, Title", as shown in Table 1. The data hierarchy structure, sub-statement 4 is used to query the data of the second layer in the data hierarchy shown in Table 1.
表3table 3
子语句Substatement 数据的属性Data attribute 过滤条件Filter condition
子语句3Substatement 3 NameName  
子语句4Substatement 4 Topic ID、TitleTopic ID, Title Title like“***”Title like "***"
在上述查询语句Q2包含2条子语句中,这2条子语句用于查询表1所示的数据层级结构中的位于第一层的数据和位于第二层的数据,其中,子语句3可以看作查询语句Q2中的“第一子语句”,用于查询2层数据中的第一层(顶层)中的数据;子语句4可以看作查询语句Q2中的“第二子语句”,用于查询2层数据中的第二层(底层)中的数据。由于子语句3中的过滤条件为空,且子语句4中的过滤条件不为空,则查询语句Q2满足上述预设条件。In the above query statement Q2, which includes two sub-statements, the two sub-statements are used to query the data located at the first layer and the data located at the second layer in the data hierarchy shown in Table 1, wherein the sub-statement 3 can be regarded as The "first sub-statement" in the query statement Q2 is used to query the data in the first layer (top layer) of the layer 2 data; the sub-statement 4 can be regarded as the "second sub-statement" in the query statement Q2, for Query the data in the second layer (bottom layer) of the 2 layers of data. Since the filter condition in the sub-statement 3 is empty, and the filter condition in the sub-statement 4 is not empty, the query statement Q2 satisfies the above-mentioned preset condition.
本申请实施例中的数据查方法,可以在查询语句中的第一子语句和第二子语句满足预设条件时,按照从第二文件到第一文件的顺序,对K个文件进行查询,利用第二子语句中的过滤条件先从第二文件(即新文件)开始查询数据,在一定程度上,可以减少现 有技术中,针对上述满足预设条件的查询语句,依然采用从第一文件到第二文件的查询顺序,产生的大量的不符合查询语句过滤条件的中间结果,有利于提高查询数据效率。The data checking method in the embodiment of the present application may query the K files according to the order from the second file to the first file when the first sub-sent and the second sub-sent in the query statement satisfy the preset condition. Using the filter condition in the second sub-statement to first query the data from the second file (ie, the new file), to a certain extent, can reduce the current In the prior art, for the above query statement satisfying the preset condition, the query order from the first file to the second file is still used, and a large number of intermediate results that do not meet the filter condition of the query statement are generated, which is beneficial to improving the efficiency of the query data.
可选地,所述查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同的层的数据,步骤320,还包括:若所述查询语句满足预设条件,且所述至少3条过滤条件为数据需要同时满足的过滤条件,则按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询。Optionally, the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data of different layers in the data hierarchy, and step 320 further includes: If the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the data needs to be satisfied at the same time, the K items are in accordance with the query order from the second file to the first file. The file is queried.
具体地,上述至少3条过滤条件中不同过滤条件用于过滤位于数据层级结构中不同层的数据,可以称为“跨层”的过滤条件,“跨层”的过滤条件之间可以通过逻辑运算符AND连接,也就是说,数据需要同时满足“跨层”的过滤条件,或者说,数据需要同时满足至少3条过滤条件中的每条过滤条件。Specifically, different filtering conditions in the at least three filtering conditions are used to filter data located in different layers in the data hierarchical structure, which may be referred to as “cross-layer” filtering conditions, and logical operations may be performed between “cross-layer” filtering conditions. The AND connection, that is, the data needs to meet the "cross-layer" filter condition at the same time, or the data needs to satisfy each of the at least 3 filter conditions.
例如,查询语句Q1中包含的过滤条件“Birthday=“1998””和“Title like“***””分别是对数据层级结构中位于第一层中的数据的过滤条件和数据层级结构中位于第二层中的数据的过滤条件,也就是上述“跨层”的过滤条件。并且,从查询语句Q1中可以看出两个过滤条件之间通过“and”连接,也就是说,数据需要同时满足这两个“跨层”的过滤条件。For example, the filter conditions "Birthday="1998"" and "Title like"***" contained in the query statement Q1 are respectively located in the filter condition and data hierarchy of the data located in the first layer in the data hierarchy. The filtering condition of the data in the second layer, that is, the above-mentioned "cross-layer" filtering condition. Moreover, it can be seen from the query statement Q1 that the two filter conditions are connected by "and", that is, the data needs to satisfy the two "cross-layer" filter conditions at the same time.
本申请实施例中,若用于过滤位于所述数据层级结构中不同的层的数据的“跨层”过滤条件满足上述预设条件,则可以按照从第二文件到第一文件的查询顺序,在K个文件中进行查询,有利于减少不符合查询语句中全部过滤条件的中间结果的数量。In the embodiment of the present application, if the “cross-layer” filtering condition for filtering data located in different layers in the data hierarchical structure satisfies the foregoing preset condition, the query order from the second file to the first file may be followed. Querying in K files helps to reduce the number of intermediate results that do not match all the filtering conditions in the query.
可选地,作为一个实施例,步骤320还包括:当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询,并且,对于所述K个文件中的每个文件按照第一顺序进行查询以获得所述目标数据,其中,所述第一顺序为按照文件的数据层级结构从底层到顶层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。Optionally, as an embodiment, step 320 further includes: when the query statement meets a preset condition, querying the K files according to a query order from the second file to the first file And querying, in the first order, each of the K files to obtain the target data, wherein the first order is an order from bottom to top according to a data hierarchy of the file, The hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
具体地,上述文件的数据层级结构可以指每个文件中存储的数据的数据层级结构,可以看做是数据集合的数据层级结构的子数据层级结构。也就是说,文件的数据层级结构包含部分数据集合的数据层级结构中的层级,文件的数据层级结构用于表示文件中存储的数据之间的层级关系或层级顺序。Specifically, the data hierarchical structure of the above file may refer to a data hierarchical structure of data stored in each file, and may be regarded as a sub-data hierarchical structure of a data hierarchical structure of the data set. That is to say, the data hierarchy of the file contains the hierarchy in the data hierarchy of the partial data set, and the data hierarchy of the file is used to represent the hierarchical relationship or hierarchical order between the data stored in the file.
例如,在图2所示的文件1中的文件的数据结构包括表1所示的数据层级结构中的第一层级的数据User ID、第二层级的数据Topic ID和第三层级的数据Comment ID,可以看出文件1中的文件的数据层级结构可以是表1中示出的数据层级结构(4个层级)的子数据层级结构。For example, the data structure of the file in the file 1 shown in FIG. 2 includes the first level data User ID, the second level data Topic ID, and the third level data Comment ID in the data hierarchy shown in Table 1. It can be seen that the data hierarchy of the files in the file 1 can be the sub-data hierarchical structure of the data hierarchy (four levels) shown in Table 1.
需要说明的是,可以在查询语句满足预设条件的情况下,按照第一顺序查询K个文件中的每个文件;或者在查询语句满足预设条件,且查询语句中的“跨层”的过滤条件为数据需要同时满足的过滤条件的情况下,按照第一顺序查询K个文件中的每个文件。It should be noted that, in the case that the query statement satisfies the preset condition, each file in the K files may be queried in the first order; or the query condition satisfies the preset condition, and the “cross-layer” in the query statement In the case where the filtering condition is that the data needs to satisfy the filtering condition at the same time, each of the K files is queried in the first order.
按照第一顺序,在每个文件中查询数据,有利于进一步减少不符合查询语句全部过滤条件的中间结果的数量。Querying the data in each file in the first order helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
可选地,作为一个实施例,所述方法还包括:330,根据所述查询语句中包含的查询所述数据集合中数据的属性和所述用于查询所述数据集合中数据的过滤条件,按照所述数据层级结构,将所述查询语句划分为多条子语句。Optionally, as an embodiment, the method further includes: 330, according to querying, in the query statement, an attribute of querying data in the data set and the filtering condition for querying data in the data set, According to the data hierarchy, the query statement is divided into a plurality of sub-statements.
具体地,上述按照所述数据层级结构,将所述查询语句划分为多条子语句,可以指 确定查询语句中的查询数据集合中数据的属性在数据层级结构中的层级,和过滤条件是针对位于数据层级结构中的那层的数据进行的限定,确定查询语句中每条子语句中包含的查询数据集合中数据的属性和查询数据集合中数据的过滤条件,其中查询数据集合中数据的属性和过滤条件过滤的数据的位于层级结构相同的层级时,可以合并为一条子语句,并且查询语句中不同的子语句用于查询位于数据层级结构中不同层的数据。Specifically, according to the data hierarchy structure, the query statement is divided into multiple sub-words, which may refer to Determining the level of the attribute of the data in the query data set in the query statement in the data hierarchy, and the filter condition is to define the data of the layer located in the data hierarchy, and determine the query included in each sub-statement in the query statement. The attribute of the data in the data set and the filtering condition of the data in the query data set, wherein when the attribute of the data in the query data set and the data filtered by the filter condition are at the same level of the hierarchical structure, they may be merged into one sub-statement, and the query statement Different sub-statements are used to query data located at different levels in the data hierarchy.
可选地,作为一个实施例,所述方法还包括:340,在所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。Optionally, as an embodiment, the method further includes: 340, when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
具体地,若查询语句不满足预设条件,则可以按照从旧文件到新文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。Specifically, if the query statement does not meet the preset condition, the K files may be queried according to the query order from the old file to the new file to obtain the target data to be queried.
对于不满足预设条件的查询语句,按照从第一文件到第二文件的查询顺序查询,有利于提高查询数据的速度,例如,不满足预设条件的查询语句可以是该查询语句的第一子语句中的过滤条件不为空的查询语句,对于该类查询语句可以按照从第一文件到第二文件的查询顺序,利用数据层级结构,查询数据,也就是说,在第一文件中查询数据的过程中,如果待查询的数据不满足第一子语句中的过滤条件,则该不满足过滤条件的查询数据所在的数据层级结构中的下层数据可以不再查询。For a query statement that does not satisfy the preset condition, querying in the query order from the first file to the second file is beneficial to speed up the query data. For example, a query statement that does not satisfy the preset condition may be the first query statement. The query condition in the sub-statement is not empty. For the query statement of the class, the data hierarchy can be used to query the data according to the query order from the first file to the second file, that is, the query in the first file. In the process of data, if the data to be queried does not satisfy the filtering condition in the first sub-statement, the lower-level data in the data tier structure in which the query data that does not satisfy the filtering condition may not be queried.
可选地,查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同的层的数据时,若所述查询语句满足所述预设条件,但是所述至少3条过滤条件中不为数据需要同时满足的过滤条件,则按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。Optionally, the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used to filter data located in different layers in the data hierarchy, if the query statement satisfies Presetting the condition, but the filtering condition that the data needs to be satisfied at the same time is not the filtering condition, the querying the K files according to the query order from the first file to the second file Obtaining the target data to be queried.
具体地,上述至少3条过滤条件中不为数据需要同时满足的过滤条件,可以指数据满足上述至少3条过滤条件中的任意一条,或者数据满足上述至少3条过滤条件中的任意两条,换句话说,上述至少3条过滤条件之间通过逻辑或(OR)连接。Specifically, the filtering condition that the data needs to be satisfied at the same time is not the filtering condition, and the data may satisfy any one of the at least three filtering conditions, or the data satisfies any two of the at least three filtering conditions. In other words, the above at least three filter conditions are connected by a logical OR.
可选地,作为一个实施例,步骤340还包括:当所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据,并且,对于所述K个文件中的每个文件按照第二顺序进行查询以获得所述目标数据,其中,所述第二顺序为按照文件的数据层级结构从顶层到底层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。Optionally, in an embodiment, step 340 further includes: when the query statement does not satisfy the preset condition, according to a query order from the first file to the second file, the K The file is queried to obtain the target data to be queried, and each of the K files is queried in a second order to obtain the target data, wherein the second order is according to a file. The order of the data hierarchy from top to bottom, the hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
具体地,上述文件的数据层级结构可以指每个文件中存储的数据的数据层级结构,可以看做是数据集合的数据层级结构的子数据层级结构。也就是说,文件的数据层级结构包含部分数据集合的数据层级结构中的层级,文件的数据层级结构用于表示文件中存储的数据之间的层级关系或层级顺序。Specifically, the data hierarchical structure of the above file may refer to a data hierarchical structure of data stored in each file, and may be regarded as a sub-data hierarchical structure of a data hierarchical structure of the data set. That is to say, the data hierarchy of the file contains the hierarchy in the data hierarchy of the partial data set, and the data hierarchy of the file is used to represent the hierarchical relationship or hierarchical order between the data stored in the file.
上述以第二顺序查询所述K个文件中的每个文件,所述第二顺序是所述文件的数据层级结构中由顶层到底层的顺序,也就是说,在对每个文件进行查询的过程中,从每个文件的数据层级结构的顶层向文件的数据层级结构的底层查询。Each of the K files is queried in the second order, and the second order is a top-to-bottom order in the data hierarchy of the file, that is, inquiring each file. In the process, from the top level of the data hierarchy of each file to the underlying query of the data hierarchy of the file.
需要说明的是,可以在查询语句不满足预设条件的情况下,按照第二顺序查询K个文件中的每个文件;或者在查询语句满足预设条件,但是不满足查询语句中的“跨层”的过滤条件为数据需要同时满足的过滤条件的情况下,按照第二顺序查询K个文件中的 每个文件。It should be noted that, in the case that the query statement does not satisfy the preset condition, each file in the K files may be queried in the second order; or the query condition satisfies the preset condition, but does not satisfy the "cross" in the query statement. In the case where the filtering condition of the layer is that the data needs to satisfy the filtering condition at the same time, the K files in the second order are queried. Every file.
按照第二顺序,在每个文件中查询数据,有利于进一步减少不符合查询语句全部过滤条件的中间结果的数量。In the second order, querying the data in each file helps to further reduce the number of intermediate results that do not meet the full filter criteria of the query.
下面结合图4,更加详细地描述本申请实施例的数据查询的方法。应理解,图4仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体场景。本领域技术人员根据图4所示的例子,显然可以进行各种等价的变化或修改,这样的变化或修改也落入本申请实施例的范围内。需要说明的是,为了便于理解,在下文描述数据查询的方法的过程中,用新文件替代第二文件,用旧文件替代第一文件,进行说明。The method of data query in the embodiment of the present application is described in more detail below with reference to FIG. It should be understood that FIG. 4 is only intended to help those skilled in the art to understand the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific examples illustrated. A person skilled in the art can obviously make various equivalent changes or modifications according to the example shown in FIG. 4, and such changes or modifications are also within the scope of the embodiments of the present application. It should be noted that, in order to facilitate understanding, in the process of describing the data query method below, the second file is replaced with a new file, and the first file is replaced with the old file for explanation.
图4是本申请实施例的数据查询的方法的示意性流程图。图4所示的方法包括:FIG. 4 is a schematic flowchart of a method for data query according to an embodiment of the present application. The method shown in Figure 4 includes:
410,接收查询语句,该查询语句用于查询微信中的目标数据。410. Receive a query statement, where the query statement is used to query target data in the WeChat.
具体地,查询语句包含目标数据的属性和目标数据需要满足的过滤条件。Specifically, the query statement contains the attributes of the target data and the filter conditions that the target data needs to satisfy.
420,根据查询语句中包含的所述目标数据的属性和所述目标数据需要满足的过滤条件,按照目标数据层级结构,将查询语句划分为N条子语句。420. Divide the query statement into N sub-sentals according to the target data hierarchical structure according to the attribute of the target data included in the query statement and the filtering condition that the target data needs to satisfy.
具体地,对查询语句Q1进行划分后,可以得到2条子语句,参见表2;以及对查询语句Q2进行划分后,可以得到2条子语句,参见表3。Specifically, after the query statement Q1 is divided, two sub-statements can be obtained, as shown in Table 2; and after the query statement Q2 is divided, two sub-statements can be obtained, as shown in Table 3.
430,确定查询语句是否满足预设规则。430. Determine whether the query statement satisfies a preset rule.
具体地,上述预设规则可以为查询语句的第一子语句中的过滤条件为空,且查询语句的第二子语句中的过滤条件不为空,且查询语句中“跨层”的过滤条件之间为“and”操作。Specifically, the foregoing preset rule may be that the filtering condition in the first sub-statement of the query statement is empty, and the filtering condition in the second sub-statement of the query statement is not empty, and the filtering condition of “cross-layer” in the query statement The operation is between "and".
440,若查询语句满足预设规则,则按照从新文件到旧文件的查询顺序,从用于存储微信中目标数据的文件中查询目标数据。440. If the query statement satisfies the preset rule, query the target data from the file for storing the target data in the WeChat according to the query order from the new file to the old file.
450,若查询语句不满足预设规则,则按照从旧文件到新文件的查询顺序,从用于存储微信中目标数据的文件中查询目标数据。450. If the query statement does not satisfy the preset rule, query the target data from the file for storing the target data in the WeChat according to the query order from the old file to the new file.
460,确定用于存储微信中目标数据的文件中是否有下一个目标数据。460. Determine whether there is a next target data in the file for storing the target data in the WeChat.
具体地,若用于存储微信中目标数据的文件中有下一个目标数据,则执行步骤470;若用于存储微信中目标数据的文件中没有下一个目标数据,则结束目标数据查询过程。Specifically, if there is a next target data in the file for storing the target data in the WeChat, step 470 is performed; if there is no next target data in the file for storing the target data in the WeChat, the target data query process is ended.
470,根据查询语句中包含的过滤条件,从用于存储微信中目标数据的文件中选择符合过滤条件的目标数据。470. Select, according to the filtering condition included in the query statement, the target data that meets the filtering condition from the file used to store the target data in the WeChat.
480,确定符合过滤条件的目标数据是否被删除。480. Determine whether the target data that meets the filtering condition is deleted.
具体地,可以通过符合过滤条件的目标数据在删除标签比特图中的删除标签确定该符合过滤条件的目标数据是否被删除,若该符合过滤条件的目标数据在删除标签比特图中的删除标签取值为0,说明该符合过滤条件的目标数据未被删除,执行步骤490;若该符合过滤条件的目标数据在删除标签比特图中的删除标签取值为1,说明该符合过滤条件的目标数据被删除,则丢弃该目标数据,执行步骤460。Specifically, whether the target data that meets the filtering condition is deleted by deleting the deleted label in the label bit map by the target data that meets the filtering condition, if the target data that meets the filtering condition is deleted in the deleted label bit map If the value is 0, the target data that meets the filtering condition is not deleted, and step 490 is performed; if the target data that meets the filtering condition has a value of 1 in the deleted label bit map, the target data that meets the filtering condition is indicated. If it is deleted, the target data is discarded, and step 460 is performed.
490,将该符合过滤条件的目标数据存入中间结果集。490. Store the target data that meets the filtering condition into an intermediate result set.
具体地,若该符合过滤条件的目标数据在删除标签比特图中的删除标签取值为0,说明该符合过滤条件的目标数据未被删除,且满足查询语句中的过滤条件,可以将该符合过滤条件的目标数据放入中间结果集后,执行步骤460。Specifically, if the target data that meets the filtering condition has a value of 0 in the deleted label bit map, indicating that the target data that meets the filtering condition is not deleted, and the filtering condition in the query statement is satisfied, the matching may be performed. After the target data of the filter condition is placed in the intermediate result set, step 460 is performed.
图5是本申请实施例的数据查询的装置的示意性框图。图5所示的装置500包括: 获取单元510和查询单元520。FIG. 5 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application. The apparatus 500 shown in Figure 5 includes: The obtaining unit 510 and the query unit 520.
获取单元510,用于获取查询语句,所述查询语句用于对数据层级结构中的N层数据进行查询,所述数据层级结构为存储数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中创建时间最晚的文件,其中N和K为大于1的正整数;The obtaining unit 510 is configured to obtain a query statement, where the query statement is used to query N-level data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in the data set, and the data set is in accordance with the The levels in the data hierarchy are stored in K files in descending order, the K files including a first file and a second file, the first file being the file with the earliest creation time among the K files The second file is the file with the latest creation time among the K files, where N and K are positive integers greater than one;
查询单元520,用于当所述获取单元获取的所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,The query unit 520 is configured to: when the query statement acquired by the obtaining unit meets a preset condition, query the K files according to a query order from the second file to the first file to obtain Target data to be queried,
其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底层数据。The preset condition includes that the filtering condition for querying the data set in the first sub-sent of the query statement is empty, and the filtering of the data set is used in the second sub-sent of the query statement. The condition is not empty, the first sub-statement is used to query top-level data in the N-layer data, and the second sub-statement is used to query bottom-level data in the N-layer data.
本申请实施例中,可以在查询语句中的第一子语句和第二子语句满足预设条件时,按照从第二文件到第一文件的顺序,对K个文件进行查询,利用第二子语句中的过滤条件先从第二文件(即新文件)开始查询数据,在一定程度上,可以减少现有技术中,针对上述满足预设条件的查询语句,依然采用从第一文件到第二文件的查询顺序,产生的大量的不符合查询语句过滤条件的中间结果,有利于提高查询数据效率。In the embodiment of the present application, when the first sub-sent and the second sub-sentence in the query statement satisfy the preset condition, the K files are queried according to the order from the second file to the first file, and the second sub-query is utilized. The filtering condition in the statement first starts to query the data from the second file (ie, the new file), and to a certain extent, can reduce the query in the prior art for the above-mentioned satisfying the preset condition, still adopting the first file to the second The query order of the files, the large number of intermediate results that do not meet the filter conditions of the query, is conducive to improving the efficiency of the query data.
可选地,作为一个实施例,所述装置还包括:确定单元,用于根据所述查询语句中包含的查询所述数据集合中数据的属性和所述用于查询所述数据集合中数据的过滤条件,按照所述数据层级结构,将所述查询语句划分为多条子语句。Optionally, as an embodiment, the device further includes: a determining unit, configured to query, according to the query, the attribute of the data in the data set and the data used to query the data set in the query statement Filtering conditions, according to the data hierarchy structure, dividing the query statement into multiple sub-statements.
可选地,作为一个实施例,所述查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同的层的数据,所述查询单元具体还用于:当所述查询语句满足所述预设条件,且所述至少3条过滤条件为所述目标数据需要同时满足的过滤条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询。Optionally, as an embodiment, the query statement includes at least three filtering conditions, and different filtering conditions of the at least three filtering conditions are used to filter data located in different layers in the data hierarchical structure. The query unit is further configured to: when the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the second file to the The query order of the first file is performed, and the K files are queried.
可选地,作为一个实施例,所述查询单元还用于:当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询,并且,对于所述K个文件中的每个文件按照第一顺序进行查询以获得所述目标数据,其中,所述第一顺序为按照文件的数据层级结构从底层到顶层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。Optionally, as an embodiment, the querying unit is further configured to: when the query statement meets a preset condition, according to a query order from the second file to the first file, to the K The file is queried, and each of the K files is queried in a first order to obtain the target data, wherein the first order is in order from bottom to top according to a data hierarchy of the file. The hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set.
可选地,作为一个实施例,所述查询单元还用于:若所述查询语句不满足所述预设条件,则按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。Optionally, as an embodiment, the querying unit is further configured to: if the query statement does not satisfy the preset condition, follow the query order from the first file to the second file, The K files are queried to obtain the target data to be queried.
可选地,作为一个实施例,所述查询单元还用于:若所述查询语句不满足所述预设条件,则按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据,并且,对于所述K个文件中的每个文件按照第二顺序进行查询以获得所述目标数据,其中,所述第二顺序为按照文件的数据层级结构从顶层到底层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。 Optionally, as an embodiment, the querying unit is further configured to: if the query statement does not satisfy the preset condition, follow the query order from the first file to the second file, The K files are queried to obtain the target data to be queried, and each of the K files is queried in a second order to obtain the target data, wherein the second order is The hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set, in order from top to bottom in the data hierarchy of the file.
可选地,上述获取单元510和查询单元520可以是处理器。Optionally, the obtaining unit 510 and the query unit 520 may be processors.
具体地,图6是本申请实施例的数据查询的装置的示意性框图。图6所示的装置600包括存储器610、处理器620、输入/输出接口630、通信接口640。其中,存储器610、处理器620、输入/输出接口630和通信接口640通过通信接口相连,该存储器610用于存储指令,该处理器620用于执行该存储器610存储的指令,以控制输入/输出接口630接收输入的数据和信息,输出操作结果等数据,并控制通信接口640发送信号。Specifically, FIG. 6 is a schematic block diagram of an apparatus for data query according to an embodiment of the present application. The apparatus 600 shown in FIG. 6 includes a memory 610, a processor 620, an input/output interface 630, and a communication interface 640. The memory 610, the processor 620, the input/output interface 630, and the communication interface 640 are connected through a communication interface, the memory 610 is configured to store instructions, and the processor 620 is configured to execute instructions stored in the memory 610 to control input/output. The interface 630 receives the input data and information, outputs data such as an operation result, and controls the communication interface 640 to transmit a signal.
所述处理器620,用于获取查询语句,所述查询语句用于对数据层级结构中的N层数据进行查询,所述数据层级结构为存储数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中创建时间最晚的文件,其中N和K为大于1的正整数;还用于当所述获取单元获取的所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底层数据。The processor 620 is configured to obtain a query statement, where the query statement is used to query N-layer data in a data hierarchy, where the data hierarchy is a hierarchical structure of data in a data set, and the data set is Stored in K files in descending order of the hierarchy in the data hierarchy, the K files including a first file and a second file, the first file being the earliest creation time in the K files The second file is the file with the latest creation time among the K files, where N and K are positive integers greater than 1; and is also used when the query statement obtained by the obtaining unit satisfies the preset In the condition, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, wherein the preset condition includes the first query statement The filter condition for querying the data set in a sub-statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty, and the first sub-statement is used for querying Top-level data of said N data layer, the second sub-query statement to the underlying data in the data of the N layer.
应理解,在本发明实施例中,该处理器620可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本发明实施例所提供的技术方案。It should be understood that, in the embodiment of the present invention, the processor 620 may be a general-purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more. The integrated circuit is used to implement the related program to implement the technical solution provided by the embodiment of the present invention.
还应理解,通信接口640使用例如但不限于收发器一类的收发装置,来实现移动终端600与其他设备或通信网络之间的通信。It should also be understood that communication interface 640 enables communication between mobile terminal 600 and other devices or communication networks using transceivers such as, but not limited to, transceivers.
该存储器610可以包括只读存储器和随机存取存储器,并向处理器620提供指令和数据。处理器620的一部分还可以包括非易失性随机存取存储器。例如,处理器620还可以存储设备类型的信息。The memory 610 can include read only memory and random access memory and provides instructions and data to the processor 620. A portion of the processor 620 can also include a non-volatile random access memory. For example, the processor 620 can also store information of the device type.
该总线***650除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线***650。The bus system 650 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 650 in the figure.
在实现过程中,上述方法的各步骤可以通过处理器620中的硬件的集成逻辑电路或者软件形式的指令完成。结合本发明实施例所公开的数据查询的方法可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器610,处理器620读取存储器610中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 620 or an instruction in a form of software. The method for data query disclosed in the embodiment of the present invention may be directly implemented as a hardware processor execution, or may be performed by a combination of hardware and software modules in the processor. The software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like. The storage medium is located in the memory 610, and the processor 620 reads the information in the memory 610 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
本申请实施例中,可以在查询语句中的第一子语句和第二子语句满足预设条件时,按照从第二文件到第一文件的顺序,对K个文件进行查询,利用第二子语句中的过滤条件先从第二文件(即新文件)开始查询数据,在一定程度上,可以减少现有技术中,针对上述满足预设条件的查询语句,依然采用从第一文件到第二文件的查询顺序,产生的大量的不符合查询语句过滤条件的中间结果,有利于提高查询数据效率。In the embodiment of the present application, when the first sub-sent and the second sub-sentence in the query statement satisfy the preset condition, the K files are queried according to the order from the second file to the first file, and the second sub-query is utilized. The filtering condition in the statement first starts to query the data from the second file (ie, the new file), and to a certain extent, can reduce the query in the prior art for the above-mentioned satisfying the preset condition, still adopting the first file to the second The query order of the files, the large number of intermediate results that do not meet the filter conditions of the query, is conducive to improving the efficiency of the query data.
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它 信息确定B。It should be understood that in the embodiment of the present application, "B corresponding to A" means that B is associated with A, and B can be determined according to A. However, it should also be understood that determining B according to A does not mean that B is determined only on the basis of A, but also based on A and/or other Information determines B.
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。It should be understood that the term "and/or" herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist simultaneously. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that, in the various embodiments of the present application, the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application. The implementation process constitutes any limitation.
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(Digital Subscriber Line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够读取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(Digital Video Disc,DVD))或者半导体介质(例如,固态硬盘(Solid State Disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer readable storage medium can be any available media that can be read by a computer or a data storage device such as a server, data center, or the like that includes one or more available media. The usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a Digital Video Disc (DVD)), or a semiconductor medium (eg, a Solid State Disk (SSD)). )Wait.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。 The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It should be covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.

Claims (12)

  1. 一种查询数据的方法,其特征在于,包括:A method for querying data, comprising:
    获取查询语句,所述查询语句用于对数据集合的数据层级结构中的N层数据进行查询,所述数据层级结构为存储所述数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中创建时间最晚的文件,其中,N和K为大于1的正整数;Obtaining a query statement for querying N-layer data in a data hierarchy of the data set, wherein the data hierarchy is a hierarchical structure for storing data in the data set, and the data set is as described The data hierarchy is stored in K files in a high-to-low order, the K files including a first file and a second file, the first file being the file with the earliest creation time among the K files. The second file is the file with the latest creation time among the K files, where N and K are positive integers greater than one;
    当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底层数据。When the query statement satisfies the preset condition, the K files are queried according to the query order from the second file to the first file to obtain target data to be queried, wherein the preset The condition includes that the filter condition for querying the data set in the first sub-sent of the query statement is empty, and the filter condition for querying the data set in the second sub-statement of the query statement is not empty. The first sub-statement is used to query top-level data in the N-layer data, and the second sub-statement is used to query bottom-level data in the N-layer data.
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 wherein the method further comprises:
    根据所述查询语句中包含的查询所述数据集合中数据的属性和所述用于查询所述数据集合中数据的过滤条件,按照所述数据集合的数据层级结构,将所述查询语句划分为多条子语句。Decoding the query statement according to a data hierarchical structure of the data set according to an attribute of the data in the data set and the filtering condition for querying data in the data set included in the query statement Multiple sub-statements.
  3. 如权利要求1或2所述的方法,其特征在于,所述查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同层的数据,The method according to claim 1 or 2, wherein the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used for filtering differently in the data hierarchy Layer of data,
    所述当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:When the query statement satisfies the preset condition, querying the K files to obtain the target data to be queried according to the query order from the second file to the first file includes:
    当所述查询语句满足所述预设条件,且所述至少3条过滤条件为所述目标数据需要同时满足的过滤条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据。When the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the query order from the second file to the first file, The K files are queried to obtain the target data to be queried.
  4. 如权利要求1-3中任一项所述的方法,其特征在于,所述按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:The method according to any one of claims 1-3, wherein the querying the K files according to a query order from the second file to the first file to obtain a query Target data, including:
    当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,并且,对于所述K个文件中的每个文件按照第一顺序进行查询以获得所述目标数据,其中,所述第一顺序为按照文件的数据层级结构从底层到顶层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。When the query statement satisfies a preset condition, querying the K files to obtain target data to be queried according to a query order from the second file to the first file, and, for the K Each of the files is queried in a first order to obtain the target data, wherein the first order is in an order from bottom to top according to a data hierarchy of the file, in a data hierarchy of the file The hierarchical order is the same as the hierarchical order of the data hierarchy of the data set.
  5. 如权利要求1-4中任一项所述的方法,其特征在于,所述方法还包括:The method of any of claims 1-4, wherein the method further comprises:
    在所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。When the query statement does not satisfy the preset condition, the K files are queried according to the query order from the first file to the second file to obtain the target data to be queried.
  6. 如权利要求5所述的方法,其特征在于,所述在所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,包括:The method according to claim 5, wherein, when the query statement does not satisfy the preset condition, the K is in accordance with a query order from the first file to the second file The files are queried to obtain the target data to be queried, including:
    当所述查询语句不满足所述预设条件时,按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据,并且,对于所述K个文件中的每个文件按照第二顺序进行查询以获得所述目标数据,其中,所述第二顺序为 按照文件的数据层级结构从顶层到底层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。When the query statement does not satisfy the preset condition, querying the K files to obtain the target data to be queried according to a query order from the first file to the second file, and Querying, in the second order, each of the K files to obtain the target data, wherein the second order is The hierarchical order in the data hierarchy of the file is the same as the hierarchical order of the data hierarchy of the data set, in order from top to bottom in the data hierarchy of the file.
  7. 一种查询数据的装置,其特征在于,包括:An apparatus for querying data, comprising:
    获取单元,用于获取查询语句,所述查询语句用于对数据集合的数据层级结构中的N层数据进行查询,所述数据层级结构为存储所述数据集合中数据的层级结构,且所述数据集合按照所述数据层级结构中层级由高到低的顺序存储在K个文件中,所述K个文件包括第一文件和第二文件,所述第一文件为所述K个文件中创建时间最早的文件,所述第二文件为所述K个文件中创建时间最晚的文件,其中,N和K为大于1的正整数;An obtaining unit, configured to obtain a query statement, where the query statement is used to query N-level data in a data hierarchy of the data set, where the data hierarchy is a hierarchical structure for storing data in the data set, and the The data set is stored in K files in order of highest to lowest levels in the data hierarchy, the K files including a first file and a second file, the first file being created in the K files The file with the earliest time, the second file is the file with the latest creation time among the K files, where N and K are positive integers greater than one;
    查询单元,用于当所述获取单元获取的所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询以获得待查询的目标数据,其中,所述预设条件包括所述查询语句的第一子语句中用于查询所述数据集合的过滤条件为空,所述查询语句的第二子语句中用于查询所述数据集合的过滤条件不为空,所述第一子语句用于查询所述N层数据中的顶层数据,所述第二子语句用于查询所述N层数据中的底层数据。a query unit, configured to: when the query statement obtained by the obtaining unit meets a preset condition, query the K files according to a query order from the second file to the first file to obtain a to-be-inquired The target data of the query, wherein the preset condition includes that a filter condition for querying the data set in the first sub-statement of the query statement is empty, and a second sub-statement of the query statement is used for querying The filter condition of the data set is not empty, the first sub-statement is used to query top-level data in the N-layer data, and the second sub-statement is used to query bottom-level data in the N-layer data.
  8. 如权利要求7所述的装置,其特征在于,所述装置还包括:The device of claim 7 wherein said device further comprises:
    确定单元,用于根据所述查询语句中包含的查询所述数据集合中数据的属性和所述用于查询所述数据集合中数据的过滤条件,按照所述数据集合的数据层级结构,将所述查询语句划分为多条子语句。a determining unit, configured to: according to the attribute of the data in the data set and the filtering condition for querying the data in the data set included in the query statement, according to the data hierarchical structure of the data set The query statement is divided into multiple sub-statements.
  9. 如权利要求7或8所述的装置,其特征在于,所述查询语句包含至少3条过滤条件,且所述至少3条过滤条件中不同的过滤条件用于过滤位于所述数据层级结构中不同的层的数据,所述查询单元具体还用于:The apparatus according to claim 7 or 8, wherein the query statement includes at least three filter conditions, and different filter conditions of the at least three filter conditions are used for filtering differently in the data hierarchy The data of the layer, the query unit is also specifically used for:
    当所述查询语句满足所述预设条件,且所述至少3条过滤条件为所述目标数据需要同时满足的过滤条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询。When the query statement satisfies the preset condition, and the at least three filtering conditions are filtering conditions that the target data needs to meet at the same time, according to the query order from the second file to the first file, Query the K files.
  10. 如权利要求7-9中任一项所述的装置,其特征在于,所述查询单元还用于:The apparatus according to any one of claims 7 to 9, wherein the query unit is further configured to:
    当所述查询语句满足预设条件时,按照从所述第二文件到所述第一文件的查询顺序,对所述K个文件进行查询,并且,对于所述K个文件中的每个文件按照第一顺序进行查询以获得所述目标数据,其中,所述第一顺序为按照文件的数据层级结构从底层到顶层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。When the query statement satisfies a preset condition, querying the K files according to a query order from the second file to the first file, and for each of the K files Querying in a first order to obtain the target data, wherein the first order is an order from bottom to top according to a data hierarchy of the file, a hierarchical order in the data hierarchy of the file and the data set The hierarchical order of the data hierarchy is the same.
  11. 如权利要求7-10中任一项所述的装置,其特征在于,所述查询单元还用于:The device according to any one of claims 7 to 10, wherein the query unit is further configured to:
    若所述查询语句不满足所述预设条件,则按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得所述待查询的目标数据。And if the query statement does not satisfy the preset condition, querying the K files according to a query order from the first file to the second file to obtain the target data to be queried.
  12. 如权利要求11所述的装置,其特征在于,所述查询单元还用于:The device according to claim 11, wherein the query unit is further configured to:
    若所述查询语句不满足所述预设条件,则按照从所述第一文件到所述第二文件的查询顺序,对所述K个文件进行查询以获得待查询的所述目标数据,并且,对于所述K个文件中的每个文件按照第二顺序进行查询以获得所述目标数据,其中,所述第二顺序为按照文件的数据层级结构从顶层到底层的顺序,所述文件的数据层级结构中的层级顺序与所述数据集合的数据层级结构的层级顺序相同。 If the query statement does not satisfy the preset condition, querying the K files to obtain the target data to be queried according to a query order from the first file to the second file, and Querying, in the second order, each of the K files to obtain the target data, wherein the second order is an order from top to bottom according to a data hierarchy of the file, the file The hierarchical order in the data hierarchy is the same as the hierarchical order of the data hierarchy of the data set.
PCT/CN2017/086600 2017-05-31 2017-05-31 Method and device for data query WO2018218504A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780091399.0A CN110678854B (en) 2017-05-31 2017-05-31 Data query method and device
PCT/CN2017/086600 WO2018218504A1 (en) 2017-05-31 2017-05-31 Method and device for data query

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/086600 WO2018218504A1 (en) 2017-05-31 2017-05-31 Method and device for data query

Publications (1)

Publication Number Publication Date
WO2018218504A1 true WO2018218504A1 (en) 2018-12-06

Family

ID=64454267

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/086600 WO2018218504A1 (en) 2017-05-31 2017-05-31 Method and device for data query

Country Status (2)

Country Link
CN (1) CN110678854B (en)
WO (1) WO2018218504A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259062B (en) * 2020-01-15 2023-08-01 山东省电子口岸有限公司 Method and device capable of guaranteeing sequence of statement result set of full-table query of distributed database

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070239691A1 (en) * 2006-04-06 2007-10-11 Carlos Ordonez Optimization techniques for linear recursive queries in sql
CN103198141A (en) * 2013-04-18 2013-07-10 中国农业银行股份有限公司 Data record access control method and device in hierarchical relationship
CN103294807A (en) * 2013-05-31 2013-09-11 重庆大学 Distributed data management method on basis of multi-level relations
CN104123288A (en) * 2013-04-24 2014-10-29 阿里巴巴集团控股有限公司 Method and device for inquiring data
CN104216891A (en) * 2013-05-30 2014-12-17 国际商业机器公司 Method and equipment for optimizing query statement in relational database
CN105528367A (en) * 2014-09-30 2016-04-27 华东师范大学 A method for storage and near-real time query of time-sensitive data based on open source big data

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7185024B2 (en) * 2003-12-22 2007-02-27 International Business Machines Corporation Method, computer program product, and system of optimized data translation from relational data storage to hierarchical structure
US8285711B2 (en) * 2009-11-24 2012-10-09 International Business Machines Corporation Optimizing queries to hierarchically structured data
CN102375853A (en) * 2010-08-24 2012-03-14 ***通信集团公司 Distributed database system, method for building index therein and query method
CN103310011A (en) * 2013-07-02 2013-09-18 曙光信息产业(北京)有限公司 Analytical method for data query under cluster database system environment
CN104063425B (en) * 2014-06-04 2017-09-19 五八同城信息技术有限公司 The method and database middleware of data are inquired about by database middleware
CN105701098B (en) * 2014-11-25 2019-07-09 国际商业机器公司 The method and apparatus for generating index for the table in database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070239691A1 (en) * 2006-04-06 2007-10-11 Carlos Ordonez Optimization techniques for linear recursive queries in sql
CN103198141A (en) * 2013-04-18 2013-07-10 中国农业银行股份有限公司 Data record access control method and device in hierarchical relationship
CN104123288A (en) * 2013-04-24 2014-10-29 阿里巴巴集团控股有限公司 Method and device for inquiring data
CN104216891A (en) * 2013-05-30 2014-12-17 国际商业机器公司 Method and equipment for optimizing query statement in relational database
CN103294807A (en) * 2013-05-31 2013-09-11 重庆大学 Distributed data management method on basis of multi-level relations
CN105528367A (en) * 2014-09-30 2016-04-27 华东师范大学 A method for storage and near-real time query of time-sensitive data based on open source big data

Also Published As

Publication number Publication date
CN110678854A (en) 2020-01-10
CN110678854B (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN106227800B (en) Storage method and management system for highly-associated big data
US9805079B2 (en) Executing constant time relational queries against structured and semi-structured data
US9411840B2 (en) Scalable data structures
US10055509B2 (en) Constructing an in-memory representation of a graph
US11169978B2 (en) Distributed pipeline optimization for data preparation
CN111046034B (en) Method and system for managing memory data and maintaining data in memory
US10678792B2 (en) Parallel execution of queries with a recursive clause
CN108369587B (en) Creating tables for exchange
US10311105B2 (en) Filtering queried data on data stores
US20200210399A1 (en) Signature-based cache optimization for data preparation
WO2017170459A1 (en) Method, program, and system for automatic discovery of relationship between fields in environment where different types of data sources coexist
WO2017096892A1 (en) Index construction method, search method, and corresponding device, apparatus, and computer storage medium
US20240126817A1 (en) Graph data query
US20120215752A1 (en) Index for hybrid database
US11468031B1 (en) Methods and apparatus for efficiently scaling real-time indexing
CN107526746B (en) Method and apparatus for managing document index
JP6159908B1 (en) Method, program, and system for automatic discovery of relationships between fields in a heterogeneous data source mixed environment
CN108959538A (en) Text retrieval system and method
US20170109389A1 (en) Step editor for data preparation
JPWO2017170459A6 (en) Method, program, and system for automatic discovery of relationships between fields in a heterogeneous data source mixed environment
US8548980B2 (en) Accelerating queries based on exact knowledge of specific rows satisfying local conditions
WO2018218504A1 (en) Method and device for data query
JP6006740B2 (en) Index management device
US10528538B2 (en) Leveraging SQL with user defined aggregation to efficiently merge inverted indexes stored as tables
US10762139B1 (en) Method and system for managing a document search index

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17912140

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17912140

Country of ref document: EP

Kind code of ref document: A1