CN111797114A - Multi-path cross-class query and optimization method in object proxy database - Google Patents

Multi-path cross-class query and optimization method in object proxy database Download PDF

Info

Publication number
CN111797114A
CN111797114A CN202010589900.XA CN202010589900A CN111797114A CN 111797114 A CN111797114 A CN 111797114A CN 202010589900 A CN202010589900 A CN 202010589900A CN 111797114 A CN111797114 A CN 111797114A
Authority
CN
China
Prior art keywords
class
path
expression
query
end point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010589900.XA
Other languages
Chinese (zh)
Other versions
CN111797114B (en
Inventor
彭煜玮
郏紫宇
兰海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010589900.XA priority Critical patent/CN111797114B/en
Publication of CN111797114A publication Critical patent/CN111797114A/en
Application granted granted Critical
Publication of CN111797114B publication Critical patent/CN111797114B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a multi-path cross-class query and optimization method in an object proxy database, which is used for respectively acquiring query of target attributes on the same path by using different path expressions and defining the query as multi-path cross-class query with the same end point class. And for the condition that a plurality of paths in the cross-class query contain common class nodes, defining the paths as multi-path cross-class queries with different end point classes. And then, providing a multi-path cross-class query syntax support implementation for supporting multi-path cross-class query with the same end point class and different end point classes. And finally, providing an execution scheme of multi-path cross-class query, and providing a multi-path pointer tracking algorithm which is used for multi-path cross-class query on the basis of the pointer tracking algorithm thought used by the original cross-class query, wherein the multi-path pointer tracking algorithm is used for calculating a multi-path expression. By adopting the invention, a user can obtain different target attribute expressions by using one path expression, the redundant expression of the common path expression is reduced, and the query efficiency is improved.

Description

Multi-path cross-class query and optimization method in object proxy database
Technical Field
The invention belongs to the technical field of database query optimization, and particularly relates to a multi-path cross-class query and optimization method in an object proxy database.
Background
With the change of cloud computing and big data technology, database application is continuously updated, and a new application field provides a new direction for the development of database technology and also provides new requirements. Mass data is not limited to a structured mode any more, and data with complex structures such as semi-structured data, unstructured data and the like appears. The traditional relational data model is mainly used for managing structured data, so that the traditional relational data model is very popular in the aspect of the situation. The object-oriented database provides a management scheme aiming at complex data, but the semi-structured and unstructured data are managed by utilizing encapsulation, so that the objects are difficult to be divided and recombined, and the special flexibility of a relational model is lost, so that the problems of inflexible object operation, low data processing efficiency and the like are caused. In view of the above problems, an object proxy model odm (object throughput model) is an extension of an object-oriented model, which has both flexibility of a relational model and modeling capability of the object-oriented model, and provides a completely new solution for storing and managing massive complex data.
The object proxy database oddb (object throughput database) is a database management system using an object proxy model as a data model. The ODDB encapsulates the attributes and methods into objects, the objects are gathered into classes by the same attribute method, and the characteristics of the objects on different sides are embodied by establishing proxy objects for the objects. Because of the inheritance relationship between the objects, the objects are linked by the bidirectional pointer according to the inheritance relationship, and thus a network structure formed by the objects is logically formed. The attribute and method inherited by the proxy object from the source object are called virtual attributes, and the virtual attributes are not actually stored in the system, but find the source object through a bidirectional pointer link during access and calculate the virtual attribute value in real time by applying the switching expression of the virtual attributes on the source object. Real attributes may also be defined for proxy objects to reflect certain aspects of the object, with real attribute values being stored in the system. Based on a bidirectional pointer link in an object network, cross-class query refers to searching for an object having a direct or indirect proxy relationship with the object in a class given by a path from the object of a starting point class, and finally obtaining a target attribute value of an end point class object. The existing cross-class query is implemented on the basis of a path expression mechanism, objects on a path are required to be sequentially acquired from a starting point object along a path expression instance in the query until a target attribute is extracted from an end point object, and the process of performing navigation verification along an object chain is also called path expression calculation. The cross-class query is one of the most important characteristics in the ODDB, so that high-efficiency personalized query service is provided for users, and the high efficiency and the flexibility of the object proxy database are guaranteed. Therefore, the time for obtaining the target attribute is shortened, the calculation efficiency of the path expression is improved, the cross-class query efficiency can be further improved, and the method has important significance for improving the performance of the object proxy database.
In the existing query system of the object proxy database, when a plurality of target attributes are acquired on an end point class of a path, a user cannot acquire different target attributes through a path expression, but needs to write a plurality of path expressions corresponding to the plurality of target attributes, and the problem of complex grammar is caused by repeatedly writing the same path. In addition, in a query command having a plurality of paths, there are often common portions between the plurality of paths, and a user needs to repeatedly write the common portions on the respective paths, which also causes unnecessary burden. Because the existing query execution strategy adopts an expression-by-expression calculation mode, when cross-class query is carried out on a plurality of paths with a common part, each target attribute is described by a single path expression, and the calculation processes of the path expressions are mutually independent. Because a plurality of paths have a common part and each path needs to complete the complete object instance search from the starting point class to the end point class, repeated access to the objects on the common path is needed, which brings a lot of unnecessary expenses.
In summary, the following problems exist in the multipath cross-class query optimization:
(1) when a query contains multiple path expressions with common portions, independent execution between the multiple path expressions can result in a large number of repeated accesses to the object. Aiming at the requirements of acquiring attributes in different objects in the same object network, a query function is lacked at present, so that a user can acquire different target attributes by writing only one path expression.
(2) In the conventional ODDB, because a plurality of path expressions are required to obtain target attribute values independently from each other, and a "real-time computation" mode of one object at a time is adopted, when an inquired proxy class has a higher proxy hierarchy, there are relatively more common class nodes among a plurality of paths, and there are cases where a large number of objects in a common class are repeatedly accessed when obtaining target attributes. Repeated access to objects requires repeated scans of the bidirectional pointer table, and frequent accesses result in a large number of I/O operations, which severely affects the efficiency of cross-class queries.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: a user can obtain different target attribute expressions by using one path expression, redundant representation of a common path expression is reduced, and query efficiency is improved.
The technical scheme adopted by the invention for solving the technical problems is as follows: a multi-path cross-class query method in an object proxy database is characterized in that: the following syntax of query pattern is set:
SELECT(C1{Q1}→C2{Q2}→…→Cr{Qr}→[...,...]→[Ct1{Qt1},…Ctn{Qtn}]).
[Attr1,…Attrm]FROM C1(r≥1,n≥1,m≥2)
wherein, C is knowniRepresents a certain node in the directed graph, i is more than or equal to 1 and less than or equal to n, each node occurs at most once, and CiAnd Ci+1The direct proxy relationship is between; ctjRepresents an end point class, j is more than or equal to 1 and less than or equal to n; attrkRepresenting the target attribute needing to be acquired in the terminal class, and k is more than or equal to 1 and less than or equal to m; q1-QrAnd Qt1-QtnAll represent the condition predicates acting on the corresponding path node class; one or more continuously existing common node classes in the multipath are called common paths; when n is 1 and m is more than or equal to 2, the multi-path cross-class query is represented as the multi-path cross-class query with the same end point class, and a plurality of target attributes are obtained in the same end point class; when m is equal to n ≧ 2, the multi-path cross-class check is indicated as that the end point classes are differentInquiring, namely acquiring different target attributes in different end point classes, wherein the number of the end point classes is the same as that of the target attribute expressions, and a one-to-one correspondence relationship exists;
according to the query mode, an algorithm suitable for multipath expression calculation is provided as follows:
step 1.1, judging the type of the multipath expression, and if the multipath expression belongs to the multipath expressions with different end point types, jumping to step 1.7;
step 1.2, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.3, obtaining the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 1.4, setting a flag array and caching an object example with an object ratio of 1: N;
step 1.5, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.6, if the object instance in the cache array has a flag mark, skipping to step 1.4 to take the next object instance, and if not, skipping to step 1.14;
step 1.7, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.8, obtaining a path expression instance PE meeting predicate conditions on a path from a bidirectional pointer directoryiI;
Step 1.9, setting a flag array and caching under the condition that the next class node of the current class node is not unique;
step 1.10, setting flag2 flag array and caching for the object example with the object ratio of 1: N;
step 1.11, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.12, if an object instance exists in the cache array and is marked by flag2, skipping to step 1.10 to take the next object instance;
step 1.13, if the flag mark exists in the class node in the cache array, skipping to step 1.9 to take the next class node;
and step 1.14, combining the target attributes to form a result and returning the result to the user.
The optimization method according to the multi-path cross-class query method in the object proxy database is characterized in that: the method comprises the following steps:
2.1, the radial parallelization optimization method comprises the following substeps:
step 2.11, selecting an intermediate class as a starting class node, wherein the intermediate class refers to the xth class node; taking the whole downwards after x is len/2;
step 2.12, selecting the intermediate class C in turniEach object O in (1)iAs a starting point, each path expression is treated as an attribute expression of the starting point class. Simultaneously starting a thread 1 and a thread 2;
step 2.13, the thread 1 tests the connectivity of the paths to the starting point class in sequence according to the bidirectional pointer directory;
step 2.14, the thread 2 obtains the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 2.15, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.16, if one thread detects an invalid path expression, stopping searching and informing the other thread to stop;
step 2.17, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.18, if the object instance in the cache array has a flag mark, skipping to step 2.13 to take the next object instance, otherwise, continuing;
step 2.19, combining the target attributes to form a result and returning the result to the user;
2.2, the implementation of the parallelization scheme for the end point class comprises the following sub-steps:
step 2.21, selecting the last class in the multipath expression public prefix path as a starting class node;
step 2.22, sequentially selecting the starting class CiEach object O in (1)iAs a starting point, each path expression is treated as an attribute expression of the starting point class. Simultaneously starting N +1 threads according to the number N of class branch nodes;
step 2.23, the thread 1 tests the path connectivity to the starting point class in turn according to the bidirectional pointer directory;
step 2.24, the remaining N threads simultaneously obtain the expression example PE of the N paths from the bidirectional pointer directory to the destination classiIi
Step 2.25, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.26, if one thread detects an invalid path expression, stopping searching and informing other threads of stopping;
step 2.27, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.28, if the object instance in the cache array has a flag mark, skipping to step 2.23 to take the next object instance, otherwise, continuing;
and 2.29, combining the target attributes to form a result and returning the result to the user.
The invention has the beneficial effects that:
1. by defining a new syntax query mode and providing a corresponding algorithm, starting from reducing the input times of repeated paths of users in cross-class query with the same end point class, the redundant representation of a public path expression is fundamentally reduced, and the calculation times of the path expression are reduced by a mode of acquiring the target attribute of the end point class at one time, so that the query efficiency of multi-path cross-class query is improved; in cross-class query with different end point classes, the writing burden of a user is relieved by reducing the input times of a common path of the user. In the calculation of the multipath expression, the redundant traversal of the object is reduced by multiplexing the class nodes on the public path, so that the efficiency of multipath cross-class query is improved.
2. And (3) providing a parallel path search algorithm for optimization on the basis of a multipath pointer tracking algorithm, wherein a radial parallelization scheme is used in multipath cross-class query with the same end point class, and a parallelization scheme facing the end point class is provided in multipath cross-class query with different end point classes. In the effective path searching of the two parallel path searching optimization schemes, the number of object traversals in unit time is increased by the parallel searching characteristic, so that the calculation of the whole path expression is accelerated. In the object instance searching of the invalid path, if one thread in the multiple threads recognizes that the path is the invalid path, the other threads are notified by thread communication to terminate the searching process, so that the traversal of the invalid object is reduced.
Drawings
FIG. 1 is a database schema according to an embodiment of the present invention.
Detailed Description
The invention is further illustrated by the following specific examples and figures.
Aiming at the requirement that a query containing a plurality of paths is used in an object proxy database to efficiently obtain different target attributes, the invention firstly provides a definition of multi-path cross-class query and provides an execution mechanism of the multi-path cross-class query. Then, according to the characteristics of the multi-path expression, the invention provides a parallel path search optimization scheme aiming at the multi-path expression by starting from accelerating the object access in the nodes on the path and reducing the access of the invalid path object and combining the principle of reducing the calculation times of the path expression as much as possible.
(1) Multi-path cross-class query
In order to meet the requirement that a user obtains different target attributes by using a path expression in query, the invention provides a multi-path cross-class query mode based on a multi-path expression. And respectively acquiring queries of the target attributes on the same path by using different path expressions, and defining the queries as multi-path cross-class queries with the same end point class. And for the condition that a plurality of paths in the cross-class query contain common class nodes, defining the paths as multi-path cross-class queries with different end point classes. And then, providing a multi-path cross-class query syntax support implementation for supporting multi-path cross-class query with the same end point class and different end point classes. And finally, providing an execution scheme of multi-path cross-class query, and providing a multi-path pointer tracking algorithm which is used for multi-path cross-class query on the basis of the pointer tracking algorithm thought used by the original cross-class query, wherein the multi-path pointer tracking algorithm is used for calculating a multi-path expression.
(2) Parallel path search optimization scheme
In order to improve the efficiency of the calculation of the multi-path expression, the invention provides a parallel path searching optimization scheme, wherein the parallel path searching optimization scheme comprises a radial parallelization scheme and a parallelization scheme facing to an end point class.
For the multi-path cross-class query with the same end point class in the definition, the invention provides a radial parallelization scheme. In this scheme, the intermediate class node is selected as the starting class. And starting two threads to search the starting point class and the end point class respectively, wherein the thread searching the starting point class is used for testing the connectivity of the path, and the thread searching the end point class is used for acquiring the object instance and calculating the target attribute expression.
For the multi-path cross-class query with different end point classes in the definition, the invention provides a parallelization scheme facing the end point classes. In this scheme, the starting class is the last class node on the common path in the multipath. Starting a plurality of threads to search respectively to the starting point class and different end point classes, wherein the threads searched in the starting point are still used for testing the connectivity of the path. And starting a plurality of threads with the same number as the branch nodes according to the number of the branch nodes of the class, and obtaining the target attribute from different end point classes.
Specifically, the invention provides a multi-path cross-class query method in an object proxy database, which sets a query mode with the following syntax:
SELECT(C1{Q1}→C2{Q2}→…→Cr{Qr}→[...,...]→[Ct1{Qt1},…Ctn{Qtn}]).
[Attr1,…Attrm]FROM C1(r≥1,n≥1,m≥2)
wherein, C is knowniRepresenting direction of movementI is more than or equal to 1 and less than or equal to n of a certain node in the graph, each node appears at most once, and CiAnd Ci+1The direct proxy relationship is between; ctjRepresents an end point class, j is more than or equal to 1 and less than or equal to n; attrkRepresenting the target attribute needing to be acquired in the terminal class, and k is more than or equal to 1 and less than or equal to m; q1-QrAnd Qt1-QtnAll represent the condition predicates acting on the corresponding path node class; one or more continuously existing common node classes in the multipath are called common paths; when n is 1 and m is more than or equal to 2, the multi-path cross-class query is represented as the multi-path cross-class query with the same end point class, and a plurality of target attributes are obtained in the same end point class; and when m is equal to or larger than 2, representing that the multi-path cross-class query is different in end point class, acquiring different target attributes in different end point classes, wherein the number of the end point classes is the same as that of the target attribute expressions, and the end point classes and the target attribute expressions are in one-to-one correspondence.
According to the above definition, the common path in the multi-path expression may have two cases: 1) the public path is the same as the paths represented by the path expressions, which shows that the multi-path expression has multiple completely same paths, and cross-class query needs to acquire multiple different target attributes on the same end point class; 2) the common path is a common prefix represented by an expression and represents a plurality of paths, and at the time, cross-class query needs to obtain target attributes on a plurality of different end point classes.
Because each path expression is calculated independently in the existing query system of the object proxy database, an algorithm suitable for calculating the multipath expression needs to be provided according to the definition of the multipath cross-class query as follows:
step 1.1, judging the type of the multipath expression, and if the multipath expression belongs to the multipath expressions with different end point types, jumping to step 1.7;
step 1.2, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.3, obtaining the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 1.4, setting a flag array and caching an object example with an object ratio of 1: N;
step 1.5, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.6, if the object instance in the cache array has a flag mark, skipping to step 1.4 to take the next object instance, and if not, skipping to step 1.14;
step 1.7, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.8, obtaining a path expression instance PE meeting predicate conditions on a path from a bidirectional pointer directoryiI;
Step 1.9, setting a flag array and caching under the condition that the next class node of the current class node is not unique;
step 1.10, setting flag2 flag array and caching for the object example with the object ratio of 1: N;
step 1.11, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.12, if an object instance exists in the cache array and is marked by flag2, skipping to step 1.10 to take the next object instance;
step 1.13, if the flag mark exists in the class node in the cache array, skipping to step 1.9 to take the next class node;
and step 1.14, combining the target attributes to form a result and returning the result to the user.
Through the steps, the efficiency of the multi-path cross-class query algorithm is mainly determined by the number of expressions to be calculated, the number of objects of the starting point class and the access time of the objects in the class nodes on the path. The technical scheme combines the calculation characteristics of the multi-path expression, two optimization strategies based on parallel path search are provided from the aspects of shortening the object access time in the class nodes on the path and reducing the object access times on the invalid path, and the optimization strategies are divided into a radial parallelization scheme and a parallelization scheme facing to the end point class.
According to the optimization method of the multi-path cross-class query method in the object proxy database, the method comprises the following steps:
2.1, the radial parallelization optimization method comprises the following substeps:
step 2.11, selecting an intermediate class as a starting class node, wherein the intermediate class refers to the xth class node; taking the whole downwards after x is len/2;
step 2.12, selecting the intermediate class C in turniEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point; simultaneously starting a thread 1 and a thread 2;
step 2.13, the thread 1 tests the connectivity of the paths to the starting point class in sequence according to the bidirectional pointer directory;
step 2.14, the thread 2 obtains the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 2.15, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.16, if one thread detects an invalid path expression, stopping searching and informing the other thread to stop;
step 2.17, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.18, if the object instance in the cache array has a flag mark, skipping to step 2.13 to take the next object instance, otherwise, continuing;
and 2.19, combining the target attributes to form a result and returning the result to the user.
2.2, the implementation of the parallelization scheme for the end point class comprises the following sub-steps:
step 2.21, selecting the last class in the multipath expression public prefix path as a starting class node;
step 2.22, sequentially selecting the starting class CiEach object O in (1)iAs a starting point, each path expression is treated asAn attribute expression of the origin class. Simultaneously starting N +1 threads according to the number N of class branch nodes;
step 2.23, the thread 1 tests the path connectivity to the starting point class in turn according to the bidirectional pointer directory;
step 2.24, the remaining N threads simultaneously obtain the expression example PE of the N paths from the bidirectional pointer directory to the destination classiIi
Step 2.25, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.26, if one thread detects an invalid path expression, stopping searching and informing other threads of stopping;
step 2.27, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.28, if the object instance in the cache array has a flag mark, skipping to step 2.23 to take the next object instance, otherwise, continuing;
and 2.29, combining the target attributes to form a result and returning the result to the user.
The invention provides a research of a multipath cross-class query optimization technology in an object proxy database.
In the Music database shown in fig. 1, for example, (Album _ Pic → Album → Music _ Lyr → Music) [ Music _ url, Album ] represents a multipath expression with the same end point class, taking Album _ Pic as a start point class, sequentially selecting each object in the class, obtaining a path expression instance satisfying a predicate condition on a path from a bidirectional pointer directory, setting a flag array and caching the flag array when an object ratio is 1: N, and then taking the next object instance from the cache array after the calculation is finished. And finally, selecting objects meeting the conditions in the Music end point class to respectively calculate the values of Music _ url and album attributes. If (Album _ Pic → Album → Music _ Lyr → Music _ Artist) → [ Music _ url, attribute ]) represents a multipath expression with different end point classes, with Album _ Pic as a starting point class, setting buffer arrays and marks for the case of the object ratio 1: N and the case of the class node 1: N, the calculation method is similar to the method, finally, the object meeting the condition is selected from the Music end point class to calculate the value of the attribute of the Music _ url, and the object meeting the condition is selected from the attribute end point class to calculate the value of the attribute of the rank.
In the Music database shown in fig. 1, for example (Album _ Pic → Album → Music _ Lyr → Music) [ Music _ url, Album ] represents a multipath expression with the same end point class, the radial parallelization scheme sequentially selects each object in the class from the middle class of Album, starts one thread to search for test path connectivity to the Album _ Pic start class, and simultaneously starts another thread to obtain a path expression instance satisfying the predicate condition on the path from the bidirectional pointer directory. If one thread detects that the path expression is invalid in the calculation process of the multipath expression, the search is terminated and the other thread is informed to terminate. And setting a flag array and caching under the condition that the object ratio is 1: N, and taking the next object instance from the cache array after the calculation is finished. And finally, selecting objects meeting the conditions in the Music end point class to respectively calculate the values of Music _ url and album attributes.
In the Music database shown in fig. 1, for example, (Album _ Pic → Album → Music _ Lyr → [ Music, Artist ]). [ Music _ url, rank ] represents a multipath expression with different end point classes, the parallelization scheme for the end point class sequentially selects each object in the class from the Music _ Lyr, starts one thread to search for testing path connectivity to the Album _ Pic start point class, simultaneously starts two other threads to search for the Music and the Artist class respectively, and acquires a path expression instance meeting the predicate condition on the path from the bidirectional pointer directory. If one thread detects that the path expression is invalid in the calculation process of the multipath expression, the search is terminated and the other thread is informed to terminate. Setting a cache array and marking for the condition of the object ratio of 1: N and the condition of the class node of 1: N, wherein the calculation mode is similar to the method, and finally, taking the object meeting the condition in the Music endpoint class to calculate the value of the Music _ url attribute, and taking the object meeting the condition in the Artist endpoint class to calculate the value of the rank attribute.
The scheme provided by the invention can well solve the problem of low multi-path cross-class query efficiency in the object proxy database, and has important significance and contribution to improving the performance of the object proxy database.
The above embodiments are only used for illustrating the design idea and features of the present invention, and the purpose of the present invention is to enable those skilled in the art to understand the content of the present invention and implement the present invention accordingly, and the protection scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes and modifications made in accordance with the principles and concepts disclosed herein are intended to be included within the scope of the present invention.

Claims (2)

1. A multi-path cross-class query method in an object proxy database is characterized in that: the following syntax of query pattern is set:
SELECT(C1{Q1}→C2{Q2}→…→Cr{Qr}→[…,…]→[Ct1{Qt1},…Ctn{Qtn}]).
[Attr1,…Attrm]FROM C1(r≥1,n≥1,m≥2)
wherein, C is knowniRepresents a certain node in the directed graph, i is more than or equal to 1 and less than or equal to n, each node occurs at most once, and CiAnd Ci+1The direct proxy relationship is between; ctjRepresents an end point class, j is more than or equal to 1 and less than or equal to n; attrkRepresenting the target attribute needing to be acquired in the terminal class, and k is more than or equal to 1 and less than or equal to m; q1-QrAnd Qt1-QtnAll represent the condition predicates acting on the corresponding path node class; one or more continuously existing common node classes in the multipath are called common paths; when n is 1 and m is more than or equal to 2, the multi-path cross-class query is represented as the multi-path cross-class query with the same end point class, and a plurality of target attributes are obtained in the same end point class; when m is equal to or larger than 2, representing the multi-path cross-class query with different end point classes, acquiring different target attributes in the different end point classes, wherein the number of the end point classes is the same as that of the target attribute expressions, and the end point classes and the target attribute expressions are in one-to-one correspondence;
according to the query mode, an algorithm suitable for multipath expression calculation is provided as follows:
step 1.1, judging the type of the multipath expression, and if the multipath expression belongs to the multipath expressions with different end point types, jumping to step 1.7;
step 1.2, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.3, obtaining the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 1.4, setting a flag array and caching an object example with an object ratio of 1: N;
step 1.5, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.6, if the object instance in the cache array has a flag mark, skipping to step 1.4 to take the next object instance, and if not, skipping to step 1.14;
step 1.7, sequentially selecting a starting point class CiEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point;
step 1.8, obtaining a path expression instance PE meeting predicate conditions on a path from a bidirectional pointer directoryiI;
Step 1.9, setting a flag array and caching under the condition that the next class node of the current class node is not unique;
step 1.10, setting flag2 flag array and caching for the object example with the object ratio of 1: N;
step 1.11, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 1.12, if an object instance exists in the cache array and is marked by flag2, skipping to step 1.10 to take the next object instance;
step 1.13, if the flag mark exists in the class node in the cache array, skipping to step 1.9 to take the next class node;
and step 1.14, combining the target attributes to form a result and returning the result to the user.
2. The method for optimizing the multi-path cross-class query method in the object broker database according to claim 1, wherein: the method comprises the following steps:
2.1, the radial parallelization optimization method comprises the following substeps:
step 2.11, selecting an intermediate class as a starting class node, wherein the intermediate class refers to the xth class node; taking the whole downwards after x is len/2;
step 2.12, selecting the intermediate class C in turniEach object O in (1)iTaking each path expression as an attribute expression of a starting point class as a starting point; simultaneously starting a thread 1 and a thread 2;
step 2.13, the thread 1 tests the connectivity of the paths to the starting point class in sequence according to the bidirectional pointer directory;
step 2.14, the thread 2 obtains the path expression instance PE meeting the predicate condition on the path from the bidirectional pointer directoryiI;
Step 2.15, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.16, if one thread detects an invalid path expression, stopping searching and informing the other thread to stop;
step 2.17, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.18, if the object instance in the cache array has a flag mark, skipping to step 2.13 to take the next object instance, otherwise, continuing;
step 2.19, combining the target attributes to form a result and returning the result to the user;
2.2, the implementation of the parallelization scheme for the end point class comprises the following sub-steps:
step 2.21, selecting the last class in the multipath expression public prefix path as a starting class node;
step 2.22, sequentially selecting the starting class CiEach object O in (1)iAs a starting point, each path expression is treated as an attribute expression of the starting point class. Simultaneously starting N +1 threads according to the number N of class branch nodes;
step 2.23, the thread 1 tests the path connectivity to the starting point class in turn according to the bidirectional pointer directory;
step 2.24, the remaining N threads simultaneously obtain the expression example PE of the N paths from the bidirectional pointer directory to the destination classiIi
Step 2.25, setting a flag array and caching an object example with an object ratio of 1: N;
step 2.26, if one thread detects an invalid path expression, stopping searching and informing other threads of stopping;
step 2.27, respectively calculating a plurality of attribute expressions in the attribute list to obtain each Attrk
Step 2.28, if the object instance in the cache array has a flag mark, skipping to step 2.23 to take the next object instance, otherwise, continuing;
and 2.29, combining the target attributes to form a result and returning the result to the user.
CN202010589900.XA 2020-06-24 2020-06-24 Multi-path cross-class query and optimization method in object proxy database Active CN111797114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010589900.XA CN111797114B (en) 2020-06-24 2020-06-24 Multi-path cross-class query and optimization method in object proxy database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010589900.XA CN111797114B (en) 2020-06-24 2020-06-24 Multi-path cross-class query and optimization method in object proxy database

Publications (2)

Publication Number Publication Date
CN111797114A true CN111797114A (en) 2020-10-20
CN111797114B CN111797114B (en) 2021-08-31

Family

ID=72803396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010589900.XA Active CN111797114B (en) 2020-06-24 2020-06-24 Multi-path cross-class query and optimization method in object proxy database

Country Status (1)

Country Link
CN (1) CN111797114B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216844A (en) * 2008-01-03 2008-07-09 彭智勇 Database cross-class inquiry method
CN103218439A (en) * 2013-04-22 2013-07-24 武汉大学 Virtual attribute query optimization method of object-oriented proxy database
US20150261792A1 (en) * 2014-03-17 2015-09-17 Commvault Systems, Inc. Maintaining a deduplication database
CN110059108A (en) * 2019-04-28 2019-07-26 武汉大学 A kind of optimization method towards the inquiry of mobile terminal object broker database association
CN110162642A (en) * 2019-05-21 2019-08-23 武汉大学 Patent knowledge map construction method based on object broker database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216844A (en) * 2008-01-03 2008-07-09 彭智勇 Database cross-class inquiry method
CN103218439A (en) * 2013-04-22 2013-07-24 武汉大学 Virtual attribute query optimization method of object-oriented proxy database
US20150261792A1 (en) * 2014-03-17 2015-09-17 Commvault Systems, Inc. Maintaining a deduplication database
CN110059108A (en) * 2019-04-28 2019-07-26 武汉大学 A kind of optimization method towards the inquiry of mobile terminal object broker database association
CN110162642A (en) * 2019-05-21 2019-08-23 武汉大学 Patent knowledge map construction method based on object broker database

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
YUWEI PENG等: "Index Structure for Cross-Class Query in Object Deputy Database", 《WAIM 2011: WEB-AGE INFORMATION MANAGEMENT》 *
蒋廉等: "一种对象代理数据库的跨类查询优化方法", 《计算机工程与科学》 *
雷芸: "对象代理数据库跨类查询与代理对象查询的索引结构分析", 《电脑知识与技术》 *
黄泽谦等: "一种支持对象代理数据库高效查询处理的索引结构", 《计算机学报》 *

Also Published As

Publication number Publication date
CN111797114B (en) 2021-08-31

Similar Documents

Publication Publication Date Title
Stuckenschmidt et al. Index structures and algorithms for querying distributed RDF repositories
CN108431810B (en) Proxy database
US7562073B2 (en) Business object search using multi-join indexes and extended join indexes
CN103294831B (en) Based on the packet aggregation computational methods of Multidimensional numerical in column storage database
JP4856627B2 (en) Partial query caching
US5822747A (en) System and method for optimizing database queries
Sevinç et al. An evolutionary genetic algorithm for optimization of distributed database queries
US20060230016A1 (en) Systems and methods for statistics over complex objects
CN105975587B (en) A kind of high performance memory database index organization and access method
US20160239544A1 (en) Collaborative planning for accelerating analytic queries
CN107291807A (en) A kind of SPARQL enquiring and optimizing methods based on figure traversal
US11392623B2 (en) Hybrid in-memory BFS-DFS approach for computing graph queries against heterogeneous graphs inside relational database systems
US20220222254A1 (en) Efficient sql-based graph random walk
CN107016071A (en) A kind of method and system of utilization simple path characteristic optimization tree data
CN110909111A (en) Distributed storage and indexing method based on knowledge graph RDF data characteristics
US20070078816A1 (en) Common sub-expression elimination for inverse query evaluation
CN105357247A (en) Multi-dimensional cloud resource interval finding method based on hierarchical cloud peer-to-peer network
CN106156171A (en) A kind of enquiring and optimizing method of Virtual asset data
CN111797114B (en) Multi-path cross-class query and optimization method in object proxy database
CN117472959A (en) Gskip list-based block chain efficient query system and dynamic construction method
CN105302551B (en) A kind of method and system of the Orthogonal Decomposition construction and optimization of big data processing system
US9135302B2 (en) Query rewrite with a nested materialized view
CN115443456A (en) Iterative query construction process in relational databases
CA2365433A1 (en) System and method for multiple-threaded access to a database
Lentner et al. ODRA: A Next Generation Object-Oriented Environment for Rapid Database Application Development

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant