CN114647764A - Graph structure query method and device and storage medium - Google Patents

Graph structure query method and device and storage medium Download PDF

Info

Publication number
CN114647764A
CN114647764A CN202210348471.6A CN202210348471A CN114647764A CN 114647764 A CN114647764 A CN 114647764A CN 202210348471 A CN202210348471 A CN 202210348471A CN 114647764 A CN114647764 A CN 114647764A
Authority
CN
China
Prior art keywords
vertex
target
coding
query
neighbor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210348471.6A
Other languages
Chinese (zh)
Other versions
CN114647764B (en
Inventor
李友焕
郑航宇
秦拯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202210348471.6A priority Critical patent/CN114647764B/en
Publication of CN114647764A publication Critical patent/CN114647764A/en
Application granted granted Critical
Publication of CN114647764B publication Critical patent/CN114647764B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a graph structure query method and related equipment, which can reduce the time consumption of graph structure query. The method comprises the following steps: acquiring an input query set aiming at a graph structure, wherein the input query set comprises at least one input query edge; querying codes of a first vertex and a second vertex corresponding to a target query edge from a graph structure coding database, wherein the graph structure coding database comprises codes corresponding to a plurality of vertices within two vertices of the target query edge, the target query edge is any one query edge in the input query set, and the coding type of each vertex in the plurality of vertices is direct coding or combined coding; determining the coding type of the first vertex and the coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex; and determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.

Description

Graph structure query method and device and storage medium
[ technical field ] A method for producing a semiconductor device
The present application relates to the field of graph structures, and in particular, to a graph structure query method, device and storage medium.
[ background of the invention ]
The graph structure is used as a flexible data structure, can express complex real-world entity relationships in a concise form, and is widely applied to various fields, such as a financial platform describing information of transfer transactions and the like among users through a point-edge relationship, and a network facilitator abstracting the communication relationship among network nodes by utilizing the graph structure. The graph query is to extract specific associated information based on graph data analysis, and mainly includes point-edge query, path query, subgraph query and the like.
With the expansion of data scale, the number of graph data edges and nodes is huge, so that the efficiency of querying large graph data is low. Especially, when complex query commands such as multi-hop query and subgraph query are to be processed, the query needs to traverse the neighbor nodes from the starting point, and then sequentially traverse the neighbor nodes of the neighbor nodes, so that the query cost of the hierarchical expansion is very expensive. In addition, in the abstract graph data of the real world, the number of neighbors of the node is far less than the total number of the node, and the node and other nodes on the graph have no edge connection with a high probability.
Therefore, most of the intermediate results in the query process are irrelevant results, such as querying common friends of any two people in the social network, the final result is far less than the number of the whole friends, and the complexity of graph query is increased by useless query cost.
[ summary of the invention ]
The application provides a graph structure query method, a graph structure query device and a storage medium, which can reduce the time consumption of graph structure query.
A first aspect of the present application provides a method for querying a graph structure, including:
acquiring an input query set aiming at a graph structure, wherein the input query set comprises at least one input query edge;
querying codes of a first vertex and a second vertex corresponding to a target query edge from a graph structure coding database, wherein the graph structure coding database comprises codes corresponding to a plurality of vertices within two vertices of the target query edge, the target query edge is any one query edge in the input query set, and the coding type of each vertex in the plurality of vertices is direct coding or combined coding;
determining the coding type of the first vertex and the coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex;
and determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
A second aspect of the present application provides a graph structure query apparatus, including:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an input query set aiming at a graph structure, and the input query set comprises at least one input query edge;
the query device is used for querying codes of a first vertex and a second vertex corresponding to a target query edge from a graph structure coding database, the graph structure coding database comprises codes corresponding to a plurality of vertices in two vertices of the target query edge, the target query edge is any one query edge in the input query set, and the coding type of each vertex in the plurality of vertices is direct coding or combined coding;
a first determining unit, configured to determine a coding type of the first vertex and a coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex;
and the second determining unit is used for determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
A third aspect of embodiments of the present application provides a computer device, which includes at least one connected processor, a memory and a transceiver, where the memory is configured to store program codes, and the processor is configured to call the program codes in the memory to perform the steps of the graph structure query method according to the first aspect.
A fourth aspect of the embodiments of the present application provides a computer storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the steps of the graph structure query method described in any one of the above aspects.
Compared with the related art, in the embodiment provided by the application, the graph data is pre-coded in a direct coding and combined coding mode, and the coded graph structure code is stored in the graph structure coding database, so that the coding types of two vertexes of a query edge can be determined when the query edge is input, and the query is carried out according to the coding types, so that the time and space efficiency are considered on the premise of ensuring the correctness of a query result, and the time consumption of graph structure query can be reduced.
[ description of the drawings ]
Fig. 1 is a schematic flowchart of a query method of a graph structure according to an embodiment of the present application;
fig. 2 is a schematic virtual structure diagram of a graph structure query device according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an encoding component of direct encoding according to an embodiment of the present application;
fig. 4 is a schematic diagram of an encoding composition of combinatorial coding provided in an embodiment of the present application;
fig. 5 is a schematic virtual structure diagram of a graph structure query device according to an embodiment of the present application;
fig. 6 is a schematic hardware structure diagram of a server according to an embodiment of the present application.
[ detailed description ] embodiments
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
The invention aims to provide a graph structure query method and related equipment, which pre-encode graph data through a high-efficiency low-dimensional graph encoding algorithm and filter out borderless results in query concentration through decoding during query, thereby considering both time and space efficiency on the premise of ensuring the correctness of query results and achieving the purpose of greatly accelerating graph query.
In order to achieve the purpose, the technical scheme adopted by the application comprises three modules, namely an offline coding module, an online inquiry module and an online updating module.
The offline encoding module encodes the graph nodes by pre-loading the graph data set and the configuration information and using a graph encoding algorithm.
And the online query module decodes the codes of the query point set according to the query content, filters part of the borderless query set and finally returns a query result.
The online updating module is used for maintaining the graph coding data, and updating the graph coding data simultaneously with the operations of updating, deleting, inserting and the like of the database.
Referring to fig. 1, fig. 1 is an architecture diagram of a graph structure query system according to an embodiment of the present application, including:
a graph query acceleration device 101, a database device 102, and an input-output device 103;
the input and output device 103 is responsible for data interaction with the graph query acceleration module 101 and the database device 102, the graph data set is loaded to the offline coding module 101A in the graph query acceleration device 101 for coding in the initial stage, the query set is firstly input to the online query module 101B in the query stage, the database device 102 is queried after partial results are returned, and finally the query result queried from the database device 102 is returned to the input and output device 103. In the update phase, the update data is first input to the online update module 101C for encoding and updating, and then the database device 102 is updated based on the encoded update data. The database device 102 includes, but is not limited to, a graph database, a relational database, and the like.
The following describes a method for querying a graph structure from the perspective of a graph structure querying device, which may be a server or a service unit in the server, and is not particularly limited.
Referring to fig. 2, fig. 2 is a schematic flowchart of a query method of a graph structure according to an embodiment of the present application, including:
201. a set of input queries for a graph structure is obtained.
In this embodiment, the graph structure query device may obtain an input query set for the graph structure, where the input query set includes at least one input query edge, for example, the input query set includes multiple query edges (v)1,v2) Wherein v is1To query a vertex of an edge, v2For another vertex in the query edge, only multiple times of calling execution are needed for multiple query edges in the input query set to obtain a single query edge (v)1,v2) And (4) finishing.
202. And querying the codes of the first vertex and the second vertex corresponding to the target query edge from the graph structure coding database.
In this embodiment, the graph structure query device may obtain the codes of the first vertex and the second vertex corresponding to the target query edge from a graph structure coding database, where the graph structure coding database stores the codes corresponding to multiple vertices including two vertices of the target query edge, and the target query edge is any one query edge in the input query set.
In the method, graph data are coded in a k-core mode and added to a graph structure coding database, wherein the k-core is a commonly used algorithm for mining a closely-associated subgraph structure in a graph, given a graph G and a parameter k, the k-core decomposition algorithm aims to perform subgraph division to obtain a maximum subgraph, and degrees of all vertexes in the subgraph are larger than or equal to k, namely all vertexes in the subgraph have at least k edges connected with other vertexes in the subgraph. The method mainly comprises the following two contents:
1. dividing the graph data through a k-core algorithm to obtain a division result, so that redundant information in the encoding process can be reduced;
2. and according to the division result, dividing the vertex into the removed point and the residual k-core subgraph vertex, and coding the vertex by adopting different coding modes according to different types of the vertex, wherein the coding modes mainly comprise direct coding and combined coding.
The coding part firstly needs to input a coding length m, wherein 1 coding length represents 32 bits (for convenience of description, the bits mentioned in the application are equivalent to bits), namely the total coding length of one vertex is 32 × m, and an input parameter m is used for calculating a parameter k in a k-core algorithm. The coding length m can be set manually or by default, and the coding length m is 5 in default, that is, the code of each node occupies 160 bits (20 bytes), so that the map query can be effectively accelerated by using a smaller coding length, and both time and space are taken into account. For convenience of explanation, the following description of the encoding process uses m-5, but the following description is specific to the parameter m of different sizes:
and A1, performing vertex serial number mapping on the data point set to obtain a target data set corresponding to the data point set.
In this step, the graph structure query device may read the data point set, perform vertex serial number mapping on the read data point set, use ID to represent the original serial number of the vertex, use ID to represent the post-vertex mapping serial number, and then the first read point ID is 1, and the second read point ID is 2, until the point set is completely read, and if the size of the data point set is n, then the vertex ID is 1 to n.
And A2, calculating the peak identification maximum digit and the decomposition parameter corresponding to the target data set.
In this step, the graph structure query device may respectively calculate the maximum vertex identifier bit b and the decomposition parameter k of k-core by the following formulas:
b=[log2n],
Figure BDA0003578143940000051
m is the code length and can be determined by manual setting or by default setting.
Step A3, determining the vertex degree corresponding to each data in the target data set.
Step A4, determining the vertex identification of the first target vertex and the vertex identification of the neighbor vertex of the first target vertex, wherein the first target vertex is the vertex with the minimum vertex degree in the target data set.
In this step, the vertex degrees of a certain vertex are the number of edges connected to the vertex, and the graph structure query device may calculate the degrees of all vertices in the data point set, sort the degrees in an ascending order, and obtain the vertex identifier of the vertex with the smallest degree (i.e., the first target vertex) and the vertex identifier of the neighbor vertex of the first target vertex, where the neighbor vertex of the first target vertex refers to the vertex connected with the first target vertex with an edge.
And A5, directly coding the vertex identification of the first target vertex and the vertex identification of the neighbor vertex of the first target vertex to obtain the code corresponding to the target vertex.
In this step, the direct coding mode is to divide the code into three parts, the first part is a flag bit and occupies 1 bit, if the first bit of the vertex code is 0, the direct coding mode is adopted, the second part occupies 32 × m-1 bits, the bit string formed by ascending sequencing the neighbor vertices of the first target vertex, each neighbor vertex id occupies b bits and is aligned backwards (the backward alignment here means that the length of the second part is fixed, the bit number of a certain neighbor vertex is less than the length, the identifier of the neighbor vertex is filled forwards from the last of the second part), that is, the 2 nd bit to the 2+ b-1 bit represent the id of the smallest neighbor vertex. The third part is the rest bit string, which is filled with 0's.
And A6, removing the first target vertex and the edge corresponding to the first target vertex from the target data set to obtain a first data set.
In this step, the graph structure query device may remove the vertex with the smallest degree and the edge connected to the vertex, and reduce the degree of the vertex connected to the vertex by one, which indicates that the remaining vertices are no longer connected to the vertex, so that the first data set may be obtained.
Step A7, based on the first data set, iteratively executing steps 3 to 6 until the vertex degree of each data in the target data set is greater than the decomposition parameter.
And step A8, carrying out combined coding on the vertexes with the vertex degrees larger than the decomposition parameters in the target data set.
In this step, the vertices in the target dataset with vertex degrees greater than the decomposition parameters are encoded in a combined manner. Because the degrees of the residual vertexes are all larger than k, all the neighbor ids cannot be directly written into the codes, a part of the neighbor ids are directly written into the codes in a combined coding mode, the residual neighbor ids are written into the codes in a hash mode, and particularly how many neighbor ids are directly written into the codes are realized through a scoring method based on a sliding window and a greedy strategy.
First, the coding components corresponding to the combined coding will be explained:
the coding under the combined coding mode is divided into five parts:
the first part is a flag bit, which occupies 1 bit, 0 represents direct coding, and 1 represents combined coding;
the second part occupies 2 bits and represents a neighbor id writing mode in the combined coding, and the three modes are Left-most, Right-most and Middle, wherein Left-most represents that the directly written id contains the minimum neighbor id, Right-most represents that the directly written id contains the maximum neighbor id, and Middle represents that the directly written id does not contain the minimum neighbor id or the maximum neighbor id. If the first bit and the second bit of the part are 0 and 0 respectively, the writing mode is a leftMost mode, if the first bit and the second bit of the part are 0 and 1, the writing mode is a Middle mode, and if the first bit and the second bit of the part are 1 and 1, the writing mode is a Right-most mode.
The third part is the number of neighbors which are directly written in and occupies log k bits;
the fourth part is directly written, is a bit string formed by ascending vertex ids of the neighbor vertex of the vertex, each id occupies b bits, if the size of the third part is 7, the fourth part represents that 7 neighbor ids are directly written, and the length of the fourth part is 7 x b bits;
the fifth part is a hash coding part, and the length of the fifth part is the length of all the coding bits.
The following describes the encoding flow of the combinatorial coding in detail:
step B1, determining an identification sequence corresponding to a second target vertex and an identification sequence of a neighbor node corresponding to the second target vertex, wherein the second target vertex is any one of the vertexes in the target data set, of which the vertex degrees are greater than the decomposition parameters;
in step B1, a vertex and its neighbor node id sequence are input, and the pair
Step B2, determining the coding score of the corresponding code of the sliding window;
step B3, if the coding score is larger than a preset optimal score, determining the window state of the sliding window, wherein the window state comprises the size and the position of the sliding window;
step B4, moving the sliding window according to a first moving rule based on the size and the position of the sliding window, and iteratively executing the steps B2 to B3 until a preset termination condition is reached;
step B5, adjusting the size of the sliding window, and iteratively executing the steps B2 to B4 based on the adjusted sliding window until the size of the sliding window is larger than a preset value;
and step B6, coding based on the coding score corresponding to the target sliding window and the window state corresponding to the target sliding window to obtain the combined code corresponding to the second target vertex, wherein the target sliding window is the sliding window with the highest coding score.
That is, the graph structure query device may input the second target vertex and the neighbor node id sequence thereof, arrange the neighbor node id sequences in an ascending order, then initialize a sliding window with the size of 0 neighbor nodes, point the leftmost end of the window initial position to the first position of the neighbor node sequence, finally calculate the coding score of the code corresponding to the sliding window through a scoring function, and record the window state of the current sliding window if the coding score of the code corresponding to the sliding window is greater than the preset optimal score, where the window state includes the size and the position of the sliding window.
The following describes the encoding score of the corresponding encoding of the sliding window:
and respectively applying a hash function to the neighbor id outside the sliding window, and setting the position of the fifth part of the code corresponding to the function result as 1. The hash function is id% h, h represents the total number of hash codes, the id% h bit of the fifth part of the code is set to be 1 according to the hash function and the id of the neighbor node, and then a score function is called to calculate a score, which is exemplified as follows:
for example, if the length of the sliding window is 4, the rightmost end of the sliding window points to the 5 th position of the neighbor identification sequence, except for the 5 th to 8 th neighbor ids, the rest neighbor ids apply a hash function, and the bit of the corresponding position of the fifth part is coded to be 1. Since the fourth part of the code takes 4 × b bits, the hash bits h are 32 × m-3-log k-4 × b bits.
The specific score function is:
Figure BDA0003578143940000081
wherein f is the code score corresponding to the sliding window, VmaxIs the maximum neighbor identification in the sliding window in the neighbor nodes corresponding to the second target vertex, VminThe minimum neighbor mark in the sliding window in the neighbor node corresponding to the second target fixed point is identified, if VmaxExactly equal to the maximum in the neighbor id sequence, then VmaxN if VminJust equal to the minimum value, then VminWhen the value is 0, w is the length of the sliding window, h is the length of the coding position corresponding to the hash function result (i.e. the length of the fifth part of the combinatorial coding, which is the bit number of the hash part), the hash function result is the result obtained by applying the hash function to the identifier of the neighbor node corresponding to the second target vertex in the sliding window, and h is the length of the sliding window0Is the number of 0's in the bit string corresponding to the hash function result.
And then, moving the sliding window to the right by one bit, and repeating the steps until the rightmost end of the sliding window reaches the rightmost end of the neighbor identification sequence.
Then increasing the size of the sliding window by 1, and repeating the stepsStep, until the size of the sliding window is larger than t, wherein,
Figure BDA0003578143940000082
and finally, the graph structure inquiry device carries out coding according to the highest score of the target sliding window and the window state of the target sliding window obtained in the step, and obtains the combined code corresponding to the second target vertex. The first part of the code is set to be 1, if the target sliding window contains the minimum neighbor id, the writing mode is Left-most, the second part is set to be 00, if the target sliding window contains the maximum neighbor id, the writing mode is Right-most, the second part is set to be 11, otherwise, the writing mode is set to be 01, the third part is the size of the target sliding window, the fourth part is a continuous bit string formed by vertex ids in the target sliding window, each id occupies b bits and is spliced into a bit string with b x w bits, and the fifth part is a bit string obtained after the rest points pass through a hash function.
203. And determining the coding type of the first vertex and the coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex.
In this embodiment, after obtaining the code of the first vertex and the code of the second vertex corresponding to the target query edge by querying from the graph structure coding database, the graph structure querying device may decode the code of the first vertex and the code of the second vertex, and then determine the coding types of the first vertex and the second vertex by looking up the identification bit of the first vertex and the identification bit of the second vertex, where if the identification bit is 0, the coding type of the vertex is direct coding, and if the identification bit is 1, the coding type of the vertex is combined coding.
204. And determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
In this embodiment, after determining the coding type of the first vertex and the coding type of the second vertex, the graph structure querying device may determine the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex. It will be appreciated that the first apex and the second apexThe coding types of the points include two coding types, direct coding and combined coding, and for the sake of understanding, the first vertex is referred to as v below1The second vertex is v2The description is given for the sake of example:
the first vertex, the first vertex and the second vertex are both directly coded, that is, the two vertices of the target query edge are both directly coded.
If the coding type of the first vertex and the coding type of the second vertex are both direct codes, the graph structure query device determines the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex, and the query result comprises the following steps:
decoding the code of the first vertex to obtain a neighbor identification sequence corresponding to the first vertex;
if the neighbor identification sequence corresponding to the first vertex contains a second vertex, determining that the first vertex and the second vertex are in a neighbor relation;
if the neighbor identification sequence corresponding to the first vertex does not contain the second vertex, determining that the first vertex and the second vertex are in a non-neighbor relation;
decoding the code of the second vertex to obtain a neighbor identification sequence corresponding to the second vertex;
if the neighbor identification sequence corresponding to the second vertex contains the first vertex, determining that the second vertex and the first vertex are in a neighbor relation;
and if the neighbor identification sequence corresponding to the second vertex does not contain the first vertex, determining that the target query edge does not have a query result.
That is, if the coding type of the first vertex and the coding type of the second vertex are both direct codes, the graph structure query device firstly judges v through the neighbor detection algorithm2With respect to v1If the returned result is a neighbor, the query returning edge of the target query edge has a result, and if the returned result is a non-neighbor, the graph structure query device can judge v through a neighbor detection algorithm1With respect to v2If v is a neighbor relation of1And v2Querying the target for edges for neighbor relationshipsThe query returns the edge with the result, otherwise the return edge does not have the result. The neighbor detection algorithm is explained in detail below:
the neighbor detection algorithm mainly uses the connection relation between the code of one vertex and another vertex. The return result of the neighbor detection method is divided into three types, 1, non-neighbor relation; 2. a neighbor relation; 3. the neighbor relation cannot be determined.
The neighbor detection algorithm is described below to query v2With respect to v1By way of example, i.e. by v1Is determined by the coding decision v2Whether or not v is1According to v, of neighbors1The coding types of (1) are divided into two modes:
direct coding:
query retrieval v1Corresponding codes, pair v1Decoding the corresponding code to obtain v1The corresponding neighbor identification sequence is easy to decode because the direct coding adopts a continuous bit string form, namely the 2 nd to 1+ b th bits are the first vertex id, and so on, the v is judged1Whether the corresponding neighbor identification sequence contains v or not2If not, then v is determined2With respect to v1Is a non-neighbor relation, if it contains, then v is determined2With respect to v1Is a neighbor relation. The coding structure of the direct coding is shown in fig. 3.
And (3) combining and coding:
query retrieval v1Corresponding coding, firstly obtaining the bit string length of the direct neighbor id in the coding, namely coding the decimal number corresponding to the 4 th to 3+ log k bits, if the decimal number is represented by w, intercepting the bit string with w x b bit length from the 4+ log k bit, equally dividing the bit string into w substrings, namely, each string has the length of b, converting each substring into a decimal form, and finally forming v1A neighbor id sequence known in the code.
Thereafter idV in the neighbor id sequence is determinedmaxAnd minimum idVminAnd query v1If the corresponding 2 nd bit and 3 rd bit in the code are 00, V is setminIs set to 0, VmaxIs v1The maximum id in the corresponding neighbor id sequence; if it is 11, then V ismaxSetting n, wherein n is the maximum id of a vertex in the data point set; thereafter, v is compared2And Vmax、VminThe size of (2).
If Vmin≤v2≤VmaxThen directly inquire v1Whether v is contained in the corresponding neighbor id sequence2If not, the returned result is a non-neighbor relation, and if the returned result is a neighbor relation.
If v is2≤VminOr Vmax≤v2Calculating the hash length h, h-m 32-3-log k-w b, and calculating v2The hash function f (v) ═ v% h is substituted, and if the result of the hash function is i, the query v is queried1And if the 3+ log k + w + b + i bit is 0, the returned result is a non-neighbor, and if the 3+ log k + w + b + i bit is 1, the returned result is that the neighbor relation cannot be determined. The coding structure of the combined coding is shown in fig. 4.
If the coding type of the first vertex is direct coding and the coding type of the second vertex is combined coding, that is, the coding type of one vertex in the target query edge is direct coding and the coding type of the other vertex is combined coding.
If the coding type of the first vertex is direct coding and the coding type of the second vertex is combined coding, determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex comprises the following steps:
determining a neighbor identification sequence of neighbor identifications in a code corresponding to the second vertex;
determining a maximum neighbor identifier and a minimum neighbor identifier according to a neighbor identifier sequence of the neighbor identifiers;
determining a target value of the maximum neighbor identifier and a target value of the minimum neighbor identifier according to the specific position parameter in the code corresponding to the second vertex;
comparing the vertex identification of the first vertex, the target value of the maximum neighbor identification and the target value of the minimum neighbor identification to obtain a comparison result;
determining a query result corresponding to the target query edge according to the comparison result;
i.e. if v1,v2One is direct coding and the other is combinatorial coding, provided that v1In a direct coding mode, v2For the combined coding mode, the graph structure query device can firstly judge v through a neighbor detection algorithm2With respect to v1If v is determined2And v1If the target query edge is in the neighbor relation, the query result corresponding to the target query edge is an edge existing result, and if v is determined2And v1If not, determining that the query result corresponding to the target query edge is an edge-absent result.
It should be noted that, the detection of the combined code in the neighbor detection algorithm has been described in detail above, and details are not described here.
And thirdly, if the coding type of the first vertex and the coding type of the second vertex are both combined coding, namely the coding types of the two vertices of the target query edge are both combined coding.
If the coding type of the first vertex and the coding type of the second vertex are both combined codes, determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex comprises:
determining a first query result corresponding to the target query edge based on the code corresponding to the first vertex;
determining a second query result corresponding to the target query edge based on the code corresponding to the second vertex
If the first query result and the second query result contain the first vertex and the second vertex which are in the neighbor relation, determining the query result of the target query edge as an edge existence result;
if the first query result and the second query result contain the non-neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge as the result without the edge;
and if the first query result and the second query result do not contain the neighbor relation between the first vertex and the second vertex and do not contain the non-neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge by querying the bottom-layer vertex information database.
That is, if v1,v2All are combined coding modes, and the graph structure query device judges v through a neighbor detection algorithm2With respect to v1And v (i.e. the first query result) of1With respect to v2If one of the two query results is a neighbor relationship, the query result corresponding to the target query edge is an edge existence result; if the two query results are both in a non-neighbor relation, the query result corresponding to the target query edge is an edge-absent result; if the two vertexes are not in the neighbor relation, the relation between the two vertexes can be obtained by inquiring the data of the top and bottom storage point information.
When the graph data is updated, the operations are divided into insertion and deletion operations according to the operation property, the following description describes that the code maintenance corresponding to single-side insertion and deletion is performed, and in the case of multiple sides, the operation only needs to be performed by decomposing into multiple points and performing multiple times in a single-side mode. For convenience of explanation, the sides of insertion/deletion are (v)1,v2)。
In one embodiment, the graph structure query device further performs the following operations:
acquiring a target updating edge, wherein the target updating edge comprises a third vertex and a fourth vertex;
determining the coding type of a third vertex and the coding type of a fourth vertex;
and updating the target updating edge to the graph structure coding database according to the coding type of the third vertex and the coding type of the fourth vertex.
In this embodiment, the graph structure querying device may obtain a target update edge, where the target update edge includes a third vertex and a fourth vertex, where the target update edge may be added or deleted, and is not specifically limited, and then the graph structure querying device may determine the coding type of the third vertex and the coding type of the fourth vertex, and update the target update edge into the graph structure coding database according to the coding types of the two vertices. How to add and delete will be described in detail according to the coding types of the two vertices of the target update edge: firstly, the target updating edge is the edge newly added into the graph structure coding database.
Step C1, inquiring two vertexes v of target update edge1,v2And correspondingly coding, checking the coding zone bit, and further determining the coding types of the two vertexes of the target updating edge.
Step C2, if v1,v2One vertex in the encoding is directly encoded, and the number of neighbors in the encoding is smaller than the upper bound b of the number of neighbors (the above-mentioned calculation method of b is explained in detail, and is not described here in detail), if both are satisfied, one vertex is randomly selected for updating, and the specific updating is as follows:
let v be the vertex satisfying the above two conditions1Parsing v from the original code1The neighbor identification sequence of (1). Then v is2Insert into v1Keeping id ascending order in the neighbor identification sequence, updating the code by a direct coding mode (the direct coding mode is described in detail above and is not described here in detail), and then based on the vertex v1The updated code performs an update operation of the graph structure coded database.
Step C3, if v1,v2One vertex is directly coded, the number of the coding neighbors is equal to the upper bound b of the number of the neighbors, and the other coding mode is combined coding.
Suppose a vertex v1Is directly coded, then only the vertex v is coded1Performing coding update to determine vertex v1Of the vertex v, the vertex v2Inserted to vertex v1Keeping id ascending order in the neighbor identification sequence, executing update coding of the combined coding mode (the above-mentioned combined coding mode is explained in detail, and is not described here again specifically), and then based on vertex v1The updated code performs an update operation of the graph structure coded database.
Step C4, if v1,v2Wherein both vertices are directly coded, and twoThe number of neighbors of the vertex is equal to the upper bound b of the number of neighbors.
Acquiring the neighbor identification sequences of two vertexes, and converting v1Is added to v2The neighbor identification sequence of (1) is coded in a combined coding mode to obtain an added vertex v1Vertex v after the neighbor identification sequence of2Is coded score s1And code e1Then v is2Addition to v1In the neighbor identification sequence, a combined coding mode is applied to obtain an added vertex v2Vertex v after the neighbor identification sequence of1Coding score s2And code e2. Comparing the two scores, assuming a score s1>s2Then only v is updated1Coding of points, by e1In place of v1And (5) original coding.
Step C5, if v1,v2In which both vertices are in a combined coding mode, first, querying v from a graph structure coding database1As an initialization neighbor identification sequence, and then querying v1Coding of the corresponding neighbor vertex, by vertex v1If the returned result is a neighbor relation, the neighbor vertex is moved from v to v in order to input and run a neighbor detection algorithm (the neighbor detection algorithm is explained in detail above and is not described herein in detail), and the returned result is a neighbor relation1Removing the neighbor identification sequence, and then executing a combined coding mode to obtain a coding score s1And code e1To v is to v2Performing the same operation to obtain a coding score s2And code e2Comparing the two scores, assuming a score s1>s2Then only v is updated1Coding of points, by e1In place of v1And (5) original coding.
And secondly, the target updating edge is an edge deleted from the graph structure coding database.
Step D1, querying two vertexes v corresponding to target query edges1And v2Corresponding coding, looking at the coded flag bits to determine two vertices v1And v2The type of encoding of (1).
Step D2, if v1,v2Are all directly coded, firstFirst to v1Decoding to obtain v1Corresponding neighbor id sequence, if v1V is contained in the corresponding neighbor id sequence2If so, delete v1V in the corresponding neighbor id sequence2After the base, direct encoding is performed again, and v is updated by the encoding obtained after the re-encoding1Corresponding coding;
if v is1The corresponding neighbor id sequence does not contain v2Then to v2Decoding to obtain v2Corresponding neighbor id sequence, if v2The corresponding neighbor id sequence does not contain v1If so, delete v2V of the corresponding neighbor id sequence1Using the sequence pair v obtained after deletion2Re-encoding is carried out and the resulting code after re-encoding is used to update v2And (4) corresponding coding.
Step D3, if v1,v2One is a direct coding mode, the other is a combined coding mode, and v is assumed1For direct encoding, v is first encoded1Decoding to obtain v1Corresponding neighbor id sequence, if v1V is contained in the corresponding neighbor id sequence2If so, delete v1V in the corresponding neighbor id sequence2Re-encoding directly, updating v1And (4) correspondingly coding. If v is1The corresponding neighbor id sequence does not contain v2And directly carrying out database updating operation.
Step D4, if v1,v2All are combined coding mode, firstly pass v1Whether v can be determined or not2Is other than v1The above description has already explained in detail how to determine the neighbor relationship between two vertices, and details are not repeated here.
If v cannot be determined2Is not v1By querying v through the database1And deleting v in the neighbor identification sequence2If the length of the neighbor identification sequence after deletion is less than or equal to k, adopting a direct coding mode to carry out v1Re-encoding, otherwise, adopting combined encoding mode to pair v1And (6) recoding. Then, based on the vertex v2And the steps are executed iteratively. And finally, updating the database based on the codes of the two vertexes obtained after the recoding.
In summary, it can be seen that, in the embodiment provided by the present application, the graph data is pre-encoded in a direct encoding and combined encoding manner, and the encoded graph structure code is stored in the graph structure encoding database, so that when the query edge is input, the encoding types of two vertices of the query edge can be determined, and the query is performed according to the encoding types, thereby considering both time and space efficiency on the premise of ensuring the correctness of the query result, and reducing the time consumption of graph structure query.
The present application is described above in terms of a graph structure query method, and is described below in terms of a graph structure query device.
Referring to fig. 5, fig. 5 is a schematic view of a virtual structure of a graph structure query device according to an embodiment of the present application, where the graph structure query device 500 includes:
an obtaining unit 501, configured to obtain an input query set for a graph structure, where the input query set includes at least one input query edge;
a querying device 502, configured to query, from a graph structure encoding database, codes of a first vertex and a second vertex corresponding to a target query edge, where the graph structure encoding database includes codes corresponding to multiple vertices within two vertices of the target query edge, the target query edge is any one query edge in the input query set, and a coding type of each vertex in the multiple vertices is direct coding or combined coding;
a first determining unit 503, configured to determine a coding type of the first vertex and a coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex;
a second determining unit 504, configured to determine a query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
In a possible design, if the coding type of the first vertex and the coding type of the second vertex are both direct coding, the second determining unit 504 is specifically configured to:
decoding the code of the first vertex to obtain a neighbor identification sequence corresponding to the first vertex;
if the neighbor identification sequence corresponding to the first vertex contains the second vertex, determining that the first vertex and the second vertex are in a neighbor relation;
if the neighbor identification sequence corresponding to the first vertex does not contain the second vertex, determining that the first vertex and the second vertex are in a non-neighbor relation;
decoding the code of the second vertex to obtain a neighbor identification sequence corresponding to the second vertex;
if the neighbor identification sequence corresponding to the second vertex contains the first vertex, determining that the second vertex and the first vertex are in a neighbor relation;
and if the neighbor identification sequence corresponding to the second vertex does not contain the first vertex, determining that the target query edge does not have a query result.
In one possible design, if the coding type of the first vertex is direct coding and the coding type of the second vertex is combinatorial coding, the second determining unit 504 is further specifically configured to:
determining a neighbor identification sequence of neighbor identifications in a code corresponding to the second vertex;
determining a maximum neighbor identifier and a minimum neighbor identifier according to the neighbor identifier sequence of the neighbor identifiers;
determining a target value of the maximum neighbor identifier and a target value of the minimum neighbor identifier according to a specific position parameter in a code corresponding to the second vertex;
comparing the vertex identification of the first vertex, the target value of the maximum neighbor identification and the target value of the minimum neighbor identification to obtain a comparison result;
and determining a query result corresponding to the target query edge according to the comparison result.
In a possible design, if the coding type of the first vertex and the coding type of the second vertex are both combination coding, the second determining unit 504 is further specifically configured to:
determining a first query result corresponding to the target query edge based on the code corresponding to the first vertex;
determining a second query result corresponding to the target query edge based on the code corresponding to the second vertex
If the first query result and the second query result contain the neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge as an edge existence result;
if the first query result and the second query result contain the non-neighbor relation between the first vertex and the second vertex, determining that the query result of the target query edge is an edge-absent result;
and if the first query result and the second query result do not contain the neighbor relation between the first vertex and the second vertex and do not contain the non-neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge by querying a bottom layer vertex information database.
In one possible design, the first determining unit 503 is further configured to:
step 1, performing vertex serial number mapping on a data point set to obtain a target data set corresponding to the data point set;
step 2, calculating the maximum digit number of the vertex identification and the decomposition parameter corresponding to the target data set;
step 3, determining a vertex degree corresponding to each data in the target data set;
step 4, determining a vertex identification of a first target vertex and vertex identifications of neighbor vertices of the first target vertex, wherein the first target vertex is the vertex with the minimum vertex degree in the target data set;
step 5, directly coding the vertex identification of the first target vertex and the vertex identification of the neighbor vertex of the first target vertex to obtain a code corresponding to the target vertex;
step 6, removing the first target vertex and the edge corresponding to the first target vertex from the target data set to obtain a first data set;
step 7, based on the first data set, iteratively executing the steps 3 to 6 until the vertex degree of each data in the target data set is greater than the decomposition parameter;
and 8, carrying out combined coding on the vertexes with the vertex degrees larger than the decomposition parameters in the target data set.
In one possible design, the first determining unit 503 performs combinatorial coding on the vertices in the target data set with vertex degrees greater than the decomposition parameter by:
step 1, determining an identification sequence corresponding to a second target vertex and an identification sequence of a neighbor node corresponding to the second target vertex, wherein the second target vertex is any one of vertexes in the target data set, of which the vertex degrees are greater than the decomposition parameter;
step 2, determining the coding score of the code corresponding to the sliding window, wherein the sliding window is the sliding window with the size of zero neighbor nodes;
step 3, if the coding score is larger than a preset optimal score, determining the window state of the sliding window, wherein the window state comprises the size and the position of the sliding window;
step 4, moving the sliding window according to a first moving rule based on the size and the position of the sliding window, and iteratively executing the step 2 to the step 3 until a preset termination condition is reached;
step 5, adjusting the size of the sliding window, and iteratively executing the step 2 to the step 4 based on the adjusted sliding window until the size of the sliding window is larger than a preset value;
and 6, coding based on the coding score corresponding to the target sliding window and the window state corresponding to the target sliding window to obtain the combined code corresponding to the second target vertex, wherein the target sliding window is the sliding window with the highest coding score.
In one possible design, the encoding, by the first determining unit 503, based on the encoding score corresponding to the target sliding window and the window state corresponding to the target sliding window, to obtain the combined encoding corresponding to the second target vertex includes:
setting a first position code corresponding to the second target vertex to be 1;
determining a second position code according to the neighbor identification corresponding to the second target vertex contained in the target sliding window;
setting the size of the target sliding window to be a third position code;
setting a continuous bit string corresponding to the identifier of the neighbor node contained in the target sliding window as a fourth position code;
performing hash function processing on the remaining nodes in the neighbor nodes corresponding to the second target vertex to obtain a fifth position code;
the combined code corresponding to the second target vertex includes the first position code, the second position code, the third position code, the fourth position code and the fifth position code.
In one possible design, the apparatus further includes:
an updating unit 505, the updating unit 505 being configured to:
acquiring a target updating edge, wherein the target updating edge comprises a third vertex and a fourth vertex;
determining a coding type of the third vertex and a coding type of the fourth vertex;
and updating the target updating edge to the graph structure coding database according to the coding type of the third vertex and the coding type of the fourth vertex.
Fig. 6 is a schematic structural diagram of a server according to the present application, and as shown in fig. 6, a server 600 according to this embodiment includes at least one processor 601, at least one network interface 604 or other user interface 603, a memory 605, and at least one communication bus 602. The server 600 optionally contains a user interface 603 including a display, keyboard or pointing device. The memory 605 may comprise a high-speed RAM memory, and may also include a non-volatile memory (non-volatile) such as at least one disk memory. The memory 605 stores execution instructions, and when the server 600 runs, the processor 601 communicates with the memory 605, and the processor 601 calls the instructions stored in the memory 605 to execute the query method of the graph structure. The operating system 606, which contains various programs for implementing various basic services and for handling hardware-dependent tasks.
The server provided in the embodiment of the present application may execute the technical solution of the embodiment of the query method with the graph structure, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a computer, implements the method flows related to the graph structure query device in any of the above method embodiments. Correspondingly, the computer can be the graph structure query device.
The present application further provides a computer program or a computer program product including the computer program, which when executed on a computer, will make the computer implement the method flow related to the graph structure query device in any of the above method embodiments. Correspondingly, the computer can be the graph structure query device.
In the above-described embodiment corresponding to fig. 1, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method for querying a graph structure, comprising:
acquiring an input query set aiming at a graph structure, wherein the input query set comprises at least one input query edge;
querying codes of a first vertex and a second vertex corresponding to a target query edge from a graph structure coding database, wherein the graph structure coding database stores codes corresponding to a plurality of vertices including two vertices of the target query edge, the target query edge is any one query edge in the input query set, and the coding type of each vertex in the plurality of vertices is direct coding or combined coding;
determining the coding type of the first vertex and the coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex;
and determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
2. The method of claim 1, wherein if the coding type of the first vertex and the coding type of the second vertex are both direct codes, the determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex comprises:
decoding the code of the first vertex to obtain a neighbor identification sequence corresponding to the first vertex;
if the neighbor identification sequence corresponding to the first vertex contains the second vertex, determining that the first vertex and the second vertex are in a neighbor relation;
if the neighbor identification sequence corresponding to the first vertex does not contain the second vertex, determining that the first vertex and the second vertex are in a non-neighbor relation;
decoding the code of the second vertex to obtain a neighbor identification sequence corresponding to the second vertex;
if the neighbor identification sequence corresponding to the second vertex contains the first vertex, determining that the second vertex and the first vertex are in a neighbor relation;
and if the neighbor identification sequence corresponding to the second vertex does not contain the first vertex, determining that the target query edge does not have a query result.
3. The method of claim 1, wherein if the coding type of the first vertex is direct coding and the coding type of the second vertex is combinatorial coding, the determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex comprises:
determining a neighbor identification sequence of neighbor identifications in a code corresponding to the second vertex;
determining a maximum neighbor identifier and a minimum neighbor identifier according to the neighbor identifier sequence of the neighbor identifiers;
determining a target value of the maximum neighbor identifier and a target value of the minimum neighbor identifier according to a specific position parameter in a code corresponding to the second vertex;
comparing the vertex identification of the first vertex, the target value of the maximum neighbor identification and the target value of the minimum neighbor identification to obtain a comparison result;
and determining the query result corresponding to the target query edge according to the comparison result.
4. The method of claim 1, wherein if the coding type of the first vertex and the coding type of the second vertex are both combined codes, the determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex comprises:
determining a first query result corresponding to the target query edge based on the code corresponding to the first vertex;
determining a second query result corresponding to the target query edge based on the code corresponding to the second vertex
If the first query result and the second query result contain the neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge as an edge existence result;
if the first query result and the second query result contain the non-neighbor relation between the first vertex and the second vertex, determining that the query result of the target query edge is an edge-absent result;
and if the first query result and the second query result do not contain the neighbor relation between the first vertex and the second vertex and do not contain the non-neighbor relation between the first vertex and the second vertex, determining the query result of the target query edge by querying a bottom layer vertex information database.
5. The method according to any one of claims 1 to 4, further comprising:
step 1, performing vertex serial number mapping on a data point set to obtain a target data set corresponding to the data point set;
step 2, calculating the maximum digit number of the vertex identification and the decomposition parameter corresponding to the target data set;
step 3, determining a vertex degree corresponding to each data in the target data set;
step 4, determining a vertex identification of a first target vertex and vertex identifications of neighbor vertices of the first target vertex, wherein the first target vertex is the vertex with the minimum vertex degree in the target data set;
step 5, directly coding the vertex identification of the first target vertex and the vertex identification of the neighbor vertex of the first target vertex to obtain a code corresponding to the target vertex;
step 6, removing the first target vertex and the edge corresponding to the first target vertex from the target data set to obtain a first data set;
step 7, based on the first data set, iteratively executing the steps 3 to 6 until the vertex degree of each data in the target data set is greater than the decomposition parameter;
and 8, carrying out combined coding on the vertexes with the vertex degrees larger than the decomposition parameters in the target data set.
6. The method of claim 5, wherein the combinatorial encoding of vertices in the target dataset having a degree of vertices greater than the decomposition parameter comprises:
step 1, determining an identification sequence corresponding to a second target vertex and an identification sequence of a neighbor node corresponding to the second target vertex, wherein the second target vertex is any one of vertexes in the target data set, of which the vertex degrees are greater than the decomposition parameter;
step 2, determining the coding score of the code corresponding to the sliding window;
step 3, if the coding score is larger than a preset optimal score, determining the window state of the sliding window, wherein the window state comprises the size and the position of the sliding window;
step 4, moving the sliding window according to a first moving rule based on the size and the position of the sliding window, and iteratively executing the step 2 to the step 3 until a preset termination condition is reached;
step 5, adjusting the size of the sliding window, and iteratively executing the step 2 to the step 4 based on the adjusted sliding window until the size of the sliding window is larger than a preset value;
and 6, coding based on the coding score corresponding to the target sliding window and the window state corresponding to the target sliding window to obtain the combined code corresponding to the second target vertex, wherein the target sliding window is the sliding window with the highest coding score.
7. The method of claim 6, wherein the encoding based on the encoding score corresponding to the target sliding window and the window state corresponding to the target sliding window to obtain the combined encoding corresponding to the second target vertex comprises:
setting a first position code corresponding to the second target vertex to be 1;
determining a second position code according to the neighbor identification corresponding to the second target vertex contained in the target sliding window;
setting the size of the target sliding window to be a third position code;
setting a continuous bit string corresponding to the identifier of the neighbor node contained in the target sliding window as a fourth position code;
performing hash function processing on the remaining nodes in the neighbor nodes corresponding to the second target vertex to obtain a fifth position code;
the combined code corresponding to the second target vertex includes the first position code, the second position code, the third position code, the fourth position code and the fifth position code.
8. The method of any one of claims 1 to 4, 6 and 7, further comprising:
acquiring a target updating edge, wherein the target updating edge comprises a third vertex and a fourth vertex;
determining a coding type of the third vertex and a coding type of the fourth vertex;
and updating the target updating edge to the graph structure coding database according to the coding type of the third vertex and the coding type of the fourth vertex.
9. A graph structure query device, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an input query set aiming at a graph structure, and the input query set comprises at least one input query edge;
the query device is used for querying codes of a first vertex and a second vertex corresponding to a target query edge from a graph structure coding database, wherein the graph structure coding database stores codes corresponding to a plurality of vertices including two vertices of the target query edge, the target query edge is any one query edge in the input query set, and the coding type of each vertex in the plurality of vertices is direct coding or combined coding;
a first determining unit, configured to determine a coding type of the first vertex and a coding type of the second vertex according to the coding of the first vertex and the coding of the second vertex;
and the second determining unit is used for determining the query result of the target query edge according to the coding type of the first vertex and the coding type of the second vertex.
10. A computer storage medium, comprising:
instructions which, when run on a computer, cause the computer to perform the steps of the graph structured query method of any one of claims 1 to 8.
CN202210348471.6A 2022-04-01 2022-04-01 Query method and device of graph structure and storage medium Active CN114647764B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210348471.6A CN114647764B (en) 2022-04-01 2022-04-01 Query method and device of graph structure and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210348471.6A CN114647764B (en) 2022-04-01 2022-04-01 Query method and device of graph structure and storage medium

Publications (2)

Publication Number Publication Date
CN114647764A true CN114647764A (en) 2022-06-21
CN114647764B CN114647764B (en) 2024-06-25

Family

ID=81996269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210348471.6A Active CN114647764B (en) 2022-04-01 2022-04-01 Query method and device of graph structure and storage medium

Country Status (1)

Country Link
CN (1) CN114647764B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383454A (en) * 2023-04-10 2023-07-04 星环信息科技(上海)股份有限公司 Data query method of graph database, electronic equipment and storage medium
CN118227815A (en) * 2024-05-24 2024-06-21 浙江邦盛科技股份有限公司 Dynamic intermediate state aggregation graph calculation method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201514399D0 (en) * 2015-08-13 2015-09-30 Fujitsu Ltd Hybrid data storage system and method and program for storing hybrid data
US20160117358A1 (en) * 2014-10-27 2016-04-28 Oracle International Corporation Graph database system that dynamically compiles and executes custom graph analytic programs written in high-level, imperative programing language
CN106294739A (en) * 2016-08-10 2017-01-04 桂林电子科技大学 A kind of based on k2tree and the large-scale graph data processing method of multivalued decision diagram
CN114153987A (en) * 2021-11-30 2022-03-08 湖南大学 Distributed knowledge graph query method, device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160117358A1 (en) * 2014-10-27 2016-04-28 Oracle International Corporation Graph database system that dynamically compiles and executes custom graph analytic programs written in high-level, imperative programing language
GB201514399D0 (en) * 2015-08-13 2015-09-30 Fujitsu Ltd Hybrid data storage system and method and program for storing hybrid data
CN106294739A (en) * 2016-08-10 2017-01-04 桂林电子科技大学 A kind of based on k2tree and the large-scale graph data processing method of multivalued decision diagram
CN114153987A (en) * 2021-11-30 2022-03-08 湖南大学 Distributed knowledge graph query method, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭伟;杨书新;: "一种基于双哈希编码的超图集合查询方法", 计算机应用与软件, no. 03, 15 March 2013 (2013-03-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383454A (en) * 2023-04-10 2023-07-04 星环信息科技(上海)股份有限公司 Data query method of graph database, electronic equipment and storage medium
CN116383454B (en) * 2023-04-10 2024-01-30 星环信息科技(上海)股份有限公司 Data query method of graph database, electronic equipment and storage medium
CN118227815A (en) * 2024-05-24 2024-06-21 浙江邦盛科技股份有限公司 Dynamic intermediate state aggregation graph calculation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN114647764B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
CN114647764A (en) Graph structure query method and device and storage medium
EP2924594B1 (en) Data encoding and corresponding data structure in a column-store database
US6904430B1 (en) Method and system for efficiently identifying differences between large files
CN103997346B (en) Data matching method and device based on assembly line
JP2020500382A (en) Method and apparatus for accessing structured bioinformatics data in an access unit
CN109325032B (en) Index data storage and retrieval method, device and storage medium
CN109299086B (en) Optimal sort key compression and index reconstruction
CN104283567A (en) Method for compressing or decompressing name data, and equipment thereof
JP2008299867A (en) Computer representation of data structure and encoding/decoding methods associated with the same
US6735600B1 (en) Editing protocol for flexible search engines
JP3083730B2 (en) System and method for compressing data information
CN110719106B (en) Social network graph compression method and system based on node classification and sorting
CN108628898A (en) The method, apparatus and equipment of data loading
CN111249736A (en) Code processing method and device
US20220005546A1 (en) Non-redundant gene set clustering method and system, and electronic device
CN110597847A (en) SQL statement automatic generation method, device, equipment and readable storage medium
KR102421458B1 (en) Method and apparatus for accessing structured bioinformatics data with an access unit
US8976048B2 (en) Efficient processing of Huffman encoded data
CN111767280A (en) Data processing method, device and storage medium
CN108829872B (en) Method, device, system and storage medium for rapidly processing lossless compressed file
CN104679775B (en) A kind of data processing method based on Huffman table
Bille et al. Finger search in grammar-compressed strings
Gagie et al. Compressing and indexing aligned readsets
CN109189996B (en) Based on K2Maximum common connectivity subgraph matching method of large-scale graph of MDD (minimization drive distribution)
CN113726342B (en) Segmented difference compression and inert decompression method for large-scale graph iterative computation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant