CN115482147B - Efficient parallel graph processing method and system based on compressed data direct calculation - Google Patents

Efficient parallel graph processing method and system based on compressed data direct calculation Download PDF

Info

Publication number
CN115482147B
CN115482147B CN202211115073.6A CN202211115073A CN115482147B CN 115482147 B CN115482147 B CN 115482147B CN 202211115073 A CN202211115073 A CN 202211115073A CN 115482147 B CN115482147 B CN 115482147B
Authority
CN
China
Prior art keywords
data
compressed
neighbor
graph
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211115073.6A
Other languages
Chinese (zh)
Other versions
CN115482147A (en
Inventor
张峰
陈政
卢卫
杜小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN202211115073.6A priority Critical patent/CN115482147B/en
Publication of CN115482147A publication Critical patent/CN115482147A/en
Application granted granted Critical
Publication of CN115482147B publication Critical patent/CN115482147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a high-efficiency parallel graph processing method and a system based on direct calculation of compressed data, comprising the following steps: processing graph data represented by an adjacency list and an application based on the graph data to obtain compressed graph data based on rules and an application based on the compressed graph data; based on the type of a computing platform of the application running the graph data, a CPU compressed graph direct processing method or a GPU compressed graph direct processing method is adopted to compute the compressed graph data based on rules and the application based on the compressed graph data, so that graph processing results are obtained. The invention can be widely applied to the technical field of big data processing.

Description

Efficient parallel graph processing method and system based on compressed data direct calculation
Technical Field
The invention relates to a high-efficiency parallel graph processing method and system based on direct calculation of compressed data, and belongs to the technical field of big data processing.
Background
Graph data processing plays a very important role in many fields, such as social networking, machine learning, graph data analysis, and the like. The graph data size has produced an explosive growth since the entry into the big data age. For example, in social networks, facebook has over 28.5 billions of accounts and billions of relationship edges, and this scale is expanding. Processing such large-scale data presents a dual challenge in terms of space and time, and on the one hand, storing large-scale map data requires large-scale space, resulting in high costs; on the other hand, querying and analyzing these large-scale map data requires extremely long processing time, which is not acceptable in actual business. In particular, GPUs have been widely used in the field of graphics processing as an important parallel acceleration device in recent years, however GPUs have their own independent memory, which is usually relatively small and cannot accommodate large-scale graphics data.
The format of the graph representation is critical to the graph storage size. There are two conventional graph representation methods: an adjacency matrix and an adjacency table. The adjacency matrix is of a size of vertex number |v| x a square matrix of V, the elements in the square matrix are used for indicating whether the vertexes are connected or not, and the spatial complexity is O (N≡2). The adjacency list is more efficient than the representation of the adjacency matrix, only stores neighbors among vertexes, and has the space complexity of O (|V|+|E|), wherein V is the number of vertexes in the graph, and E is the number of edges in the graph. However, these two figures show that the space consumption of the method is very large, and the requirements in actual production and life are not satisfied.
Compressing the graph data can greatly reduce the storage space. In the existing graph data compression method, one of the compression schemes is a compression scheme aiming at a network graph, and a plurality of coding schemes are adopted to code neighbor sequences in an adjacent table, so that a good compression effect can be obtained; in addition, the compression method based on the adjacent matrix can well utilize the sparsity of the adjacent matrix for compression. However, most of these current compressed picture schemes are based on coding, and in the analysis process of compressed picture data, real-time decoding is required, which causes a great performance degradation. Worse yet, due to the data dependency in compressed data, decoding operations often cannot be processed in parallel to improve performance. Thus, current compression methods have difficulty in achieving both spatial and temporal benefits compared to conventional graph storage formats.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a high-efficiency parallel graph processing method and a high-efficiency parallel graph processing system based on direct calculation of compressed data, which are characterized in that graph data expressed by an adjacency list are subjected to rule compression to obtain compressed graph data, so that data redundancy existing in the graph data can be effectively eliminated, and meanwhile, the direct processing is applied to the graph data described by grammar rules, so that the purposes of saving space and saving calculation are achieved.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for efficient parallel graph processing based on direct computation of compressed data, comprising the steps of:
processing graph data represented by an adjacency list and an application based on the graph data to obtain compressed graph data based on rules and an application based on the compressed graph data;
based on the type of a computing platform of the application running the graph data, a CPU compressed graph direct processing method or a GPU compressed graph direct processing method is adopted to compute the compressed graph data based on rules and the application based on the compressed graph data, so that graph processing results are obtained.
Further, the processing of the graph data represented by the adjacency list includes:
according to a preset ordering rule, ordering neighbors of each vertex in the graph data represented by the adjacency list;
inserting separators between neighbor sequences of each vertex;
and compressing a text sequence formed by the processed adjacency list by adopting a text compression method based on a context-free grammar of TADOC to obtain compressed graph data based on rules.
Further, the processing the application based on the graph data represented by the adjacency list to obtain the application based on the compressed graph data includes:
abstracting an application of graph data expressed based on an adjacency list into a tuple containing 6 elements, wherein the tuple comprises the graph data, operation, condition, result, start state and end state;
and adopting an application adaptation method to adapt the obtained tuple containing 6 elements to the tuple aiming at the compressed picture data, thereby obtaining the application based on the compressed picture data.
Further, the method for directly processing the CPU compressed graph is adopted to process the input compressed graph data based on rules and the application based on the compressed graph data, and comprises the following steps:
taking as input rule-based compression map data and an application based on the compression map data;
the starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the next step is carried out;
branch reduction is carried out, and branches with the same conditions and operation are combined;
loading the compression diagram data based on the rules into a memory;
and preparing application auxiliary data, and obtaining a graph processing result by using a CPU double-layer traversal model.
Further, the CPU bilayer traversal model processing flow includes:
3.5.1 Judging whether the state is an ending state, if so, collecting a calculation result, and ending the flow; otherwise, a vertex v is taken out from the set to be processed;
3.5.2 Judging whether all neighbors of the vertex v are processed, if yes, returning to the step 3.5.1), otherwise, taking out an unprocessed neighbor u;
3.5.3 Judging the type of the neighbor u, if the neighbor u is a vertex, executing operation < vertex, vertex >; if neighbor u is a rule, then perform operation < vertex, rule > and go to step 3.5.5);
3.5.4 Judging whether the neighbor u meets a preset condition, if so, adding the neighbor u into the vertex set to be processed and returning to the step 3.5.2), otherwise, directly returning to the step 3.5.2);
3.5.5 Judging whether the neighbor u meets the preset condition, if so, adding the neighbor u into a rule set to be processed and entering a step 3.5.6), otherwise, returning to the step 3.5.2);
3.5.6 Judging whether the rule set to be processed is empty, if so, returning to the step 3.5.2), otherwise, taking out a rule r;
3.5.7 Judging whether all neighbors of the rule r are processed, if yes, returning to the step 3.5.6), otherwise, taking out the neighbor t of one rule r;
3.5.8 Judging the type of the neighbor t, if the neighbor t is a vertex, executing operation < rule, vertex > and entering step 3.5.9); if the neighbor t is a rule, then perform operation < rule, rule > and go to step 3.5.10);
3.5.9 Judging whether the neighbor t meets the condition < vertex >, if so, adding the neighbor t into the vertex set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7);
3.5.10 Judging whether the neighbor t meets the condition < rule >, if so, adding the neighbor t into the vertex set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7.
Further, when the GPU compression direct computing method is adopted to process the input compressed image data and the application, the method comprises the following steps:
rule-based compression map data and applications based on the compression map data are used as inputs.
The starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the next step is carried out;
branch reduction is carried out, and branches with the same conditions and operation are combined;
loading the compressed graph data based on the rules into a memory, and copying the compressed graph data to a GPU;
auxiliary data are prepared, and a GPU interlayer asynchronous traversal model is operated on the memory of the GPU to obtain a graph processing result.
Further, the processing flow of the asynchronous traversal model between the GPU layers comprises the following steps:
4.5.1 Judging whether the current state is an ending state, if so, collecting a calculation result, ending the flow, otherwise, entering the step 4.5.2);
4.5.2 All elements in the set to be processed are processed in parallel, and one GPU thread processes one element e;
4.5.3 Judging whether the neighbor of the element e is completely processed in one GPU thread, if yes, entering a step 4.5.9), and if not, taking out the neighbor u of one e;
4.5.4 Judging the types of the element e and the neighbor u, if the element e is a vertex and the neighbor u is a vertex, executing operation < vertex, vertex > and entering into step 4.5.5); if e is a vertex and u is a rule, then perform operation < vertex, rule > and go to step 4.5.6); if e is a rule and u is a vertex, then perform operation < rule, vertex > and go to step 4.5.7); if e is a rule and u is a rule, then performing operation < rule, rule > and entering step 4.5.8);
4.5.5 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.6 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set to be traversed by the next state, otherwise, returning to the step 4.5.3);
4.5.7 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.8 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set of which the next state needs to be traversed, otherwise, returning to the step 4.5.3).
4.5.9 Judging whether all threads are finished, and returning to the step 4.5.1) if all threads are finished.
In a second aspect, the present invention provides an efficient parallel graph processing system based on direct computation of compressed data, comprising:
the preprocessing module is used for processing the graph data represented by the adjacency list and the application based on the graph data to obtain compressed graph data based on rules and the application based on the compressed graph data;
and the compressed picture processing module is used for calculating the compressed picture data based on rules and the application based on the compressed picture data by adopting a CPU compressed picture direct processing method or a GPU compressed picture direct processing method according to the type of a computing platform running the application to obtain a picture processing result.
In a third aspect, the present invention provides a processing device, at least comprising a processor and a memory, the memory having stored thereon a computer program, the processor executing the steps of the efficient parallel graph processing method based on compressed data direct computation when running the computer program.
In a fourth aspect, the present invention provides a computer storage medium having stored thereon computer readable instructions executable by a processor to perform the steps of the efficient parallel graph processing method based on direct computation of compressed data.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. the invention carries out rule compression on the graph data represented by the adjacency list to obtain compressed graph data, can effectively remove the data redundancy existing in the graph data, and simultaneously utilizes the direct processing of the graph data described by the grammar rule to achieve the purposes of saving space and saving calculation.
2. According to the method, the application of the graph data based on the adjacency list representation is converted into the application based on the compressed graph, and the CPU compressed graph direct calculation processing method or the GPU compressed graph direct processing method is adopted for different calculation platforms, so that the efficient parallel processing of the compressed graph data is realized, and the calculation time is effectively saved.
Therefore, the invention can be widely applied to the technical field of big data processing.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Like parts are designated with like reference numerals throughout the drawings. In the drawings:
FIG. 1 is an overall process flow provided by an embodiment of the present invention;
FIG. 2 is a flowchart of a rule compression module process provided by an embodiment of the present invention;
FIG. 3 is a flowchart of an application adaptation module provided by an embodiment of the present invention;
FIG. 4 is a CPU direct computation engine provided by an embodiment of the present invention;
FIG. 5 is a CPU dual-layer traversal model provided by an embodiment of the invention;
FIG. 6 is a GPU direct computation module provided by an embodiment of the present invention;
fig. 7 is a GPU interlayer asynchronous traversal model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which are obtained by a person skilled in the art based on the described embodiments of the invention, fall within the scope of protection of the invention.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
In the background of the large data age today, processing large-scale graph data poses significant challenges in terms of both spatial storage and processing time. The invention provides a high-efficiency parallel graph processing method based on direct calculation of compressed data. The method aims at the data redundancy existing in the graph data, and the graph data described by the rules of the opposite language are directly processed, so that the purposes of saving space and saving calculation are achieved. The invention mainly comprises an application adaptation, a CPU direct computing module and a GPU compression direct computing module, and can realize obvious performance acceleration on both the CPU and the GPU platform.
In accordance therewith, further embodiments of the present invention provide an efficient parallel graph processing system, apparatus, and medium based on rule compression.
Example 1
As shown in fig. 1, the present embodiment provides a method for processing an efficient parallel graph based on direct computation of compressed data, which includes the following steps:
1) Processing graph data represented by an adjacency list and an application based on the graph data to obtain compressed graph data based on rules and an application based on the compressed graph data;
2) Judging the type of a computing platform of an application running the graph data, entering the step 3) if the computing platform is a CPU, and entering the step 4) if the computing platform is a GPU;
3) And calculating the compressed graph data based on the rules and the application based on the compressed graph data by adopting a CPU compressed graph direct processing method to obtain a graph processing result.
4) And calculating the rule-based compressed graph data and the application based on the compressed graph data by adopting a GPU compressed graph direct processing method to obtain a graph processing result.
Preferably, the above step 1) may be achieved by:
1.1 Compressing graph data represented by the adjacency list by adopting a graph compression method based on rules to obtain compressed graph data;
1.2 Application adaptation method is adopted to modify the application (such as Breadth First Search (BFS) and the like) based on the graph data represented by the adjacency table, so as to obtain the application based on the compressed graph data.
Preferably, as shown in fig. 2, the above step 1.1) can be achieved by:
1.1.1 Ordering the neighbors of each vertex in the graph data represented in the adjacency list;
when ordering neighbors of each vertex, the ordering rules are arranged according to ascending order of node IDs by default, and a user can also designate other ordering rules, so that the consistency of the ordering rules of the neighbors of each vertex is only required;
1.1.2 Inserting separators between the neighbor sequences of each vertex;
1.1.3 Using a text compression method based on a context-free grammar of TADOC, compressing a text sequence formed by the processed adjacency list to obtain compression diagram data based on rules, wherein the compression diagram data is represented by a regular DAG diagram.
In fact, in this embodiment, for an original graph represented by an adjacency table, rule compression is used to represent common neighbors existing between different vertices as a rule, so that neighbor sequences that appear multiple times in the original graph only need to be stored once in the compressed graph, so as to achieve the purpose of saving space. To achieve this, for an original graph represented by an adjacency list, the first step in compressing it in this embodiment is to order the neighbors of vertices, otherwise, for vertices 0 and 1, they have the same neighbors (1, 2,3,4, 5), but are not identified as the same neighbor sequence because of the different order of their respective neighbors. The second step is to insert separators between the neighbor sequences of different vertices, each separator only appearing once in the final compressed graph, and thus not being identified as a repeated sequence during the compression process and put into rules, in the final compressed result, the invention can determine the neighbor positions of each different vertex according to the positions of separators. For example, the neighbor of vertex 2 is between delimiters spt2 and spt 3. And finally, compressing the text sequence formed by the neighbor sequences by using a text compression method of TADOC, so as to obtain a final compression result, wherein the compression result is represented by a DAG graph among rules.
Preferably, as shown in fig. 3, the above step 1.2) can be achieved by:
1.2.1 Abstracting an application of graph data represented based on the adjacency table into a tuple containing 6 elements;
1.2.2 Using an application adaptation method, adapting the obtained tuple containing 6 elements to be the tuple aiming at the compressed picture data, and further obtaining the application based on the compressed picture data.
Preferably, in the above step 1.2.1), according to a specific application, the application based on the graph data represented by the adjacency table may be abstracted into tuples represented by the graph data, the operation, the condition, the result, the start state, and the end state, which are respectively described as:
a) An operation of designating what operation the graph application does on one edge;
b) Determining whether an element enters the next iteration;
c) As a result, the specified data structure represents the final result collected in the graph algorithm;
d) A start state. In this embodiment, (W, G, B) is used to represent the state of a round of iteration in the graph application, where W represents the set of vertices and rules that do not need to be traversed, G represents the set of vertices and rules that need to be traversed, and B represents the set of vertices and rules that have been traversed.
e) And (5) ending the state.
Preferably, in step 1.2.2) above, since the rule is introduced when compressing the graph data in the present embodiment, when adapting the obtained different operation processing states to the operation processing for compressing the graph data, the adaptation content is to modify the operation and the condition in the tuple.
Specifically, the present invention can treat the rule as a new type of vertex, and then include 4 types of edges and two types of points in the compressed map data. Wherein, four types of edges are respectively: < vertex, vertex >, < vertex, rule >, < rule, vertex >, < rule, rule >; the two types of points are respectively: original vertices, rules. The basic goal of modification of operations and conditions in tuples is to design operations and conditions that involve rules. Such as conditions for the rule, and operations of < vertex, rule >, < rule, vertex > and < rule, rule >. For conditions and operations involving only vertices, it is generally only necessary to keep the same as the original application; for operations involving rules, one basic principle is to treat the rule as a virtual vertex, which serves to pass information between the point to it and the vertex to which it points, modifying the operation under this assumption; for conditions involving rules, then, the DAG graph traversal is designed for rule composition, and its modification is generally similar to normal graph traversal.
Taking Breadth First Search (BFS) as an example, the application calculates distances from one vertex root to vertices on the graph. The original application can be described as:
a) The operation is as follows: giving a vertex v and a neighbor u thereof, judging whether the u is traversed, if so, skipping, otherwise, setting the distance of the u from the root as the distance of the v from the root plus 1;
b) Conditions are as follows: giving a neighbor u of a vertex, judging whether the u is traversed or not, if yes, skipping, and if not, adding the vertex to be traversed in the next state;
c) Results: the distance between all vertexes and root;
d) Start state: set to (W, G, B), where W is all vertices except root, G is { root }, B is null;
e) End state: in (W, G, B), W is a set of arbitrary vertices, G is null, and B is a set of arbitrary vertices.
After adaptation to applications based on compressed map data, it can be described as:
a) The operation is as follows: giving an element src and a neighbor dst thereof, judging the types of the src and the dst, judging whether the dst traverses or not if both the src and the dst are vertexes, skipping if yes, and otherwise, setting the distance between the dst and the root as the distance of the src plus 1; if src is a vertex, dst is a rule, judging whether dst traverses, if so, skipping, and if not, setting the distance between dst and root as the distance of src plus 1; if src is a rule, dst is a vertex, judging whether dst traverses, if so, skipping, and if not, setting the distance between dst and root as the distance of src; if src is a rule, dst is a rule, judging whether dst traverses, if so, skipping, otherwise, setting the distance between dst and root as the distance of src;
b) Conditions are as follows: giving a neighbor u of an element, judging the type of u, judging whether u is traversed or not if u is a vertex, if u is skipped, adding the element into an element set to be traversed in the next state; if u is a rule, judging whether u is traversed, if so, skipping, otherwise, adding the u into an element set of which the next state needs to be traversed;
c) Results: the distance between all vertexes and root;
d) Start state: in (W, G, B), W is all vertices except root, G is { root }, and B is null;
e) End state: in (W, G, B), W is a set of arbitrary vertices, G is null, and B is a set of arbitrary vertices.
Preferably, as shown in fig. 4, in the step 3), when the computing platform is a CPU, the method for directly processing the compression map data based on the rule and the application based on the compression map data by adopting the CPU compression map direct processing method specifically includes the following steps:
3.1 Rule-based compression map data and applications based on the compression map data are used as inputs.
3.2 The starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the step 3.3) is entered.
3.3 Branch reduction, merging branches with the same conditions and operations.
In fact, this embodiment finds that in practical application, although there are two different types of vertices and four different types of edges in the compressed graph, the conditions for the two types of vertices and the operations for the four types of edges may be identical, and at this time, merging the branches can improve efficiency. For example, in the example of BFS mentioned in the previous section, operation < vertex, vertex > and operation < vertex, rule > are identical, operation < rule, vertex >, < rule, rule > is identical.
3.4 Rule-based compressed map data is loaded into memory.
3.5 Preparing application auxiliary data, and obtaining a graph processing result by using a CPU double-layer traversal model. Wherein the application assistance data comprises user-defined Result fields, and some basic properties of vertices and rules, such as degree of ingress, degree of egress, etc.
Preferably, as shown in fig. 5, the CPU double-layer traversal model processing flow includes the following steps:
3.5.1 Judging whether the state is an ending state, if so, collecting a calculation result, and ending the flow; otherwise, a vertex v is taken out from the set to be processed;
3.5.2 Judging whether all neighbors of the vertex v are processed, if yes, returning to the step 3.5.1), otherwise, taking out an unprocessed neighbor u;
3.5.3 Judging the type of the neighbor u, if the neighbor u is a vertex, executing operation < vertex, vertex >; if neighbor u is a rule, then perform operation < vertex, rule > and go to step 3.5.5);
3.5.4 Judging whether the neighbor u meets a preset condition, if so, adding the neighbor u into the vertex set to be processed and returning to the step 3.5.2), otherwise, directly returning to the step 3.5.2);
3.5.5 Judging whether the neighbor u meets the preset condition, if so, adding the neighbor u into a rule set to be processed and entering a step 3.5.6), otherwise, returning to the step 3.5.2);
3.5.6 Judging whether the rule set to be processed is empty, if so, returning to the step 3.5.2), otherwise, taking out a rule r;
3.5.7 Judging whether all neighbors of the rule r are processed, if yes, returning to the step 3.5.6), otherwise, taking out the neighbor t of one rule r;
3.5.8 Judging the type of the neighbor t, if the neighbor t is a vertex, executing operation < rule, vertex > and entering step 3.5.9); if the neighbor t is a rule, then perform operation < rule, rule > and go to step 3.5.10);
3.5.9 Judging whether the neighbor t meets the condition < vertex >, if so, adding the neighbor t into the vertex set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7);
3.5.10 Judging whether the neighbor t meets the condition < rule >, if so, adding the neighbor t into the vertex (rule) set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7.
Preferably, as shown in fig. 6, in the step 4), when the GPU compression direct calculation method is adopted to process the input compressed image data and the application, the method includes the following steps:
4.1 Rule-based compression map data and applications based on the compression map data are used as inputs.
4.2 The starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the step 4.3 is entered).
4.3 Branch reduction, merging branches with the same conditions and operations.
4.4 Loading the rule-based compressed graph data into a memory and copying the data to a GPU;
4.5 Preparing auxiliary data, and running a GPU interlayer asynchronous traversal model on the memory of the GPU to obtain a graph processing result.
As shown in fig. 7, the process flow of the GPU interlayer asynchronous traversal model includes the following steps:
4.5.1 Judging whether the current state is an ending state, if so, collecting a calculation result, ending the flow, otherwise, entering the step 4.5.2);
4.5.2 All elements in the set to be processed are processed in parallel, and one GPU thread processes one element e;
4.5.3 Judging whether the neighbor of the element e is completely processed in one GPU thread, if yes, entering a step 4.5.9), and if not, taking out the neighbor u of one e;
4.5.4 Judging the types of the element e and the neighbor u, if the element e is a vertex and the neighbor u is a vertex, executing operation < vertex, vertex > and entering into step 4.5.5); if e is a vertex and u is a rule, then perform operation < vertex, rule > and go to step 4.5.6); if e is a rule and u is a vertex, then perform operation < rule, vertex > and go to step 4.5.7); if e is a rule and u is a rule, then performing operation < rule, rule > and entering step 4.5.8);
4.5.5 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.6 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set to be traversed by the next state, otherwise, returning to the step 4.5.3);
4.5.7 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.8 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set of which the next state needs to be traversed, otherwise, returning to the step 4.5.3).
4.5.9 Judging whether all threads are finished, and returning to the step 4.5.1) if all threads are finished.
Example 2
The present embodiment tests on several general graph data sets, it-2004 being a network graph and twitter-2010 being a social network graph. In both figures, the present invention can achieve compression ratios of 17.58 and 11.51, respectively. (compression ratio=original size/compressed size)
We have performed performance comparisons with the different most advanced graphics systems on the CPU and GPU, respectively, and with the guestrock system on the CPU and GPU. We tested three algorithms, breadth First Search (BFS), connected block (CC), pagerank (PR). In the invention, the rule is a common neighbor among vertexes, which appears multiple times in the graph data but only appears once in the compressed data, and in the practical graph application, the invention can realize the calculation reuse of the rule, thereby obviously improving the performance. On the CPU, the invention can realize the acceleration ratio which is 3.49 times on average and 9.58 on the GPU.
Example 3
In contrast to the above embodiment 1, which provides an efficient parallel graph processing method based on direct computation of compressed data, this embodiment provides an efficient parallel graph processing system based on direct computation of compressed data. The system provided in this embodiment may implement the efficient parallel graph processing method based on direct computation of compressed data in embodiment 1, where the system may be implemented by software, hardware, or a combination of software and hardware. For example, the system may include integrated or separate functional modules or functional units to perform the corresponding steps in the methods of embodiment 1. Since the system of this embodiment is substantially similar to the method embodiment, the description of this embodiment is relatively simple, and the relevant points may be found in part in the description of embodiment 1, which is provided by way of illustration only.
The efficient parallel graph processing system based on direct computation of compressed data provided in this embodiment includes:
the preprocessing module is used for processing the graph data represented by the adjacency list and the application based on the graph data to obtain compressed graph data based on rules and the application based on the compressed graph data;
and the compressed picture processing module is used for calculating the compressed picture data based on rules and the application based on the compressed picture data by adopting a CPU compressed picture direct processing method or a GPU compressed picture direct processing method according to the type of a computing platform running the application, so as to obtain a picture processing result.
Example 4
The present embodiment provides a processing device corresponding to the efficient parallel graph processing method based on direct computation of compressed data provided in the present embodiment 1, where the processing device may be a processing device for a client, for example, a mobile phone, a notebook computer, a tablet computer, a desktop computer, or the like, to execute the method of embodiment 1.
The processing device comprises a processor, a memory, a communication interface and a bus, wherein the processor, the memory and the communication interface are connected through the bus so as to complete communication among each other. A computer program executable on the processor is stored in the memory, and when the processor executes the computer program, the efficient parallel graph processing method based on the direct calculation of compressed data provided in the present embodiment 1 is executed.
In some embodiments, the memory may be a high-speed random access memory (RAM: random Access Memory), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
In other embodiments, the processor may be a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or other general purpose processor, which is not limited herein.
Example 5
The efficient parallel graph processing method based on the direct computation of compressed data of this embodiment 1 may be embodied as a computer program product, which may include a computer readable storage medium having computer readable program instructions loaded thereon for performing the efficient parallel graph processing method based on the direct computation of compressed data described in this embodiment 1.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination of the preceding.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. The efficient parallel graph processing method based on the direct calculation of the compressed data is characterized by comprising the following steps of:
processing graph data represented by an adjacency list and an application based on the graph data to obtain compressed graph data based on rules and an application based on the compressed graph data;
wherein processing the application based on the graph data represented by the adjacency list to obtain the application based on the compressed graph data comprises the following steps:
abstracting an application of graph data expressed based on an adjacency list into a tuple containing 6 elements, wherein the tuple comprises the graph data, operation, condition, result, start state and end state;
adopting an application adaptation method to adapt the obtained tuple containing 6 elements to the tuple aiming at the compressed picture data, thereby obtaining the application based on the compressed picture data;
based on the type of a computing platform of an application running the graph data, a CPU compressed graph direct processing method or a GPU compressed graph direct processing method is adopted to compute the compressed graph data based on rules and the application based on the compressed graph data, so that graph processing results are obtained;
the method for processing the input rule-based compressed image data and the application based on the compressed image data by adopting a CPU compressed image direct processing method comprises the following steps:
taking as input rule-based compression map data and an application based on the compression map data;
the starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the next step is carried out;
branch reduction is carried out, and branches with the same conditions and operation are combined;
loading the compression diagram data based on the rules into a memory;
and preparing application auxiliary data, and obtaining a graph processing result by using a CPU double-layer traversal model.
2. A method for efficient parallel graph processing based on direct computation of compressed data according to claim 1, wherein said processing of graph data represented in adjacency lists comprises:
according to a preset ordering rule, ordering neighbors of each vertex in the graph data represented by the adjacency list;
inserting separators between neighbor sequences of each vertex;
and compressing a text sequence formed by the processed adjacency list by adopting a text compression method based on a context-free grammar of TADOC to obtain compressed graph data based on rules.
3. The method for efficient parallel graph processing based on direct computation of compressed data according to claim 1, wherein the CPU double-layer traversal model processing flow comprises:
3.5.1 Judging whether the state is an ending state, if so, collecting a calculation result, and ending the flow; otherwise, a vertex v is taken out from the set to be processed;
3.5.2 Judging whether all neighbors of the vertex v are processed, if yes, returning to the step 3.5.1), otherwise, taking out an unprocessed neighbor u;
3.5.3 Judging the type of the neighbor u, if the neighbor u is a vertex, executing operation < vertex, vertex >; if neighbor u is a rule, then perform operation < vertex, rule > and go to step 3.5.5);
3.5.4 Judging whether the neighbor u meets a preset condition, if so, adding the neighbor u into the vertex set to be processed and returning to the step 3.5.2), otherwise, directly returning to the step 3.5.2);
3.5.5 Judging whether the neighbor u meets the preset condition, if so, adding the neighbor u into a rule set to be processed and entering a step 3.5.6), otherwise, returning to the step 3.5.2);
3.5.6 Judging whether the rule set to be processed is empty, if so, returning to the step 3.5.2), otherwise, taking out a rule r;
3.5.7 Judging whether all neighbors of the rule r are processed, if yes, returning to the step 3.5.6), otherwise, taking out the neighbor t of one rule r;
3.5.8 Judging the type of the neighbor t, if the neighbor t is a vertex, executing operation < rule, vertex > and entering step 3.5.9); if the neighbor t is a rule, then perform operation < rule, rule > and go to step 3.5.10);
3.5.9 Judging whether the neighbor t meets the condition < vertex >, if so, adding the neighbor t into the vertex set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7);
3.5.10 Judging whether the neighbor t meets the condition < rule >, if so, adding the neighbor t into the vertex set to be processed and returning to the step 3.5.7), otherwise, directly returning to the step 3.5.7.
4. The efficient parallel graph processing method based on compressed data direct computation according to claim 1, wherein when the GPU compression direct computation method is adopted to process the input compressed graph data and the application, the method comprises the following steps:
taking as input rule-based compression map data and an application based on the compression map data;
the starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the next step is carried out;
branch reduction is carried out, and branches with the same conditions and operation are combined;
loading the compressed graph data based on the rules into a memory, and copying the compressed graph data to a GPU;
auxiliary data are prepared, and a GPU interlayer asynchronous traversal model is operated on the memory of the GPU to obtain a graph processing result.
5. The method for efficient parallel graph processing based on direct computation of compressed data according to claim 4, wherein the process flow of the asynchronous traversal model between GPU layers comprises the following steps:
4.5.1 Judging whether the current state is an ending state, if so, collecting a calculation result, ending the flow, otherwise, entering the step 4.5.2);
4.5.2 All elements in the set to be processed are processed in parallel, and one GPU thread processes one element e;
4.5.3 Judging whether the neighbor of the element e is completely processed in one GPU thread, if yes, entering a step 4.5.9), and if not, taking out the neighbor u of one e;
4.5.4 Judging the types of the element e and the neighbor u, if the element e is a vertex and the neighbor u is a vertex, executing operation < vertex, vertex > and entering into step 4.5.5); if e is a vertex and u is a rule, then perform operation < vertex, rule > and go to step 4.5.6); if e is a rule and u is a vertex, then perform operation < rule, vertex > and go to step 4.5.7); if e is a rule and u is a rule, then performing operation < rule, rule > and entering step 4.5.8);
4.5.5 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.6 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set to be traversed by the next state, otherwise, returning to the step 4.5.3);
4.5.7 Judging whether the neighbor u meets the condition < vertex >, if so, adding the neighbor u into a set to be traversed in the next state, otherwise, returning to the step 4.5.3);
4.5.8 Judging whether the neighbor u meets the condition < rule >, if so, adding the neighbor u into a set to be traversed by the next state, otherwise, returning to the step 4.5.3);
4.5.9 Judging whether all threads are finished, and returning to the step 4.5.1) if all threads are finished.
6. An efficient parallel graph processing system based on direct computation of compressed data, comprising:
the preprocessing module is used for processing the graph data represented by the adjacency list and the application based on the graph data to obtain compressed graph data based on rules and the application based on the compressed graph data; wherein processing the application based on the graph data represented by the adjacency list to obtain the application based on the compressed graph data comprises the following steps:
abstracting an application of graph data expressed based on an adjacency list into a tuple containing 6 elements, wherein the tuple comprises the graph data, operation, condition, result, start state and end state;
adopting an application adaptation method to adapt the obtained tuple containing 6 elements to the tuple aiming at the compressed picture data, thereby obtaining the application based on the compressed picture data;
the compressed picture processing module is used for calculating the compressed picture data based on rules and the application based on the compressed picture data by adopting a CPU compressed picture direct processing method or a GPU compressed picture direct processing method according to the type of a computing platform running the application to obtain a picture processing result; the method for processing the input rule-based compressed image data and the application based on the compressed image data by adopting a CPU compressed image direct processing method comprises the following steps:
taking as input rule-based compression map data and an application based on the compression map data;
the starting state and the ending state are checked for rationality, if not, the calculation is ended, otherwise, the next step is carried out;
branch reduction is carried out, and branches with the same conditions and operation are combined;
loading the compression diagram data based on the rules into a memory;
and preparing application auxiliary data, and obtaining a graph processing result by using a CPU double-layer traversal model.
7. A processing device comprising at least a processor and a memory, said memory having stored thereon a computer program, characterized in that the processor executes the steps of the method for efficient parallel graph processing based on direct computation of compressed data according to any of claims 1 to 5 when running said computer program.
8. A computer storage medium having stored thereon computer readable instructions executable by a processor to implement the steps of the method for efficient parallel graph processing based on direct computation of compressed data according to any one of claims 1 to 5.
CN202211115073.6A 2022-09-14 2022-09-14 Efficient parallel graph processing method and system based on compressed data direct calculation Active CN115482147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211115073.6A CN115482147B (en) 2022-09-14 2022-09-14 Efficient parallel graph processing method and system based on compressed data direct calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211115073.6A CN115482147B (en) 2022-09-14 2022-09-14 Efficient parallel graph processing method and system based on compressed data direct calculation

Publications (2)

Publication Number Publication Date
CN115482147A CN115482147A (en) 2022-12-16
CN115482147B true CN115482147B (en) 2023-04-28

Family

ID=84424117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211115073.6A Active CN115482147B (en) 2022-09-14 2022-09-14 Efficient parallel graph processing method and system based on compressed data direct calculation

Country Status (1)

Country Link
CN (1) CN115482147B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109785224A (en) * 2019-01-29 2019-05-21 华中科技大学 A kind of diagram data processing method and system based on FPGA

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9852522B2 (en) * 2014-03-17 2017-12-26 Sony Interactive Entertainment Inc. Image decoder, graphics processing system, image decoding method, and graphics processing method
CN107146274B (en) * 2017-05-05 2021-06-22 上海兆芯集成电路有限公司 Image data processing system, texture mapping compression method and method for generating panoramic video
US10585944B2 (en) * 2017-07-06 2020-03-10 International Business Machines Corporation Directed graph compression
CN109982088B (en) * 2017-12-28 2021-07-16 华为技术有限公司 Image processing method and device
WO2020061797A1 (en) * 2018-09-26 2020-04-02 华为技术有限公司 Method and apparatus for compressing and decompressing 3d graphic data
CN111737540B (en) * 2020-05-27 2022-11-29 中国科学院计算技术研究所 Graph data processing method and medium applied to distributed computing node cluster
CN113163198B (en) * 2021-03-19 2022-12-06 北京百度网讯科技有限公司 Image compression method, decompression method, device, equipment and storage medium
CN113064870B (en) * 2021-03-22 2021-11-30 中国人民大学 Big data processing method based on compressed data direct calculation
CN114780502B (en) * 2022-05-17 2022-09-16 中国人民大学 Database method, system, device and medium based on compressed data direct computation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109785224A (en) * 2019-01-29 2019-05-21 华中科技大学 A kind of diagram data processing method and system based on FPGA

Also Published As

Publication number Publication date
CN115482147A (en) 2022-12-16

Similar Documents

Publication Publication Date Title
US11928599B2 (en) Method and device for model compression of neural network
CN113312175A (en) Operator determining and operating method and device
US20230056760A1 (en) Method and apparatus for processing graph data, device, storage medium, and program product
CN114337920B (en) Code analysis method and device and electronic equipment
CN114880742A (en) Revit model lightweight method for webgl engine
CN112771546A (en) Operation accelerator and compression method
CN115482147B (en) Efficient parallel graph processing method and system based on compressed data direct calculation
WO2012159320A1 (en) Method and device for clustering large-scale image data
CN113505278A (en) Graph matching method and device, electronic equipment and storage medium
CN107038260B (en) Efficient parallel loading method capable of keeping titan real-time data consistency
WO2019127926A1 (en) Calculation method and calculation device for sparse neural network, electronic device, computer readable storage medium, and computer program product
CN114091648A (en) Image classification method and device based on convolutional neural network and convolutional neural network
CN108021678B (en) Key value pair storage structure with compact structure and quick key value pair searching method
CN116450347A (en) Video multitasking method, video analysis device, and storage medium
CN115809294A (en) Rapid ETL method based on Spark SQL temporary view
CN114662688A (en) Model training method, data processing method, device, electronic device and medium
CN115130672A (en) Method and device for calculating convolution neural network by software and hardware collaborative optimization
CN113792170A (en) Graph data dividing method and device and computer equipment
CN113569727B (en) Method, system, terminal and medium for identifying construction site in remote sensing image
CN115578583B (en) Image processing method, device, electronic equipment and storage medium
CN117556273B (en) Method and device for calculating contrast loss through multiple graphic processors
CN110021059B (en) High-efficiency Marking Cubes isosurface extraction method and system without redundant computation
CN118132808B (en) Stream batch integrated processing method and system for space-time data
CN115221211B (en) Graph calculation processing method and device, electronic equipment and storage medium
CN115546009B (en) Optimization method, device and equipment of non-maximum suppression algorithm and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant