CN111177482A - Method, device and equipment for parallel processing of graph data and readable storage medium - Google Patents

Method, device and equipment for parallel processing of graph data and readable storage medium Download PDF

Info

Publication number
CN111177482A
CN111177482A CN201911402930.9A CN201911402930A CN111177482A CN 111177482 A CN111177482 A CN 111177482A CN 201911402930 A CN201911402930 A CN 201911402930A CN 111177482 A CN111177482 A CN 111177482A
Authority
CN
China
Prior art keywords
data
independent
address
source
source address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911402930.9A
Other languages
Chinese (zh)
Other versions
CN111177482B (en
Inventor
梅国强
郝锐
王江为
阚宏伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201911402930.9A priority Critical patent/CN111177482B/en
Publication of CN111177482A publication Critical patent/CN111177482A/en
Application granted granted Critical
Publication of CN111177482B publication Critical patent/CN111177482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Image Generation (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application discloses a method for parallel processing of graph data, which comprises the following steps: acquiring graph data; screening source addresses in the graph data to obtain independent source addresses; determining a corresponding destination address according to the independent source address, and screening the destination address to obtain an independent destination address; and acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result. According to the method and the device, a plurality of processing units are not required to process simultaneously, and the parallel processing of the graph data can be completed only by one processing unit, so that the cache overhead and the communication overhead in the graph data parallel processing process are greatly reduced. The application also provides a device, equipment and a readable storage medium for parallel processing of the graph data, and the beneficial effects are achieved.

Description

Method, device and equipment for parallel processing of graph data and readable storage medium
Technical Field
The present application relates to the field of graph data processing, and in particular, to a method, an apparatus, a device, and a readable storage medium for graph data parallel processing.
Background
In the big data era, the graph is used as a basic data representation mode and widely applied to various algorithms such as deep learning, user recommendation and the like. At present, the scale of the graph is often in the order of tens of millions to hundreds of millions of nodes, and the edge (node-node contact) information of the graph is in the order of billions. The graph in actual life has the characteristics of large scale, sparse and discontinuous nodes, power law degree distribution of node network degree distribution and the like, and brings huge challenges for designing an effective graph processing algorithm.
In the prior art, the parallel processing of the graph data is realized by increasing the number of processing units. However, this approach has limited processing power of a single processing unit, and requires a larger cache unit than a single processing unit due to the need for multiple blocks to be processed simultaneously. Meanwhile, as the number of blocks is increased, the communication overhead ratio between different blocks is increased, so that the number of blocks which are simultaneously operated in parallel is limited to a certain extent.
Therefore, how to reduce the cache overhead and communication overhead in the graph data parallel processing process is a technical problem that needs to be solved by those skilled in the art at present.
Disclosure of Invention
The application aims to provide a method, a device and equipment for graph data parallel processing and a readable storage medium, which are used for reducing cache overhead and communication overhead in the graph data parallel processing process.
In order to solve the above technical problem, the present application provides a method for parallel processing of graph data, including:
acquiring graph data;
screening the source address in the graph data to obtain an independent source address; wherein the independent source addresses are source addresses with different values;
determining a corresponding destination address according to the independent source address, and screening the destination address to obtain an independent destination address; wherein the independent destination addresses are destination addresses with different values;
and acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
Optionally, the screening the source address in the graph address to obtain an independent source address includes:
storing each source address into a corresponding input FIFO respectively;
selecting a screening channel according to the source address; the two ends of the screening channel are the input FIFO and the output FIFO respectively, and only one source address is allowed to pass through the screening channel at each moment;
determining a source address in each of the output FIFOs as the independent source address.
Optionally, selecting a screening channel according to the source address includes:
determining the highest address of the source address as the mark data;
selecting a first-stage FIFO according to the mark data, and moving the source address from the input FIFO to the first-stage FIFO;
judging whether the mark data is the last address;
if not, updating the mark data to be the next address of the mark data in the source address, selecting the next-level FIFO according to the updated mark data, moving the source address to the next-level FIFO, and returning to execute the step of judging whether the mark data is the last address;
and if so, moving the source address to the corresponding output FIFO.
Optionally, obtaining corresponding independent source data and independent destination data according to the independent source address and the independent destination address includes:
simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and simultaneously reading independent destination data corresponding to each independent destination address by using a destination static random access memory with the preset number of ports.
The present application further provides an apparatus for parallel processing of graph data, the apparatus comprising:
the acquisition module is used for acquiring graph data;
the first screening module is used for screening the source address in the graph data to obtain an independent source address; wherein the independent source addresses are source addresses with different values;
the second screening module is used for determining a corresponding destination address according to the independent source address and screening the destination address to obtain an independent destination address; wherein the independent destination addresses are destination addresses with different values;
and the parallel processing module is used for acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
Optionally, the first screening module includes:
the storage submodule is used for respectively storing each source address into the corresponding input FIFO;
the selection submodule is used for selecting a screening channel according to the source address; the two ends of the screening channel are the input FIFO and the output FIFO respectively, and only one source address is allowed to pass through the screening channel at each moment;
a determining submodule for determining a source address in each of the output FIFOs as the independent source address.
Optionally, the selecting sub-module includes:
a determination unit configured to determine that a most significant address of the source address is flag data;
a selection unit, configured to select a first stage FIFO according to the tag data, and move the source address from the input FIFO to the first stage FIFO;
a judging unit for judging whether the flag data is the last bit address;
an updating unit, configured to update the tag data to a next address of the tag data in the source address when the tag data is not a last address, select a next FIFO according to the updated tag data, move the source address to the next FIFO, and return to the step of determining whether the tag data is the last address;
and the moving unit is used for moving the source address to the corresponding output FIFO when the mark data is the last bit address.
Optionally, the parallel processing module includes:
the first reading submodule is used for simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and the second reading submodule is used for simultaneously reading the independent destination data corresponding to each independent destination address by using the destination static random access memory with the preset number of ports.
The present application also provides a graph data parallel processing apparatus, including:
a memory for storing a computer program;
a processor for implementing the steps of the method for graph data parallel processing according to any one of the above when the computer program is executed.
The present application also provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of graph data parallel processing according to any one of the above.
The method for parallel processing of graph data provided by the application comprises the following steps: acquiring graph data; screening source addresses in the graph data to obtain independent source addresses; wherein, the independent source addresses are source addresses with different values; determining a corresponding destination address according to the independent source address, and screening the destination address to obtain an independent destination address; wherein, the independent destination addresses are destination addresses with different values; and acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
According to the technical scheme, the source address in the graph data is screened to obtain the independent source address, the corresponding destination address is determined according to the independent source address, the destination address is screened to obtain the independent destination address, the corresponding independent source data and the independent destination data are finally obtained, parallel processing of the graph data is completed, a plurality of processing units are not needed to process simultaneously, only one processing unit is needed to complete parallel processing of the graph data, and cache overhead and communication overhead in the parallel processing process of the graph data are greatly reduced. The application also provides a device, equipment and a readable storage medium for parallel processing of the graph data, which have the beneficial effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart illustrating a method for parallel processing of graph data according to an embodiment of the present disclosure;
FIG. 2 is a schematic structural diagram of a 64-port SRAM according to an embodiment of the present application;
FIG. 3 is a flow chart of an actual representation of S102 in a graph data parallel processing method provided in FIG. 1;
FIG. 4 is a flow chart of an actual representation of S302 of a method of graph data parallel processing provided in FIG. 3;
fig. 5 is a structural diagram of an 8-channel parallel data screening architecture according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of an apparatus for parallel processing of graph data according to an embodiment of the present disclosure;
fig. 7 is a block diagram of a graph data parallel processing device according to an embodiment of the present application.
Detailed Description
The core of the application is to provide a method, a device and equipment for graph data parallel processing and a readable storage medium, which are used for reducing cache overhead and communication overhead in the graph data parallel processing process.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for parallel processing of graph data according to an embodiment of the present disclosure.
The method specifically comprises the following steps:
s101: acquiring graph data;
the graph is an abstract data structure for representing associations between objects, described using vertices and edges: vertices represent objects and edges represent relationships between objects. Data that can be abstracted into a graph description is graph data. The graph calculation is the process of expressing and solving the problem by taking the graph as a data model.
Based on the prior art, a graph is generally decomposed into a plurality of small blocks according to a source node and a destination node, and different small blocks are processed by different processing units. The parallelism of the graph data processing is realized by increasing the number of processing units. However, this approach has limited processing power of a single processing unit, and requires a larger cache unit than a single processing unit due to the need for multiple blocks to be processed simultaneously. Meanwhile, as the number of blocks is increased, the communication overhead ratio between different blocks is increased, so that the number of blocks which are simultaneously operated in parallel is limited to a certain extent. Therefore, the present application provides a method for parallel processing of graph data, which is used to solve the above problems.
S102: screening source addresses in the graph data to obtain independent source addresses;
as shown in fig. 2, which is a schematic diagram of a 64-port sram structure, in order to enable a single processing unit to process multiple paths of data in parallel, a parallel processing apparatus needs to be able to read multiple paths of data simultaneously. Taking 64-way parallel processing as an example, 64 ports of sram can be used to read corresponding data at the same time for processing. The address of the static random access memory adopts the lower bit of the destination address of the edge data or the hash value of the destination address of the edge data to distinguish different channels, however, for given graph data, the situation that a certain specific static random access memory is read simultaneously exists, so that the parallel processing cannot be carried out, therefore, the situation that a certain specific static random access memory is read is avoided by screening the source address and the destination address in the graph data.
The independent source addresses are source addresses with different values, and the meaning of screening the source addresses in the graph data is to ensure that two same source addresses do not exist at the same time;
optionally, the screening of the source address in the graph data may be implemented by a software program, or may also be implemented by a hardware device.
S103: determining a corresponding destination address according to the independent source address, and screening the destination address to obtain an independent destination address;
the independent destination addresses mentioned here are destination addresses whose values are different from each other; as mentioned herein, the significance of screening the destination addresses is to ensure that there are no two identical destination addresses at the same time;
optionally, the filtering of the destination address may be implemented by a software program, or may also be implemented by a hardware device.
S104: and acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
Optionally, parallel processing of independent source data and independent destination data may be completed by a multi-path floating point parallel processing unit;
preferably, the obtaining of the corresponding independent source data and independent destination data according to the independent source address and the independent destination address mentioned here may specifically be:
simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and simultaneously reading independent destination data corresponding to each independent destination address by using a destination static random access memory with a preset number of ports.
Based on the embodiment, the independent source data and the independent target data are read in parallel by using the source static random access memory with the preset number of ports, so that the effect of parallel processing of the graph data by a single processing unit is realized, and the cache overhead and the communication overhead in the process of parallel processing of the graph data are further reduced.
Based on the technical scheme, the method for parallel processing of the graph data obtains the independent source address by screening the source address in the graph data, then determines the corresponding destination address according to the independent source address, screens the destination address to obtain the independent destination address, and finally completes parallel processing of the graph data by the corresponding independent source data and the independent destination data.
For step S102 in the previous embodiment, the source address in the graph address is filtered to obtain an independent source address, which may specifically be the step shown in fig. 3, and is described below with reference to fig. 3.
Referring to fig. 3, fig. 3 is a flowchart illustrating an actual representation of S102 in the graph data parallel processing method of fig. 1.
The method specifically comprises the following steps:
s301: storing each source address into a corresponding input FIFO respectively;
the FIFO (First in First out, First in First out buffer) has the characteristic of First in First out, and the First stored source address is preferentially output.
S302: selecting a screening channel according to a source address;
the two ends of the screening channel are respectively an input FIFO and an output FIFO, and only one source address is allowed to pass through the screening channel at each moment;
optionally, the selecting of the screening channel according to the source address mentioned here may specifically be selecting an intermediate FIFO of the screening channel according to address bits from low to high of the source address;
preferably, the selection of the screening channel according to the source address mentioned in step S302 may specifically be a step shown in fig. 4, which is described below with reference to fig. 4.
Referring to fig. 4, fig. 4 is a flowchart illustrating an actual representation of S302 in the graph data parallel processing method of fig. 3.
The method specifically comprises the following steps:
s401: determining the highest address of the source address as the mark data;
s402: selecting a first-stage FIFO according to the mark data, and moving a source address from the input FIFO to the first-stage FIFO;
s403: judging whether the mark data is the last address;
if not, go to step S404; if yes, the process proceeds to step S405.
S404: updating the mark data to the next address of the mark data in the source address, selecting the next-level FIFO according to the updated mark data, moving the source address to the next-level FIFO, and returning to the step of judging whether the mark data is the last address;
s405: the source address is moved to the corresponding output FIFO.
Based on the embodiment, the next-level FIFO where the source address is to be stored is selected through the mark data, and when the mark data is the last address, the source address is moved to the corresponding output FIFO, so that the source address screening is completed.
S303: the source address in each output FIFO is determined to be an independent source address.
As shown in fig. 5, an 8-channel parallel data screening architecture, which is composed of an input FIFO, a three-level intermediate FIFO, a three-level processing, and an output FIFO, is taken as an example for detailed description.
When data screening is performed by using the parallel data screening architecture, source address data input by eight channels D0 to D7 can be screened at the same time, if the input source addresses are 12345677 respectively, that is, binary addresses are 001, 010, 011, 100, 101, 110, 111, 111 respectively, then the source addresses stored in the input FIFO No. 6 and the input FIFO No. 7 are both 111, and the processing procedures of the first two stages are the same:
when the first-stage processing is carried out, the mark data is 1, and then the mark data enters the No. 2 first-stage FIFO and the No. 3 first-stage FIFO of the second part along the direction indicated by the dotted line; when the second-stage processing is carried out, the mark data is 1, and then the mark data enters a No. 0 second-stage FIFO and a No. 1 second-stage FIFO of the fourth part along the direction indicated by the dotted line;
when the third set of processing is performed, the flag data is 1, at this time, the source address in the D6 channel enters the S7 output FIFO along the direction indicated by the dotted line, the source address in the D7 channel enters the S7 output FIFO along the direction indicated by the solid line, both the source address and the source address select the S7 output FIFO, at this time, the source address in the D6 channel can be stored first and output first, and the source address in the D7 channel can be output next time, so that only one source address can pass through at the same time.
Referring to fig. 6, fig. 6 is a structural diagram of an apparatus for parallel processing of graph data according to an embodiment of the present disclosure.
The apparatus may include:
an obtaining module 100, configured to obtain graph data;
the first screening module 200 is configured to screen a source address in the graph data to obtain an independent source address; wherein, the independent source addresses are source addresses with different values;
the second screening module 300 is configured to determine a corresponding destination address according to the independent source address, and screen the destination address to obtain an independent destination address; wherein, the independent destination addresses are destination addresses with different values;
the parallel processing module 400 is configured to obtain corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and perform parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
Optionally, the first screening module 200 may include:
the storage submodule is used for respectively storing each source address into the corresponding input FIFO;
the selection submodule is used for selecting a screening channel according to a source address; the two ends of the screening channel are respectively an input FIFO and an output FIFO, and only one source address is allowed to pass through the screening channel at each moment;
a determining submodule for determining the source address in each output FIFO to be an independent source address.
Further, the selection sub-module may include:
a determination unit configured to determine that a most significant address of the source address is the tag data;
the selection unit is used for selecting the first-stage FIFO according to the mark data and moving the source address from the input FIFO to the first-stage FIFO;
a judging unit for judging whether the flag data is the last bit address;
an updating unit, configured to update the tag data to a next address of the tag data in the source address when the tag data is not the last address, select a next FIFO according to the updated tag data, move the source address to the next FIFO, and return to the step of determining whether the tag data is the last address;
and the moving unit is used for moving the source address to the corresponding output FIFO when the mark data is the last bit address.
The parallel processing module 400 may include:
the first reading submodule is used for simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and the second reading submodule is used for simultaneously reading the independent destination data corresponding to each independent destination address by using the destination static random access memory with the preset number of ports.
Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.
Referring to fig. 7, fig. 7 is a structural diagram of a graph data parallel processing device according to an embodiment of the present application.
The graph data parallel processing apparatus 700 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 722 (e.g., one or more processors) and a memory 732, one or more storage media 730 (e.g., one or more mass storage devices) storing an application 742 or data 744. Memory 732 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a sequence of instruction operations for the device. Further, the processor 722 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the graph data parallel processing apparatus 700.
The graph data parallel processing apparatus 700 may also include one or more power supplies 727, one or more wired or wireless network interfaces 750, one or more input-output interfaces 757, and/or one or more operating devices 741, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
The steps in the method for graph data parallel processing described in fig. 1 to 5 above are implemented by a graph data parallel processing apparatus based on the structure shown in fig. 7.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, device and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a function calling device, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The detailed description is given above to a method, an apparatus, a device and a readable storage medium for parallel processing of graph data provided by the present application. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method for parallel processing of graph data, comprising:
acquiring graph data;
screening the source address in the graph data to obtain an independent source address; wherein the independent source addresses are source addresses with different values;
determining a corresponding destination address according to the independent source address, and screening the destination address to obtain an independent destination address; wherein the independent destination addresses are destination addresses with different values;
and acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
2. The method of claim 1, wherein the screening the source addresses in the graph address to obtain independent source addresses comprises:
storing each source address into a corresponding input FIFO respectively;
selecting a screening channel according to the source address; the two ends of the screening channel are the input FIFO and the output FIFO respectively, and only one source address is allowed to pass through the screening channel at each moment;
determining a source address in each of the output FIFOs as the independent source address.
3. The method of claim 2, wherein selecting a screening channel based on the source address comprises:
determining the highest address of the source address as the mark data;
selecting a first-stage FIFO according to the mark data, and moving the source address from the input FIFO to the first-stage FIFO;
judging whether the mark data is the last address;
if not, updating the mark data to be the next address of the mark data in the source address, selecting the next-level FIFO according to the updated mark data, moving the source address to the next-level FIFO, and returning to execute the step of judging whether the mark data is the last address;
and if so, moving the source address to the corresponding output FIFO.
4. The method of claim 1, wherein obtaining corresponding independent source data and independent destination data according to the independent source address and the independent destination address comprises:
simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and simultaneously reading independent destination data corresponding to each independent destination address by using a destination static random access memory with the preset number of ports.
5. An apparatus for parallel processing of graph data, comprising:
the acquisition module is used for acquiring graph data;
the first screening module is used for screening the source address in the graph data to obtain an independent source address; wherein the independent source addresses are source addresses with different values;
the second screening module is used for determining a corresponding destination address according to the independent source address and screening the destination address to obtain an independent destination address; wherein the independent destination addresses are destination addresses with different values;
and the parallel processing module is used for acquiring corresponding independent source data and independent destination data according to the independent source address and the independent destination address, and performing parallel processing on the independent source data and the independent destination data to obtain a parallel processing result.
6. The apparatus of claim 5, wherein the first screening module comprises:
the storage submodule is used for respectively storing each source address into the corresponding input FIFO;
the selection submodule is used for selecting a screening channel according to the source address; the two ends of the screening channel are the input FIFO and the output FIFO respectively, and only one source address is allowed to pass through the screening channel at each moment;
a determining submodule for determining a source address in each of the output FIFOs as the independent source address.
7. The apparatus of claim 5, wherein the selection submodule comprises:
a determination unit configured to determine that a most significant address of the source address is flag data;
a selection unit, configured to select a first stage FIFO according to the tag data, and move the source address from the input FIFO to the first stage FIFO;
a judging unit for judging whether the flag data is the last bit address;
an updating unit, configured to update the tag data to a next address of the tag data in the source address when the tag data is not a last address, select a next FIFO according to the updated tag data, move the source address to the next FIFO, and return to the step of determining whether the tag data is the last address;
and the moving unit is used for moving the source address to the corresponding output FIFO when the mark data is the last bit address.
8. The apparatus of claim 5, wherein the parallel processing module comprises:
the first reading submodule is used for simultaneously reading independent source data corresponding to each independent source address by using a source static random access memory with a preset number of ports;
and the second reading submodule is used for simultaneously reading the independent destination data corresponding to each independent destination address by using the destination static random access memory with the preset number of ports.
9. A graph data parallel processing apparatus, comprising:
a memory for storing a computer program;
processor for implementing the steps of the method for graph data parallel processing according to any of claims 1 to 4 when executing said computer program.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of a method for parallel processing of graph data according to any one of claims 1 to 4.
CN201911402930.9A 2019-12-30 2019-12-30 Method, device and equipment for parallel processing of graph data and readable storage medium Active CN111177482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911402930.9A CN111177482B (en) 2019-12-30 2019-12-30 Method, device and equipment for parallel processing of graph data and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911402930.9A CN111177482B (en) 2019-12-30 2019-12-30 Method, device and equipment for parallel processing of graph data and readable storage medium

Publications (2)

Publication Number Publication Date
CN111177482A true CN111177482A (en) 2020-05-19
CN111177482B CN111177482B (en) 2022-04-22

Family

ID=70650529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911402930.9A Active CN111177482B (en) 2019-12-30 2019-12-30 Method, device and equipment for parallel processing of graph data and readable storage medium

Country Status (1)

Country Link
CN (1) CN111177482B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556534A (en) * 2009-04-21 2009-10-14 浪潮电子信息产业股份有限公司 Large-scale data parallel computation method with many-core structure
US20160026677A1 (en) * 2014-07-23 2016-01-28 Battelle Memorial Institute System and method of storing and analyzing information
CN106161254A (en) * 2016-07-18 2016-11-23 中国科学院计算技术研究所 A kind of many purposes data transmission network road route device, method, chip, router
CN110134664A (en) * 2019-04-12 2019-08-16 中国平安财产保险股份有限公司 Acquisition methods, device and the computer equipment in Data Migration path

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556534A (en) * 2009-04-21 2009-10-14 浪潮电子信息产业股份有限公司 Large-scale data parallel computation method with many-core structure
US20160026677A1 (en) * 2014-07-23 2016-01-28 Battelle Memorial Institute System and method of storing and analyzing information
CN106161254A (en) * 2016-07-18 2016-11-23 中国科学院计算技术研究所 A kind of many purposes data transmission network road route device, method, chip, router
CN110134664A (en) * 2019-04-12 2019-08-16 中国平安财产保险股份有限公司 Acquisition methods, device and the computer equipment in Data Migration path

Also Published As

Publication number Publication date
CN111177482B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
EP3369045B1 (en) Determining orders of execution of a neural network
US10650047B2 (en) Dense subgraph identification
JP5950285B2 (en) A method for searching a tree using an instruction that operates on data having a plurality of predetermined bit widths, a computer for searching a tree using the instruction, and a computer thereof program
US20140188893A1 (en) Data retrieval apparatus, data storage method and data retrieval method
CN103970604A (en) Method and device for realizing image processing based on MapReduce framework
CN112734034A (en) Model training method, calling method, device, computer equipment and storage medium
WO2023138188A1 (en) Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
CN116822422B (en) Analysis optimization method of digital logic circuit and related equipment
CN105302536A (en) Configuration method and apparatus for related parameters of MapReduce application
CN111274455B (en) Graph data processing method and device, electronic equipment and computer readable medium
CN115358397A (en) Parallel graph rule mining method and device based on data sampling
CN111177482B (en) Method, device and equipment for parallel processing of graph data and readable storage medium
US9740797B2 (en) Counting bloom filter
CN113157695B (en) Data processing method and device, readable medium and electronic equipment
Fischer et al. Unrooted non-binary tree-based phylogenetic networks
CN113626650A (en) Service processing method and device and electronic equipment
CN105677801A (en) Data processing method and system based on graph
CN111049988A (en) Intimacy prediction method, system, equipment and storage medium for mobile equipment
CN116541421B (en) Address query information generation method and device, electronic equipment and computer medium
CN111915002B (en) Operation method, device and related product
JP6485594B2 (en) Memory control device and memory control method
CN116933880A (en) Quantum circuit depth optimization method and system based on genetic algorithm and electronic equipment
CN114298203A (en) Method, device, equipment and computer readable medium for data classification
Morshed et al. SOF: An Efficient String Graph Construction Algorithm
CN118229509A (en) Image processing optimization method, system, equipment and medium suitable for DSP

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant