CN111953614B

CN111953614B - Data transmission method, device, processing equipment and medium

Info

Publication number: CN111953614B
Application number: CN202010793135.3A
Authority: CN
Inventors: 姜曦楠; 朱子霖; 周飞虎; 郭振宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-08-07
Filing date: 2020-08-07
Publication date: 2023-10-24
Anticipated expiration: 2040-08-07
Also published as: CN111953614A

Abstract

The embodiment of the application discloses a data transmission method, a device, processing equipment and a medium based on cloud technology, wherein the method comprises the following steps: acquiring dominant point information of a plurality of target nodes in a calculation graph of a target object; according to the dominant point information of each target node, at least two target nodes are aggregated into a target aggregation node; updating the computational graph by adopting the target aggregation node, and sending the updated computational graph to computing equipment, wherein the updated computational graph is used for indicating: and the computing equipment aggregates the execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object according to the indication of the target aggregation node, and transmits an aggregation result. According to the embodiment of the application, the computing equipment can be instructed to aggregate and transmit the execution result data of the target node through the updated computation graph, so that the number of data transmission times is reduced, network resources are saved, and the total transmission time is shortened.

Description

Data transmission method, device, processing equipment and medium

Technical Field

The present application relates to the field of internet technologies, and in particular, to the field of computer technologies, and in particular, to a data transmission method, a data transmission device, a processing apparatus, and a computer storage medium.

Background

In mathematical graph theory, a graph is used to express an abstraction of an object-to-object relationship, consisting essentially of nodes for representing objects and edges for representing relationships between objects; the Graph in which each edge has a direction may be referred to as a Directed Graph (Directed Graph). With the development of graph technology and internet technology, computing graphs has been developed; the so-called computation Graph, which may also be referred to as a Data Flow Graph, refers in particular to a directed Graph of Data Flow computation for characterizing target objects. The nodes in the computation graph are used for representing data processing operations involved in the process of computing the target object, and one data processing operation corresponds to one execution result data; edges in the computational graph are used to represent dependencies, such as data dependencies and control dependencies, between data processing operations (nodes). A computational graph typically has specific target nodes that represent data processing operations that require the transmission of execution result data.

At present, before a computing device calculates a target object, a computing graph of the target object is generally constructed, and the constructed computing graph is directly sent to the computing device; the computing device is enabled to directly transmit corresponding execution result data after each data processing operation represented by one target node is executed in the process of computing the target object. Such a data transmission manner may result in excessive data transmission times and excessive consumption of network resources; and, each transmission typically has a network delay, which also results in a longer overall transmission duration.

Disclosure of Invention

The embodiment of the invention provides a data transmission method, a device, processing equipment and a medium, which can instruct computing equipment to carry out aggregation transmission on execution result data of a target node through an updated computing graph, thereby reducing the times of data transmission, saving network resources and shortening the total transmission time.

In one aspect, an embodiment of the present invention provides a data transmission method, where the method includes:

acquiring dominant point information of a plurality of target nodes in a calculation graph of a target object; each target node is used for representing a data processing operation which needs to be executed in the calculation process of the target object, and execution result data of the data processing operation represented by each target node needs to be transmitted; the dominant point information of any target node is used for reflecting the dominant relationship between any target node and other target nodes;

according to the dominant point information of each target node, aggregating at least two target nodes into a target aggregation node, wherein the target aggregation node is used for indicating to aggregate the execution result data of the data processing operation represented by the aggregated target nodes;

updating the computational graph by adopting the target aggregation node, and sending the updated computational graph to computing equipment, wherein the updated computational graph is used for indicating: and the computing equipment aggregates the execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object according to the indication of the target aggregation node, and transmits an aggregation result.

In another aspect, an embodiment of the present invention provides a data transmission apparatus, including:

an acquisition unit configured to acquire dominant point information of a plurality of target nodes in a computation graph of a target object; each target node is used for representing a data processing operation which needs to be executed in the calculation process of the target object, and execution result data of the data processing operation represented by each target node needs to be transmitted; the dominant point information of any target node is used for reflecting the dominant relationship between any target node and other target nodes;

the aggregation unit is used for aggregating at least two target nodes into a target aggregation node according to the dominant point information of each target node, and the target aggregation node is used for indicating to aggregate the execution result data of the data processing operation represented by the aggregated target node;

the processing unit is used for updating the computational graph by adopting the target aggregation node and sending the updated computational graph to the computing equipment, wherein the updated computational graph is used for indicating: and the computing equipment aggregates the execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object according to the indication of the target aggregation node, and transmits an aggregation result.

In one embodiment, the aggregation unit, when configured to aggregate at least two target nodes into a target aggregation node according to dominant point information of each target node, may be specifically configured to:

constructing a dominance tree formed by the plurality of target nodes according to dominance point information of each target node;

extracting aggregation level information based on the dominance tree; the aggregation level information includes: n layers of node groups required by aggregation, wherein N is a positive integer; at least one node in each node group is the target node;

and performing at least one layer of aggregation iteration processing on the target nodes according to the aggregation level information to obtain target aggregation nodes.

In still another embodiment, the aggregation unit, when configured to construct a dominant tree composed of the plurality of target nodes according to dominant point information of each target node, may be specifically configured to:

taking a starting target node in the target directed graph as a root node of a dominating tree, and determining the rest target nodes except the starting target node in the target directed graph in the plurality of target nodes;

acquiring the nearest dominant point of each remaining target node from the dominant point set in the dominant point information of each remaining target node;

Determining the nearest dominant relationship among the target nodes according to the nearest dominant point of each remaining target node;

and adding the rest target nodes under the root node according to the latest dominance relation to obtain a dominance tree.

In yet another embodiment, the parent node of each target node in the dominance tree except the root node is: the nearest dominant point of each target node; k pairs of branches exist in the target nodes, one pair of branches is used for associating node groups required by at least one layer of aggregation; wherein K is a positive integer; accordingly, the aggregation unit, when configured to extract aggregation level information based on the dominance tree, may be specifically configured to:

selecting a first target node from target nodes which are not traversed in the dominance tree according to the traversing sequence from bottom to top;

detecting whether a kth branch pair is formed by a second target node and the first target node according to an inverse dominant point set of each target node except for the tail target node in the target directed graph, wherein k is E [1, K ]; the second target node satisfies the following condition: the second target node is the nearest dominant point of the first target node, and the first target node is the nearest inverse dominant point of the second target node;

If so, selecting at least one target node from the plurality of target nodes according to the second target node, and adding the target node to a node group required by target layer aggregation associated with the kth dominant pair; and continuing to traverse the dominance tree;

if not, the first target node is reselected until each target node in the dominance tree is traversed.

In still another embodiment, the aggregation unit, when configured to select, according to the second target node, at least one target node from the plurality of target nodes to be added to a node group required for target layer aggregation associated with the kth dominant pair, may be specifically configured to:

if so, acquiring a offspring node set of the second target node from the dominance tree;

if the descendant node set only comprises the first target node and the descendant nodes of the first target node, selecting the first target node and the second target node, and adding the first target node and the second target node into a node group required by target layer aggregation associated with the kth dominant pair;

and if the descendant node set comprises other descendant nodes except the first target node and the descendant nodes of the first target node, selecting the other descendant nodes and adding the other descendant nodes into the node group required by the target layer aggregation.

In yet another embodiment, the aggregation unit, when configured to select the first target node and the second target node, is added to a node group required for aggregation with the target layer associated with the kth dominant pair, may be specifically configured to:

detecting whether a first history node group exists in the first k-1 node groups required for dominating the associated history layer aggregation and comprises the first target node;

if the first history node group exists, adding the aggregation node corresponding to the first history node group and the second target node into the node group required by target layer aggregation associated with the kth dominant pair;

and if the first historical node group does not exist, adding the first target node and the second target node into the node group required by the target layer aggregation.

In yet another embodiment, the aggregation unit, when used for selecting the other descendant nodes to be added to the node group required by the target layer aggregation, may be specifically configured to:

detecting whether a second history node group comprises aggregation nodes corresponding to other offspring nodes in the node groups required by the former k-1 dominance on the associated history layer aggregation;

If the second history node group exists, adding the aggregation node corresponding to the second history node group, the first target node and the second target node into the node group required by the target layer aggregation;

if the second historical node group does not exist, adding the other descendant nodes into the node group required by the target layer aggregation, and adding the aggregation nodes aggregated by the other descendant nodes, the first target node and the second target node into the node group required by the next layer aggregation below the target layer aggregation, which is associated with the kth dominant pair.

In still another embodiment, when the aggregation unit is configured to perform at least one layer of aggregation iterative processing on the plurality of target nodes according to the aggregation level information, the aggregation unit may be specifically configured to:

determining an nth node group required by nth layer aggregation according to the aggregation level information, and determining the traffic sum of the nth node group according to the traffic of each node in the nth node group; n is E [1, N ];

when the sum of the traffic of the nth node group is smaller than or equal to a traffic threshold, performing aggregation treatment on each node in the nth node group to obtain an nth aggregation node;

And if the current value of N is smaller than N and the sum of the traffic of the n+1th node group required by the n+1th layer aggregation acquired according to the aggregation level information is larger than the traffic threshold, acquiring a target aggregation node according to the N aggregation node.

In yet another embodiment, the polymerization unit may further be specifically configured to:

if the current value of N is smaller than N and the sum of the traffic of the n+1th node group is smaller than or equal to the traffic threshold, executing an operation of adding one to the current value of N to update N, and executing a step of determining the nth node group required by the nth layer aggregation according to the aggregation level information;

and if the current value of N is equal to N, obtaining a target aggregation node according to the nth aggregation node.

In yet another embodiment, the aggregation unit, when configured to obtain the target aggregation node according to the nth aggregation node, may be specifically configured to:

if the value of n is 1, the 1 st aggregation node is taken as a target aggregation node;

if the value of n is not 1, at least one history aggregation node obtained by the previous n-1 layer aggregation is obtained, and the history aggregation node which is not subjected to aggregation processing is selected from the at least one history aggregation node, and the nth aggregation node is used as the target aggregation node.

In yet another embodiment, the processing unit, when configured to update the computational graph with the target aggregation node, may be specifically configured to:

adding the target aggregation node in the calculation graph, and connecting the target aggregation node and the aggregated target node by adopting a directed edge;

adding a matched communication node to the target node which is not aggregated in the calculation graph, and adding a matched communication node to the target aggregation node in the calculation graph; the communication node is configured to represent a data transfer operation.

In still another aspect, an embodiment of the present invention provides a processing apparatus, including an input interface and an output interface, the processing apparatus further including:

a processor adapted to implement one or more instructions; the method comprises the steps of,

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of:

In yet another aspect, embodiments of the present invention provide a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the steps of:

According to the embodiment of the invention, at least two target nodes can be aggregated into a target aggregation node according to the dominant point information of a plurality of target nodes in the calculation graph of the target object; the target aggregation node is configured to instruct aggregation of execution result data of the data processing operation represented by the aggregated target node. The computational graph may then be updated with the target aggregation node and the updated computational graph sent to the computing device. In the process of calculating the target object, after the data processing operation represented by the aggregated target node is executed, the computing device can aggregate and transmit the execution result data of the data processing operation represented by the aggregated target node according to the indication of the target aggregation node, so that the number of times of data transmission is reduced, network resources are saved, and the total transmission time is shortened.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1a is a schematic diagram of a data transmission system according to an embodiment of the present invention;

fig. 1b is a schematic diagram of a data transmission system according to another embodiment of the present invention;

fig. 2 is a schematic flow chart of a data transmission method according to an embodiment of the present invention;

FIG. 3a is a schematic diagram of a computational graph and a target directed graph provided by an embodiment of the present invention;

FIG. 3b is a schematic diagram of a reverse graph provided by an embodiment of the present invention;

fig. 4 is a flow chart of a data transmission method according to another embodiment of the present invention;

FIG. 5a is a schematic diagram of an adjacency matrix according to another embodiment of the present invention;

FIG. 5b is a schematic diagram of an initial reachability matrix provided by another embodiment of the present invention;

FIG. 5c is a schematic diagram of a target reachability matrix provided by another embodiment of the present invention;

FIG. 5d is a schematic diagram of the construction of a target directed graph according to another embodiment of the present invention;

FIG. 5e is a schematic diagram of a dominance tree provided by another embodiment of the present invention;

FIG. 5f is a schematic diagram of an inverse dominant tree provided by another embodiment of the present invention;

FIG. 5g is a schematic diagram illustrating extraction of aggregation level information according to another embodiment of the present invention;

FIG. 5h is a schematic diagram illustrating extraction of aggregation level information according to another embodiment of the present invention;

FIG. 5i is a schematic diagram illustrating extraction of aggregation level information according to another embodiment of the present invention;

FIG. 5j is a schematic diagram illustrating extraction of aggregation level information according to another embodiment of the present invention;

FIG. 5k is a schematic diagram of a generating target aggregation node according to another embodiment of the present invention;

FIG. 5l is a schematic diagram of adding a target aggregation node according to another embodiment of the present invention;

FIG. 5m is a schematic diagram of an add communication node according to another embodiment of the present invention;

FIG. 6 is a schematic diagram of an application scenario of distributed machine learning according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a data transmission device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

In the calculation process of the target object, in order to better transmit the execution result data of the data processing operation represented by each target node, the embodiment of the invention firstly provides a data transmission system. The target object herein refers to any object involved in multiple data processing operations in the calculation process, for example, the target object may be a neural network model involved in multiple data processing operations such as convolution operation and pooling operation in the model training process; as another example, the target object may be an application program that involves multiple data processing operations such as a test operation on the application function 1, a test operation on the application function 2, and the like in the application test process; for another example, the target object may be a hardware device involved in a plurality of data processing operations such as a test operation on the module 1, a test operation on the module 2, and the like in a hardware test process.

Specifically, the data transmission system may include: one processing device 11 and one or more computing devices 12; the processing device 11 and the computing devices 12 may communicate with each other. The processing device 11 is mainly configured to generate and update a computation graph (i.e. a data flow graph) of a target object, and send the computation graph to each computing device 12; which may be any terminal or server having data processing functionality. The computing device 12 is mainly used for executing multiple data processing operations on the target object, and transmitting execution result data of part or all of the data processing operations according to instructions of the computational graph; which may be any terminal or server having a data calculation function and a communication function. In a specific implementation, when each computing device 12 is configured to transmit the execution result data of a part or all of the data processing operations according to the instruction of the computation graph, the execution result data of the part or all of the data processing operations may be returned to the processing device 11, so that the processing device 11 may perform subsequent processing on the target object according to the execution result data sent by each computing device 12, such as model updating processing, application test analysis processing, module test analysis processing, and so on; in this particular implementation, the system architecture of the data transmission system can be seen in fig. 1 a. In still another specific implementation, when each computing device 12 is configured to transmit execution result data of a part or all of the data processing operations according to an instruction of the computation graph, the execution result data of the part or all of the data processing operations may be transmitted to another management device 13, so that the management device 13 may perform subsequent processing according to the execution result data sent by each computing device 12; in this particular implementation, the system architecture of the data transmission system can be seen in fig. 1 b. For ease of illustration, the system architecture shown in FIG. 1b is described below.

It should be noted that fig. 1a and fig. 1b are only exemplary and not limiting for the specific architecture of the data transmission system. For example, FIGS. 1a and 1b each physically deploy a single processing device 11 to perform both computational graph generation and updating operations; in other embodiments, any one of the plurality of computing devices 12 may be used as a processing device to perform the generation and updating operations of the computational graph; in this case, one processing apparatus 11 may not be deployed alone. It should be further noted that the above-mentioned terminals may include, but are not limited to: smart phones, tablet computers, notebook computers, desktop computers, and the like. The servers mentioned above may be independent physical servers, or may be server clusters or distributed systems formed by a plurality of physical servers, or may be cloud servers that provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and so on.

Based on the data transmission system, the embodiment of the invention also provides a data transmission scheme. Specifically, the general principle of the data transmission scheme is as follows: the processing apparatus may aggregate nodes that are dominant points and inverse dominant points to each other into one aggregated node by comparing dominant points (minimators) and inverse dominant points (Post-minimators) in the computation graph of the target object, which need to transmit synchronization data (i.e., execution result data), and update the computation graph with the aggregated node. Wherein, the definition of the dominant point is as follows: if and only if each path from the starting node (which can be understood as the source node) of the directed graph to node n is going through node d, then one node d in the directed graph can be considered to dominate node n; node d can be considered as the dominant point of node n, written as d dom n (or d > n). Accordingly, the inverse dominant point is a dominant point obtained from a graph in which all sides of the directed graph are inverted. The aggregation node is used for indicating to aggregate the execution result data of the data processing operation represented by the aggregated target node. The updated computational graph may then be issued to each computing device; in the process of calculating the target object, each computing device can aggregate and transmit the execution result data of the data processing operation represented by the aggregated target node according to the instruction of the aggregation node. Therefore, the data transmission scheme provided by the embodiment of the invention can realize the aggregate transmission of the execution result data corresponding to at least two target nodes, so that the number of data transmission times can be effectively reduced, network resources can be saved, and the total transmission time is shortened.

Based on the above description, the embodiments of the present invention propose a data transmission method, which can be performed by the processing device mentioned above. Referring to fig. 2, the data transmission method may include the following steps S201 to S203:

s201, obtaining dominant point information of a plurality of target nodes in a calculation graph of the target object.

In the embodiment of the invention, each target node can be used for representing one data processing operation of a target object to be executed in the computing process, and execution result data of the data processing operation represented by each target node needs to be transmitted. Specifically, the plurality of target nodes may correspond to a target directed graph, where the target directed graph is obtained by connecting each target node with a plurality of directed edges based on the reachability relationship of each target node in the computation graph. The reachability relationship herein may be used to indicate: the ability of one target node to reach other target nodes along at least one edge in the computational graph; a target node a to a target node B is considered reachable if it can reach another target node B through a series of edges. Otherwise, the target node A is considered to be unreachable to the target node B. For convenience of explanation, the calculation chart shown in the upper chart of fig. 3a will be used as an example; referring to the upper graph in fig. 3a, the computational graph may include a plurality of computational nodes, the number on each computational node being used to represent the duration of operation, the connection (i.e., directed edge) between the computational nodes representing a dependency, the number on the connection being used to represent the number of the directed edge. Wherein, the black calculation nodes in the calculation graph are all target nodes; accordingly, the target directed graph corresponding to the target nodes in the calculation graph can be shown by referring to the lower graph in fig. 3 a. Wherein the dominant point information of any target node can be used to reflect the dominant relationship between any target node and other target nodes. Specifically, the dominance information of any target node may include at least one of: a set of dominant points for any target node and a set of inverse dominant points for any target node.

The dominant point set of any target node refers to a set formed by all dominant points of any target node; the dominant points in the dominant point set of any target node may be: all forward paths from a starting target node to any target node in the target directed graph pass through the target nodes. The dominant point of any target node is concentrated to the nearest dominant point of any target node, and is the nearest dominant point of any target node; i.e. the nearest dominant point of any target node means: the dominant point of any one target node is concentrated with all dominant points except itself. It should be noted that, there is no dominant point in the initial target node in the target directed graph. For example, see the target-directed graph in fig. 3 a: for example, any one of the target nodes is set as the target node J. Since there is only one forward path (i.e., forward path b→e→g→j) from the starting target node (i.e., target node B) to target node J in the target directed graph; thus, the target node B, the target node E, the target node G, and the target node J, through which the one forward path (i.e., forward path b→e→g→j) passes, may all be dominant points of the target node J. I.e. the dominant point set of target node J may comprise: target node B, target node E, target node G, and target node J. The nearest dominant point of the target node J is the target node G whose dominant point is closest to the target node J. For another example, any target node is set as the target node O. Since there are two forward paths (i.e., forward paths b→e→g→j→o and forward paths b→e→l→o) from the starting target node (i.e., target node B) to target node J in the target directed graph; then both the target node B and the target node E, through which both forward paths (i.e., forward paths b→e→g→j→o and forward paths b→e→l→o) pass, may serve as dominant points for the target node O. I.e. the dominant point set of the target node O may comprise: a target node B and a target node E; the nearest dominant point of the target node O is the target node E whose dominant point is closest to the target node O.

The inverse dominant point set of any target node refers to a set formed by each inverse dominant point of the any target node; the inverse dominant points in the set of inverse dominant points for any target node may be: and all reverse paths from the starting target node to any target node in the reverse graph corresponding to the target directed graph pass through the target nodes. The inverse dominant point of any target node is concentrated to the nearest inverse dominant point of any target node, and is the nearest inverse dominant point of any target node; i.e. the nearest inverse dominant point of any target node means: the inverse dominant point of any one target node is concentrated with all the inverse dominant points except itself. The reverse graph is a graph obtained by reversing each directed edge in the target directed graph. It should be noted that, there is no inverse dominant point at the starting target node in the inverse graph (i.e., the end target node in the target directed graph). For example, still taking the target directed graph shown in the lower graph of fig. 3a as an example, the corresponding reverse graph can be seen in fig. 3 b: for example, any target node is set as the target node G. Since there is only one reverse path (i.e., reverse path O→J→G) from the starting target node (i.e., target node O) to target node G in the reverse graph; thus, the target node O and the target node J through which the one reverse path (i.e., the reverse path o→j→g) passes may both be the inverse dominant point of the target node G. I.e. the inverse dominant point set of the target node G may comprise: target node O and target node J. The nearest inverse dominant point of the target node G is a target node J which is nearest to the target node G in the inverse dominant point set. As another example, any target node is set as the target node B. Since there are two reverse paths (i.e., reverse paths O→J→G→E→B and reverse paths O→E→B) from the starting target node (i.e., target node O) to target node B in the reverse graph; then the target node E and the target node O, through which both of the two reverse paths (i.e., the reverse path o→j→g→e→b and the reverse path o→e→b) pass, can both serve as the reverse dominant point of the target node B. I.e., the inverse dominant point set of the target node B may include: a target node E and a target node O; wherein the nearest inverse dominant point of the target node B is the target node E closest to the target node B in the inverse dominant point set.

In the embodiment of the invention, if two nodes are the nearest dominant point and the nearest inverse dominant point, the node formed by the two nodes can be symmetrical as a branch pair. That is, if the nearest inverse dominant point of the nearest dominant point Y of the node X is the node X itself (i.e., the node Y is the nearest dominant point of the node X, and the node X is the nearest inverse dominant point of the node Y), the node pair { Y, X } may be defined as a pair of branches, and it should be noted that, here, the node X and the node Y are both generally referred to and do not refer to a specific node. As can be seen, the dominant pair refers to a node pair constituted by target nodes satisfying the following conditions: one target node is the nearest dominant point of the other target node and the other target node is the nearest inverse dominant point of the one target node. For example, the examples shown in fig. 3a and 3b above are still taken as examples: since target node B is the closest dominant point of target node E and target node E is the closest inverse dominant point of target node B, node pair { B, E } consisting of target node B and target node E may be the dominant pair.

S202, aggregating at least two target nodes into a target aggregation node according to the dominant point information of each target node.

Studies have shown that in the calculation of the target object, if there is a branch pair, the following two cases may generally exist:

in case one, the two target nodes that constitute the dominant pair are directly connected in the target directed graph (or reverse graph), i.e. there are no other target nodes in the target directed graph (or reverse graph) located between them. In this case, the execution order of the data processing operations represented by the two target nodes is adjacent among the plurality of target nodes, whether in the forward execution order of the respective data processing operations indicated by the target directed graph or in the reverse execution order of the respective data processing operations indicated by the reverse graph. For example, for the branch pair { B, E }, if the forward execution order indicated by the target directed graph is followed, the data processing operation represented by the target node B may be performed first, and then the data processing operation represented by the target node E may be performed; if the reverse order of execution is indicated by the reverse graph, the data processing operations represented by target node E may be performed first, followed by the data processing operations represented by target node B. It can be seen that the order of execution of the data processing operations represented by target node B and target node E in the plurality of target nodes is contiguous, whether in forward or reverse order of execution. If the data of the execution results of the data processing operations represented by the two target nodes are aggregated and then transmitted, the overall data transmission effect is not generally affected.

In case two, the two target nodes constituting the dominant pair are indirectly connected in the target directed graph, i.e. there is at least one other target node in the target directed graph located between the two target nodes. In this case, the execution order of the data processing operations represented by each other target node located between the two target nodes in the dominant pair is between the two target nodes in the dominant pair, regardless of the forward execution order of each data processing operation indicated by the target directed graph or the reverse execution order of each data processing operation indicated by the reverse graph. For example, for the branch pair { E, O }, if the forward execution order indicated by the target directed graph is followed, the data processing operation represented by the other target nodes (such as the target node G, the target node J, and the target node L) may be executed after the data processing operation represented by the target node E is executed, and then the data processing operation of the target node O is executed; if the reverse order of execution is indicated by the reverse graph, it may be necessary to perform the data processing operations represented by the other target nodes (e.g., target node G, target node J, and target node L) after the data processing operations represented by target node O have been performed. It can be seen that the data processing operations represented by the other target nodes (e.g., target node G, target node J, and target node L) are between target node E and target node O in the dominant pair { E, O } whether in forward or reverse order of execution. Then, if the data processing operations represented by the respective other target nodes located between the two target nodes of the dominant pair are regarded as the internal operations of the dominant pair, that is, the data processing operations represented by the two target nodes constituting the dominant pair and the other target nodes located between the two target nodes are regarded as a whole to aggregate the execution result data corresponding to each other and then transmit the aggregated execution result data, the overall data transmission effect is not generally affected.

Based on the related description of the two cases, the embodiment of the invention can aggregate the target nodes which are the dominant points and the inverse dominant points into one target aggregate node according to the dominant point information of each target node, so as to instruct the computing device to perform subsequent aggregate transmission on the execution result data of the data processing operations represented by the target nodes which are the dominant points and the inverse dominant points through the target aggregate node. I.e. the target aggregation node may be used to indicate the aggregation of execution result data of the data processing operations represented by the aggregated target node.

In a specific implementation, at least one supporting pair can be found out from a plurality of target nodes directly according to the dominant point information of each target node; and then, respectively aggregating the target nodes in the searched dominant pairs to obtain at least one target aggregation node. For example, taking the target directed graph shown in fig. 3a as an example, according to the dominant point information of each target node mentioned in step S201, three pairs of branches can be found from the plurality of target nodes, which are respectively: a dominant pair { G, J } formed by target node G and target node J, a dominant pair { E, O } formed by target node E and target node O, and a dominant pair { B, E } formed by target node B and target node E. Then, the target nodes in the three pairs of nodes can be aggregated respectively, so that three target aggregated nodes can be obtained: a target aggregate node (GJ), a target aggregate node (EO), and a target aggregate node (BE). It should be noted that, since there is a common target node E in the pair { E, O } and the pair { B, E } of the dominance; thus, in other embodiments, target node B, target node E, and target node O may be directly aggregated to obtain a target aggregated node (BEO). I.e. in this case two target aggregation nodes are available: a target aggregation node (GJ) and a target aggregation node (BEO).

In another embodiment, a dominance tree composed of a plurality of target nodes may be constructed according to dominance point information of each target node; the dominance tree may be used to indicate a dominance order between the target nodes. Then, multiple aggregation iteration processes may be performed on at least two target nodes based on the dominance order indicated by the dominance tree to obtain a target aggregate node. Specifically, according to the order of dominance indicated by the dominance tree, determining the first dominant target node (e.g., target node J), and obtaining the nearest dominant point (e.g., target node G) of the first dominant target node (e.g., target node J); if the most-recently-dominant point of the first-dominant target node (e.g., target node G) and the first-dominant target node (e.g., target node J) can form a branch pair, then the most-recently-dominant point of the first-dominant target node (e.g., target node G) and the first-dominant target node (e.g., target node J) can be aggregated once to obtain a target aggregate node (GJ). Then, a second dominated target node may be determined according to the dominance order indicated by the dominance tree; if the most recent dominant point of the first dominant target node and the first dominant target node cannot form a dominant pair, the second dominant target node may be determined directly according to the dominant order indicated by the dominant tree. After determining the second dominant target node, a nearest dominant point of the second dominant target node may be obtained; if the nearest dominant point of the second dominant target node and the second dominant target node can form a branch pair and the stop condition of multiple aggregation iteration processing is not satisfied, performing one-time aggregation on the nearest dominant point of the second dominant target node and the second dominant target node to obtain a target aggregation node; otherwise, determining a third dominated target node directly according to the dominated sequence indicated by the dominated tree, and so on until all the plurality of target nodes are traversed.

Wherein the iteration stop condition of the above-mentioned multiple polymerization iteration process is: the number of iterative aggregations is greater than a number threshold, the sum of traffic of each node required for current aggregation is greater than a traffic threshold, and so on, which the embodiments of the present invention do not limit. It should be noted that, in the above embodiment, the processing device may directly search the nodes to be aggregated in real time according to the dominant sequence indicated by the dominant tree, and determine whether the iteration stop condition is satisfied in real time each time the node aggregation is performed. In other embodiments, the processing device may first extract the aggregation level information based on the dominance tree without considering the iteration stop condition; the aggregation level information is used to indicate nodes required for aggregation of each layer. And then, performing at least one layer of aggregation iteration processing on the plurality of target nodes according to the aggregation level information, and judging whether an iteration stop condition is met or not when each layer of aggregation iteration processing is executed in the process.

S203, updating the calculation graph by adopting a target aggregation node, and sending the updated calculation graph to the computing equipment; the updated computational graph is used for indicating: according to the indication of the target aggregation node, the computing device aggregates the execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object, and transmits an aggregation result.

Fig. 4 is a flowchart of another data transmission method according to an embodiment of the present invention. The data transmission method may be performed by the processing device mentioned above. Referring to fig. 4, the data transmission method may include the following steps S401 to S406:

s401, obtaining dominant points of a plurality of target nodes in a calculation graph of a target object.

In an embodiment of the present invention, the computational graph of the target object may include the following computational nodes: a plurality of target nodes and non-target nodes; each compute node may be used to represent a data processing operation that the target object needs to perform during the computation. The target node refers to a computing node that the execution result of the indicated data processing operation needs to be transmitted, and the non-target node refers to a computing node that the execution result of the indicated data processing operation does not need to be transmitted. In the specific implementation of step S401, a target directed graph formed by a plurality of target nodes may be determined according to the topology relationship of the computation graph of the target object; then, dominant point information of a plurality of target nodes is acquired according to the target directed graph. The specific implementation mode for determining the target directed graph formed by a plurality of target nodes according to the topological relation of the calculated graph of the target object at least comprises the following steps:

embodiment one: the non-target nodes can be deleted directly in the calculation graph of the target object, and the connection relation among the target nodes is adjusted according to the deleted non-target nodes, so that the target directed graph formed by a plurality of target nodes is obtained.

Embodiment two: a calculation map of the target object can be obtained; and a target reachability matrix including a plurality of target nodes, the target reachability matrix being used to indicate reachability relationships between the target nodes, may be calculated based on the topology relationship of the calculation map. In the implementation process, the adjacency matrix including each computing node in the computation graph can be calculated according to the topology relation of the computation graph, as shown in fig. 5 a. Wherein the adjacency matrix can be used to indicate connection relationships between computing nodes in the computation graph; specifically, if the element in the x-th row and the y-th column in the adjacency matrix is a non-zero element, it may indicate that the target node corresponding to the x-th row and the target node corresponding to the y-th column are connected in the computation graph, that is, the target node corresponding to the x-th row may reach the target node corresponding to the y-th column through a directed edge in the computation graph. It can be seen that if the element of the x-th row and the y-th column in the adjacency matrix is a non-zero element, it can be indicated that the target node corresponding to the x-th row is a reachable target node of the target node corresponding to the y-th column, and the target node corresponding to the y-th column is a reachable target node of the target node corresponding to the x-th row. Wherein x and y are both greater than 0 and less than or equal to the number of compute nodes. It should be noted that fig. 5a is only an exemplary representation of a computational graph of a target object and a corresponding adjacency matrix; and is not limited thereto. For example, each directed edge in the computational graph shown in FIG. 5a has a corresponding number; however, in other embodiments, the directional edges may not be numbered. In this case, the adjacency matrix can only use "0" and "1" to represent the connection relationship between each computing node; wherein "0" means unconnected and "1" means connected. That is, if the target node corresponding to the x-th row and the target node corresponding to the y-th column are connected in the calculation map, the element of the x-th row and the y-th column in the adjacent matrix is "1".

Next, the transitive closure may be solved for the adjacency matrix to obtain an initial reachability matrix that includes each compute node in the computation graph, which may be used to indicate reachability relationships between each compute node in the computation graph. By transitive closure is meant the smallest transitive relationship that contains transitive relationships between any two nodes; the term "transitive closure" means: and searching the computing nodes with the transfer relationship according to the connection relationship indicated by the adjacency matrix, and determining the reachability relationship among the computing nodes according to the searched transfer relationship of the computing nodes. For example, from the adjacency matrix shown in fig. 5a, it can be seen that: the computing node A is connected with the computing node C, and the computing node C is connected with the computing node F; then computing node C may be determined to be a computing node having a transfer relationship of: from computing node a to computing node C, and from computing node C to computing node F. Then from this transfer relationship, it may be determined that there is a reachability relationship between compute node a and compute node C. Based on this, reachability relationships between the computing nodes may be obtained, resulting in an initial reachability matrix as shown in fig. 5 b. Any computing node can be known to reach which computing node along a path (i.e., at least one directed edge) through the initial reachability matrix, and also can be known to reach which computing node; for example, it is known that the computing node F can reach four computing nodes of KMNO, or it is known that the computing node F can be reached by six computing nodes of abcmeg.

Non-target nodes in the initial reachability matrix may then be removed, resulting in a target reachability matrix comprising a plurality of target nodes, as shown in fig. 5 c. Finally, a target directed graph formed by a plurality of target nodes can be constructed according to a construction principle of the minimum edge number and the target reachable matrix, as shown in fig. 5 d. The construction principle of the minimum edge number refers to that: the constructed target directed graph contains the principle of minimum number of directed edges.

It should be noted that the embodiments of the present invention are merely exemplary of the manner in which the target directed graph may be determined, and are not intended to be exhaustive. For example, in other embodiments, the reachability relationship between the target nodes may also be obtained directly according to the initial reachability matrix or the computation graph, and based on the reachability relationship between the target nodes, multiple directed edges are used to connect the target nodes, so as to obtain the target directed graph.

S402, constructing a dominance tree composed of a plurality of target nodes according to dominance point information of each target node.

In a specific implementation, since the starting target node in the target directed graph does not have a dominant point, the starting target node in the target directed graph may be used as a root node of the dominant tree, and the remaining target nodes of the plurality of target nodes except the starting target node in the target directed graph may be determined. Then, the nearest dominant point of each remaining target node can be obtained from the dominant point set in the dominant point information of each remaining target node; and determining the nearest dominant relationship between the target nodes according to the nearest dominant point of each remaining target node. Finally, each remaining target node may be added under the root node according to the most recent dominance relationship to obtain the dominance tree.

To facilitate a clearer understanding of the process of constructing the dominance tree, the following is illustrated using the target directed graph shown in FIG. 5 d: referring to fig. 5d, the starting target node in the target directed graph is the target node B, which may be added to the root node of the dominance tree since the target node B has no dominance points. Next, the remaining target nodes of the plurality of target nodes other than the starting target node in the target directed graph and the nearest dominant point of each remaining target node may be determined as follows: target node E (closest dominant point is target node B), target node G (closest dominant point is target node B), target node L (closest dominant point is target node B), target node O (closest dominant point is target node B), and target node J (closest dominant point is target node G). Then, the most recent dominant relationship between the target nodes may be determined as follows: target node E is most recently governed by target node B, target node G, target node L, and target node O are all most recently governed by target node E, and target node J is most recently governed by target node G. Then, each remaining target node is added under the root node according to the most recent dominance relation, resulting in a dominance tree as shown in fig. 5 e.

S403, extracting aggregation level information based on the dominance tree.

Wherein the aggregation hierarchy information includes: n layers of node groups required by aggregation, wherein N is a positive integer; at least one node in each node group is a target node. From the foregoing, the parent node of each target node except the root node in the dominance tree is: the nearest dominant point of each target node. There may be K dominant pairs in the plurality of target nodes, one dominant pair may be associated with a node group required for at least one layer of aggregation; wherein K is a positive integer. Specifically, the specific embodiment of step S403 may include the following steps S11-S14:

and s11, selecting a first target node from target nodes which are not traversed in the dominance tree according to the traversing sequence from bottom to top.

s12, detecting whether a kth branch pair is formed by the second target node and the first target node according to an inverse dominant point set of each target node except for the last target node in the target directed graph, wherein k is E [1, K ]; wherein the second target node satisfies the following condition: the second target node is the closest dominant point of the first target node and the first target node is the closest inverse dominant point of the second target node. Specifically, if the first target node is the root node of the dominating tree, it may be directly determined that there is no k-th branch pair formed by the second target node and the first target node. If the first target node is not the root node of the dominant tree, a parent node of the first target node may be obtained from the dominant tree. Then, in the inverse dominant point set of the parent node of the first target node, inquiring whether the nearest inverse dominant point of the parent node of the first target node is the first target node; if so, the father node of the first target node can be used as a second target node, and the existence of the k branch pair formed by the second target node and the first target node is determined.

Optionally, to facilitate better querying whether the nearest inverse dominant point of the parent node of the first target node is the first target node; the processing device may also construct an inverse dominant tree composed of a plurality of target nodes according to dominant point information of each target node, where a parent node of each target node except the root node in the inverse dominant tree is: the nearest inverse dominant point of each target node. Then, the processing device may directly query whether the parent node of the first target node is the first target node in the inverted dominance tree; if so, the nearest inverse dominant point of the parent node of the first target node may be determined to be the first target node. The specific implementation mode of constructing the inverse dominance tree composed of a plurality of target nodes according to the dominance point information of each target node is as follows: since the end target node in the target directed graph does not have an inverse dominant point, the end target node in the target directed graph can be used as a root node of an inverse dominant tree, and target nodes to be added, except for the end target node in the target directed graph, in the plurality of target nodes can be determined. Then, the nearest inverse dominant point of each target node to be added can be obtained from the inverse dominant point set in the dominant point information of each target node to be added; and determining the nearest inverse dominant relationship among the target nodes to be added according to the nearest inverse dominant point of each target node to be added. Finally, each target node to be added can be added under the root node of the inverse dominant tree according to the latest inverse dominant relationship to obtain the inverse dominant tree. Still taking the target directed graph shown in fig. 5d as an example, its corresponding inverse dominant tree can be seen in fig. 5 f.

s13, if the target node exists, selecting at least one target node from a plurality of target nodes according to the second target node, and adding the selected target node into a node group required by target layer aggregation associated with the kth dominant pair. And continuing to traverse the dominance tree, i.e. continuing to execute step s11 to reselect the first target node, detecting whether the k+1st dominance pair exists or not through step s12, and so on; until each target node in the dominance tree is traversed. Wherein when at least one target node is selected from the plurality of target nodes according to the second target node to be added to the node group required for target layer aggregation associated with the kth dominant pair, the following embodiments may be provided:

in one embodiment, if present, the processing device may select the first target node and the second target node directly from the plurality of target nodes to add to the group of nodes required for target layer aggregation associated with the kth dominant pair. In yet another embodiment, the processing device may further obtain a set of descendant nodes of the second target node from the dominance tree, if present. Wherein the set of descendant nodes of the second target node comprises at least the first target node; if the first target node has descendant nodes (e.g., child nodes of the first target node, child nodes of the first target node, etc.), the set of descendant nodes of the second target node may also include the descendant nodes of the first target node. Then, it may be detected whether the set of descendant nodes includes descendant nodes other than the first target node and the descendant nodes of the first target node.

If the descendant node set only comprises the first target node and the descendant nodes of the first target node, selecting the first target node and the second target node, and adding the first target node and the second target node into a node group required by target layer aggregation associated with the kth dominant pair. In one embodiment, the first target node and the second target node may be selected directly and added to the node group required for target layer aggregation associated with the kth dominant pair. In yet another embodiment, it may be further detected whether a first history node group including a first target node exists among the first k-1 node groups required to govern aggregation of associated history layers. If the first history node group exists, adding the aggregation node and the second target node corresponding to the first history node group into the node group required by the target layer aggregation associated with the kth dominant pair; so that when the subsequent aggregation iteration processing is performed, the aggregation nodes where the second target node and the first target node are located can be directly aggregated when the target layer is aggregated, and thus, the newly added aggregation nodes and aggregation layers are avoided. If the first historical node group does not exist, the first target node and the second target node can be added into the node group required by target layer aggregation.

If the descendant nodes comprise other descendant nodes except the first target node and the descendant nodes of the first target node in the set, other descendant nodes can be selected and added into the node group required by the target layer aggregation, so that the other descendant nodes can be aggregated into an aggregation node firstly, and then the first target node, the second target node and the aggregation node are aggregated into a new aggregation node. In one embodiment, other descendant nodes can be directly selected and added to the node group required by the target layer aggregation. In yet another embodiment, it may be further detected whether the first k-1 node groups that govern the aggregation of the associated history layers include aggregation nodes corresponding to other descendant nodes in the second history node group. If the second history node group exists, adding the aggregation node, the first target node and the second target node corresponding to the second history node group into the node group required by target layer aggregation; when the aggregation iteration processing is carried out subsequently, aggregation nodes where aggregation nodes corresponding to the first target node, the second target node and other offspring nodes are located can be directly aggregated when the target layer aggregation is reached, so that new aggregation nodes and aggregation layers are avoided. If the second historical node group does not exist, other descendant nodes can be added to the node group needed by the target layer aggregation, and the aggregation node, the first target node and the second target node aggregated by the other descendant nodes can be added to the node group needed by the next layer aggregation below the target layer aggregation associated with the kth dominant pair.

Optionally, it may be further detected whether a third history node group including other descendant nodes exists among the first k-1 node groups that govern the need for the associated history layer aggregation. If the third history node group exists, adding the aggregation node, the first target node and the second target node corresponding to the third history node group into the node group required by target layer aggregation; so that when the subsequent aggregation iteration processing is performed, the aggregation nodes where the first target node, the second target node and other descendant nodes are located can be directly aggregated when the target layer aggregation is reached, and thus, the newly added aggregation nodes and aggregation layers are avoided. If the third historical node group does not exist, other descendant nodes may be added to the node group required for target layer aggregation, and the aggregate node, the first target node, and the second target node aggregated by the other descendant nodes may be added to the node group required for the next layer aggregation below the target layer aggregation associated with the kth dominant pair.

s14, if not, the first target node is selected again until all target nodes in the dominance tree are traversed.

Based on the description of steps S11-S14, in order to more clearly understand the implementation of step S402, the implementation of step S402 will be further described below with reference to the dominance tree shown in fig. 5 e:

(one) the first dominant pair { G, J }:

first, a target node J may be selected as a first target node from target nodes not traversed in the bottom-up traversal order from the lowest level (i.e., fourth level) of the dominance tree shown in fig. 5 e. Then, from the inverse dominant point set of each target node except the last target node (i.e., target node O) in the target directed graph, it may be detected that there is a second target node (i.e., target node G) capable of forming the 1 st dominant pair { G, J } (or noted as the first dominant pair { G, J }) with the first target node (i.e., target node J). Since in the dominance tree, the offspring node set of the target node G has no other offspring nodes except the target node J; the number of aggregation layers associated with the first pair of branches may be 1, i.e., the first pair of branches may be associated with only the node groups required for the first layer aggregation. The target node G and the target node J may now be added directly to the node group required for target layer aggregation associated with the first dominant pair (i.e. first layer aggregation) such that the target node G and the target node J may be aggregated directly upon subsequent arrival at the first layer aggregation, as shown in fig. 5G.

The dominance tree may then continue to be traversed to reselect the first target node. Specifically, since only one target node, i.e., the target node J, is included in the lowest layer (i.e., the fourth layer) in the dominance tree shown in fig. 5e, the target node J has been traversed; thus, traversing the target nodes in the third level of the dominance tree may continue. Because the target nodes in the same layer have no sequence, the target nodes in the same layer can be traversed in any sequence. Assuming that the target nodes in the third layer in the dominance tree are traversed in order from left to right, the target node G may be selected as the first target node from among the target nodes (target node G, target node L, and target node O) that are not traversed by the third layer. Since it is detected that there is no second target node capable of forming a 2 nd dominant pair (or second dominant pair) with the first target node (i.e., target node G) from the inverse dominant point set of each target node except the last target node in the target directed graph. Thus, it is possible to continue traversing the third level of the dominance tree and re-select the target node L as the first target node from among the target nodes (target node L and target node O) that are not traversed by the third level.

Similarly, since it is detected that there is no second target node capable of forming the 2 nd dominant pair (or second dominant pair) with the first target node (i.e., target node L) from the inverse dominant point set of each target node except the last target node in the target directed graph. Thus, it is also possible to continue traversing the third level of the dominance tree and re-select the target node O as the first target node from among the target nodes (target nodes O) that have not been traversed by the third level. Since the presence of the second target node (i.e., target node E) is detected from the inverse dominant point set of each target node except the last target node in the target directed graph to be able to form the 2 nd dominant pair { E, O } (or noted as the second dominant pair { E, O }) with the first target node (i.e., target node O). It is thus possible to choose at least one target node from a plurality of target nodes according to the second target node, i.e. target node E, to add to the group of nodes required for the target layer aggregation associated with the 2 nd dominant pair, see in particular the description below.

(II) second pair { E, O }:

since in the dominance tree, the set of descendant nodes of target node E have other descendant nodes in addition to target node O: target node G, target node J, and target node L. The number of aggregation layers associated with the second pair of branches may thus be 2, i.e. the second pair of branches may associate the node groups required for the second layer aggregation and the node groups required for the third layer aggregation. Specifically, three other descendant nodes, target node G, target node J, and target node L, may be added to the node group required for the second tier aggregation associated with the second dominant pair, as shown in fig. 5 h. The aggregate node aggregated by these three other descendant nodes (i.e., aggregate node (GJL)), the first target node (i.e., target node O), and the second target node (i.e., target node E) are then added to the desired node group for the next layer aggregation (i.e., third layer aggregation) below the target layer aggregation associated with the second dominant pair, as shown in fig. 5 i.

The dominance tree may then continue to be traversed to reselect the first target node. Specifically, since each target node in the third level in the dominance tree shown in FIG. 5e has been traversed; thus, traversing the target nodes in the second level of the dominance tree may continue. Since the second layer includes only one target node, target node E may be selected directly as the first target node. Then, based on the inverse dominant point set of each target node except the last target node in the target directed graph, it may be detected that there is a second target node (i.e., target node B) that can form a 3 rd pair of branches { B, E } (or third pair of branches { B, E }) with the first target node (i.e., target node E). At least one target node may thus be selected from a plurality of target nodes according to the second target node (i.e. target node B) to be added to the group of nodes required for target layer aggregation associated with the 3 rd dominant pair, see in particular the description below.

(III) third pair { B, E }:

since in the dominance tree, the set of descendant nodes of the target node B are not present except the target node E and the descendant nodes of the target node E. The number of aggregation layers associated with the third pair of branches may be 1, i.e., the third pair of branches may only associate the node groups required for the fourth layer aggregation. Specifically, since there is a first history node group (i.e., the node group required for the third layer aggregation) including the first target node (i.e., the target node E), the aggregation node (i.e., the aggregation node (EGJLO)) corresponding to the first history node group and the second target node (i.e., the target node B) may be directly added to the node group required for the fourth layer aggregation, as shown in fig. 5 j. The dominance tree may then continue to be traversed to reselect the first target node. Specifically, since each target node in the second level in the dominance tree shown in FIG. 5e has been traversed; thus, traversing the root node in the first level of the dominance tree may continue; because the root node does not have a dominant point, the extraction of the aggregation level information can be stopped, so that the final aggregation level information is obtained; i.e. the aggregation level information may be schematically represented by the lower graph in fig. 5 j.

S404, performing at least one layer of aggregation iteration processing on the plurality of target nodes according to the aggregation level information to obtain target aggregation nodes.

In a specific implementation, node aggregation may be performed with the traffic threshold as granularity, starting from the innermost aggregation (i.e., the first layer aggregation) of the aggregation level information. Specifically, a specific embodiment of step S403 may include the following steps S21-S25:

s21, determining an nth node group required by nth layer aggregation according to aggregation level information, and determining a traffic sum of the nth node group according to the traffic of each node in the nth node group; wherein n is [1, N ]. It is to be noted that, from the foregoing, it is known that: the nodes in any node group may include at least one of: a target node and an aggregate node aggregated by at least two target nodes. For a target node, determining the traffic of the target node according to the data size of the execution result data corresponding to the target node; for an aggregation node, the traffic of the aggregation node is obtained by summing the traffic of a target node corresponding to the aggregation node.

And s22, when the sum of the traffic of the nth node group is smaller than or equal to the traffic threshold, performing aggregation processing on each node in the nth node group to obtain an nth aggregation node.

s23, if the current value of N is smaller than N, and the sum of the traffic of the n+1th node group required by the n+1th layer aggregation acquired according to the aggregation level information is larger than the traffic threshold, the target aggregation node can be obtained according to the N aggregation node.

s24, if the current value of N is smaller than N and the sum of the traffic of the n+1th node group is smaller than or equal to the traffic threshold, adding an operation to the current value of N to update N, and determining the nth node group required by the nth layer aggregation according to the aggregation level information.

And s25, if the current value of N is equal to N, obtaining the target aggregation node according to the nth aggregation node.

The specific implementation of the step s21-s25 of obtaining the target aggregation node according to the nth aggregation node may be: if the value of n is 1, the 1 st aggregation node is taken as the target aggregation node. If the value of n is not 1, at least one history aggregation node obtained by the previous n-1 layer aggregation is obtained, and the history aggregation node which is not subjected to aggregation processing is selected from the at least one history aggregation node, and the nth aggregation node is used as a target aggregation node. The first n-1 layer polymerization means: all layers between layer 1 polymerization to layer n-1 polymerization.

Based on the description of the steps S21 to S25, in order to more clearly understand the implementation of the step S403, the implementation of the step S403 will be further described with reference to specific examples. Specifically, the above example is still accepted, and the traffic threshold is set to 100 and the traffic of each target node is set as follows: the traffic volume of target node B is 50 (i.e., b=50), the traffic volume of target node E is 20 (i.e., e=50), the traffic volume of target node L is 120 (i.e., l=120), the traffic volume of target node G is 10 (i.e., g=10), the traffic volume of target node J is 80 (i.e., j=80), and the traffic volume of target node O is 50 (i.e., o=50). Then, correspondingly, the specific implementation procedure of step S403 is as follows:

(one) n has a value of 1:

first, determining a first node group required for the first layer aggregation according to the aggregation level information includes: target node G and target node J. The traffic of the first group of nodes may then be summed to 90 based on the traffic of each node in the first group of nodes. Since the traffic sum of the first node group is smaller than the traffic threshold (100), the nodes in the first node group may be aggregated to obtain a first aggregated node (GJ), as shown in fig. 5 k. Then, because the current value (1) of N is smaller than N (4), the total traffic of the n+1th node group required by the n+1th layer aggregation can be obtained according to the aggregation level information; the obtaining the second node group needed by the second layer aggregation according to the aggregation level information comprises the following steps: a first aggregation node (GJ) and a target node L, the sum of the traffic of which is 210. Stopping the aggregation iteration since the acquired sum of the traffic of the second node group required for the second layer aggregation is larger than the traffic threshold (100); at this time, the target aggregation node may be obtained according to the first aggregation node, that is, the target aggregation node includes: a first aggregation node (GJ).

It should be noted that, in other embodiments, the traffic threshold is 260; then an add operation may also be performed on the current value of N to update N such that the value of N is updated to 2, since the traffic sum (210) of the second node group is less than the traffic threshold (260) and the current value of N is less than N. The step of determining the second node group required for the second layer aggregation according to the aggregation level information may then be performed, see in particular the description below.

(II) n has a value of 2:

first, determining a second node group required for second layer aggregation according to aggregation level information includes: a first aggregation node (GJ) and a target node L; and determining that the traffic sum of the second node group is 210 based on the traffic of each node in the second node group. Since the sum of the traffic of the second group of nodes is less than the traffic threshold (260), the nodes in the second group of nodes may be aggregated to obtain a second aggregated node (GJL). Then, because the current value (2) of N is smaller than N (3), the total traffic of the n+1th node group required by the n+1th layer aggregation can be obtained according to the aggregation level information; and obtaining the traffic sum of the third node group required by the third layer aggregation according to the aggregation level information. Specifically, the first obtaining, according to the aggregation level information, the third node group required by the third layer aggregation includes: a second aggregation node (GJL), a target node E and a target node O; then, the traffic of the third node group may be calculated from the traffic of each node in the third node group to be 280. Stopping the aggregation iteration because the acquired sum of traffic of the third node group required for the third layer aggregation is greater than a traffic threshold (260); at this time, the target aggregation node may be obtained according to the second aggregation node. Specifically, a first aggregation node (GJ) obtained by aggregation of the first 1 layer may be obtained; since the first aggregation node (GJ) has been aggregated, there may be no history aggregation node for which the aggregation process has not been performed, and only the second aggregation node (GJL) may be regarded as the target aggregation node. That is, the target aggregation node in this case includes: a second aggregation node (GJL).

S405, updating the calculation graph by using the target aggregation node.

In the specific implementation process, a target aggregation node can be added in the calculation graph, and a directed edge is adopted to connect the target aggregation node and the aggregated target node; taking the target aggregation node (i.e., the first aggregation node (GJ)) shown in fig. 5k as an example, a schematic diagram of adding the target aggregation node can be seen in fig. 5 l. Then, a matched communication node can be added for the target node which is not aggregated in the calculation graph, and a matched communication node can be added for the target aggregation node in the calculation graph; wherein the communication node is configured to represent a data transmission operation. With the above example in mind, the target nodes that are not aggregated include: target node B, target node E, target node L and target node O; the target aggregation node comprises a first aggregation node (GJ); a schematic diagram of adding a communication node may be seen in fig. 5 m.

And S406, sending the updated calculation graph to the computing equipment.

The embodiment of the invention can firstly acquire the dominant point information of a plurality of target nodes in the calculation graph of the target object; next, a dominance tree may be constructed from dominance point information for a plurality of target nodes in the computational graph of the target object, and aggregate level information may be extracted based on the dominance tree. And secondly, performing at least one layer of aggregation iteration processing on the plurality of target nodes according to the aggregation level information so as to improve the accuracy of the target aggregation nodes. The computational graph may then be updated with the target aggregation node and the updated computational graph sent to the computing device. In the process of calculating the target object, after the data processing operation represented by the aggregated target node is executed, the computing device can aggregate and transmit the execution result data of the data processing operation represented by the aggregated target node according to the indication of the target aggregation node, so that the number of times of data transmission is reduced, network resources are saved, and the total transmission time is shortened.

In practical applications, the above mentioned data transmission method can be applied in different application scenarios; for example, a distributed machine learning application scenario, an application scenario that uses one or more computing devices to test applications, an application scenario that uses one or more computing devices to test hardware devices, and so on. Wherein distributed machine learning refers to: machine learning mode of distributing machine learning task of neural network model to multiple computing devices for parallel processing. Distributed machine learning may support multiple modes, such as a Data parallel (Data parallel) mode, a model parallel (model Parallelism) mode, and so on. In data parallel mode: different computing devices have multiple copies of the same model, each computing device model trains the respective copies in parallel using different training data such that the respective copies are machine-learned, and then incorporates the results (e.g., gradients) of the computations involved in model training by all computing devices in some manner. In model parallel mode: different parts of the same model are assigned to different computing devices, such as different network layers or different parameters of the same network layer are assigned to different computing devices, the respective responsible parts are model trained by the respective computing devices in parallel to make the respective responsible parts machine-learnt, and then training results of all computing devices are combined.

The machine learning is a multi-field interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like; the learning behavior of the computer equipment is specially researched to simulate or realize the learning behavior of human beings so as to acquire new knowledge or skills, and the existing knowledge structure is reorganized to continuously improve the performance of the computer equipment. Machine learning is the core of AI (Artificial Intelligence ), which refers to the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, AI is a comprehensive technique of computer science; the intelligent machine is mainly used for producing a novel intelligent machine which can react in a similar way of human intelligence by knowing the essence of the intelligence, so that the intelligent machine has multiple functions of sensing, reasoning, decision making and the like.

The specific application of the data transmission method will be described below taking the application scenario of the above-mentioned data transmission method applied to distributed machine learning as an example; in the application scenario of the distributed machine learning, the target object may be a neural network model to be subjected to the distributed machine learning, and the execution result data of the data processing operation represented by each target node includes: the neural network model creates gradients in distributed machine learning. Specifically, the general principle of the data transmission method can be collectively seen in fig. 6:

The processing device may first obtain a computational graph of the neural network model, which may include a plurality of target nodes for representing data processing operations that require transmission of execution result data (e.g., gradients). Second, the target nodes with the same or similar reachability information may be aggregated into one target aggregation node (concatemer node) by comparing the reachability information of each target node in the computational graph that needs to transmit synchronization data (i.e., gradients). The target aggregation node may then be added to the computational graph, and a communication node (All Reduce node) may be added to update the computational graph for the tensor that needs to be communicated (i.e., the gradient corresponding to the target node that is not aggregated and the aggregation result corresponding to the aggregation node). At run-time, the processing device may issue the updated computational graph to each computing device; in the process of model training the copies of the neural network models held by the computing devices, gradient fusion can be carried out on gradients corresponding to the aggregated target nodes according to the indication of the aggregated nodes in the updated computing graph; by gradient fusion is meant: and fusing the different gradients into one communication data segment for communication transmission. After gradient fusion, the communication node can be operated; and each computing device can synchronously communicate with the management device when operating to the communication node so as to transmit the corresponding tensor (the gradient corresponding to the target node which is not aggregated and the aggregation result corresponding to the aggregation node) to the management device.

Correspondingly, after receiving the tensor transmitted by each computing device, if the tensor transmitted by each computing device is a gradient corresponding to the target node which is not aggregated, the management device may directly perform merging calculation (such as mean calculation) on the gradient transmitted by each computing device, and update the network parameters of the neural network model (i.e. the target object) by using the merged gradient. If the tensor transmitted by each computing device is an aggregation result corresponding to the aggregation node, the management device can perform separation processing on the aggregation result to obtain each gradient to be fused. Then, the gradients of the same data processing operation transmitted by the computing devices can be respectively combined and calculated (such as mean value calculation), and the combined gradients are used for updating network parameters of the neural network model (i.e. the target object) respectively. After updating the network parameters, the management device may issue the updated network parameters to each computing device; or after receiving the pulling request of each computing device, issuing the updated network parameters to each computing device, so that each computing device executes the next round of model training by adopting the updated network parameters, and repeatedly executing the steps until the model training is completed.

Therefore, in the application scenario that the data transmission method provided by the embodiment of the invention is applied to the distributed machine learning, the gradient obtained by each computing device in the model training process can be effectively fused and transmitted, so that the transmission delay can be effectively reduced, and the communication is accelerated. Moreover, the gradient fusion method can adapt to a complex calculation graph topological structure and different traffic threshold conditions, and can realize flexible fusion of communication information so as to enable calculation communication to be parallel. It should be understood that, the data transmission method provided by the embodiment of the invention can be reasonably and flexibly applied to machine learning platforms such as a distributed machine learning framework and the like, and can be further extended to other distributed systems needing computational communication parallelism; the embodiments of the present invention are not limited in this regard.

Based on the above description of the embodiments of the data transmission method, the embodiments of the present invention also disclose a data transmission device, which may be a computer program (including program code) running in a processing device. The data transmission device may perform the method shown in fig. 2 or fig. 4. Referring to fig. 7, the data transmission apparatus may operate as follows:

An obtaining unit 701, configured to obtain dominant point information of a plurality of target nodes in a computation graph of a target object; each target node is used for representing a data processing operation which needs to be executed in the calculation process of the target object, and execution result data of the data processing operation represented by each target node needs to be transmitted; the dominant point information of any target node is used for reflecting the dominant relationship between any target node and other target nodes;

an aggregation unit 702, configured to aggregate at least two target nodes into a target aggregate node according to dominant point information of each target node, where the target aggregate node is configured to instruct aggregation of execution result data of a data processing operation represented by the aggregated target node;

a processing unit 703, configured to update the computational graph with the target aggregation node, and send the updated computational graph to a computing device, where the updated computational graph is used to indicate: and the computing equipment aggregates the execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object according to the indication of the target aggregation node, and transmits an aggregation result.

In one embodiment, the target nodes correspond to a target directed graph, and the target directed graph is obtained by connecting the target nodes by using a plurality of directed edges based on the reachability relation of the target nodes in the calculation graph; the reachability relationship is used to indicate: the ability of one target node to reach other target nodes along at least one edge in the computational graph;

the dominance information for any target node includes at least one of: the dominant point set of any target node and the inverse dominant point set of any target node;

the dominant points in the dominant point set of any target node are: a target node through which all forward paths from a starting target node to any one of the target nodes in the target directed graph pass; the dominant point of any target node is concentrated to the dominant point closest to the any target node, and is the closest dominant point of any target node;

the inverse dominant points in the inverse dominant point set of any target node are: a target node through which all reverse paths from a starting target node to any one target node in a reverse graph corresponding to the target directed graph pass; the inverse dominant point of any target node is concentrated to the inverse dominant point closest to the any target node, and is the closest inverse dominant point of the any target node; the reverse graph is a graph obtained by reversing each directed edge in the target directed graph.

In yet another embodiment, the aggregation unit 702, when configured to aggregate at least two target nodes into a target aggregate node according to dominant point information of each target node, may be specifically configured to:

In yet another embodiment, the aggregation unit 702, when configured to construct a dominant tree composed of the plurality of target nodes according to dominant point information of each target node, may be specifically configured to:

In yet another embodiment, the parent node of each target node in the dominance tree except the root node is: the nearest dominant point of each target node; k pairs of branches exist in the target nodes, one pair of branches is used for associating node groups required by at least one layer of aggregation; wherein K is a positive integer; accordingly, the aggregation unit 702, when configured to extract aggregation level information based on the dominance tree, may be specifically configured to:

In yet another embodiment, the aggregation unit 702, when configured to select, according to the second target node, at least one target node from the plurality of target nodes to be added to a node group required for target layer aggregation associated with the kth dominant pair, may be specifically configured to:

In yet another embodiment, the aggregation unit 702, when configured to select the first target node and the second target node, is added to a node group required for aggregation with the kth dominant pair associated with the target layer, may be specifically configured to:

In yet another embodiment, the aggregation unit 702, when configured to select the other descendant nodes to be added to the node group required for the target layer aggregation, may be specifically configured to:

In another embodiment, the aggregation unit 702, when configured to perform at least one layer of aggregation iterative processing on the plurality of target nodes according to the aggregation level information, may be specifically configured to:

In yet another embodiment, the aggregation unit 702 may be further specifically configured to:

In yet another embodiment, the aggregation unit 702, when configured to obtain the target aggregation node according to the nth aggregation node, may be specifically configured to:

In yet another embodiment, the processing unit 703, when configured to update the computational graph with the target aggregation node, may be specifically configured to:

In yet another embodiment, the target object includes a neural network model to be subjected to distributed machine learning, and the execution result data of the data processing operation represented by each target node includes: gradients generated by the neural network model in the distributed machine learning.

According to one embodiment of the invention, the steps involved in the method of fig. 2 or fig. 4 may be performed by the units of the data transmission device of fig. 7. For example, steps S201 to S203 shown in fig. 2 may be performed by the acquisition unit 701, the aggregation unit 702, and the processing unit 703 shown in fig. 7, respectively; as another example, step S401 shown in fig. 4 may be performed by the acquisition unit 701 shown in fig. 7, steps S402 to S404 may be performed by the aggregation unit 702 shown in fig. 7, and steps S405 to S406 may be performed by the processing unit 703 shown in fig. 7.

According to another embodiment of the present invention, each unit in the data transmission apparatus shown in fig. 7 may be separately or completely combined into one or several other units, or some unit(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present invention, the data-based transmission device may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present invention, a data transmission apparatus device as shown in fig. 7 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or fig. 4 on a general-purpose processing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and the data transmission method of the embodiment of the present invention is implemented. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed in the processing apparatus described above via the computer-readable recording medium.

Based on the description of the method embodiment and the device embodiment, the embodiment of the invention also provides a processing device. Referring to fig. 8, the processing device includes at least a processor 801, an input interface 802, an output interface 803, and a computer storage medium 804. Wherein the processor 801, input interface 802, output interface 803, and computer storage medium 804 within the processing device may be connected by bus or other means.

The computer storage medium 804 may be stored in a memory of a processing device, the computer storage medium 804 being adapted to store a computer program comprising program instructions, the processor 801 being adapted to execute the program instructions stored by the computer storage medium 804. The processor 801, or CPU (Central Processing Unit ), is a computing core and a control core of the processing device, adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; in one embodiment, the processor 801 according to the embodiments of the present invention may be configured to perform a series of data transmission processes, including: acquiring dominant point information of a plurality of target nodes in a calculation graph of a target object; each target node is used for representing a data processing operation which needs to be executed in the calculation process of the target object, and execution result data of the data processing operation represented by each target node needs to be transmitted; the dominant point information of any target node is used for reflecting the dominant relationship between any target node and other target nodes; according to the dominant point information of each target node, aggregating at least two target nodes into a target aggregation node, wherein the target aggregation node is used for indicating to aggregate the execution result data of the data processing operation represented by the aggregated target nodes; updating the computational graph by adopting the target aggregation node, and sending the updated computational graph to computing equipment, wherein the updated computational graph is used for indicating: the computing device aggregates execution result data of the data processing operation represented by the aggregated target node in the computing process of the target object according to the instruction of the target aggregation node, transmits an aggregation result, and the like.

The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the processing device and is used for storing programs and data. It is understood that the computer storage media herein may include both built-in storage media in the processing device and extended storage media supported by the processing device. The computer storage media provides storage space that stores the operating system of the processing device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 801. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; optionally, at least one computer storage medium remote from the processor may be present.

In one embodiment, one or more instructions stored in computer storage medium 804 may be loaded and executed by processor 801 to implement the corresponding method steps described above in connection with the data transmission method embodiments shown in fig. 2 or fig. 4; in particular implementations, one or more instructions in computer storage media 804 are loaded by processor 801 and perform the steps of:

In yet another embodiment, the one or more instructions may be loaded and executed by the processor 801 when aggregating at least two target nodes into a target aggregate node according to dominant point information of each target node:

In yet another embodiment, the one or more instructions may be loaded and executed by the processor 801 in constructing a dominance tree of the plurality of target nodes based on the dominance point information of each target node, where the one or more instructions are specifically:

In yet another embodiment, the parent node of each target node in the dominance tree except the root node is: the nearest dominant point of each target node; k pairs of branches exist in the target nodes, one pair of branches is used for associating node groups required by at least one layer of aggregation; wherein K is a positive integer; accordingly, in extracting the aggregated hierarchy information based on the dominance tree, the one or more instructions may be loaded and executed in particular by the processor 801 to:

In yet another embodiment, if present, the one or more instructions may be loaded and executed by the processor 801 to specifically perform:

In yet another embodiment, the one or more instructions may be loaded and executed by the processor 801 when selecting the first target node and the second target node to add to the group of nodes required for target layer aggregation associated with the kth dominant pair, and specifically:

In yet another embodiment, the one or more instructions may be loaded and executed by the processor 801 when the other descendant node is selected to be added to the node group required for target layer aggregation:

In yet another embodiment, when at least one layer of aggregation iteration processing is performed on the plurality of target nodes according to the aggregation level information to obtain a target aggregation node, the one or more instructions may be loaded and specifically executed by the processor 801:

In yet another embodiment, the one or more instructions may also be loaded and executed in particular by the processor 801:

In yet another embodiment, the one or more instructions may be loaded and executed by the processor 801 to:

In yet another embodiment, the one or more instructions may be loaded and executed in particular by the processor 801 when updating the computational graph with the target aggregation node:

It should be noted that according to an aspect of the present application, there is also provided a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the methods provided in the various alternatives of the data transmission method embodiments aspects shown in fig. 2 or fig. 4 described above.

It is also to be understood that the foregoing is merely illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

Claims

1. A data transmission method, comprising:

2. The method of claim 1, wherein the plurality of target nodes correspond to a target directed graph, the target directed graph being obtained by connecting each target node with a plurality of directed edges based on a reachability relationship of each target node in the computational graph; the reachability relationship is used to indicate: the ability of one target node to reach other target nodes along at least one edge in the computational graph;

the dominant point information of any target node includes at least one of: the dominant point set of any target node and the inverse dominant point set of any target node;

3. The method of claim 2, wherein aggregating at least two target nodes into a target aggregate node based on dominant point information for each target node, comprises:

4. The method of claim 3, wherein constructing a dominance tree of the plurality of target nodes based on dominance point information for each target node comprises:

5. The method of claim 3, wherein the parent node of each target node in the dominance tree other than the root node is: the nearest dominant point of each target node; k pairs of branches exist in the target nodes, one pair of branches is used for associating node groups required by at least one layer of aggregation; wherein K is a positive integer; the extracting aggregation level information based on the dominance tree includes:

6. The method of claim 5, wherein the selecting at least one target node from the plurality of target nodes based on the second target node, if any, to add to the group of nodes required for target layer aggregation associated with the kth dominant pair comprises:

7. The method of claim 6, wherein the selecting the first target node and the second target node for addition to the group of nodes required for target layer aggregation associated with the kth dominant pair comprises:

8. The method of claim 6, wherein the selecting the other descendant nodes to add to the group of nodes required for the target layer aggregation comprises:

9. The method of claim 3, wherein performing at least one layer of aggregation iteration processing on the plurality of target nodes according to the aggregation level information to obtain target aggregation nodes comprises:

10. The method of claim 9, wherein the method further comprises:

11. The method according to claim 9 or 10, wherein the obtaining a target aggregation node according to the nth aggregation node comprises:

12. The method of claim 1, wherein the updating the computational graph with the target aggregation node comprises:

13. The method of claim 1, wherein the target object comprises a neural network model to be subjected to distributed machine learning, and the execution result data of the data processing operation represented by each target node comprises: gradients generated by the neural network model in the distributed machine learning.

14. A data transmission apparatus, comprising:

15. A processing device comprising an input interface and an output interface, further comprising:

computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the data transmission method according to any of claims 1-13.