CN107291533B

CN107291533B - Method and device for determining upstream node bottleneck degree and system bottleneck degree

Info

Publication number: CN107291533B
Application number: CN201610197115.3A
Authority: CN
Inventors: 胡于响
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2016-03-31
Filing date: 2016-03-31
Publication date: 2020-10-30
Anticipated expiration: 2036-03-31
Also published as: CN107291533A

Abstract

The embodiment of the application discloses a method and a device for determining an upstream node bottleneck degree and a system bottleneck degree, wherein the method for determining the upstream node bottleneck degree comprises the following steps: determining all upstream nodes of the current node; determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node; splitting the optimizable time range into a plurality of time intervals; calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimizable time range; by adopting the method and the device, the restriction degree of each upstream node on the starting execution time of the calculation task corresponding to the current node can be determined, so that the output time of the calculation task corresponding to the current node can be optimized conveniently.

Description

Method and device for determining upstream node bottleneck degree and system bottleneck degree

Technical Field

The present application relates to the field of internet, and in particular, to a method and an apparatus for determining an upstream node bottleneck degree and a system bottleneck degree.

Background

In a scheduling system, a scheduling topology (i.e., a workflow) is usually provided, and the scheduling system may call an application program according to the scheduling topology to execute a computing task. For example, as shown in fig. 1, the scheduling system will first call an application program to execute the computation task corresponding to the node a, then call the application program after the computation task corresponding to the node a is completed, execute the computation tasks corresponding to the node B and the node C, and so on until the computation tasks are executed to the node D and the node E.

In the scheduling system, the output time (i.e. the end running time) of each computation task is an important index, so there is often a need to optimize the output time of the computation task; because the output time of each computing task is the starting operation time of the computing task plus the operation time of the computing task, the output time of the computing task can be optimized from the aspects of advancing the starting operation time of the computing task and reducing the operation time of the computing task.

In practical application, the running time of the calculation task can be reduced by optimizing and executing the writing codes of the calculation task application program; in the scheduling system, the starting operation time of each computation task is restricted by the ending operation time of the computation task corresponding to the upstream node in the scheduling topological graph, for example, the starting operation time of the D node in fig. 1 is restricted by the ending operation time of the computation task corresponding to the upstream A, B, C nodes, that is, only the computation tasks corresponding to the A, B, C three upstream nodes are executed, and the computation tasks corresponding to the D node can be started to be executed. Further, since the restriction degrees of each upstream node on the current computing task are different, in the prior art, a method and a device for calculating the bottleneck degrees of the upstream nodes and the system bottleneck degrees are needed to determine the restriction degree of each upstream node on the time when the current computing task starts to be executed, so as to facilitate optimization of the output time of the current computing task.

Content of application

The embodiment of the application provides a method and a device for determining the bottleneck degree of an upstream node and the bottleneck degree of a system, so as to determine the degree of restriction of each upstream node on the starting execution time of a calculation task corresponding to a current node, thereby facilitating the optimization of the output time of the calculation task corresponding to the current node.

In order to solve the technical problem, the embodiment of the application discloses the following technical scheme:

in one aspect, the present application discloses a method for determining an upstream node bottleneck degree, comprising:

determining all upstream nodes of the current node;

determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node;

splitting the optimizable time range into a plurality of time intervals;

and calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimized time range.

Optionally, determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node, where the determining includes:

judging whether a timed calculation task exists in the corresponding calculation tasks of the upstream node;

if yes, taking the starting operation time of the timed calculation task as the target starting operation time;

if the current time does not exist, determining the earliest starting operation time of the calculation task in the upstream node as the target starting operation time;

and determining an optimized time range according to the target starting operation time and the starting operation time of the corresponding calculation task of the current node.

Optionally, the method further includes:

sequencing all upstream nodes of the current node according to the bottleneck degree;

and taking the upstream nodes with preset quantity as the upstream bottleneck nodes of the current nodes.

Optionally, the method further includes:

taking a time interval in which a plurality of computing tasks run simultaneously as an indirect bottleneck interval;

and determining the optimization difficulty of the current node according to the indirect bottleneck interval and the optimizable time range.

On the other hand, the application also discloses a method for determining the system bottleneck degree, which comprises the following steps:

calculating a shortest path from a non-leaf node to each leaf node in a scheduling topological graph, wherein the non-leaf node is a node with a downstream node in the scheduling topological graph;

when the non-leaf nodes increase the operation preset time length of the corresponding calculation task, determining the increased time length of the operation of the corresponding calculation task of each leaf node according to the shortest path;

and determining the system bottleneck degree of the non-leaf nodes according to the increased time length of each leaf node in operation.

Optionally, calculating a shortest path from a non-leaf node to each leaf node in the scheduling topology includes:

determining all paths from a non-leaf node to a leaf node in a scheduling topological graph; wherein each path comprises a plurality of nodes;

determining the path length of each path according to the end operation time and the start operation time of the computing task corresponding to the adjacent nodes in each path;

and determining the path with the shortest path length as the shortest path.

Optionally, when the non-leaf node increases the operation time length corresponding to the computation task by a preset time length, determining the time length increased by the operation of the computation task corresponding to each leaf node according to the shortest path, including:

judging whether the current node corresponds to a calculation task and whether the operation increased preset time is longer than the length from the current node to the shortest path of one leaf node or not;

if so, determining that the time length increased by the leaf node corresponding to the running of the calculation task is as follows: the current node corresponds to the length of a shortest path which is a preset time length added by the running of the calculation task;

and if the time length of the leaf node corresponding to the running of the calculation task is less than or equal to zero, determining that the time length of the leaf node corresponding to the running of the calculation task is zero.

Optionally, the method further includes:

sequencing all non-leaf nodes in the scheduling topological graph according to the size of the bottleneck degree of the system;

and taking a preset number of non-leaf nodes as system bottleneck nodes.

In another aspect, the present application further discloses a method for determining a system bottleneck, comprising:

determining a non-leaf node;

when the operation of the non-leaf node corresponding to the calculation task is reduced by a preset time length, the number of activated leaf nodes in a scheduling topological graph is calculated, the non-leaf nodes are nodes with downstream nodes in the scheduling topological graph, and the activated leaf nodes are leaf nodes with the changed operation starting time and operation ending time corresponding to the calculation task;

and determining the system bottleneck degree of the non-leaf nodes according to the number of the activated leaf nodes and the preset duration.

Optionally, when the running of the non-leaf node corresponding to the computation task is reduced by a preset time length, calculating the number of activated leaf nodes in the scheduling topology, including:

when the running of a non-leaf node corresponding to the calculation task in the scheduling topological graph is reduced by a preset time length, recalculating the running ending time of the calculation task;

setting the non-leaf node as an active node;

judging whether the end running time of the computing task corresponding to the active node influences the start running time of the computing task corresponding to a direct downstream node of the active node, wherein the direct downstream node is a downstream node directly connected with the active node in the scheduling topological graph;

if so, determining the direct downstream node as an active node, and recalculating the starting running time and the ending running time of the corresponding calculation task of the active node;

judging whether the activated node is a leaf node;

if yes, determining the activated node as an activated leaf node;

if not, returning to the step of circularly executing and calculating the ending running time of the calculation task corresponding to the active node and whether the starting running time of the calculation task corresponding to the direct downstream node of the active node is influenced.

Optionally, calculating the ending operation time of the computing task corresponding to the active node, and whether to influence the starting operation time of the computing task corresponding to the direct downstream node of the active node, includes:

determining, in a scheduling topology, a node immediately downstream of an active node and a node immediately upstream of the immediately downstream node;

determining the latest finishing running time of the corresponding computing task in all direct upstream nodes of a direct downstream node;

judging whether the latest finishing operation time is the same as the recalculation finishing operation time of the activation node;

if the two nodes are the same, determining that the activation node influences the starting running time of the corresponding calculation task of the direct downstream node, otherwise, determining that the activation node does not influence the starting running time of the corresponding calculation task of the direct downstream node.

Optionally, the method further includes:

and taking a preset number of non-leaf nodes as system bottleneck nodes.

On the other hand, the application also discloses a device for determining the bottleneck degree of the upstream node, which comprises the following steps:

an upstream node determining module, configured to determine all upstream nodes of a current node;

the optimization time range determining module is used for determining an optimization time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node;

the splitting module is used for splitting the time range which can be optimized into a plurality of time intervals;

and the upstream node bottleneck degree calculating module is used for calculating the bottleneck degree of each upstream node according to the time interval in which the corresponding calculating task runs, the total number of the calculating tasks in the time interval and the optimized time range.

Optionally, the determining an optimizable time range module includes:

the timing calculation task judging unit is used for judging whether the upstream node corresponds to the calculation task or not;

a first target starting operation time determining unit, configured to, when a timed computing task exists, take a starting operation time of the timed computing task as a target starting operation time;

a second target start operation time determining unit, configured to determine, in the upstream node, an earliest start operation time of the calculation task as a target start operation time when the timing calculation task does not exist;

and the unit for determining the optimizable time range is used for determining the optimizable time range according to the target starting operation time and the starting operation time of the corresponding calculation task of the current node.

Optionally, the apparatus further comprises:

the first sequencing module is used for sequencing all upstream nodes of the current node according to the bottleneck degree;

and the upstream bottleneck node determining module is used for taking a preset number of upstream nodes as the upstream bottleneck nodes of the current node.

Optionally, the apparatus further comprises:

an indirect bottleneck interval determining unit, configured to use a time interval in which a plurality of computation tasks operate simultaneously as an indirect bottleneck interval;

and the optimization difficulty determining unit is used for determining the optimization difficulty of the current node according to the indirect bottleneck interval and the optimizable time range.

In another aspect, the present application further discloses an apparatus for determining a system bottleneck, comprising:

the shortest path calculation module is used for calculating the shortest path from a non-leaf node to each leaf node in the scheduling topological graph, wherein the non-leaf node is a node with a downstream node in the scheduling topological graph;

the leaf node running time length determining module is used for determining the running increased time length of the computing task corresponding to each leaf node according to the shortest path when the non-leaf node increases the running preset time length of the computing task corresponding to each leaf node;

and the first system bottleneck degree determining module is used for determining the system bottleneck degree of the non-leaf nodes according to the increased time length of each leaf node in operation.

Optionally, the shortest path calculating module includes:

a path determining unit, configured to determine all paths from a non-leaf node to a leaf node in the scheduling topology; wherein each path comprises a plurality of nodes;

the path length determining unit is used for determining the path length of each path according to the end operation time and the start operation time of the computing task corresponding to the adjacent nodes in each path;

and the shortest path determining unit is used for determining the path with the shortest path length as the shortest path.

Optionally, the module for determining the operation duration of the leaf node includes:

the first judgment unit is used for judging whether the current node corresponds to a calculation task and the operation increased preset time is longer than the length from the current node to the shortest path of one leaf node;

a first operation duration determining unit, configured to determine, when a current node corresponds to a computation task and a preset operation duration increased by the operation is greater than a length of a shortest path from the current node to a leaf node, that an operation duration increased by the leaf node corresponding to the computation task is: the current node corresponds to the length of a shortest path which is a preset time length added by the running of the calculation task;

and the second operation time length determining unit is used for determining that the time length increased by the leaf node corresponding to the operation of the calculation task is zero when the preset time length increased by the operation of the current node corresponding to the calculation task is less than or equal to the length of the shortest path from the current node to the leaf node.

Optionally, the apparatus further comprises:

the second sequencing module is used for sequencing all the non-leaf nodes in the scheduling topological graph according to the system bottleneck degree;

and the first system bottleneck determining module is used for taking the preset number of non-leaf nodes as the system bottleneck nodes.

On the other hand, the application also discloses a device for determining the bottleneck degree of the system, which comprises:

a non-leaf node determining module for determining a non-leaf node;

the module for calculating the number of activated leaf nodes is used for calculating the number of activated leaf nodes in a scheduling topological graph when the operation of the non-leaf nodes corresponding to the calculation tasks is reduced by a preset time length, wherein the non-leaf nodes are nodes with downstream nodes in the scheduling topological graph, and the activated leaf nodes are leaf nodes with changed starting operation time and ending operation time corresponding to the calculation tasks;

and the system bottleneck degree calculating module is used for determining the system bottleneck degree of the non-leaf node according to the number of the activated leaf nodes and the preset duration.

Optionally, the module for calculating the number of activated leaf nodes includes:

the first recalculation unit is used for recalculating the running finishing time of the calculation task when the running of a non-leaf node corresponding to the calculation task in the scheduling topological graph is reduced by a preset time length;

a setting unit, configured to set the non-leaf node as an active node;

an influence judging unit, configured to judge whether an ending operation time of the computation task corresponding to the active node influences a starting operation time of the computation task corresponding to a direct downstream node of the active node, where the direct downstream node is a downstream node directly connected to the active node in the scheduling topological graph;

an active node determining unit, configured to determine, if the influence is caused, that the direct downstream node is an active node, and recalculate a start operation time and an end operation time of the active node corresponding to the calculation task;

a leaf node determining unit, configured to determine whether the activated node is a leaf node;

an activated leaf node determining unit, configured to determine that an activation node is an activated leaf node when the activation node is a leaf node.

Optionally, the influence determining unit includes:

a direct downstream and direct upstream determining subunit, configured to determine, in a scheduling topology, a direct downstream node of an active node and a direct upstream node of the direct downstream node;

the latest ending running time determining subunit is used for determining the latest ending running time of the corresponding computing task in all direct upstream nodes of a direct downstream node;

the same judgment subunit is used for judging whether the latest ending running time is the same as the recalculation ending running time of the activation node;

the influence determining unit is used for determining the starting running time of the calculation task of the direct downstream node influenced by the activation node when the latest finishing running time is the same as the recalculation finishing running time of the activation node;

and the unaffected determining unit is used for determining that the activated node does not affect the starting running time of the corresponding calculation task of the direct downstream node when the latest finishing running time is different from the recalculating finishing running time of the activated node.

Optionally, the apparatus further comprises:

the third sequencing module is used for sequencing all the non-leaf nodes in the scheduling topological graph according to the bottleneck degree of the system;

and the second system bottleneck determining module is used for taking the preset number of non-leaf nodes as the system bottleneck nodes.

As can be seen from the above technical solutions, in the embodiment of the present application, first, all upstream nodes of a current node are determined in a scheduling topological graph; then, determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node; then, dividing the optimizable time range into a plurality of time intervals, determining the time interval in which each upstream node corresponds to the calculation task to run and the total number of the calculation tasks to run in the time intervals; and finally, calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimized time range. In the embodiment of the application, the larger the bottleneck degree of the upstream node is, the larger the restriction degree of the upstream node on the starting execution time of the corresponding calculation task of the current node is, so that by adopting the method and the device disclosed by the application, the restriction degree of the upstream node on the starting execution time of the corresponding calculation task of the current node can be calculated, and the output time of the corresponding calculation task of the current node can be optimized conveniently.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a schematic diagram of a scheduling topology disclosed in an embodiment of the present application;

FIG. 2 is another schematic diagram of a scheduling topology disclosed in an embodiment of the present application;

fig. 3 is a schematic diagram illustrating a method for determining a bottleneck degree of an upstream node according to an embodiment of the present disclosure;

fig. 4 is another schematic diagram of a method for determining a bottleneck degree of an upstream node according to an embodiment of the present disclosure;

fig. 5 is another schematic diagram of a method for determining a bottleneck degree of an upstream node according to an embodiment of the present disclosure;

FIG. 6 is a diagram illustrating a method for determining system bottleneck in accordance with an embodiment of the present disclosure;

FIG. 7 is another schematic diagram of a method for determining system bottleneck disclosed in the embodiments of the present application;

FIG. 8 is a further schematic diagram of a method for determining system bottleneck disclosed in the embodiments of the present application;

fig. 9 is a schematic structural diagram of an apparatus for determining an upstream node bottleneck disclosed in an embodiment of the present application;

fig. 10 is another schematic diagram of a method for determining an upstream node bottleneck degree according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The application discloses a method for determining upstream node bottleneck degree, which can be used for calculating the upstream node bottleneck degree of all non-root nodes in a scheduling topological graph, wherein the non-root nodes refer to nodes (such as C, D, E, F, G, H, I, J nodes in fig. 2) with upstream nodes in the scheduling topological graph. The present application will take node I shown in fig. 2 as an example to describe the process of the present application in detail, and the method for calculating the bottleneck of the upstream node disclosed in the present application, as shown in fig. 3, may at least include the following steps:

step S31: determining all upstream nodes of a current node in a scheduling topological graph;

in the embodiment of the present application, first, in the scheduling topology shown in fig. 3, all upstream nodes depended on by the current node I are determined, and all upstream nodes depended on by the node I can be found to include a node a, a node B, a node C, a node D, a node E, a node F, and a node G.

Step S32: determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node;

in the embodiment of the present application, the specific implementation process of step S32 may be as follows:

a: judging whether a timed calculation task exists in the corresponding calculation tasks of the upstream node;

in the embodiment of the present application, among the upstream nodes A, B, C, D, E, F and G of the node I, the node B is a timing calculation task, and the start time timing of the timing calculation task is 5: point 00.

B: if the timing calculation task exists, taking the starting operation time of the timing calculation task as the target starting operation time;

in the embodiment of the present application, no matter how to optimize the computing task corresponding to the current node, the starting operation time of the computing task is always later than the starting operation time of the timing computing task of the upstream node, so the starting operation time of the timing computing task needs to be used as the target starting operation time. In the embodiment of the present application, the B node may correspond to the start operation time (5:00 point) of the timing calculation task, and be used as the target start operation time.

C: if the timing calculation task does not exist, determining the earliest starting running time of the calculation task in an upstream node, and taking the earliest starting running time as the target starting running time;

d: and determining an optimized time range according to the target starting operation time and the starting operation time of the corresponding calculation task of the current node.

In this embodiment, a time difference between the target starting operation time and the starting operation time of the corresponding computing task of the current node may be specifically used as the optimizable time range. In the embodiment of the present application, as can be seen from fig. 3, the starting operation time of the node I is 6:00, thus the ratio of 6:00 and target start operation time 5: the time difference between 00 was 1 hour (60 minutes) as an optimizable time frame.

Step S33: splitting the optimizable time range into a plurality of time intervals;

in the embodiment of the application, a user can set a time interval by himself and then split an optimized time range into a plurality of time intervals according to the time interval. In the embodiment of the present application, the optimizable time range (60 minutes) may be split into 6 time intervals, which are 5:00-5: 10. 5:10-5: 20. 5: 20-5: 30. 5: 30-5: 40. 5: 40-5: 50 and 5: 50-6: 00.

step S34: determining a time interval during which each upstream node runs corresponding to a computing task, and a total number of computing tasks run during the time interval;

in the embodiment of the present application, as can be seen from fig. 3, the upstream node a does not run in the time zone corresponding to the calculation task, that is, the time interval during which the node a runs in the time zone corresponding to the calculation task is 0; the time interval of the upstream node B corresponding to the running of the calculation task is 5:00-5:10 and 5:10-5: 20; the upstream C node corresponds to the time interval during which the computing task runs 5:00-5:10 and 5:10-5: 20; the time interval of the upstream D node corresponding to the running of the calculation task is 5:00-5: 10. 5:10-5:20 and 5: 20-5: 30, of a nitrogen-containing gas; the time interval of the upstream E node corresponding to the running of the calculation task is 5: 20-5: 30 and 5: 30-5: 40; the time interval of the upstream F node corresponding to the running of the calculation task is 5: 20-5: 30. 5: 30-5: 40. 5: 40-5: 50 and 5: 50-6: 00; the time interval of the upstream G node is 5: 30-5: 40 and 5: 40-5: 50. from the above analysis, 5:00-5: the number of the computing tasks which are operated in the time interval of 10 is 3 corresponding to B, C, D nodes; 5:10-5: the number of the computing tasks which are operated in the time interval of 20 is 3 corresponding to B, C, D nodes; 5: 20-5: the number of the computing tasks which are operated in the time interval of 30 is 3 corresponding to D, E, F nodes; 5: 30-5: the number of the computing tasks which are operated in the time interval of 40 is 3 corresponding to E, F, G nodes; 5: 40-5: the number of the computing tasks which are operated in the time interval of 50 is 2, wherein the number of the computing tasks is F, G; 5: 50-6: and 00, the calculation tasks operated in the time interval are calculation tasks corresponding to the F nodes, and are only 1.

Step S35: and calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimized time range.

In the embodiment of the present application, assuming that the time interval during which an upstream node runs corresponding to the calculation task is time interval 1 and time interval 2 … …, time interval i … …, time interval n, the bottleneck of each upstream node can be calculated by using the following upstream node bottleneck calculation formula:

and i and n are positive integers, and i is less than or equal to n.

In the embodiment of the present application, still following the above example, it can be seen from the above discussion that the time interval during which the node a runs corresponding to the calculation task is 0, and therefore the bottleneck degree of the upstream node a is 0. And the time interval of the running of the corresponding calculation task of the node B is 5:00-5:10 and 5:10-5:20, and 5:00-5:10 the total number of computing tasks run in this time interval is 3, 5:10-5:20, the total number of computational tasks performed in this time interval is also 3, so that the bottleneck of the upstream node B is [ (5:00-5:10)/3+ (5:10-5:20)/3]/[5:00-6:00], [10 min/3 +10 min/3 ]/60 min is 11.1%; similarly, by adopting the above method, the bottleneck of the upstream node C is 11.1%, the bottleneck of the upstream node D is 16.7%, the bottleneck of the upstream node E is 11.1%, the bottleneck of the upstream node F is 36.1%, and the bottleneck of the upstream node G is 13.9%.

As can be seen from the above technical solutions, in the embodiment of the present application, first, all upstream nodes of a current node are determined in a scheduling topological graph; then, determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node; then, dividing the optimizable time range into a plurality of time intervals, determining the time interval in which each upstream node corresponds to the calculation task to run and the total number of the calculation tasks to run in the time intervals; and finally, calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimized time range. In the embodiment of the application, the larger the bottleneck degree of the upstream node is, the larger the restriction degree of the upstream node on the starting execution time of the corresponding calculation task of the current node is, so that by adopting the method disclosed by the application, the restriction degree of the upstream node on the starting execution time of the corresponding calculation task of the current node can be calculated, thereby facilitating the optimization of the output time of the corresponding calculation task of the current node.

In another possible embodiment of the present application, as shown in fig. 4, the method in all the above embodiments further includes:

step S41: sequencing all upstream nodes of the current node according to the bottleneck degree;

in the embodiment of the application, all the upstream nodes of the current node can be sorted from the bottleneck degree to the small, and also can be sorted from the small to the large. In the embodiment of the present application, still following the above example, all the upstream nodes A, B, C, D, E, F and G of the current node I can be sorted from the bottleneck degree to the smallest, which are the upstream nodes F, D, G, B, C, E, A.

Step S42: and taking the upstream nodes with preset quantity as the upstream bottleneck nodes of the current nodes.

In the embodiment of the present application, a preset number of upstream nodes with a large bottleneck degree can be specifically used as the upstream bottleneck nodes of the current node. Following the above example, the node F and the node D with the larger bottleneck degree can be used as the upstream bottleneck node of the current node I.

As can be seen from the above, in the embodiment of the present application, an upstream bottleneck node of the current node may be determined.

In another possible embodiment of the present application, as shown in fig. 5, the method in all the above embodiments may further include:

step S51: taking a time interval in which a plurality of computing tasks run simultaneously as an indirect bottleneck interval;

in the embodiment of the present application, still following the above example, a time interval during which a plurality of computation tasks are simultaneously running may be determined as 5:00-5: 10. 5:10-5: 20. 5: 20-5: 30. 5: 30-5: 40. 5: 40-5: 50 and 5: 50-6: 00;

step S52: and calculating the optimization difficulty of the current node according to the indirect bottleneck interval and the optimizable time range.

In the embodiment of the application, because only one upstream node is operated in a time interval, the optimization is easier, and the operation starting time of the current node can be optimized only by optimizing the upstream node; however, a time interval in which a plurality of upstream nodes operate simultaneously, namely an indirect bottleneck interval, is required to optimize a plurality of upstream nodes simultaneously, and the start operation time of the current node can be optimized, so that the optimization difficulty of the current node can be represented by the indirect bottleneck interval.

In this embodiment of the present application, when the indirect bottleneck interval is an indirect bottleneck interval 1, and an indirect bottleneck interval 2 … … is an indirect bottleneck interval p … …, the optimization difficulty of the current node may be calculated by specifically using an optimization difficulty calculation formula, where the optimization difficulty calculation formula specifically is:

both p and m are positive integers, and p is less than or equal to m;

in the examples of the present application, the above formula is used to calculate the difficulty of optimizing the current node I [ (5:00-5:10) + (5:10-5:20) + (5: 20-5: 30) + (5: 30-5: 40) + (5: 40-5: 50) + (5: 50-6: 00) ]/[ [5:00-6:00] ], [10 minutes +10 minutes ]/60 minutes ]/[ 83.3%.

As can be seen from the above, in the embodiment of the present application, the optimization difficulty of the current node can be calculated.

The application also discloses a method for determining the system bottleneck degree, and the method can be used for calculating the system bottleneck degree of all non-leaf nodes in the scheduling topological graph, wherein the non-leaf nodes refer to nodes (such as A, B, C, D, E, F, G nodes in fig. 2) with downstream nodes in the scheduling topological graph. The application will take the example that the corresponding calculation task of the non-leaf node B shown in fig. 2 is increased to run for 1 minute, and details the process of the application; the method for calculating the bottleneck of the system disclosed in the present application, as shown in fig. 6, at least includes the following steps:

step S61: calculating the shortest path from a non-leaf node to each leaf node in the scheduling topological graph;

in the embodiment of the present application, taking calculating the shortest path from the non-leaf node B to the leaf node I in fig. 3 as an example, a specific implementation process of step S61 is described in detail, and a specific implementation process of step S61 may be as follows:

a: determining all paths from a non-leaf node to a leaf node in a scheduling topological graph; wherein each path comprises a plurality of nodes;

in the embodiment of the present application, first, all paths from node B to node I are determined, and in the scheduling topology shown in fig. 3, the paths from node B to node I may be found, including: B-G-I path, B-I path, and B-E-I path;

b: determining the path length of each path according to the end operation time and the start operation time of the calculation task corresponding to the adjacent nodes of each path;

in the embodiment of the present application, a time difference between the end operation time and the start operation time of the corresponding calculation task of the adjacent node in each path may be specifically used as the sub-path length; and then adding all the sub-path lengths in the current path to obtain the path length of the current path.

In the embodiment of the present application, the following process is described in detail by taking the calculation of the path length of the path B-G-H as an example: firstly, calculating the time difference (10 minutes after calculation) between the end operation time (5: 20) of the node B and the start operation time (5: 30) of the node G, and taking the time difference 10 as a sub-path length; then calculating the time difference (the calculated time difference is also 10 minutes) between the end operation time (5: 50) of the node G and the start operation time (6: 00) of the node I, and taking the time difference 10 as another sub-path length; finally, adding all the sub-path lengths in the current path (10+10 equals to 20) to obtain the path length of the path B-G-H; similarly, the path length of the path B-I is calculated to be 40, and the path length of the path B-E-I is calculated to be 20.

C: and determining the path with the shortest path length as the shortest path.

Since the path lengths of both the paths B-G-I and B-E-I are 20 in the above embodiment, it can be determined that the shortest path from the current node B to the node I is either B-G-I or B-E-I.

Step S62: when the calculation task corresponding to the non-leaf node increases the operation preset time length, determining the time length increased by the operation of the calculation task corresponding to each leaf node according to the shortest path;

in the embodiment of the present application, step S62 will be described by taking as an example that node B operates for 1 minute during the increase of the computation task, and the time length increased by the computation task corresponding to leaf node I; in the embodiment of the present application, the specific implementation process of step S62 is as follows:

a: judging whether the current node corresponds to a calculation task and whether the operation increased preset time is longer than the length of the shortest path from the current node to a leaf node or not;

b: if so, determining that the time length increased by the leaf node corresponding to the running of the calculation task is as follows: the current node corresponds to a preset time length-shortest path length added by the running of the calculation task;

c: and if the time length of the leaf node corresponding to the running of the calculation task is less than or equal to zero, determining that the time length of the leaf node corresponding to the running of the calculation task is zero.

In the embodiment of the present application, still continuing with the above example, since the node B increases 1 minute corresponding to the operation of the calculation task, and is less than or equal to the shortest path length 20 from the node B to the node I, execute node C; it is thus possible to determine that the length of time that the I node has increased in correspondence with the running of the computing task is zero; similarly, the time length increased by the H node corresponding to the operation of the calculation task is 0 minute, and the time length increased by the J node corresponding to the operation of the calculation task is 1 minute;

in this embodiment, the path length may represent a buffering time from the current node to the leaf node, and therefore, when the preset time added by the current node corresponding to the operation of the calculation task is less than the buffering time, it may be determined that the current node has no influence on the leaf node. In practical application, the calculation tasks corresponding to the leaf nodes are generally specific application, and the output time is more important, because in the embodiment of the application, the influence of each non-leaf node on the leaf nodes is taken as a scale to measure the influence of each non-leaf node on the whole scheduling system.

Step S63: and determining the system bottleneck degree of the non-leaf nodes according to the increased time length of each leaf node in operation.

In this embodiment of the present application, the duration increased by the operation of each leaf node may be specifically added to serve as the system bottleneck of the non-leaf node.

In the embodiment of the present application, the duration increased by the H, I, J node operating in the scheduling topology shown in fig. 3 may be added (0+0+1 ═ 1), as the system bottleneck of the I node.

Therefore, by adopting the method, the system bottleneck degree of each non-leaf node in the scheduling topological graph can be calculated.

In another possible embodiment of the present application, the method in all the above embodiments may further include:

and taking a preset number of non-leaf nodes as system bottleneck nodes.

In the embodiment of the application, the preset non-leaf nodes with high system bottleneck can be specifically used as the system bottleneck nodes.

By adopting the method in the foregoing embodiment, the system bottleneck degrees of all non-leaf nodes (A, B, C, D, E, F, G nodes) in the scheduling topological graph can be calculated, and in the embodiment of the present application, the system bottleneck degrees of all non-child nodes in the scheduling topological graph can be sorted, and the non-leaf node with a larger bottleneck degree is used as the system bottleneck node.

The application also discloses a method for determining the system bottleneck degree, and the method can be used for calculating the system bottleneck degrees of all non-leaf nodes in the scheduling plop. The process of the present application will be described in detail by taking as an example that the corresponding calculation task of the non-leaf node B shown in fig. 2 is reduced to run for 1 minute; the method for calculating the bottleneck of the system disclosed in the present application, as shown in fig. 7, at least includes the following steps:

step S71: determining a non-leaf node in a scheduling topological graph;

step S72: when the non-leaf node corresponds to the calculation task and reduces the running preset time, calculating the number of activated leaf nodes in a scheduling topological graph, wherein the non-leaf node is a node with a downstream node in the scheduling topological graph, and the activated leaf node is a leaf node which changes the starting running time and the ending running time of the corresponding calculation task;

in the embodiment of the present application, a specific implementation process of step S72 may be as shown in fig. 8:

step S71-1: when a non-leaf node in the scheduling topological graph corresponds to a calculation task and reduces the preset running time, recalculating the running finishing time of the calculation task;

in this embodiment of the present application, first, in the scheduling topology shown in fig. 3, it is determined that the start execution time and the end execution time of the node B corresponding to the computation task are respectively 5:00 and 5: 20; and when the corresponding calculation task of the node B is reduced and operated for 1 minute, determining that the end operation time of the corresponding calculation task of the node B is 5: 19.

step S71-2: setting the non-leaf node as an active node;

in the embodiment of the application, a node B is set as an active node;

step S71-3: judging whether the end running time of the computing task corresponding to the active node influences the start running time of the computing task corresponding to a direct downstream node of the computing node, wherein the direct downstream node is a downstream node directly connected with an active node in a scheduling topological graph; if so, go to step S71-4; if not, executing step S71-7;

in the embodiment of the present application, the specific implementation process of step S71-3 may be as follows:

a: determining, in a debug topology map, a node immediately downstream of an active node and a node immediately upstream of the immediately downstream node;

in the embodiment of the present application, first, in the scheduling topology shown in fig. 3, it is determined that nodes directly downstream of the active node B are nodes G, I and E; then, in the scheduling topology, the direct upstream nodes of the direct downstream node G, which are nodes B and D, the direct upstream node of the direct downstream node I, which are nodes G, B and E, and the direct upstream node of the direct downstream node E, which is node B, are determined.

B: determining the latest finishing running time of the corresponding computing task in all direct upstream nodes of a direct downstream node;

in the embodiment of the present application, taking a direct downstream node G as an example, to describe the process of the present application in detail, first, in the scheduling topology shown in fig. 3, the start operation time of all upstream nodes (nodes B and D) of the node G is determined, and it can be found that the start operation time of the corresponding computation task of the node B has become 5: and node D corresponds to a computing task with a start time of 5:30, of a nitrogen-containing gas; the latest calculation task start running time may be determined to be 5: 30.

c: judging whether the latest finishing operation time is the same as the recalculation finishing operation time of the activation node; if the two are the same, determining to execute D, otherwise, executing E;

in the embodiment of the present application, it can be found that the latest ending runtime 5:30 does not coincide with the ending runtime 5:19 that activates the node B to recalculate, and therefore it can be determined that the node B does not affect the starting runtime of the corresponding calculation task of the immediately downstream node D. Similarly, by using the above method, it can be determined that the node E directly downstream of the node B is affected, and the node I is not affected. In the embodiment of the application, the current node can start to operate the corresponding calculation task only when all the upstream nodes of the node finish operating the corresponding calculation task, so that when the latest operating time of the node corresponding to the calculation task is inconsistent with the recalculated finishing operating time of the upstream activation node, the reduced operating time of the activation node can be determined, and no influence is generated on the current node.

D: determining the starting running time of the corresponding calculation task of the direct downstream node influenced by the activation node, and executing the step S71-4;

e: determining that the activating node does not affect the start-of-run time of the direct downstream node corresponding to the computing task, step S71-7 is performed.

Step S71-4: determining a direct downstream node as an activation node, and recalculating the starting operation time and the ending operation time of the corresponding calculation task of the activation node;

in the embodiment of the present application, still following the above example, it may be determined that the E node is an active node. As can be seen from the scheduling topology shown in fig. 3, when the end runtime of the node B becomes 5: at time 19, the start time of the E node becomes 5:19, the end running time becomes 5: 39.

Step S71-5: judging whether the activated node is a leaf node; if yes, go to step S71-6; if not, returning to execute the step S71-3;

in the embodiment of the present application, since the E node is not a leaf node, the process returns to perform step S71-3, and with the method described above, the J node may be determined to be an activated node, and at this time, the J node is also a leaf node, and step S71-6 is performed, at this time, the J node may be determined to be an activated node leaf node.

Step S71-6: determining the activated node as an activated leaf node, and executing step S72;

step S71-7: and determining that the active leaf node does not exist in the scheduling topological graph, and executing the step S72.

Step S73: and determining the system bottleneck degree of the non-leaf nodes according to the number of the activated leaf nodes and the preset duration.

In this embodiment of the present application, the system bottleneck of the non-leaf node may be calculated specifically according to a system bottleneck calculation formula, where the system bottleneck calculation formula is: the system bottleneck rate is the number of activated leaf nodes for a preset duration.

In the embodiment of the present application, still following the above example, it can be seen that when the node B reduces the corresponding computation task to run for 1 minute, in the scheduling topology shown in fig. 3, the number of activated leaf nodes is 1 (only J nodes), and therefore the system bottleneck of the node B is 1 × 1.

Therefore, by adopting the method, the system bottleneck degree of each non-leaf node in the scheduling topological graph can be determined.

and taking a preset number of non-leaf nodes as system bottleneck nodes.

The present application also discloses a method for determining an upstream node bottleneck degree, as shown in fig. 10, the method at least includes the following steps:

step S101: determining all upstream nodes of the current node;

step S102: determining an optimizable time range according to the starting operation time of the computing task corresponding to the upstream node and the starting operation time of the computing task corresponding to the current node;

step S103: splitting the optimizable time range into a plurality of time intervals;

step S104: and calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimized time range.

Through the above description of the method embodiments, those skilled in the art can clearly understand that the present application can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation manner in many cases. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media that can store program codes, such as Read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and so on.

Corresponding to the embodiment of the method for determining the bottleneck degree of the upstream node provided by the present application, the present application further provides a device for determining the bottleneck degree of the upstream node, as shown in fig. 9, the device at least includes:

an upstream node determining module 91, configured to determine all upstream nodes of the current node;

an optimizable time range determining module 92, configured to determine an optimizable time range according to a time when the upstream node starts to operate corresponding to the computing task and a time when the current node starts to operate corresponding to the computing task;

a splitting module 93, configured to split the optimizable time range into a plurality of time intervals;

and an upstream node bottleneck degree calculating module 94, configured to calculate a bottleneck degree of each upstream node according to the time interval in which the corresponding calculation task runs, the total number of calculation tasks run in the time interval, and the optimizable time range.

As can be seen from the above, in the present embodiment, the bottleneck of each upstream node may be determined.

In another possible embodiment of the present application, the determining an optimizable time range module in all the above embodiments includes:

In another possible embodiment of the present application, the apparatus in all the above embodiments further includes:

Corresponding to the embodiment of the method for determining the system bottleneck, the present application further discloses a device for determining the system bottleneck, which includes:

Therefore, by adopting the device, the system bottleneck degree of each non-leaf node in the scheduling topological graph can be calculated.

In another possible embodiment of the present application, the shortest path calculating module in all the above embodiments includes:

and the shortest path determining unit is used for determining the path with the shortest path length as the shortest path. A (c),

In another possible embodiment of the present application, the leaf node operation duration determining module in all the embodiments includes:

Corresponding to the above disclosed method embodiment for calculating the system bottleneck, the present application also discloses a device for calculating the system bottleneck, comprising:

a non-leaf node determining module for determining a non-leaf node;

the module for calculating the number of activated leaf nodes is used for calculating the number of activated leaf nodes in the scheduling topological graph when the operation of the non-leaf nodes corresponding to the calculation task is reduced by a preset time length, wherein the non-leaf nodes are nodes with downstream nodes in the scheduling topological graph, and the activated leaf nodes are leaf nodes with changed starting operation time and ending operation time corresponding to the calculation task;

and the system bottleneck degree calculating module is used for determining the system bottleneck degree of the non-leaf node according to the number of the activated leaf nodes and the preset duration. Therefore, by adopting the device, the system bottleneck degree of each non-leaf node in the scheduling topological graph can be calculated.

In another possible embodiment of the present application, the module for calculating the number of activated leaf nodes in all the above embodiments includes:

a setting unit, configured to set the non-leaf node as an active node;

In another possible embodiment of the present application, the influence determining unit in all the above embodiments includes:

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of determining upstream node bottleneck, comprising:

determining all upstream nodes of the current node;

splitting the optimizable time range into a plurality of time intervals;

calculating the bottleneck degree of each upstream node according to the time interval of the corresponding calculation task of each upstream node, the total number of the calculation tasks operated in the time interval and the optimizable time range;

determining the optimization difficulty of the current node according to the indirect bottleneck interval and the optimizable time range;

the i and the n are positive integers, and i is less than or equal to n;

and p and m are positive integers, and p is less than or equal to m.

2. The method of claim 1, wherein determining an optimizable time horizon from the time the upstream node corresponds to the start of the computing task and the time the current node corresponds to the start of the computing task comprises:

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. A method of determining a system bottleneck, comprising:

determining the system bottleneck degree of the non-leaf nodes according to the increased time length of each leaf node in operation;

when the non-leaf node reduces the preset time length corresponding to the running of the computing task, the number of activated leaf nodes in the scheduling topological graph is computed, and the method comprises the following steps:

setting the non-leaf node as an active node;

judging whether the activated node is a leaf node;

if yes, determining the activated node as an activated leaf node;

5. The method of claim 4, wherein calculating the shortest path to each leaf node from a non-leaf node in the scheduling topology comprises:

and determining the path with the shortest path length as the shortest path.

6. The method of claim 4, wherein determining the increased duration of the running of the computing task corresponding to each leaf node according to the shortest path when the non-leaf node increases the running time of the computing task by a preset duration comprises:

7. The method according to any one of claims 4-6, further comprising:

and taking a preset number of non-leaf nodes as system bottleneck nodes.

8. A method of determining a system bottleneck, comprising:

determining a non-leaf node;

determining the system bottleneck degree of the non-leaf node according to the number of the activated leaf nodes and preset duration;

setting the non-leaf node as an active node;

judging whether the activated node is a leaf node;

if yes, determining the activated node as an activated leaf node;

9. The method of claim 8, wherein calculating whether the end runtime of the computing task corresponding to the active node affects the start runtime of the computing task corresponding to a directly downstream node of the active node comprises:

10. The method according to claim 8 or 9, characterized in that the method further comprises:

and taking a preset number of non-leaf nodes as system bottleneck nodes.

11. An apparatus for determining upstream node bottleneck, comprising:

the upstream node bottleneck degree calculating module is used for calculating the bottleneck degree of each upstream node according to the time interval in which the corresponding calculating task runs, the total number of the calculating tasks in the time interval and the optimizable time range;

the optimization difficulty determining unit is used for determining the optimization difficulty of the current node according to the indirect bottleneck interval and the optimizable time range;

the i and the n are positive integers, and i is less than or equal to n;

and p and m are positive integers, and p is less than or equal to m.

12. The apparatus of claim 11, wherein the determine an optimizable time horizon module comprises:

13. The apparatus of claim 12 or 11, further comprising:

14. An apparatus for determining a system bottleneck, comprising:

the first system bottleneck degree determining module is used for determining the system bottleneck degree of the non-leaf nodes according to the increased time length of the operation of each leaf node;

a module for calculating the number of leaf nodes to be activated, comprising:

a setting unit, configured to set the non-leaf node as an active node;

15. The apparatus of claim 14, wherein the shortest path computation module comprises:

16. The apparatus of claim 14, wherein the leaf node runtime determination module comprises:

17. The apparatus of any one of claims 14-16, further comprising:

18. An apparatus for determining a system bottleneck, comprising:

a non-leaf node determining module for determining a non-leaf node;

the system bottleneck degree calculating module is used for determining the system bottleneck degree of the non-leaf nodes according to the number of the activated leaf nodes and the preset duration;

a module for calculating the number of leaf nodes to be activated, comprising:

a setting unit, configured to set the non-leaf node as an active node;

19. The apparatus of claim 18, wherein the influence determining unit comprises:

20. The apparatus of claim 18 or 19, further comprising: