CN113535400A - Parallel computing resource allocation method and device, storage medium and terminal equipment - Google Patents

Parallel computing resource allocation method and device, storage medium and terminal equipment Download PDF

Info

Publication number
CN113535400A
CN113535400A CN202110814155.9A CN202110814155A CN113535400A CN 113535400 A CN113535400 A CN 113535400A CN 202110814155 A CN202110814155 A CN 202110814155A CN 113535400 A CN113535400 A CN 113535400A
Authority
CN
China
Prior art keywords
node
calculation
computing
nodes
parallel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110814155.9A
Other languages
Chinese (zh)
Inventor
林炜彦
万绵元
肖乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wingtech Communication Co Ltd
Original Assignee
Wingtech Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wingtech Communication Co Ltd filed Critical Wingtech Communication Co Ltd
Priority to CN202110814155.9A priority Critical patent/CN113535400A/en
Priority to PCT/CN2021/115790 priority patent/WO2023000443A1/en
Publication of CN113535400A publication Critical patent/CN113535400A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/15Vehicle, aircraft or watercraft design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/17Mechanical parametric or variational design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/02CAD in a network environment, e.g. collaborative CAD or distributed simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Automation & Control Theory (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a parallel computing resource allocation method and device, a storage medium and terminal equipment, wherein the parallel computing resource allocation method comprises the following steps: firstly, determining the total number m of calculation problems and available calculation nodes, generating a calculation resource scheduling scheme permutation set according to the total number m of the calculation nodes and the calculation problems, mapping the calculation resource scheduling scheme permutation set to a pre-constructed time cost matrix to obtain a time consumption permutation set, performing isochronous calculation on the time consumption permutation set, acquiring a calculation node combination with the minimum isochronous ratio value from the calculation resource scheduling scheme permutation set according to a calculation result, and allocating the calculation node combination to the calculation problems to perform calculation processing on the calculation problems. Therefore, the parallel computing resource allocation method of the embodiment can greatly improve the computing isochronism of the computing quantity, reduce the waiting time of workers and improve the working efficiency.

Description

Parallel computing resource allocation method and device, storage medium and terminal equipment
Technical Field
The present invention relates to the field of resource allocation technologies, and in particular, to a distributed parallel computing resource allocation method, a computer-readable storage medium, a terminal device, and a distributed parallel computing resource allocation apparatus.
Background
In the production process of products such as terminal equipment and automobiles, collision analysis is generally required, and in the analysis process, data calculation processing is often required to be performed on different collision surfaces, however, in the calculation process, the problem of different calculation speeds occurs, so that a calculation node with a high calculation speed needs to wait for the calculation of a calculation node with a low calculation speed to be completed after calculation is completed, and then a calculation result can be obtained, which often needs to consume a long calculation time, affects the development progress of the products, and reduces the work efficiency.
The method adopted in the related technology is to directly select two or more than two computing nodes from idle computing nodes to work, so as to improve the computing speed, but because the computing capacity of each computing node is different, the problem that a plurality of computing quantities cannot be processed simultaneously is solved, that is, a large amount of time is still needed for a worker to wait for all the computing quantities to complete computing, the next operation can be carried out, the waiting time of the worker cannot be solved, and the efficiency of the worker cannot be improved.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, an object of the present invention is to provide a distributed parallel computing resource allocation method, which can greatly improve the computing isochronism of computing quantity, reduce the waiting time of the staff, and improve the work efficiency.
A second object of the invention is to propose a computer-readable storage medium.
A third object of the present invention is to provide a terminal device.
The fourth purpose of the invention is to provide a distributed parallel computing resource allocation device.
To achieve the above object, an embodiment of a first aspect of the present invention provides a distributed parallel computing resource allocation method, where the method includes: determining the amount of calculation problems, and determining the total number m of available calculation nodes, wherein m is the total number of the available calculation nodes; generating a computing resource scheduling scheme permutation set according to the total number m of the available computing nodes and the computing quantity; mapping the computing resource scheduling scheme permutation set to a pre-constructed time cost matrix to obtain a time duration permutation set; performing isochronous calculation on the consumed duration permutation set, and acquiring a calculation node combination with the minimum isochronous specific value from the calculation resource scheduling scheme permutation set according to a calculation result; and allocating the computing node combination with the minimum isochronism ratio value to the computing problem amount.
The embodiment of the invention firstly determines the calculation quantity and the total number m of available calculation nodes, generates a calculation resource scheduling scheme arrangement set according to the calculation node total number m and the calculation quantity, maps the calculation resource scheduling scheme arrangement set to a pre-constructed time consumption cost matrix to obtain a consumed duration arrangement set, performs isochronous calculation on the consumed duration arrangement set, obtains a calculation node combination with the minimum isochronous ratio value from the calculation resource scheduling scheme arrangement set according to the calculation result, and allocates the calculation node combination to the calculation quantity to perform calculation processing on the calculation quantity. Therefore, the distributed parallel computing resource allocation method of the embodiment can greatly improve the computing isochronism of the computing quantity, reduce the waiting time of workers and improve the working efficiency.
In some embodiments of the invention, the determining the total number of available compute nodes, m, comprises: and multiplying the calculation problem quantity by n to determine the total number m of the available calculation nodes, wherein n is the number of the nodes calculated in parallel.
In some embodiments of the present invention, the generating a computing resource scheduling scheme permutation set according to the total number m of available computing nodes and the computation problem amount includes: and sequentially taking out n computing nodes from the total number m of the available computing nodes, sequentially taking out n computing nodes from the rest controllable computing nodes, and repeating the steps for u times until the total number m of the available computing nodes is completely taken out so as to generate a computing resource scheduling scheme arrangement set, wherein u is the computing quantity.
In some embodiments of the invention, the time-consuming cost matrix is constructed by: acquiring the total number of the available computing nodes and the number of parallel computing nodes; arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes; acquiring the consumed duration corresponding to each node resource; and writing the consumed duration corresponding to each node resource into the empty matrix to generate a time-consumed cost matrix.
In some embodiments of the present invention, the distributed parallel computing resource allocation method further includes: and after the calculation of the calculation quantity is finished, updating the time consumption cost matrix according to the actual consumption duration of the calculation quantity.
In some embodiments of the invention, the isochronism ratio is equal to the maximum value in the set of time-consuming permutations over the minimum value in the set of time-consuming permutations.
To achieve the above object, a second aspect of the present invention provides a computer-readable storage medium, on which a distributed parallel computing resource allocation program is stored, which when executed by a processor implements the distributed parallel computing resource allocation method according to the above embodiments.
The computer readable storage medium in the embodiment of the invention executes the distributed parallel computing resource allocation program stored on the processor through the processor, thereby greatly improving the computing isochronism of computing quantity, reducing the waiting time of workers and improving the working efficiency.
To achieve the above object, a terminal device according to a third aspect of the present invention includes a memory, a processor, and a distributed parallel computing resource allocation program stored in the memory and executable on the processor, where the computing resource allocation program, when executed by the processor, implements the distributed parallel computing resource allocation method according to the above embodiments.
The terminal equipment in the embodiment of the invention comprises the memory and the processor, and the processor executes the distributed parallel computing resource allocation program stored on the memory, so that the computing isochronism of the computing quantity can be greatly improved, the waiting time of workers is reduced, and the working efficiency is improved.
To achieve the above object, a fourth aspect of the present invention provides a distributed parallel computing resource allocation apparatus, including: the determining module is used for determining the calculation quantity and determining the total number m of the available calculating nodes, wherein m is the total number of the available calculating nodes; a generating module, configured to generate a computing resource scheduling scheme permutation set according to the total number m of the available computing nodes and the computing quantity; the mapping module is used for mapping the computing resource scheduling scheme permutation set to a pre-constructed time cost matrix to obtain a time duration permutation set; the calculation module is used for performing isochronous calculation on the consumed duration permutation set and acquiring a calculation node combination with the minimum isochronous ratio value from the calculation resource scheduling scheme permutation set according to a calculation result; and the control module is used for allocating the calculation node combination with the minimum isochronism ratio value to the calculation problem amount.
The distributed parallel computing resource allocation device of the embodiment of the invention comprises a determining module, a generating module, a mapping module, a computing module and a control module, wherein, firstly, the determining module is used for determining the total number m of the calculation problems, then the generating module is used for generating the arrangement set of the calculation resource scheduling scheme according to the total number m of the calculation nodes and the total number m of the calculation problems, then the mapping module is used for mapping the arrangement set of the calculation resource scheduling scheme to the pre-constructed time consumption cost matrix so as to obtain the time consumption duration arrangement set, and performing isochronous calculation on the time-consuming duration arrangement set through a calculation module, acquiring a calculation node combination with the minimum isochronous ratio value from the calculation resource scheduling scheme arrangement set according to a calculation result, and allocating the calculation node combination with the minimum isochronous ratio value to the calculation problem quantity through a control module so as to perform calculation processing on the calculation problem quantity. Therefore, the distributed parallel computing resource allocation device of the embodiment can greatly improve the computing isochronism of the computing quantity, reduce the waiting time of workers and improve the working efficiency.
In some embodiments of the invention, the apparatus further comprises: the construction module is used for acquiring the total number of the available computing nodes and the number of the parallel computing nodes; arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes; acquiring the consumed duration corresponding to each node resource; and writing the consumed duration corresponding to each node resource into the empty matrix to generate a time-consumed cost matrix.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flow diagram of a distributed parallel computing resource allocation method according to one embodiment of the invention;
FIG. 2 is a flow diagram of a distributed parallel computing resource allocation method according to another embodiment of the invention;
FIG. 3 is a flow diagram of a distributed parallel computing resource allocation method in accordance with one embodiment of the present invention;
fig. 4 is a block diagram of a structure of a terminal device according to an embodiment of the present invention;
FIG. 5 is a block diagram of a distributed parallel computing resource allocation apparatus according to one embodiment of the present invention;
FIG. 6 is a block diagram of a distributed parallel computing resource allocation apparatus according to another embodiment of the present invention;
fig. 7 is a block diagram of a distributed parallel computing resource allocation apparatus according to still another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a distributed parallel computing resource allocation method and apparatus, a computer-readable storage medium, and a terminal device according to an embodiment of the present invention with reference to the drawings.
FIG. 1 is a flow diagram of a distributed parallel computing resource allocation method according to one embodiment of the invention.
As shown in fig. 1, the present invention provides a distributed computing resource allocation method, which includes the following steps:
and S10, determining the computation problem amount and determining the total number m of the available computation nodes, wherein m is the total number of the available computation nodes.
First, it should be noted that the distributed computing resource allocation method in the present invention is a multi-node distributed computing resource allocation method, in this embodiment, a two-node is taken as an example to describe the distributed computing resource allocation method in the embodiment of the present invention, and other embodiments such as three-node and four-node may refer to a specific implementation manner of the two-node.
Specifically, the calculation problem amount in the present embodiment can be calculated according to different products, for example, for an electronic product with a rectangular-like shape, the collision data of six surfaces of the electronic product needs to be calculated during collision analysis, so the calculation problem amount is six. After the calculation problem amount is determined, the total number m of available calculation nodes may be determined, it should be noted that, in an embodiment of two-node parallel calculation, the total number m of available calculation nodes may be greater than or equal to two times of the calculation problem amount, for example, in a case where the calculation problem amount is six, the total number m of available calculation nodes may be twelve, thirteen, and the like.
In some embodiments of the present invention, the determination of the total number of available computing nodes m may be determined by multiplying the number of computing problems by n, where n is the number of nodes computed in parallel.
Specifically, in the case of two-node distributed parallel computing, if the number of nodes in parallel computing is two, the total number m of available computing nodes may be twice the computation workload, and in this embodiment, if the computation workload is six, the total number m of available computing nodes is twelve.
And S20, generating a computing resource scheduling scheme arrangement set according to the total number m of the available computing nodes and the computing quantity.
Specifically, for example, if the computation problem amount in this embodiment is two, and the total number m of available computation nodes is four, the four available computation nodes may be arranged to obtain a plurality of computation resource scheduling schemes. More specifically, the present embodiment is described by taking four computing nodes, i.e., a, b, c, and d as examples, where the computing resource scheduling scheme permutation set may include: in the first scheme, a node a is parallel to a node b, and a node c is parallel to a node d; in the second scheme, a node a is parallel to a node c, and a node b is parallel to a node d; according to the third scheme, a node a is parallel to a node d, and a node b is parallel to a node c; according to the scheme IV, a node b is parallel to a node a, and a node c is parallel to a node d; in the fifth scheme, the node c is parallel to the node a, and the node b is parallel to the node d; in the sixth scheme, a node d is parallel to a node a, and a node b is parallel to a node c; a seventh scheme is that the node a is parallel to the node b, and the node d is parallel to the node c; according to the scheme eight, the node a and the node c are parallel, and the node d and the node b are parallel; in the ninth scheme, the node a and the node d are parallel, and the node c and the node b are parallel; the method comprises the following steps that a node b is parallel to a node a, and a node d is parallel to a node c; in the eleventh scheme, the node c is parallel to the node a, and the node d is parallel to the node b; and a twelfth scheme, wherein the node d is parallel to the node a, and the node c is parallel to the node b.
In this embodiment, if the number of the computing nodes exceeds four, for example, five computing nodes, four computing nodes need to be taken out of the five computing nodes, and each group includes four nodes, so that, in the case of dual-node parallel computing, the computing resource scheduling scheme permutation set of each group of nodes includes twelve schemes, which may be specifically referred to as the set scheme listed in the above example.
It should be noted that, in this embodiment, the parallel computing node has a primary and secondary score, for example, node a and node b are parallel and can be represented as primary node a and secondary node b parallel computing; the node b and the node a are parallel and can be represented as a primary node b and a secondary node a which perform parallel computation, that is, the computation time consumption of the primary node a and the secondary node b on the computation problem amount is different from the computation time consumption of the primary node b and the secondary node a on the same computation problem amount.
It should be noted that, when more than two nodes perform parallel computation, for example, three nodes perform parallel computation, the arrangement manner may be referred to as the above manner. For example, when six computing nodes a, b, c, d, e, and f can allocate two groups of three nodes to perform parallel computing, the following schemes may be included: in the first scheme, nodes a, b and c are parallel, and nodes d, e and f are parallel; and in the second scheme, nodes a, b and d are parallel, nodes c, e and f are parallel, and the like.
In some embodiments of the present invention, generating the permutation set of the computing resource scheduling schemes according to the total number m of available computing nodes and the computation problem amount comprises: and sequentially taking out n computing nodes from the total number m of the available computing nodes, sequentially taking out n computing nodes from the rest controllable computing nodes, repeating the steps for u times until the total number m of the available computing nodes is completely taken out, and generating a computing resource scheduling scheme arrangement set, wherein u is a computing quantity.
Specifically, it is described that the computation problem u is two, in this embodiment, the total number m of the computation nodes is twice as large as the computation problem u, and the total number m of the computation nodes is four, and this embodiment is a two-node parallel computation, that is, n is two, so that two computation nodes can be sequentially taken out of four computation nodes for 12 cases, and then two computation nodes can be taken out of the remaining two computation nodes for 2 cases, and for 12 cases, a total of 12 × 2 is 24 cases, but since the node a and the node b are parallel, the computation result of the node c and the node d are parallel to the node c and the node d, and the computation result of the node a and the node b are parallel to each other, there are actually only 12 cases, that is, the twelve schemes listed in the above embodiment.
S30, mapping the arrangement set of the computing resource scheduling scheme to a pre-constructed time cost matrix to obtain a time duration arrangement set.
Specifically, as shown in table 1 below, it can be seen from table 1 that, in the case of parallel computation of two nodes, there is a combination mode in 12 in total, where, since the master node and the slave node in diagonal combination are the same, it is equivalent to a single-machine computation, and this case is not considered in this embodiment.
TABLE 1
Figure BDA0003169295910000091
In this embodiment, each computing resource scheduling scheme may be mapped to a pre-constructed time cost matrix to obtain the time of each computing resource scheduling scheme, and further obtain a set of consumed durations. In this embodiment, the time-consuming cost matrix may be represented by a cost, wherein,
Figure BDA0003169295910000092
in some embodiments of the present invention, as shown in FIG. 2, the time-consuming cost matrix is constructed by:
s301, acquiring the total number of available computing nodes and the number of parallel computing nodes.
S302, arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes.
Specifically, the total number of available computing nodes and the number of parallel computing nodes are first obtained, and it should be noted that, in this embodiment, the manner of generating the node resource set according to the total number of available computing nodes and the number of parallel computing nodes may be to generate a computing resource scheduling scheme permutation set of twelve schemes in the above embodiments, and a specific generation manner of the calculation resource scheduling scheme permutation set may refer to a specific description process in the above embodiments, which is not described herein again.
It should be noted that after the total number of available computing nodes and the number of parallel computing nodes are obtained, an empty matrix with the same number of rows and columns as the number of available computing nodes may also be generated, and it is understood that the empty matrix has the same format as the time-consuming cost matrix in the foregoing embodiment, but corresponding time is not mapped in the empty matrix in this embodiment.
S303, acquiring the consumed duration corresponding to each node resource.
After the node resource set is generated, the consumed duration corresponding to each node resource may be obtained again, it may be understood that the consumed duration of each node resource may be obtained according to historical data, or may be directly set according to the working experience of a worker under the condition that there is no historical data, where the consumed duration of the node resource ab may be represented as costabThe consumed duration of other node resources can be represented by referring to the representation method.
S304, writing the consumed duration corresponding to each node resource into a blank matrix to generate a time-consumed cost matrix.
Specifically, after the consumed duration and the empty matrix corresponding to each node resource are obtained, the consumed duration corresponding to each node resource may be correspondingly written into the empty matrix, so as to generate a time consumed cost matrix of the node resource set.
And S40, performing isochronous calculation on the consumed duration permutation set, and acquiring a calculation node combination with the minimum isochronous specific value from the calculation resource scheduling scheme permutation set according to the calculation result.
Specifically, the consumed duration corresponding to each node resource can be obtained in the time consumption cost matrix, and the isochronism of each computing resource scheduling scheme can be further computed, for example, the first computing scheme, the node a and the node b are parallel, and the node c and the node d are parallel; in the second scheme, a node a is parallel to a node c, and a node b is parallel to a node d; according to the third scheme, a node a is parallel to a node d, and a node b is parallel to a node c; according to the scheme IV, a node b is parallel to a node a, and a node c is parallel to a node d; in the fifth scheme, the node c is parallel to the node a, and the node b is parallel to the node d; in the sixth scheme, a node d is parallel to a node a, and a node b is parallel to a node c; a seventh scheme is that the node a is parallel to the node b, and the node d is parallel to the node c; according to the scheme eight, the node a and the node c are parallel, and the node d and the node b are parallel; in the ninth scheme, the node a and the node d are parallel, and the node c and the node b are parallel; the method comprises the following steps that a node b is parallel to a node a, and a node d is parallel to a node c; in the eleventh scheme, the node c is parallel to the node a, and the node d is parallel to the node b; in the twelfth scheme, the node d is parallel to the node a, and the node c is parallel to the node b, and the isochronism corresponding to the twelve schemes can be calculated according to the isochronism ratio.
In some embodiments of the invention, the isochronism ratio value is equal to the maximum value in the duration-consuming permutation set over the minimum value in the duration-consuming permutation set.
For example, in case of scheme 1, the time consumption is cost in parallel between node a and node b, and between node c and node dabAnd costcdSo can costabAnd costcdMaximum of (1) is higher than costabAnd costcdTo obtain the isochronism ratio corresponding to scenario 1, it is understood that in twelve scenarios, there are twelve isochronism ratios. Note that in this example, only two elapsed time periods are shown in each scenario, such as costabAnd costcdLet cost be assumedabRatio costcdLarge, then the corresponding isochronism ratio in this scenario 1 is costab/costcd
And S50, allocating the calculation node combination with the minimum isochronism ratio value to the calculation problem amount.
Specifically, in this embodiment, after twelve isochronism ratios are obtained, a combination of calculation nodes having the smallest isochronism ratio may be assigned to the calculation problem amount to calculate the calculation problem amount. For example, the first latency ratio is costabAnd costcdMaximum of (1) is higher than costabAnd costcdAnd the first isochronism ratio is the smallest of the twelve, then the node may be segmentedThe point a and the node b are parallel, the node c and the node d are respectively distributed to the first computation problem quantity and the second computation problem quantity in parallel, and the computation problem quantity is computed and processed by the node a and the node b in parallel and the node c and the node d in parallel. It can be understood that the calculation node corresponding to the minimum isochronism ratio represents that the isochronism of the calculation problem amount of the combined processing is high, and the waiting time can be reduced by workers.
In some embodiments of the present invention, after the computation of the computation volume is completed, the time-consuming cost matrix is updated according to the actual time-consuming duration of the computation volume.
Specifically, in this embodiment, after the calculation of the calculation order quantity is completed by using the calculation node corresponding to the minimum isochronism ratio, the time-consuming cost of the corresponding calculation node combination may be updated in the time-consuming cost matrix, so as to improve the accuracy of the time-consuming cost matrix.
FIG. 3 is a flow diagram of a distributed parallel computing resource allocation method according to one embodiment of the invention.
As shown in fig. 3, in the process of batch submitting of the calculation files, the number of the calculation files may be read to obtain calculation arguments, and the available calculation nodes are obtained from the calculation node number list, and then arranged to obtain a calculation resource scheduling scheme arrangement set, where end represents the last calculation node, and the value of end is equal to twice the number of the calculation arguments. The time consuming cost matrix is initialized to obtain a time consuming cost matrix cost, then mapping the arrangement set of the computing resource scheduling scheme to a time cost matrix cost to obtain the consumed duration arrangement corresponding to the arrangement set of the computing resource scheduling scheme, then calculating the isochronism ratio judge corresponding to each computing resource scheduling scheme, selecting the computing resource scheduling scheme corresponding to the smallest ratio from the isochronism ratio judge, and the combination of the computing nodes corresponding to the scheme is used as a perfect _ group, computing resources are randomly scheduled for the computing files according to the perfect _ group for computing, and after the computing is finished, the result data can be processed, and the time-consuming data generated in the calculation process can be updated in the time-consuming cost matrix to obtain the latest time-consuming cost matrix.
In summary, the distributed parallel computing resource allocation method of the embodiment of the invention can greatly improve the computing isochronism of the computing quantity, reduce the waiting time of the staff and improve the working efficiency.
Further, an embodiment of the present invention also provides a computer-readable storage medium, where the storage medium stores a distributed parallel computing resource allocation program, and the computing resource allocation program, when executed by a processor, implements the distributed parallel computing resource allocation method in the foregoing embodiments.
The computer readable storage medium of the embodiment of the invention executes the distributed parallel computing resource allocation program stored on the processor through the processor, thereby greatly improving the computing isochronism of computing quantity, reducing the waiting time of workers and improving the working efficiency.
Fig. 4 is a block diagram of a terminal device according to an embodiment of the present invention.
Further, as shown in fig. 4, an embodiment of the present invention provides a terminal device 10, where the terminal device 10 includes a memory 11, a processor 12, and a distributed parallel computing resource allocation program stored on the memory 11 and executable on the processor 12, and when the computing resource allocation program is executed by the processor 12, the distributed parallel computing resource allocation method in the above embodiment is implemented.
The terminal equipment of the embodiment of the invention comprises the memory and the processor, and the processor executes the distributed parallel computing resource allocation program stored on the memory, so that the computing isochronism of the computing quantity can be greatly improved, the waiting time of workers is reduced, and the working efficiency is improved.
Fig. 5 is a block diagram of a distributed parallel computing resource allocation apparatus according to an embodiment of the present invention.
Further, as shown in fig. 5, an embodiment of the present invention proposes a distributed parallel computing resource allocation apparatus 100, where the distributed parallel computing resource allocation apparatus 100 includes a determination module 101, a generation module 102, a mapping module 103, a computation module 104, and a control module 105.
The determining module 101 is configured to determine a computation quantity and determine a total number m of available computation nodes, where m is the total number of available computation nodes; the generating module 102 is configured to generate a computing resource scheduling scheme permutation set according to the total number m of available computing nodes and the computation quantity; the mapping module 103 is configured to map the computing resource scheduling scheme permutation set to a pre-constructed time-consuming cost matrix to obtain a consumed duration permutation set; the calculation module 104 is configured to perform isochronous calculation on the consumed duration permutation set, and obtain a calculation node combination with a minimum isochronous ratio value from the calculation resource scheduling scheme permutation set according to a calculation result; the control module 105 is configured to assign the combination of the computing nodes with the smallest isochronism ratio to the computation order quantity.
First, it should be noted that the distributed computing resource allocation apparatus in the present invention is a multi-node distributed computing resource allocation apparatus, in this embodiment, an execution method of the distributed computing resource allocation apparatus in the embodiment of the present invention is described by taking two nodes as an example, and other embodiments such as three nodes and four nodes may refer to a specific implementation manner of two nodes.
Specifically, the calculation problem amount in the present embodiment can be calculated according to different products, for example, for an electronic product with a rectangular-like shape, the collision data of six surfaces of the electronic product needs to be calculated during collision analysis, so the calculation problem amount is six. After the determining module 101 determines the computation problem amount, the total number m of available computation nodes may be determined, it should be noted that, in an embodiment of two-node parallel computation, the total number m of available computation nodes may be greater than or equal to twice the computation problem amount, for example, in a case that the computation problem amount is six, the total number m of available computation nodes may be twelve, thirteen, and the like.
In some embodiments of the present invention, the determining module 101 is specifically configured to multiply the computation problem quantity by n to determine the total number m of available computation nodes, where n is the number of nodes in parallel computation.
Specifically, in the case of two-node distributed parallel computing, if the number of nodes in parallel computing is two, the total number m of available computing nodes may be twice the computation workload, and in this embodiment, if the computation workload is six, the total number m of available computing nodes is twelve.
For example, if the computation problem amount in this embodiment is two, and the total number m of available computation nodes is four, the four available computation nodes may be arranged and the generation module 102 may be utilized to generate a plurality of computation resource scheduling schemes. More specifically, the present embodiment is described by taking four computing nodes, i.e., a, b, c, and d as examples, where the computing resource scheduling scheme permutation set may include: in the first scheme, a node a is parallel to a node b, and a node c is parallel to a node d; in the second scheme, a node a is parallel to a node c, and a node b is parallel to a node d; according to the third scheme, a node a is parallel to a node d, and a node b is parallel to a node c; according to the scheme IV, a node b is parallel to a node a, and a node c is parallel to a node d; in the fifth scheme, the node c is parallel to the node a, and the node b is parallel to the node d; in the sixth scheme, a node d is parallel to a node a, and a node b is parallel to a node c; a seventh scheme is that the node a is parallel to the node b, and the node d is parallel to the node c; according to the scheme eight, the node a and the node c are parallel, and the node d and the node b are parallel; in the ninth scheme, the node a and the node d are parallel, and the node c and the node b are parallel; the method comprises the following steps that a node b is parallel to a node a, and a node d is parallel to a node c; in the eleventh scheme, the node c is parallel to the node a, and the node d is parallel to the node b; and a twelfth scheme, wherein the node d is parallel to the node a, and the node c is parallel to the node b.
It should be noted that the computing node in this embodiment has a primary score and a secondary score, for example, the computing time consumption of the primary node a and the secondary node b for the computing problem amount is different from the computing time consumption of the primary node b and the secondary node a for the same computing problem amount.
In some embodiments of the present invention, the generating module 102 may be specifically configured to sequentially take out n computing nodes from the total number m of available computing nodes, sequentially take out n computing nodes from the remaining controllable computing nodes, and repeat u times until the total number m of available computing nodes is completely taken out, so as to generate the computing resource scheduling scheme permutation set, where u is a computation order quantity.
Specifically, it is described that the computation problem u is two, in this embodiment, the total number m of the computation nodes is twice as large as the computation problem u, and the total number m of the computation nodes is four, and this embodiment is a two-node parallel computation, that is, n is two, so that two computation nodes can be sequentially taken out of four computation nodes for 12 cases, and then two computation nodes can be taken out of the remaining two computation nodes for 2 cases, and for 12 cases, a total of 12 × 2 is 24 cases, but since the node a and the node b are parallel, the computation result of the node c and the node d are parallel to the node c and the node d, and the computation result of the node a and the node b are parallel to each other, there are actually only 12 cases, that is, the twelve schemes listed in the above embodiment.
As shown in table 1 below, it can be seen from table 1 that, in the case of parallel computing of two nodes, there is a combination mode in 12 in total, where, since the master node and the slave node in the diagonal combination are the same, which is equivalent to a single-machine computing, this case is not considered in this embodiment.
TABLE 1
Figure BDA0003169295910000161
In this embodiment, the mapping module 103 may map each computing resource scheduling scheme to a pre-constructed time cost matrix to obtain the time of each computing resource scheduling scheme, and further obtain a set of time duration. In this embodiment, the time-consuming cost matrix may be represented by cost, where the time-consuming cost is exemplified by:
Figure BDA0003169295910000162
in some embodiments of the present invention, as shown in fig. 6, the distributed parallel computing resource allocation apparatus 100 further includes a building module 106, where the building module 106 is configured to obtain the total number of available computing nodes and the number of parallel computing nodes; arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes; acquiring the consumed duration corresponding to each node resource; and writing the consumed duration corresponding to each node resource into the empty matrix to generate a time-consumed cost matrix.
Specifically, the building module 106 first obtains the total number of available computing nodes and the number of parallel computing nodes, and it should be noted that, in this embodiment, the manner of generating the node resource set according to the total number of available computing nodes and the number of parallel computing nodes may be to generate a computing resource scheduling scheme permutation set of twelve schemes in the above embodiments, and a specific generation manner of the calculation resource scheduling scheme permutation set may refer to a specific description process in the above embodiments, which is not described herein again.
It should be noted that after the total number of available computing nodes and the number of parallel computing nodes are obtained, an empty matrix with the same number of rows and columns as the number of available computing nodes may also be generated, and it is understood that the empty matrix has the same format as the time-consuming cost matrix in the foregoing embodiment, but corresponding time is not mapped in the empty matrix in this embodiment.
After the node resource set is generated, the consumed duration corresponding to each node resource may be obtained again, it may be understood that the consumed duration of each node resource may be obtained according to historical data, or may be directly set according to the working experience of a worker under the condition that there is no historical data, where the consumed duration of the node resource ab may be represented as costabThe consumed duration of other node resources can be represented by referring to the representation method.
Specifically, after the consumed duration and the empty matrix corresponding to each node resource are obtained, the consumed duration corresponding to each node resource may be correspondingly written into the empty matrix, so as to generate a time consumed cost matrix of the node resource set.
The consumed duration corresponding to each node resource can be obtained in the time consumption cost matrix, and then the calculation module 104 can be used to calculate the isochronism of each calculation resource scheduling scheme, for example, the first calculation scheme, the node a and the node b are parallel, and the node c and the node d are parallel; in the second scheme, a node a is parallel to a node c, and a node b is parallel to a node d; according to the third scheme, a node a is parallel to a node d, and a node b is parallel to a node c; according to the scheme IV, a node b is parallel to a node a, and a node c is parallel to a node d; in the fifth scheme, the node c is parallel to the node a, and the node b is parallel to the node d; in the sixth scheme, a node d is parallel to a node a, and a node b is parallel to a node c; a seventh scheme is that the node a is parallel to the node b, and the node d is parallel to the node c; according to the scheme eight, the node a and the node c are parallel, and the node d and the node b are parallel; in the ninth scheme, the node a and the node d are parallel, and the node c and the node b are parallel; the method comprises the following steps that a node b is parallel to a node a, and a node d is parallel to a node c; in the eleventh scheme, the node c is parallel to the node a, and the node d is parallel to the node b; in the twelfth scheme, the node d is parallel to the node a, and the node c is parallel to the node b, and the isochronism corresponding to the twelve schemes can be calculated according to the isochronism ratio.
In some embodiments of the invention, the isochronism ratio value is equal to the maximum value in the duration-consuming permutation set over the minimum value in the duration-consuming permutation set.
For example, in case of scheme 1, the time consumption is cost in parallel between node a and node b, and between node c and node dabAnd costcdSo can costabAnd costcdMaximum of (1) is higher than costabAnd costcdTo obtain isochronism ratios, it is understood that in twelve scenarios, there are twelve isochronism ratios.
In this embodiment, after obtaining twelve isochronism ratios, the control module 105 may assign the calculation node combination with the smallest isochronism ratio to the calculation volume to calculate the calculation volume. For example, the first latency ratio is costabAnd costcdMaximum of (1) is higher than costabAnd costcdWhen the first isochronism ratio is the minimum among twelve isochronism ratios, the node a and the node b may be parallel, and the node c and the node d may be parallel and respectively allocated to the first computation problem quantity and the second computation problem quantity, so that the computation problems may be parallel by the node a and the node b, and parallel by the node c and the node dThe amount is subjected to calculation processing. It can be understood that the calculation node corresponding to the minimum isochronism ratio represents that the isochronism of the calculation problem amount of the combined processing is high, and the waiting time can be reduced by workers.
In some embodiments of the present invention, as shown in fig. 7, the distributed parallel computing resource allocating apparatus 100 further includes an updating module 107, where the updating module 107 is configured to update the time-consuming cost matrix according to the actual time-consuming duration of the computing volume after the computing volume is completely computed.
Specifically, in this embodiment, after the calculation of the calculation order quantity is completed by using the calculation node corresponding to the minimum isochronism ratio, the time cost of the corresponding calculation node combination may be updated in the time cost matrix by using the updating module 107, so as to improve the accuracy of the time cost matrix.
It should be noted that, for other specific embodiments of the embodiment of the present invention, reference may be made to the specific real-time manner of the distributed parallel computing resource allocation method in the foregoing embodiment, and details are not described here again.
In summary, the distributed parallel computing resource allocation device of the embodiment of the invention can greatly improve the computing isochronism of the computing quantity, reduce the waiting time of the staff and improve the working efficiency.
It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the invention.
Furthermore, the terms "first", "second", and the like used in the embodiments of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated in the embodiments. Thus, a feature of an embodiment of the present invention that is defined by the terms "first," "second," etc. may explicitly or implicitly indicate that at least one of the feature is included in the embodiment. In the description of the present invention, the word "plurality" means at least two or two and more, such as two, three, four, etc., unless specifically limited otherwise in the examples.
In the present invention, unless otherwise explicitly stated or limited by the relevant description or limitation, the terms "mounted," "connected," and "fixed" in the embodiments are to be understood in a broad sense, for example, the connection may be a fixed connection, a detachable connection, or an integrated connection, and it may be understood that the connection may also be a mechanical connection, an electrical connection, etc.; of course, they may be directly connected or indirectly connected through intervening media, or they may be interconnected within one another or in an interactive relationship. Those of ordinary skill in the art will understand the specific meaning of the above terms in the present invention according to their specific implementation.
In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A method of distributed parallel computing resource allocation, the method comprising:
determining the amount of calculation problems, and determining the total number m of available calculation nodes, wherein m is the total number of the available calculation nodes;
generating a computing resource scheduling scheme permutation set according to the total number m of the available computing nodes and the computing quantity;
mapping the computing resource scheduling scheme permutation set to a pre-constructed time cost matrix to obtain a time duration permutation set;
performing isochronous calculation on the consumed duration permutation set, and acquiring a calculation node combination with the minimum isochronous specific value from the calculation resource scheduling scheme permutation set according to a calculation result;
and allocating the computing node combination with the minimum isochronism ratio value to the computing problem amount.
2. The method of claim 1, wherein determining the total number of available computing nodes, m, comprises:
and multiplying the calculation problem quantity by n to determine the total number m of the available calculation nodes, wherein n is the number of the nodes calculated in parallel.
3. The method according to claim 1, wherein said generating a ranked set of computing resource scheduling schemes according to the total number m of available computing nodes and the computation residuals comprises: and sequentially taking out n computing nodes from the total number m of the available computing nodes, sequentially taking out n computing nodes from the rest controllable computing nodes, and repeating the steps for u times until the total number m of the available computing nodes is completely taken out so as to generate a computing resource scheduling scheme arrangement set, wherein u is the computing quantity.
4. The method of claim 1, wherein the time-consuming cost matrix is constructed by:
acquiring the total number of the available computing nodes and the number of parallel computing nodes;
arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes;
acquiring the consumed duration corresponding to each node resource;
and writing the consumed duration corresponding to each node resource into the empty matrix to generate a time-consumed cost matrix.
5. The method according to any one of claims 1-4, further comprising:
and after the calculation of the calculation quantity is finished, updating the time consumption cost matrix according to the actual consumption duration of the calculation quantity.
6. The method of claim 1, wherein the isochronism ratio is equal to a maximum value in the time-consuming duration array set over a minimum value in the time-consuming duration array set.
7. A computer-readable storage medium, having stored thereon a distributed parallel computing resource allocation program which, when executed by a processor, implements the distributed parallel computing resource allocation method of any one of claims 1-6.
8. A terminal device comprising a memory, a processor and a distributed parallel computing resource allocation program stored on the memory and executable on the processor, the computing resource allocation program when executed by the processor implementing the distributed parallel computing resource allocation method of any one of claims 1 to 6.
9. A distributed parallel computing resource allocation apparatus, the apparatus comprising:
the determining module is used for determining the calculation quantity and determining the total number m of the available calculating nodes, wherein m is the total number of the available calculating nodes;
a generating module, configured to generate a computing resource scheduling scheme permutation set according to the total number m of the available computing nodes and the computing quantity;
the mapping module is used for mapping the computing resource scheduling scheme permutation set to a pre-constructed time cost matrix to obtain a time duration permutation set;
the calculation module is used for performing isochronous calculation on the consumed duration permutation set and acquiring a calculation node combination with the minimum isochronous ratio value from the calculation resource scheduling scheme permutation set according to a calculation result;
and the control module is used for distributing the calculation node combination with the minimum isochronism ratio value to the calculation problem amount.
10. The apparatus of claim 9, further comprising:
the construction module is used for acquiring the total number of the available computing nodes and the number of the parallel computing nodes; arranging according to the total number of the available computing nodes and the number of the parallel computing nodes to generate a node resource set, and generating a null matrix with the same row and column number as the number of the available computing nodes according to the total number of the available computing nodes and the number of the parallel computing nodes; acquiring the consumed duration corresponding to each node resource; and writing the consumed duration corresponding to each node resource into the empty matrix to generate a time-consumed cost matrix.
CN202110814155.9A 2021-07-19 2021-07-19 Parallel computing resource allocation method and device, storage medium and terminal equipment Pending CN113535400A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110814155.9A CN113535400A (en) 2021-07-19 2021-07-19 Parallel computing resource allocation method and device, storage medium and terminal equipment
PCT/CN2021/115790 WO2023000443A1 (en) 2021-07-19 2021-08-31 Parallel computing resource allocation method and apparatus, storage medium, and terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110814155.9A CN113535400A (en) 2021-07-19 2021-07-19 Parallel computing resource allocation method and device, storage medium and terminal equipment

Publications (1)

Publication Number Publication Date
CN113535400A true CN113535400A (en) 2021-10-22

Family

ID=78128739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110814155.9A Pending CN113535400A (en) 2021-07-19 2021-07-19 Parallel computing resource allocation method and device, storage medium and terminal equipment

Country Status (2)

Country Link
CN (1) CN113535400A (en)
WO (1) WO2023000443A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328654A (en) * 2022-08-15 2022-11-11 中国建设银行股份有限公司 Resource allocation method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160301624A1 (en) * 2015-04-10 2016-10-13 International Business Machines Corporation Predictive computing resource allocation for distributed environments
US20160378560A1 (en) * 2014-02-28 2016-12-29 Pivotal Software, Inc. Executing a foreign program on a parallel computing system
CN106598727A (en) * 2016-11-07 2017-04-26 北京邮电大学 Computation resource distribution method and system for communication system
CN106776044A (en) * 2017-01-11 2017-05-31 上海鲲云信息科技有限公司 Hardware-accelerated method and system, hardware accelerator perform method and system
CN110245023A (en) * 2019-06-05 2019-09-17 欧冶云商股份有限公司 Distributed scheduling method and device, electronic equipment and computer storage medium
CN110825526A (en) * 2019-11-08 2020-02-21 欧冶云商股份有限公司 Distributed scheduling method and device based on ER relationship, equipment and storage medium
CN111309479A (en) * 2020-02-14 2020-06-19 北京百度网讯科技有限公司 Method, device, equipment and medium for realizing task parallel processing
CN112150262A (en) * 2020-09-29 2020-12-29 中国银行股份有限公司 Method and device for processing data of account and fact checking
CN112363821A (en) * 2021-01-12 2021-02-12 湖南大学 Computing resource scheduling method and device and computer equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160378560A1 (en) * 2014-02-28 2016-12-29 Pivotal Software, Inc. Executing a foreign program on a parallel computing system
US20160301624A1 (en) * 2015-04-10 2016-10-13 International Business Machines Corporation Predictive computing resource allocation for distributed environments
CN106598727A (en) * 2016-11-07 2017-04-26 北京邮电大学 Computation resource distribution method and system for communication system
CN106776044A (en) * 2017-01-11 2017-05-31 上海鲲云信息科技有限公司 Hardware-accelerated method and system, hardware accelerator perform method and system
CN110245023A (en) * 2019-06-05 2019-09-17 欧冶云商股份有限公司 Distributed scheduling method and device, electronic equipment and computer storage medium
CN110825526A (en) * 2019-11-08 2020-02-21 欧冶云商股份有限公司 Distributed scheduling method and device based on ER relationship, equipment and storage medium
CN111309479A (en) * 2020-02-14 2020-06-19 北京百度网讯科技有限公司 Method, device, equipment and medium for realizing task parallel processing
CN112150262A (en) * 2020-09-29 2020-12-29 中国银行股份有限公司 Method and device for processing data of account and fact checking
CN112363821A (en) * 2021-01-12 2021-02-12 湖南大学 Computing resource scheduling method and device and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
冯亮;: "差异分布式存储***资源调度的优化仿真", 计算机仿真, no. 03, 15 March 2016 (2016-03-15) *
巩子杰;张亚平;张铭栋;: "分布式计算中基于资源分级的自适应Min-Min算法", 计算机应用研究, no. 03, 15 March 2016 (2016-03-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328654A (en) * 2022-08-15 2022-11-11 中国建设银行股份有限公司 Resource allocation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2023000443A1 (en) 2023-01-26

Similar Documents

Publication Publication Date Title
CN110705705B (en) Convolutional neural network model synchronous training method, cluster and readable storage medium
CN112559163B (en) Method and device for optimizing tensor calculation performance
US20130151707A1 (en) Scalable scheduling for distributed data processing
CN110795241B (en) Job scheduling management method, scheduling center and system
CN103207774A (en) Method And System For Resolving Thread Divergences
CN114185818B (en) GPU (graphics processing Unit) memory access self-adaptive optimization method and device based on extended page table
EP2657842A1 (en) Workload optimization in a multi-processor system executing sparse-matrix vector multiplication
CN111078394B (en) GPU thread load balancing method and device
CN103425536A (en) Test resource management method oriented towards distributed system performance tests
KR20140044596A (en) Computing system including multi core processor and load balancing method thereof
JP2022539955A (en) Task scheduling method and apparatus
CN109800092A (en) A kind of processing method of shared data, device and server
CN116257345B (en) Deep learning task scheduling method and device
CN113535400A (en) Parallel computing resource allocation method and device, storage medium and terminal equipment
CN104850505A (en) Memory management method and system based on chain type stacking
CN106844024B (en) GPU/CPU scheduling method and system of self-learning running time prediction model
CN115952385B (en) Parallel supernode ordering method and system for solving large-scale sparse equation set
CN111506400A (en) Computing resource allocation system, method, device and computer equipment
CN110175073B (en) Scheduling method, sending method, device and related equipment of data exchange job
JP5355152B2 (en) Dynamic reconfiguration device
CN116302327A (en) Resource scheduling method and related equipment
CN115757260A (en) Data interaction method, graphics processor and graphics processing system
CN105389212A (en) Job assigning method and apparatus
CN115878517A (en) Memory device, operation method of memory device, and electronic device
CN112433847A (en) OpenCL kernel submission method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination