CN102708404A - Machine learning based method for predicating parameters during MPI (message passing interface) optimal operation in multi-core environments - Google Patents

Machine learning based method for predicating parameters during MPI (message passing interface) optimal operation in multi-core environments Download PDF

Info

Publication number
CN102708404A
CN102708404A CN2012100420437A CN201210042043A CN102708404A CN 102708404 A CN102708404 A CN 102708404A CN 2012100420437 A CN2012100420437 A CN 2012100420437A CN 201210042043 A CN201210042043 A CN 201210042043A CN 102708404 A CN102708404 A CN 102708404A
Authority
CN
China
Prior art keywords
mpi
training
model
program
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100420437A
Other languages
Chinese (zh)
Other versions
CN102708404B (en
Inventor
曾宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEJING COMPUTING CENTER
Original Assignee
BEJING COMPUTING CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEJING COMPUTING CENTER filed Critical BEJING COMPUTING CENTER
Priority to CN201210042043.7A priority Critical patent/CN102708404B/en
Publication of CN102708404A publication Critical patent/CN102708404A/en
Application granted granted Critical
Publication of CN102708404B publication Critical patent/CN102708404B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a novel method for optimizing an MPI (message passing interface) application in multi-core environments, and particularly relates to a machine learning based method for predicating parameters during MPI application optimal operation under multi-core clusters. According to the method, training benchmarks with different ratios of point-to-point communication data to collective communication data are designed to generate training data under the specific multi-core clusters, parameter optimized models during operation are constructed by a decision tree REPTree capable of quickly outputting results and an ANN (artificial neural network) capable of generating multiple output and good in noise immunity, the optimized models are trained by the training data generated by the training benchmarks, and the trained models are used for predicating the unknown parameters inputted to the MPI application during optimal operation. Experiments show that speed-up ratios generated by the parameters obtained by the REPTree-based predication model and the ANN-based predication model during optimal operation are averagely higher than 90% of a practical maximum speed-up ratio.

Description

A kind of based on the parameter prediction method during the MPI optimized operation under the multinuclear of machine learning
Technical field
The present invention relates under the multi-core environment MPI and optimize, specifically, relate to a kind of based on the parameter prediction method during the MPI optimized operation under the multinuclear of machine learning.
Background technology
Along with multi-core technology is widely used in a group of planes more, the performance optimization that MPI uses under the multinuclear group of planes becomes the focus of research.The MPI storehouse of main flow realizes that (Open MPI, MPICH etc.) all provide adjustable runtime parameter mechanism at present, allows the user to transfer excellent runtime parameter to promote the performance that MPI uses according to certain applications demand, hardware and operating system.
We design this chapter and have realized a kind ofly based on MPI runtime parameter Optimization Model under the general multi-core environment of machine learning, can be that MPI program prediction under the multinuclear group of planes of given soft, hardware configuration is near optimum runtime parameter combination automatically.The forecast model that we propose through off-line training and the on-line study to forecast model, can be the unknown approaching optimum runtime parameter of MPI program prediction based on decision tree in the machine learning and Artificial Neural Network automatically.The MPI program of predicting is by common descriptions of static nature such as the behavioral characteristics that source code operation is once obtained and communicator sizes.The optimum MPI runtime parameter Forecasting Methodology based on machine learning that we propose verifies on the multinuclear SMP group of planes based on InfiniBand, and the environment of the MPI storehouse of this main flow of utilization Open MPI parameter during as prediction MPI optimized operation.Prove through the IS in the parallel benchmark external member 2.4 of NAS and the experiment of LU benchmark; Compare with Open MPI default configuration, the optimization runtime parameter combination that obtains based on the forecast model of machine learning can bring maximum about 20% performance boost for the MPI a multinuclear group of planes under uses.
Multi-core technology refers to two or more processing kernels are integrated in the middle of the processor chips, and the handling property through load allocating is quickened to use to the multinuclear.Become the main flow platform of high-performance computing sector at present based on the group of planes of multi-core technology, an increasing group of planes adopts polycaryon processor as core component [CHAI08].Message passing interface MPI (Message Passing Interface) is a multiple programming model the most frequently used under the group of planes, is widely used in distributed and the shared drive system.
The new features of polycaryon processor make the memory hierarchy of a multinuclear group of planes complicated more, have brought new optimization space also for simultaneously the MPI program.Though the data locality of algorithm, load balancing etc. are the factors that influences the MPI application performance; But it is relevant with concrete application-specific characteristic; Directly with existing MPI program portable to multinuclear group of planes platform, the performance of application and extensibility do not obtain great improvement [SW 09].Optimization research for MPI under the multinuclear at present mainly concentrates on the aspects such as optimization of mixing MPI/OpenMP, optimization MPI runtime parameter, optimizing MPI process topology, MPI collective communication; The performance important influence that wherein adjustable runtime parameter is used the MPI under the multi-core environment, but optimum runtime parameter depends on the bottom architecture of a multinuclear node or a multinuclear group of planes and the characteristic of MPI program self.
The MPI storehouse of main flow realizes that the runtime parameter that all provides adjustable is machine-processed, allows the user to obtain more high-performance through the adjustment runtime parameter.For example can revise the agreement that point to point link adopts, promptly revise in the MPI storehouse threshold parameter that transfers concentrated communication protocol (Rendezvous) by communication protocol (Eager) immediately to according to the size of communication information.The performance important influence that adjustable runtime parameter is used the MPI under the multinuclear group of planes, but optimum runtime parameter depends on factors such as the level of communicating by letter that MPI uses in the communication performance (communication delay and bandwidth that comprise internal memory and network), a group of planes of network interconnection mode (comprising Infiniband network, gigabit Ethernet and Myrinet network etc.), the group of planes of memory hierarchy (comprising the sharing mode of intranodal secondary or three grades of buffer memorys etc.), the group of planes of a multinuclear group of planes (comprise that Chip is interior, between Chip and intra-node communication) largely.
Fig. 1 has shown that at one 10 node the difference configuration combination of following five runtime parameters of a multinuclear group of planes of every node 8 nuclears is to the performance impact of IS benchmark (Class B) in the parallel benchmark external member of NAS.Under the group of planes of interconnected AMD double-core 10 nodes of Infiniband; Best runtime parameter configuration is compared with Open MPI storehouse default setting and can be brought about 20% performance boost at most, and wrong configuration is compared with default configuration and caused about 30% performance loss.
Fig. 2 shows the influence of runtime parameter to the Jacobi benchmark.Experiment is presented on the AMD node of one 32 nuclear and matrix size when being 4096*6096, and for the Jacobi benchmark, the optimized parameter configuration combination that obtains maximum speed-up ratio during 8 MPI processes is during with 16 MPI processes different (comparing with default configuration).Experimental result is also shown under 8 MPI processes simultaneously, and optimum MPI runtime parameter can bring about 70% performance boost to the Jacobi benchmark.
Fig. 1 and Fig. 2 explain that adjustable runtime parameter can be used MPI and bring considerable performance boost, is difficult to manual realization but the configuration set of runtime parameter and corresponding optimization space are quite huge simultaneously.The Open MPI based on the modular assembly structure with main flow is an example; Suppose from the coll assembly of the btl assembly of point to point link commonly used and set operation, respectively to get an adjustable numeric type parameter and an index type parameter; 20 kinds of values of each numeric type parameter testing; Each border will shape parameter has 2 kinds of values, then uses four parameters of automatic Iterative Technology Need test to constitute the combining and configuring of 1600 kinds of runtime parameters.With the average execution time of MPI program under every kind of configuration is to calculate in 5 minutes, needs 5 day time find best runtime parameter sequence altogether.Therefore press for a kind of fast automatic parameter optimization method and promote the performance that MPI uses under the multinuclear group of planes.
Summary of the invention
For achieving the above object, the invention provides a kind of based on the parameter prediction method during the MPI optimized operation under the multinuclear of machine learning.
It is a kind of based on the parameter prediction method during the MPI optimized operation under the multinuclear of machine learning,
Adopt two kinds of standards of decision tree and artificial neural network to make up Optimization Model;
Generate training data with the training benchmark of the structure combinations of parameters when many group operations is set on a target multinuclear group of planes, and the model of structure is carried out off-line training;
Configuration parameter when the model after the training is used for the operation of new MPI program prediction optimum;
Forecasting institute is got the result and reality has the runtime parameter vector to do contrast most, the accuracy of evaluation prediction pattern.
Preferably, it is the input of decision-tree model that said decision-tree model will be trained the performance of program of benchmark and the configuration groups cooperation of runtime parameter, and training data is: { F i, C i, F wherein iBe the program characteristic of training benchmark, C iBe the combination of the runtime parameter under the present procedure characteristic, the actual speed-up ratio that obtains is as the output of decision tree.
Preferably, said artificial nerve network model will be trained in the benchmark and to be produced the data of high speed-up ratio and select and be used for the training parameter forecast model, and training data is: { F i, C I_best, F wherein i=<f 1, f 2..., f m>Be the program characteristic of training benchmark, C I_best=<c 1, c 2..., c n>Parameter combinations during for the optimum operation under the present procedure characteristic.
Preferably, said decision-tree model is in the training pattern stage, and F produces different speed-up ratio results with C through the conversion vector; When performance model is predicted, if F pThe performance of program vector of the MPI program of representative input then can obtain maximum speed-up ratio S MaxRuntime parameter configuration C BestParameter combinations vector, i.e. S in the time of will being the optimum operation of this MPI program Max=M REPTree(F p, C Best).
Preferably, said artificial nerve network model is in the training pattern stage, and F produces different speed-up ratio results with C through the conversion vector; When performance model is predicted, if M ANNBe the artificial nerve network model after the training, then C Best=M ANN(F p), F wherein pThe performance of program vector of the MPI program of representative input, C BestParameter combinations vector when being the optimum operation of this MPI program.
Preferably, said training benchmark comprises two kinds of MPI communication modes: synchronous MPI point to point link and MPI set operation; The training benchmark receives 5 parameters, the message size that can be used for respectively exchanging in the size, collective communication of message of the ratio of point to point link in the Control Training benchmark, the ratio of collective communication, two synchronous point to point links of MPI process and the size of communicator.
Preferably; Said off-line training is through 5 input parameters of conversion training benchmark; The ratio of control point to point link and collective communication is respectively: 100% point to point link, 100% collective communication, 50% point to point link and 50% collective communication; Under three kinds of different communication ratios, the size of message size and MPI communicator in conversion point-to-point and the collective communication, and the configuration of conversion runtime parameter combination respectively; Common property is given birth to 3000 of training datas, is used for the neural network training Optimization Model.
Preferably, when forecast model foundation and after, will carry out the prediction task according to the actual requirements with a large amount of learning data trained.
Preferably, before carrying out prediction, need be under a target multinuclear group of planes to will predicting the instrument operation of carrying out of MPI program, with the proper vector F of the MPI program that obtains importing pWith F pParameter combinations in the time of can obtaining importing the optimum operation of MPI program as the input of model; When a target multinuclear group of planes changed, above process need repeated.
The present invention proposes a kind of new method that MPI uses of under multi-core environment, optimizing: parameter is predicted during optimized operation that the utilization machine learning method is used MPI under the multinuclear group of planes.We have designed the training benchmark with different point to point links and collective communication ratio data and under a specific multinuclear group of planes, have produced training data; Adopt decision tree REPTree and a plurality of outputs of generation that to export the result fast and neural network ANN to make up the runtime parameter Optimization Model simultaneously with better noise immunity; Training data through the training benchmark produces is trained Optimization Model, and parameter was predicted when the model after the training was used to the optimized operation of the input MPI program of the unknown.Experiment showed, based on the forecast model of REPTree and the speed-up ratio that produces based on the optimization runtime parameter that the forecast model of ANN obtains and on average reach more than 90% of actual maximum speed-up ratio.
Description of drawings
Fig. 1 runtime parameter is to the performance impact of IS benchmark (Class B)
Fig. 2 runtime parameter is to the performance impact of Jacobi benchmark (4096*6096)
Fig. 3 is based on the forecast model of machine learning
Fig. 4 decision tree forecast model
Fig. 5 neural network prediction model
Embodiment
Further specify below in conjunction with accompanying drawing and specific embodiment.
The performance important influence that adjustable runtime parameter is used the MPI under the multinuclear group of planes, but optimum runtime parameter depends on the bottom architecture of a multinuclear group of planes and the characteristic of MPI program self.The method and the step of our parameter prediction when introducing the utilization machine learning techniques and carrying out MPI optimized operation under the multinuclear of this joint.
Our method comprises four-stage: tectonic model, model training, use training pattern to carry out parameter prediction and model prediction accuracy assessment.Wherein we have adopted machine learning techniques one decision tree and the artificial neural network of two kinds of standards to be used to make up Optimization Model the phase one.The model training stage we on a target multinuclear group of planes, generate training datas with the training benchmark of structure through the combinations that many group runtime parameters are set, and the model of structure is carried out off-line training.Model after the training can be used for to the optimum runtime parameter configuration of the MPI program prediction of new the unknown.Forecasting institute gets the accuracy that the contrast of result and actual optimum runtime parameter vector can the evaluation prediction pattern.
The essence of machine learning is that the appliance computer learning system solves practical problems, can be regarded as a mapping or function y=F (X) based on the forecast model of machine learning, and wherein X is input, and output y is continuous or orderly value.The destination of study is to obtain a mapping or function F, to the contact modeling between X and the y.The accuracy rate of fallout predictor is calculated the predicted value of y and the difference of actual given value and is assessed [HAN 07] through to each check tuple X.
The MPI runtime parameter is difficult to realize under the excellent multinuclear owing to manually transfer; Therefore we adopt the forecast model of setting up optimized parameter based on the method for machine learning, this model can to given multinuclear group of planes platform down arbitrarily during the optimum operation of unknown MPI loading routine parameter predict.
Fig. 3 has described the course of work of forecast model; At first on target multinuclear NOWs, use different runtime parameter configuration operation training benchmark to produce training data; With the training data that produces the forecast model of constructing is carried out off-line training; Extract the input of the program characteristic of given MPI program as forecast model then, last model output is near optimum runtime parameter predicted value, to obtain near maximum speed-up ratio.The formula form of parametric prediction model can be expressed as during based on the MPI optimized operation of machine learning: establishing M is the forecast model after the training, F=<f 1, f 2..., f i>The performance of program of the MPI program of input is extracted in representative, then C=M (F) gained vector C=<c 1, c 2..., c i>Parameter combinations when being the optimum operation of this program.
Decision-tree model
Decision tree is based on tree-like forecast model; The root node of tree is whole data acquisition space; The corresponding fragmentation problem of each partial node; It is the test to certain unitary variant, and this test is slit into two or more data blocks with the data acquisition spatial, and each leaf node is that the data that have classification results are cut apart.It is that decision tree need not understood a lot of background knowledges in learning process that the reason of setting up forecast model is set in our trade-off decision; The information that only provides from the sample data collection just can produce a decision tree; Bifurcated through tree node is differentiated can make a certain type of classification problem only relevant with the corresponding variable's attribute value of main tree node, does not promptly need whole variable-values to judge that corresponding classification or execution predict.
We adopt this decision tree learning fast of REPTree device to make up our decision-tree model.REPTree uses the wrong hedge clipper branch strategy of cutting approximately and can create regression tree, therefore can effectively handle the situation [PIER 02] of connection attribute and property value vacancy.
Fig. 4 has described our decision tree forecast model.During training pattern we will to train the performance of program of benchmark and the configuration groups cooperation of runtime parameter be the input of decision-tree model, promptly the training data of REPTree model is: { F i, C i, F wherein iBe the program characteristic of training benchmark, C iBe the combination of the runtime parameter under the present procedure characteristic, the actual speed-up ratio that obtains is as the output of decision tree.Be the input and output of our program characteristic that will train benchmark, different runtime parameter combination and the actual speed-up ratio that obtains, with the If-then rule that generates decision-making as modeling REPTree.The decision tree that produces through sample data collection study can be used for MPI program prediction to the new the unknown that extracts program characteristic near optimum runtime parameter combination.
Formulism to model is explained as follows: establish M REPTreeBe the decision tree forecast model, then relation may be defined as S=M between model and training data and the output data REPTree(f 1, f 2..., f m, c 1, c 2... c n), F=wherein<f 1, f 2..., f m>Be the proper vector of program, C=<c 1, c 2..., c n>Be the runtime parameter mix vector, S is the actual speed-up ratio that when being input as F and C, produces.In the training pattern stage, we produce different speed-up ratio results through conversion vector F with C.When performance model is predicted, if F pThe performance of program vector of the MPI program of representative input then can obtain maximum speed-up ratio S MaxRuntime parameter configuration C BestParameter combinations vector, i.e. S in the time of will being the optimum operation of this MPI program Max=M REPTree(F p, C Best).
Neural network model
Artificial neural network ANN (Artificial Neural Network) is one type of machine learning model; Can shine upon one group of input parameter and export to one group of target, we adopt ANN to be because it can finely be applied to linearity and nonlinear regression problem and good noise proofness [opening 08] is arranged.
One three layers forward direction type error anti-pass neural network is used to make up forecast model; Experimental verification is designed to the best ANN of our forecasting problem performances: the transition function of hiding layer is tangent (Sigmoid) function: the transition function of
Figure BSA00000673964700071
output layer is logarithm tan (Logarithmic sigmoid):
Figure BSA00000673964700072
hides layer simultaneously has 10 neurons; And the training function of hiding layer adopts wheat quart (Levenberg-Marquardt) algorithm, because it has well combined the speed of Newton's algorithm and the stability of gradient descent algorithm [BATR 92].
Fig. 5 has described our neural network prediction model.During model training, we will train in the benchmark and to produce the data of high speed-up ratio and select and be used for training the parametric prediction model based on ANN, and promptly the training data of ANN model is: { F i, C I_best, F wherein i=<f 1, f 2..., f m>Be the program characteristic of training benchmark, C I_best=<c 1, c 2..., c n>Parameter combinations during for the optimum operation under the present procedure characteristic.The described formulate form of corresponding preamble is established M ANNBe the ANN model after the training, then C Best=M ANN(F p), F wherein pThe performance of program vector of the MPI program of representative input, C BestParameter combinations vector when being the optimum operation of this MPI program.
The MPI performance of program extracts
Our designed method is parameter during to the MPI applied forcasting optimum operation of the unknown with the Optimization Model of off-line training, therefore will from the MPI program of the unknown, extract suitable performance of program and import as Optimization Model, is used for being predicted the outcome accurately.Because runtime parameter mainly influences the communication performance between the MPI process, so we mainly consider the communication pattern of MPI program, the data volume of communication exchange and the size of communicator when carrying out feature extraction.Table 2 has been explained the characteristic of MPI program, and these necessary programs characteristics can be through obtaining an instrument operation will predicting the MPI program.
Table 2 MPI program characteristic and description
Training baseline configuration and training data generate
In order to produce the data of training forecast model, we have designed the training benchmark program.On the multinuclear group of planes of target architecture, use the multiple various combination of adjustable runtime parameter can produce training data to the training benchmark.Simultaneously, train benchmark can accept a plurality of input parameters and come point-to-point in the Control Training benchmark, set operation data quantity transmitted and communicator size.
We come the project training benchmark according to the defined MPI performance of program of table 2.Benchmark mainly comprises following two kinds of MPI communication modes: synchronous MPI point to point link and MPI set operation.The training benchmark receives 5 parameters, the message size that can be used for respectively exchanging in the size, collective communication of message of the ratio of point to point link in the Control Training benchmark, the ratio of collective communication, two synchronous point to point links of MPI process and the size of communicator.
Through 5 input parameters of conversion training benchmark, the ratio of control point to point link and collective communication is respectively: 100% point to point link, 100% collective communication, 50% point to point link and 50% collective communication.Under three kinds of different communication ratios, the respectively size of message size and MPI communicator in conversion point-to-point and the collective communication, and the configuration of conversion runtime parameter combination, common property is given birth to 3000 of training datas, is used for the neural network training Optimization Model.
Carry out prediction
When forecast model foundation and after, will carry out the prediction task according to the actual requirements with a large amount of learning data trained.Because our decision tree predictive mode S Max=M REPTree(F p, C Best) and neural network prediction model C Best=M ANN(F p) in, all need use the performance of program vector F of program to be predicted pBe used as input, therefore before carrying out prediction, we need be under a target multinuclear group of planes to will predicting the instrument operation of carrying out of MPI program, with the proper vector F of the MPI program that obtains importing pWith F pParameter combinations in the time of can obtaining importing the optimum operation of MPI program as the input of model.But when a target multinuclear group of planes changed, above process need repeated.

Claims (9)

1. one kind based on the parameter prediction method during the MPI optimized operation under the multinuclear of machine learning, it is characterized in that:
Adopt two kinds of standards of decision tree and artificial neural network to make up Optimization Model;
Generate training data with the training benchmark of the structure combinations of parameters when many group operations is set on a target multinuclear group of planes, and the model of structure is carried out off-line training;
Configuration parameter when the model after the training is used for the operation of new MPI program prediction optimum;
Forecasting institute is got the result and reality has the runtime parameter vector to do contrast most, the accuracy of evaluation prediction pattern.
2. the method for claim 1, it is characterized in that: it is the input of decision-tree model that said decision-tree model will be trained the performance of program of benchmark and the configuration groups cooperation of runtime parameter, and training data is: { F i, C i, F wherein iBe the program characteristic of training benchmark, C iBe the combination of the runtime parameter under the present procedure characteristic, the actual speed-up ratio that obtains is as the output of decision tree.
3. the method for claim 1 is characterized in that: said artificial nerve network model will be trained and produced the data of high speed-up ratio in the benchmark and select and be used for the training parameter forecast model, and training data is: { F i, C I_best, F wherein i=<f 1, f 2..., f m>Be the program characteristic of training benchmark, C I_best=<c 1, c 2..., c n>Parameter combinations during for the optimum operation under the present procedure characteristic.
4. method as claimed in claim 2 is characterized in that: said decision-tree model is in the training pattern stage, and F produces different speed-up ratio results with C through the conversion vector; When performance model is predicted, if F pThe performance of program vector of the MPI program of representative input then can obtain maximum speed-up ratio S MaxRuntime parameter configuration C BestParameter combinations vector, i.e. S in the time of will being the optimum operation of this MPI program Max=M REPTree(F p, C Best).
5. method as claimed in claim 3 is characterized in that: said artificial nerve network model is in the training pattern stage, and F produces different speed-up ratio results with C through the conversion vector; When performance model is predicted, if M ANNBe the artificial nerve network model after the training, then C Best=M ANN(F p), F wherein pThe performance of program vector of the MPI program of representative input, C BestParameter combinations vector when being the optimum operation of this MPI program.
6. the method for claim 1, it is characterized in that: said training benchmark comprises two kinds of MPI communication modes: synchronous MPI point to point link and MPI set operation; The training benchmark receives 5 parameters, the message size that can be used for respectively exchanging in the size, collective communication of message of the ratio of point to point link in the Control Training benchmark, the ratio of collective communication, two synchronous point to point links of MPI process and the size of communicator.
7. the method for claim 1; It is characterized in that: said off-line training is through 5 input parameters of conversion training benchmark; The ratio of control point to point link and collective communication is respectively: 100% point to point link, 100% collective communication, 50% point to point link and 50% collective communication; Under three kinds of different communication ratios, the size of message size and MPI communicator in conversion point-to-point and the collective communication, and the configuration of conversion runtime parameter combination respectively; Common property is given birth to 3000 of training datas, is used for the neural network training Optimization Model.
8. the method for claim 1 is characterized in that: when forecast model foundation and after with a large amount of learning data trained, will carry out the prediction task based on actual demand.
9. the method for claim 1 is characterized in that: before carrying out prediction, need be under a target multinuclear group of planes to will predicting the instrument operation of carrying out of MPI program, with the proper vector F of the MPI program that obtains importing pWith F pParameter combinations in the time of can obtaining importing the optimum operation of MPI program as the input of model; When a target multinuclear group of planes changed, above process need repeated.
CN201210042043.7A 2012-02-23 2012-02-23 A kind of parameter prediction method during MPI optimized operation under multinuclear based on machine learning Expired - Fee Related CN102708404B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210042043.7A CN102708404B (en) 2012-02-23 2012-02-23 A kind of parameter prediction method during MPI optimized operation under multinuclear based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210042043.7A CN102708404B (en) 2012-02-23 2012-02-23 A kind of parameter prediction method during MPI optimized operation under multinuclear based on machine learning

Publications (2)

Publication Number Publication Date
CN102708404A true CN102708404A (en) 2012-10-03
CN102708404B CN102708404B (en) 2016-08-03

Family

ID=46901144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210042043.7A Expired - Fee Related CN102708404B (en) 2012-02-23 2012-02-23 A kind of parameter prediction method during MPI optimized operation under multinuclear based on machine learning

Country Status (1)

Country Link
CN (1) CN102708404B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951425A (en) * 2015-07-20 2015-09-30 东北大学 Cloud service performance adaptive action type selection method based on deep learning
CN106250686A (en) * 2016-07-27 2016-12-21 哈尔滨工业大学 A kind of collective communication function modelling method of concurrent program
CN106909452A (en) * 2017-03-06 2017-06-30 中国科学技术大学 Concurrent program runtime parameter optimization method
CN107992295A (en) * 2017-12-29 2018-05-04 西安交通大学 A kind of dynamic algorithm system of selection towards grain
CN109146081A (en) * 2017-06-27 2019-01-04 阿里巴巴集团控股有限公司 It is a kind of for quickly creating the method and device of model item in machine learning platform
CN109710330A (en) * 2018-12-20 2019-05-03 Oppo广东移动通信有限公司 Method for determining operation parameters, device, terminal and the storage medium of application program
EP3506095A3 (en) * 2017-12-29 2019-09-25 INTEL Corporation Communication optimizations for distributed machine learning
CN111324532A (en) * 2020-02-13 2020-06-23 苏州浪潮智能科技有限公司 MPI parameter determination method, device and equipment of parallel computing software
CN111860826A (en) * 2016-11-17 2020-10-30 北京图森智途科技有限公司 Image data processing method and device of low-computing-capacity processing equipment
US11010681B2 (en) 2017-08-31 2021-05-18 Huawei Technologies Co., Ltd. Distributed computing system, and data transmission method and apparatus in distributed computing system
CN113760766A (en) * 2021-09-10 2021-12-07 曙光信息产业(北京)有限公司 MPI parameter tuning method and device, storage medium and electronic equipment
US11373266B2 (en) 2017-05-05 2022-06-28 Intel Corporation Data parallelism and halo exchange for distributed machine learning
US20230116246A1 (en) * 2021-09-27 2023-04-13 Indian Institute Of Technology Delhi System and method for optimizing data transmission in a communication network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020165838A1 (en) * 2001-05-01 2002-11-07 The Regents Of The University Of California Performance analysis of distributed applications using automatic classification of communication inefficiencies
CN101520748A (en) * 2009-01-12 2009-09-02 浪潮电子信息产业股份有限公司 Method for testing speed-up ratio of Intel multicore CPU

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020165838A1 (en) * 2001-05-01 2002-11-07 The Regents Of The University Of California Performance analysis of distributed applications using automatic classification of communication inefficiencies
CN101520748A (en) * 2009-01-12 2009-09-02 浪潮电子信息产业股份有限公司 Method for testing speed-up ratio of Intel multicore CPU

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JELENA PJESIVAC-GRBOVIC ET AL: "Decision trees and MPI collective algorithm selection problem", 《EURO-PAR 2007 PARALLEL PROCESSING》 *
王洁等: "多核机群下MPI程序优化技术的研究", 《计算机科学》 *
王洁等: "多核机群下基于神经网络的MPI运行时参数优化", 《计算机科学》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951425B (en) * 2015-07-20 2018-03-13 东北大学 A kind of cloud service performance self-adapting type of action system of selection based on deep learning
CN104951425A (en) * 2015-07-20 2015-09-30 东北大学 Cloud service performance adaptive action type selection method based on deep learning
CN106250686A (en) * 2016-07-27 2016-12-21 哈尔滨工业大学 A kind of collective communication function modelling method of concurrent program
CN111860826A (en) * 2016-11-17 2020-10-30 北京图森智途科技有限公司 Image data processing method and device of low-computing-capacity processing equipment
CN106909452B (en) * 2017-03-06 2020-08-25 中国科学技术大学 Parallel program runtime parameter optimization method
CN106909452A (en) * 2017-03-06 2017-06-30 中国科学技术大学 Concurrent program runtime parameter optimization method
US11373266B2 (en) 2017-05-05 2022-06-28 Intel Corporation Data parallelism and halo exchange for distributed machine learning
CN109146081A (en) * 2017-06-27 2019-01-04 阿里巴巴集团控股有限公司 It is a kind of for quickly creating the method and device of model item in machine learning platform
CN109146081B (en) * 2017-06-27 2022-04-29 阿里巴巴集团控股有限公司 Method and device for creating model project in machine learning platform
US11010681B2 (en) 2017-08-31 2021-05-18 Huawei Technologies Co., Ltd. Distributed computing system, and data transmission method and apparatus in distributed computing system
EP3506095A3 (en) * 2017-12-29 2019-09-25 INTEL Corporation Communication optimizations for distributed machine learning
CN107992295B (en) * 2017-12-29 2021-01-19 西安交通大学 Particle-oriented dynamic algorithm selection method
US11270201B2 (en) 2017-12-29 2022-03-08 Intel Corporation Communication optimizations for distributed machine learning
CN107992295A (en) * 2017-12-29 2018-05-04 西安交通大学 A kind of dynamic algorithm system of selection towards grain
US11704565B2 (en) 2017-12-29 2023-07-18 Intel Corporation Communication optimizations for distributed machine learning
CN109710330A (en) * 2018-12-20 2019-05-03 Oppo广东移动通信有限公司 Method for determining operation parameters, device, terminal and the storage medium of application program
CN111324532A (en) * 2020-02-13 2020-06-23 苏州浪潮智能科技有限公司 MPI parameter determination method, device and equipment of parallel computing software
CN113760766A (en) * 2021-09-10 2021-12-07 曙光信息产业(北京)有限公司 MPI parameter tuning method and device, storage medium and electronic equipment
US20230116246A1 (en) * 2021-09-27 2023-04-13 Indian Institute Of Technology Delhi System and method for optimizing data transmission in a communication network
US12021928B2 (en) * 2021-09-27 2024-06-25 Indian Institute Of Technology Delhi System and method for optimizing data transmission in a communication network

Also Published As

Publication number Publication date
CN102708404B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN102708404A (en) Machine learning based method for predicating parameters during MPI (message passing interface) optimal operation in multi-core environments
Zou et al. Novel global harmony search algorithm for unconstrained problems
Lin et al. A hybrid evolutionary immune algorithm for multiobjective optimization problems
WO2018161468A1 (en) Global optimization, searching and machine learning method based on lamarck acquired genetic principle
CN102629106B (en) Water supply control method and water supply control system
US20170330078A1 (en) Method and system for automated model building
CN109445935A (en) A kind of high-performance big data analysis system self-adaption configuration method under cloud computing environment
Akopov et al. A multi-agent genetic algorithm for multi-objective optimization
Li et al. An improved multiobjective estimation of distribution algorithm for environmental economic dispatch of hydrothermal power systems
Lin et al. A K-means clustering with optimized initial center based on Hadoop platform
Amruthnath et al. Modified rank order clustering algorithm approach by including manufacturing data
CN109145342A (en) Automatic wiring system and method
Moradi et al. The application of water cycle algorithm to portfolio selection
Muhsen et al. Enhancing NoC-based MPSoC performance: a predictive approach with ANN and guaranteed convergence arithmetic optimization algorithm
CN114065646B (en) Energy consumption prediction method based on hybrid optimization algorithm, cloud computing platform and system
Gong et al. Evolutionary computation in China: A literature survey
CN104392317A (en) Project scheduling method based on genetic culture gene algorithm
De Moraes et al. A random forest-assisted decomposition-based evolutionary algorithm for multi-objective combinatorial optimization problems
Wen et al. MapReduce-based BP neural network classification of aquaculture water quality
Weihong et al. Optimization of BP neural network classifier using genetic algorithm
Aziz et al. Assessment of evolutionary programming models for single-objective optimization
Malhotra et al. Application of evolutionary algorithms for software maintainability prediction using object-oriented metrics
Ouyang et al. Amended harmony search algorithm with perturbation strategy for large-scale system reliability problems
Han et al. A Kriging Model‐Based Expensive Multiobjective Optimization Algorithm Using R2 Indicator of Expectation Improvement
CN115879824A (en) Method, device, equipment and medium for assisting expert decision based on ensemble learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160803