CN115456202A - Method, device, equipment and medium for improving learning performance of working machine - Google Patents
Method, device, equipment and medium for improving learning performance of working machine Download PDFInfo
- Publication number
- CN115456202A CN115456202A CN202211394593.5A CN202211394593A CN115456202A CN 115456202 A CN115456202 A CN 115456202A CN 202211394593 A CN202211394593 A CN 202211394593A CN 115456202 A CN115456202 A CN 115456202A
- Authority
- CN
- China
- Prior art keywords
- working machine
- local
- data set
- prediction
- variance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 230000007786 learning performance Effects 0.000 title claims abstract description 24
- 238000012360 testing method Methods 0.000 claims abstract description 84
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 75
- 238000012549 training Methods 0.000 claims abstract description 75
- 230000004927 fusion Effects 0.000 claims abstract description 39
- 230000008569 process Effects 0.000 claims abstract description 37
- 238000012937 correction Methods 0.000 claims abstract description 31
- 230000002776 aggregation Effects 0.000 claims abstract description 30
- 238000004220 aggregation Methods 0.000 claims abstract description 30
- 230000004931 aggregating effect Effects 0.000 claims abstract description 15
- 230000006870 function Effects 0.000 claims description 44
- 238000009826 distribution Methods 0.000 claims description 20
- 238000003860 storage Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 abstract description 12
- 238000010586 diagram Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 7
- 230000008901 benefit Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- VKYKSIONXSXAKP-UHFFFAOYSA-N hexamethylenetetramine Chemical compound C1N(C2)CN3CN1CN2C3 VKYKSIONXSXAKP-UHFFFAOYSA-N 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the field of machine learning, and provides a method, a device, equipment and a medium for improving the learning performance of a working machine. The method comprises the following steps: establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (Rich computer-based binary-coded decimal) aggregation algorithm to obtain a global prediction model; and sending the global prediction model to each working machine, and setting a fusion algorithm and an uncertainty test data set for each working machine for fusion to obtain a prediction error minimum model corresponding to each working machine. The method disclosed by the invention can obviously improve the final learning precision of the working machine model after prediction fusion.
Description
Technical Field
The invention relates to the field of machine learning, in particular to a method, a device, equipment and a medium for improving the learning performance of a working machine.
Background
The internet of things generates a large amount of distributed data, a typical training mode is to store the data on a server, and a model is trained through the server, however, the problem of communication efficiency and calculation efficiency of this mode is obvious, for example, hundreds of Gb of data generated by a car for several hours are a great burden in the transmission and calculation processes. In practical application, a deep neural network is adopted as a machine learning model in general distributed machine learning, and the machine learning model achieves unprecedented success in many applications, such as model classification and pattern recognition, but is mainly limited to offline learning. In practical applications, the working opportunity obtains data flow, so online learning is an effective way to solve the problem.
In the prior art, in a method for improving the learning performance of a working machine, aggregation of global model prediction by utilizing gpe (Generalized products of experiments) is proposed. One disadvantage of gpe is that when using local prediction provided by the working machine for gpe aggregation, the obtained global prediction model has a large uncertainty, i.e. the prediction variance is large and conservative, so that the conservative global prediction variance affects the final learning performance of the working machine in the distributed framework. The global prediction variance obtained by using the existing aggregation algorithm is not too large in distinction from the local prediction variance obtained by using a local data set and a GPR (Gaussian process regression) by using a working machine, so that when the two variances are compared, the result advantage of global prediction is not obvious, namely the global prediction variance obtained by using a certain aggregation algorithm is smaller than the local prediction variance of the working machine, and the improvement of the advantage brought by a global model on the local prediction performance cannot be reflected in the final local fusion process.
Disclosure of Invention
In view of this, the present invention provides a method, an apparatus, a device, and a medium for improving the learning performance of a working machine, where the method for improving the learning performance of a working machine uses a Gaussian Process Regression (GPR) and a rBCM (Robust Bayesian commit machine) aggregation algorithm to classify the function as a prediction model of the working machine, and learns the function with a local data set to realize prediction of test output. Each work machine then sends the locally predicted expectations and variances to the server. And after receiving the prediction expectation and the variance of all the working machines, the server adopts an rBCM algorithm to aggregate the global model, and sends the obtained global prediction expectation and variance to each working machine, so that the working machines realize final prediction fusion. Under an online learning framework, by introducing a correction term of precision, the variance of global prediction can be smaller, and the final learning precision of the working machine is improved.
In view of the above objects, an aspect of an embodiment of the present invention provides a method of improving learning performance of a working machine, the method including the steps of: establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (random binary coded modulation) aggregation algorithm to obtain a global prediction model; and sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the establishing a local training data set corresponding to each working machine, and training the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine includes: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
In some embodiments, the creating a local training data set corresponding to the objective function and each working machine, and the building a test data set from the local training data set includes: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
In some embodiments, said approximating said local training data set to said objective function on said test data set by a gaussian process regression algorithm to obtain a corresponding local prediction model for said each working machine comprises: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, said calculating a gaussian posterior probability distribution over said test data set for said each work machine, obtaining said expectation and variance of said corresponding local prediction for said each work machine comprises: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the setting, by the server, a correction term of uncertainty on the data of the local prediction model corresponding to each working machine and aggregating the correction term based on an rBCM aggregation algorithm to obtain a global prediction model includes: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In some embodiments, the aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertainty correction term and the rBCM aggregation algorithm to obtain a global prediction model includes: and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain the expectation and variance of the global prediction.
In some embodiments, the sending the global prediction model to each of the working machines, and fusing the set fusion algorithm and an uncertainty test data set for each of the working machines to obtain the minimum prediction error model corresponding to each of the working machines includes: transmitting the global predicted expectation and variance to each of the work machines; and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the fusing the set fusion algorithm for each working machine and an uncertainty test data set to obtain the minimum prediction error model for each working machine includes: setting a fusion algorithm and an uncertainty test data set for each working machine according to the global prediction variance and the local prediction variance of each working machine; and obtaining a prediction error minimum model corresponding to each working machine on the uncertainty test data set through the fusion algorithm so as to realize the error minimum of the expected value on the uncertainty test data set.
In some embodiments, said setting a fusion algorithm and an uncertainty test data set for said each work machine based on said global predicted variance and said local predicted variance for said each work machine comprises: and establishing an uncertainty test data set, and setting a fusion algorithm for each working machine according to the magnitude of the global prediction variance and the local prediction variance of the data in the uncertainty test data set.
In some embodiments, the creating an uncertainty test data set and setting a fusion algorithm for each of the working machines according to the magnitude of the global predicted variance and the local predicted variance of the data in the uncertainty test data set comprises: in response to the variance of the global prediction for data in the uncertainty test dataset not being greater than the variance of the local prediction, using the global prediction model as a minimum model of prediction error for the work machine.
In some embodiments, said creating an uncertainty test data set and setting a fusion algorithm for each of said work machines based on the magnitude of the variance of the global prediction and the variance of the local prediction of data in said uncertainty test data set further comprises: and in response to the variance of the global prediction of the data in the uncertainty test data set being greater than the variance of the local prediction, using the local prediction model corresponding to the working machine as the minimum prediction error model of the working machine.
In another aspect of the embodiments of the present invention, there is provided an apparatus for improving learning performance of a working machine, the apparatus including: the system comprises a first module, a second module and a third module, wherein the first module is configured to establish a local training data set corresponding to each working machine and train the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; the second module is configured and used for setting a correction term of uncertainty for data of the local prediction model corresponding to each working machine through the server and carrying out aggregation based on an rBCM aggregation algorithm to obtain a global prediction model; and the third module is configured to send the global prediction model to each working machine, and set a fusion algorithm for each working machine and fuse an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the first module is further configured to: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
In some embodiments, the first module is further configured to: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
In some embodiments, the first module is further configured to: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the first module is further configured to: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the second module is further configured to: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In another aspect of the embodiments of the present invention, there is also provided a computer device, including at least one processor; and a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of any of the methods described above.
In another aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing any one of the above method steps is stored when the computer program is executed by a processor.
The invention has at least the following beneficial effects: the invention provides a method, a device, equipment and a medium for improving the learning performance of a working machine, wherein the method for improving the learning performance of the working machine adopts Gaussian Process Regression (GPR) as a prediction model of the working machine and carries out global model aggregation through an rBCM algorithm, and sends the obtained global prediction expectation and variance to each working machine so as to realize final prediction fusion of the working machine. The global prediction precision can be improved, namely the global model prediction variance (with uncertainty) is greatly reduced, so that a better model fusion effect of a working machine can be realized. In particular, for local variance and global variance, if the global prediction variance is very small, the fusion algorithm of using the global model to replace the local model is more valuable for the working machine with large local prediction variance. On one hand, the rBCM global model aggregation algorithm reduces the uncertainty of global prediction, namely the conservative property is reduced; on the other hand, for the global prediction variance obtained by the server through the rBCM, the working machine remarkably improves the final learning precision after the model prediction fusion of the working machine by using the comparison between the global model and the local model variance.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of an embodiment of a method for improving learning performance of a work machine according to the present invention;
FIG. 2 is a schematic diagram of an embodiment of an apparatus for improving learning performance of a working machine according to the present invention;
FIG. 3 is a schematic diagram of one embodiment of a computer device provided by the present invention;
fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium provided in the present invention.
Detailed Description
Embodiments of the present invention are described below. However, it is to be understood that the disclosed embodiments are merely examples and that other embodiments may take various and alternative forms.
In addition, it should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are only used for convenience of expression and should not be construed as a limitation to the embodiments of the present invention, and they are not described in any further embodiments. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
One or more embodiments of the present application will be described below in conjunction with the following drawings.
In view of the above objects, a first aspect of an embodiment of the present invention proposes an embodiment of a method of improving learning performance of a working machine. Fig. 1 is a schematic diagram illustrating an embodiment of a method for improving the learning performance of a working machine according to the present invention. As shown in fig. 1, a method for improving learning performance of a working machine according to an embodiment of the present invention includes the following steps:
s1, establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
s2, setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (random binary coded modulation) aggregation algorithm to obtain a global prediction model;
and S3, sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In view of the above object, the first aspect of the embodiments of the present invention also provides another embodiment of a method for improving the learning performance of a working machine.
Distributed machine learning is used to address situations where the computation is too large, the training data is excessive, and the model is too large. For the case of too large amount of calculation, multi-thread or multi-machine parallel operation based on shared memory (or virtual memory) can be adopted; in the case of too much training data, the data needs to be divided and distributed to a plurality of working nodes for training, so that the local data of each working node is within a tolerance. Each working node can train a sub-model according to local data and can communicate with other working nodes according to a certain rule (the communication content is mainly sub-model parameters or parameter updating) so as to ensure that the training results from all the working nodes can be effectively integrated finally and a global machine learning model can be obtained. For the case that the size of the model is too large, the model needs to be divided and distributed to different working nodes for training. Different from data parallel, the dependency relationship between the submodels under the model parallel framework is very strong, because the output of one submodel may be the input of another submodel, if the communication of the intermediate calculation result is not performed, the whole model training cannot be completed.
In general distributed machine learning, a deep neural network is adopted as a machine learning model and mainly applied to pattern classification and pattern recognition, but the distributed machine learning is limited to offline learning, in practical application, a working machine can obtain data flow in real-time application, online learning is called as a means for solving the problem, and Gaussian process regression is one effective means. The gaussian process model can be equivalent to existing machine learning models, including Bayesian linear models, multi-layer neural networks. According to the central limit theorem, assuming that the weights in the neural network follow a gaussian normal distribution, as the width of the neural network approaches infinity, such a neural network is equivalent to a gaussian process regression. However, the gaussian process regression is a non-hyper-parametric statistical probability model, unlike the conventional learning models such as linear regression, logistic regression, and neural network, which need to solve the optimization problem to minimize the loss function to obtain the optimal model parameters, the gaussian process regression does not need to solve the optimization problem. Given training data and test inputs, the prediction of the gaussian process regression is divided into two steps, inference and prediction. The inference process assumes that the function to be learned obeys the Gaussian process, gives the Gaussian prior probability distribution of the model, and then utilizes the observed value and the Bayesian rule to solve the Gaussian posterior probability distribution of the model. When the local model prediction is completed, each working machine sends the obtained local prediction (expectation and variance) to the server, and the server completes the calculation of the global model, for example, an average aggregation algorithm is used for solving the global model. And finally, the server sends the global model (global expectation and variance) obtained by calculation back to each working machine, and the working machines perform fusion calculation by using the obtained global model and the local model obtained by self training so as to expect to obtain an updated prediction of the target function, so that the prediction is closer to the true value of the function.
The method for improving the learning performance of the working machine uses Gaussian Process Regression (GPR) as a prediction model of the working machine, and learns functions by using a local data set to realize prediction of test output. Each work machine then sends the locally predicted expectations and variances to the server. And after receiving the prediction expectation and the variance of all the working machines, the server carries out global model aggregation through an rBCM algorithm, and sends the obtained global prediction expectation and variance to each working machine to realize final prediction fusion of the working machines. The global aggregation of the rBCM algorithm can improve the accuracy of global prediction, namely the prediction variance (uncertainty) of a global model is greatly reduced, so that a better model fusion effect of a working machine can be realized. In particular, considering the comparison between the local variance and the global variance, if the global prediction variance is very small, the fusion algorithm of replacing the local model with the global model is more valuable for the working machine with the larger local prediction variance.
Defining an objective function asIn whichIs thatThe input space is dimensioned. Without loss of generality, we assume that the output is one-dimensional, i.e. one-dimensional. At the moment of timeGiven a givenThe corresponding output is
Is obeyed with a mean of 0 and a variance ofGaussian noise of gaussian probability distribution, i.e.. Defining a training set of the formIn whichIs a set of input data that is,is a column vector that aggregates the outputs. The Gaussian process regression objective is to utilize a training setIn testing data setsUpper approximation function。
Defining symmetric positive semi-definite kernel functionsI.e. byWherein,Is a measure. LetReturning a column vector such that it isAn element is equal to. Hypothesis functionIs a sample from a prior probability distribution of a Gaussian process having a mean function ofThe kernel function is. Then training output and test outputObeying a joint probability distribution
WhereinAndreturn to byAndthe vector of the composition is then calculated,return a matrix such thatGo to the firstThe elements of the column are。
Using the properties of the Gaussian process, the Gaussian process regression uses the training setPredictive test data setTo output (d). This outputSubject to a normal distribution, i.e.Here, the
In distributed machine learning, consider a network havingA working machine. Define this set as. At each momentEach of the working machinesUsing local training dataTo predict function to test inputTo output of (c).、The local predictive value of each machine training is
If under the Federal learning framework, every working machine will beTrained local prediction,And sending the data to a server.
The specific steps of distributed training and fusion are as follows:
(1) Constructing a training subset based on projection of a training set, defining two training data pointsAnda distance ofData pointsTo a collectionIs a distance of. Defining data pointsTo a collectionIs a set of projections。
Consider each working machineAnd its local training data setFor a test dataCalculating test dataTo the training setIs marked as:
For each working machineAnd projection sets thereofTaking out each projection point marked as. Subscripts hereinDenotes the firstA projection point. And then for each proxelFind out a neighborhood of itSo thatAnd is directed to,,. It should be noted here that the number of neighborhoods is adjustable, and the selection can be fixed.
(2) Selecting a kernel function, in practical application, generally selecting a kernel function:
(3) For each working machineIn the new training setThe gaussian posterior probability distribution is calculated above, i.e.:
In the training subsetObtaining a local prediction using equation (7)Andthe local prediction is then sent to a server where it can be verified that the local prediction error is less than an upper bound, which we define asThat is, for the test input, the following inequality holds
(4) The server utilizes an rBCM aggregation algorithm to aggregate the local predicted values, and global prediction expectation and variance are given:
Wherein the content of the first and second substances,is an uncertainty correction term that can make the global expectation variance smaller. Because the global prediction expectation obtained by the rBCM algorithm has consistency, namely when the training data is large enough, the global prediction expectation is consistentCan approximate a function. Therefore, the approximation error is as follows:
(5) The server predicts global expectationSum varianceSent to each working machine according to the global prediction varianceAnd local prediction varianceDesigning a fusion algorithm for each working machine to enable the prediction expectation after fusion to be more approximate to the target functionThe true value of (d). Constructing a test data with small uncertaintyThe set of (a) is as follows:
If the set is not an empty set, global prediction from the serverAndwill be used; if the set is an empty set, local prediction from the working machineAndwill be used. The global prediction variance obtained by the rBCM algorithm is smaller, so that the global prediction variance can more occupy the dominant advantage when the global prediction variance and the local prediction variance are compared in the working machine fusion algorithm. If the local variance is large, the global prediction expectation and the variance are used for replacement, so that the local prediction of the working machine is remarkably improved. On the other hand, by comparing the upper bounds of equation (8) and equation (10), when the global prediction variance is small enough, the confidence interval becomes narrower to reflect a smaller approximation error.
In a second aspect of an embodiment of the present invention, an apparatus for improving learning performance of a working machine is provided. Fig. 2 is a schematic diagram illustrating an embodiment of an apparatus for improving learning performance of a working machine according to the present invention. As shown in fig. 2, the apparatus for improving learning performance of a working machine according to the present invention includes: the first module 011 is configured to establish a local training data set corresponding to each working machine, and train the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; a second module 012, configured to set, by the server, a correction term of uncertainty for the data of the local prediction model corresponding to each working machine, and aggregate the data based on the rBCM aggregation algorithm to obtain a global prediction model; and a third module 013, configured to send the global prediction model to each of the working machines, and set a fusion algorithm for each of the working machines and fuse an uncertainty test data set to obtain a minimum prediction error model corresponding to each of the working machines.
The first module 011 is further configured for: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
The first module 011 is further configured for: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
The first module 011 is further configured for: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
The first module 011 is further configured for: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
The second module 012 is further configured to: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In view of the above object, a third aspect of the embodiments of the present invention provides a computer device, and fig. 3 is a schematic diagram illustrating an embodiment of a computer device provided by the present invention. As shown in FIG. 3, an embodiment of a computer device provided by the present invention includes the following modules: at least one processor 021; and a memory 022, the memory 022 storing computer instructions 023 executable on the processor 021, the computer instructions 023, when executed by the processor 021, implementing the steps of the method as described above.
The invention also provides a computer readable storage medium. FIG. 4 is a schematic diagram illustrating an embodiment of a computer-readable storage medium provided by the present invention. As shown in fig. 4, the computer readable storage medium 031 stores a computer program 032 which, when executed by a processor, performs the method as described above.
Finally, it should be noted that, as one of ordinary skill in the art can appreciate that all or part of the processes of the methods of the above embodiments can be implemented by a computer program to instruct related hardware, and the program of the method for setting system parameters can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the methods as described above. The storage medium of the program may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
Furthermore, the methods disclosed according to embodiments of the present invention may also be implemented as a computer program executed by a processor, which may be stored in a computer-readable storage medium. Which when executed by a processor performs the above-described functions defined in the methods disclosed in embodiments of the invention.
Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, D0L, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the above embodiments of the present invention are merely for description, and do not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.
Claims (20)
1. A method of improving learning performance of a work machine, comprising:
establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (random binary coded modulation) aggregation algorithm to obtain a global prediction model;
and sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
2. The method of claim 1, wherein the establishing a local training data set corresponding to each working machine and training the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine comprises:
establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets;
and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
3. The method of claim 2, wherein the establishing a local training data set corresponding to the objective function and each working machine, and the constructing a test data set from the local training data set comprises:
calculating the projection of each test data to the local training data set to obtain a local projection set;
and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
4. The method of claim 2, wherein approximating the local training data set to the objective function on the test data set by a gaussian process regression algorithm to obtain the local prediction model for each of the work machines comprises:
calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine;
and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
5. The method of claim 4, wherein said calculating a Gaussian posterior probability distribution over said test data set for said each work machine, and obtaining the expectation and variance of the corresponding local prediction for said each work machine comprises:
and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
6. The method according to claim 4, wherein the setting, by the server, a correction term of uncertainty for the data of the local prediction model corresponding to each working machine and aggregating the correction term based on an rBCM aggregation algorithm to obtain a global prediction model comprises:
setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server;
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
7. The method according to claim 6, wherein the aggregating the expectation and variance of the local prediction corresponding to each of the working machines based on the uncertainty correction term and the rBCM aggregation algorithm to obtain a global prediction model comprises:
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain the expectation and variance of the global prediction.
8. The method of claim 7, wherein sending the global prediction model to each of the plurality of working machines, and fusing the set fusion algorithm and an uncertainty test data set for each of the plurality of working machines to obtain the minimum prediction error model for each of the plurality of working machines comprises:
transmitting the global predicted expectation and variance to each of the work machines;
and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
9. The method of claim 8, wherein said fusing said set of uncertainty test data with said each work machine to obtain a corresponding prediction error minimization model for said each work machine comprises:
setting a fusion algorithm and an uncertainty test data set for each working machine according to the global prediction variance and the local prediction variance of each working machine;
and obtaining a prediction error minimum model corresponding to each working machine on the uncertainty test data set through the fusion algorithm so as to realize the error minimum of the expected value on the uncertainty test data set.
10. The method of claim 9, wherein said setting a fusion algorithm and an uncertainty test data set for said each work machine based on said global predicted variance and said local predicted variance for said each work machine comprises:
and establishing an uncertainty test data set, and setting a fusion algorithm for each working machine according to the variance of global prediction and the variance of local prediction of data in the uncertainty test data set.
11. The method of claim 10, wherein said creating an uncertainty test data set and setting a fusion algorithm for each of said work machines based on the magnitude of the global predicted variance and the local predicted variance of data in said uncertainty test data set comprises:
in response to a variance of a global prediction of data in the uncertainty test dataset being not greater than a variance of a local prediction, using the global prediction model as a minimum model of prediction error for the work machine.
12. The method of claim 10, wherein said creating an uncertainty test data set and setting a fusion algorithm for each of said work machines based on the magnitude of the global predicted variance and the local predicted variance of data in said uncertainty test data set further comprises:
and in response to the variance of the global prediction of the data in the uncertainty test data set being greater than the variance of the local prediction, using the local prediction model corresponding to the working machine as the minimum prediction error model of the working machine.
13. An apparatus for improving learning performance of a working machine, the apparatus comprising:
the system comprises a first module, a second module and a third module, wherein the first module is configured to establish a local training data set corresponding to each working machine, and train the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
the second module is configured to set an uncertain correction term for the data of the local prediction model corresponding to each working machine through the server and aggregate the data based on an rBCM (Rich computer-based binary-coded decimal) aggregation algorithm to obtain a global prediction model; and
and the third module is configured to send the global prediction model to each working machine, and set a fusion algorithm and fuse an uncertainty test data set for each working machine to obtain a prediction error minimum model corresponding to each working machine.
14. The apparatus of claim 13, wherein the first module is further configured to:
establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets;
and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
15. The apparatus of claim 14, wherein the first module is further configured for:
calculating the projection of each test data to the local training data set to obtain a local projection set;
and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
16. The apparatus of claim 14, wherein the first module is further configured for:
calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine;
and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
17. The apparatus of claim 16, wherein the first module is further configured for:
and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
18. The apparatus of claim 16, wherein the second module is further configured for:
setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server;
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
19. A computer device, comprising:
at least one processor; and
a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method of any one of claims 1 to 12.
20. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394593.5A CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394593.5A CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115456202A true CN115456202A (en) | 2022-12-09 |
CN115456202B CN115456202B (en) | 2023-04-07 |
Family
ID=84309944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211394593.5A Active CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115456202B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117370473A (en) * | 2023-12-07 | 2024-01-09 | 苏州元脑智能科技有限公司 | Data processing method, device, equipment and storage medium based on integrity attack |
CN117473331A (en) * | 2023-12-27 | 2024-01-30 | 苏州元脑智能科技有限公司 | Stream data processing method, device, equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451101A (en) * | 2017-07-21 | 2017-12-08 | 江南大学 | It is a kind of to be layered integrated Gaussian process recurrence soft-measuring modeling method |
US20180018586A1 (en) * | 2016-07-13 | 2018-01-18 | Fujitsu Limited | Apparatus and method for managing machine learning |
CN109688110A (en) * | 2018-11-22 | 2019-04-26 | 顺丰科技有限公司 | DGA domain name detection model construction method, device, server and storage medium |
CN112381145A (en) * | 2020-11-16 | 2021-02-19 | 江康(上海)科技有限公司 | Gaussian process regression multi-model fusion modeling method based on nearest correlation spectral clustering |
US20220101178A1 (en) * | 2020-09-25 | 2022-03-31 | EMC IP Holding Company LLC | Adaptive distributed learning model optimization for performance prediction under data privacy constraints |
CN114912626A (en) * | 2022-04-15 | 2022-08-16 | 上海交通大学 | Method for processing distributed data of federal learning mobile equipment based on summer pril value |
CN115174191A (en) * | 2022-06-30 | 2022-10-11 | 山东云海国创云计算装备产业创新中心有限公司 | Local prediction value safe transmission method, computer equipment and storage medium |
-
2022
- 2022-11-08 CN CN202211394593.5A patent/CN115456202B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180018586A1 (en) * | 2016-07-13 | 2018-01-18 | Fujitsu Limited | Apparatus and method for managing machine learning |
CN107451101A (en) * | 2017-07-21 | 2017-12-08 | 江南大学 | It is a kind of to be layered integrated Gaussian process recurrence soft-measuring modeling method |
CN109688110A (en) * | 2018-11-22 | 2019-04-26 | 顺丰科技有限公司 | DGA domain name detection model construction method, device, server and storage medium |
US20220101178A1 (en) * | 2020-09-25 | 2022-03-31 | EMC IP Holding Company LLC | Adaptive distributed learning model optimization for performance prediction under data privacy constraints |
CN112381145A (en) * | 2020-11-16 | 2021-02-19 | 江康(上海)科技有限公司 | Gaussian process regression multi-model fusion modeling method based on nearest correlation spectral clustering |
CN114912626A (en) * | 2022-04-15 | 2022-08-16 | 上海交通大学 | Method for processing distributed data of federal learning mobile equipment based on summer pril value |
CN115174191A (en) * | 2022-06-30 | 2022-10-11 | 山东云海国创云计算装备产业创新中心有限公司 | Local prediction value safe transmission method, computer equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
MARC等: "Distributed Gaussian Processe", 《PROCEEDINGS OF THE 32ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117370473A (en) * | 2023-12-07 | 2024-01-09 | 苏州元脑智能科技有限公司 | Data processing method, device, equipment and storage medium based on integrity attack |
CN117370473B (en) * | 2023-12-07 | 2024-03-01 | 苏州元脑智能科技有限公司 | Data processing method, device, equipment and storage medium based on integrity attack |
CN117473331A (en) * | 2023-12-27 | 2024-01-30 | 苏州元脑智能科技有限公司 | Stream data processing method, device, equipment and storage medium |
CN117473331B (en) * | 2023-12-27 | 2024-03-08 | 苏州元脑智能科技有限公司 | Stream data processing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115456202B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115456202B (en) | Method, device, equipment and medium for improving learning performance of working machine | |
CN109241291B (en) | Knowledge graph optimal path query system and method based on deep reinforcement learning | |
CN110263227B (en) | Group partner discovery method and system based on graph neural network | |
KR102264571B1 (en) | Hierarchical decision agent | |
WO2022227217A1 (en) | Text classification model training method and apparatus, and device and readable storage medium | |
CN111191934A (en) | Multi-target cloud workflow scheduling method based on reinforcement learning strategy | |
CN113361680A (en) | Neural network architecture searching method, device, equipment and medium | |
CN115563858A (en) | Method, device, equipment and medium for improving steady-state performance of working machine | |
CN116976640B (en) | Automatic service generation method, device, computer equipment and storage medium | |
Ciancio et al. | Heuristic techniques to optimize neural network architecture in manufacturing applications | |
Tehrani et al. | A hybrid optimized artificial intelligent model to forecast crude oil using genetic algorithm | |
CN111459988A (en) | Method for automatic design of machine learning assembly line | |
He et al. | A GNN-based predictor for quantum architecture search | |
WO2022252694A1 (en) | Neural network optimization method and apparatus | |
CN112288154B (en) | Block chain service reliability prediction method based on improved neural collaborative filtering | |
Ye et al. | Towards quantum machine learning for constrained combinatorial optimization: a quantum qap solver | |
CN112541556A (en) | Model construction optimization method, device, medium, and computer program product | |
CN115392493A (en) | Distributed prediction method, system, server and storage medium | |
Huang et al. | Network reliability evaluation of manufacturing systems by using a deep learning approach | |
CN116646021A (en) | Fusion element path molecular heterogeneous diagram property prediction method, storage medium and device | |
Hennebold et al. | Machine learning based cost prediction for product development in mechanical engineering | |
Chen et al. | Structured hierarchical dialogue policy with graph neural networks | |
CN115081609A (en) | Acceleration method in intelligent decision, terminal equipment and storage medium | |
CN115130663A (en) | Heterogeneous network attribute completion method based on graph neural network and attention mechanism | |
US11783194B1 (en) | Evolutionary deep learning with extended Kalman filter for modeling and data assimilation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |