CN111191726A - Fault classification method based on weak supervised learning multi-layer perceptron - Google Patents

Fault classification method based on weak supervised learning multi-layer perceptron Download PDF

Info

Publication number
CN111191726A
CN111191726A CN201911418196.5A CN201911418196A CN111191726A CN 111191726 A CN111191726 A CN 111191726A CN 201911418196 A CN201911418196 A CN 201911418196A CN 111191726 A CN111191726 A CN 111191726A
Authority
CN
China
Prior art keywords
sample
label
layer
network
mlp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911418196.5A
Other languages
Chinese (zh)
Other versions
CN111191726B (en
Inventor
葛志强
廖思奋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911418196.5A priority Critical patent/CN111191726B/en
Publication of CN111191726A publication Critical patent/CN111191726A/en
Application granted granted Critical
Publication of CN111191726B publication Critical patent/CN111191726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a process data fault classification method based on a weak supervised learning multilayer sensor, which consists of a supervised classification network consisting of the multilayer sensor, a BatchNormalization layer, a Dropout layer and a Softmax output layer and a Gaussian mixture model for acquiring inaccurate sample labels; the multi-layer perceptron can learn the feature representation of data from inaccurate label data, in addition, a Gaussian mixture model carries out unsupervised clustering on the features extracted by the multi-layer perceptron, the clustering result can be used for estimating the relation between inaccurate labels of various samples and potential real labels of the samples, namely a label probability transition matrix, and the estimated label probability transition matrix is used for correcting a network loss function to carry out secondary training on the classification network, so that the classification precision of the network on the inaccurate label samples is improved. The method can be suitable for the situation that labels of the industrial process data sample part are labeled wrongly, namely the fault classification of the labels is not accurate.

Description

Fault classification method based on weak supervised learning multi-layer perceptron
Technical Field
The invention belongs to the field of fault diagnosis and classification in industrial processes, and particularly relates to a fault classification method based on a weak supervised learning multi-layer sensor.
Background
In the industrial process monitoring, when a fault is detected, the fault information needs to be further analyzed, and fault classification is an important link in the fault information, so that the fault classification is obtained, and the recovery of the industrial process is facilitated.
In the traditional fault classification, the acquired data sample labels are assumed to be accurate, so that model training is performed, however, the labels of the industrial process data are generated in the modes of an external knowledge base, a rule base or manual calibration, and the labels of the samples may be inaccurate. In addition, inaccurate label samples are more readily available and less costly than accurate label samples. Sample label inaccuracy has become a non-negligible feature of the model. Therefore, the modeling of the inaccurate label samples through weak supervised learning is carried out in practice, and the classification precision of the model to the fault samples can be improved.
Disclosure of Invention
The invention provides a fault classification method based on a weak supervised learning multi-layer sensor, aiming at the problems that a sample label is possibly inaccurate in the current industrial process and the like.
The purpose of the invention is realized by the following technical scheme: a method for process data fault classification for a multi-layered perceptron based on weak supervised learning, the multi-layered perceptron based on weak supervised learning comprising: two layers of perceptrons MLP, a Softmax output layer and a Gaussian mixture model GMM. The process data fault classification method specifically comprises the following steps:
the method comprises the following steps: collecting samples containing inaccurate labels in historical industrial processes as training data sets
Figure BDA0002351692900000011
Wherein the content of the first and second substances,
Figure BDA0002351692900000012
in order to not accurately label the data sample,
Figure BDA0002351692900000013
is a label for the sample to be tested,
Figure BDA0002351692900000014
n represents the number of samples in the training data set, and K is the number of sample classes.
Step two: standardizing the training data set D collected in the step one, namely mapping each variable of the labeled sample set X into a sample set X with the mean value of 0 and the variance of 1stdAnd converting each sample label of the label set Y into a one-dimensional vector through one-hot coding to obtain a standardized data set
Figure BDA0002351692900000015
Step three: first a new data set D is createdstdAs input, carrying out first supervised training on the network of the sensor MLP, and obtaining a sample set X at a Softmax output layerstdBelong to its label
Figure BDA0002351692900000016
The posterior probability of (d).
Step four: taking the posterior probability obtained in the third step as the input of the Gaussian mixture model GMM, training the Gaussian mixture model, and using the trained parameters of the Gaussian mixture model
Figure BDA0002351692900000021
Estimating a label probability transition matrix T to obtain an estimation matrix
Figure BDA0002351692900000022
Step five: according to
Figure BDA0002351692900000023
Correcting the loss function of the inaccurate label sample fitted by the MLP of the step three, and obtaining a data set D by the step twostdAs input, a second supervised training step, namely a third sensor MLP, completes weak supervised learning to obtain a trained WS-MLP network;
step six: collecting new industrial process data of unknown fault types, standardizing the process data according to the method of the second step to obtain a data set dstdInputting the data into the WS-MLP network trained in the fifth step, calculating the posterior probability of each fault category corresponding to the sample, and taking the category with the maximum posterior probability as the category of the sample to realize the fault classification of the sample.
Further, the third step specifically comprises the following steps:
(3.1) constructing a network of the sensor MLP, wherein the network of the sensor MLP consists of a first hidden layer, a BatchNormalization layer, a Dropout layer, a second hidden layer, a BatchNormalization layer, a Dropout layer and a Softmax layer which are connected in sequence. Wherein, the weight matrix and bias of the first hidden layer and the second hidden layerThe location vectors are respectively W1,b1,W2,b2The weight matrix and the offset vector from the hidden layer of the second layer to the Softmax layer are respectively W3,b3These network parameters are denoted as θ ═ W1,b1,W2,b2,W3,b3}。
(3.2) normalized sample set DstdAs input, the network of perceptrons MLP is supervised trained, using a cross entropy loss function:
Figure BDA0002351692900000024
wherein the content of the first and second substances,
Figure BDA0002351692900000025
is a representation of the last layer of the MLP network.
And the loss function adjusts parameters of the whole sensor MLP network through a back propagation algorithm (BP), and obtains parameters of the whole network after repeated iteration loss convergence to finish training.
Further, the fourth step specifically includes the following steps:
(4.1) each type of exemplar of the inaccurate label exemplar set consists of an exact-label exemplar and an incorrect-label exemplar, making the following assumptions: it is assumed that the generation of inaccurate labels is independent of the input, i.e. the probability that a certain class of samples is labeled into other classes is the same. And the MLP network is assumed to have perceptual consistency, i.e. the MLP network obeys a gaussian distribution for the characteristic representation of the labeled accurate samples and labeled erroneous samples in each class, respectively.
According to assumptions, it is possible to obtain:
Figure BDA0002351692900000026
Figure BDA0002351692900000027
Figure BDA0002351692900000031
Figure BDA0002351692900000032
wherein the content of the first and second substances,
Figure BDA0002351692900000033
is a sample set DstdIs a sample representation, y is a potential true label of the sample, p (-) represents a probability, eiI ∈ {1,2, L, K } is represented in
Figure BDA0002351692900000034
Spatially, the ith element is a vector with 1 and other elements are 0, θ represents all weight matrix and offset vector parameters in the MLP network, μ, Σ represents the mean vector and covariance matrix of unknown gaussian distribution, respectively,
Figure BDA0002351692900000035
and
Figure BDA0002351692900000036
respectively representing the Gaussian distribution density of all samples and the class of i samples, T represents a label probability transition matrix, and defines
Figure BDA0002351692900000037
(4.2) for different classes of sample subsets
Figure BDA0002351692900000038
Modeling using a gaussian mixture model:
Figure BDA0002351692900000039
Figure BDA00023516929000000310
wherein x isiRepresentation of belonging toData set
Figure BDA00023516929000000311
The sample data of (a) is stored in the memory,
Figure BDA00023516929000000312
to represent
Figure BDA00023516929000000313
┐ i denote other categories than category i.
(4.3) establishing a two-component Gaussian mixture model, using a maximum Expectation (EM) algorithm to complete parameter estimation of the Gaussian mixture model, and solving
Figure BDA00023516929000000314
Namely, it is
Figure BDA00023516929000000315
When step (E step) is desired, the Q function is calculated:
Figure BDA00023516929000000316
where t is the number of iterations.
Calculation model for observed data
Figure BDA00023516929000000317
Degree of responsibility of
Figure BDA00023516929000000318
Figure BDA00023516929000000319
Wherein the content of the first and second substances,
Figure BDA00023516929000000320
denotes xiThe nth sample of (1).
At very big step (M steps), the mean value mu of Gaussian distribution is estimatedmAnd a mixing coefficient αm
Figure BDA0002351692900000041
Figure BDA0002351692900000042
Figure BDA0002351692900000043
Wherein S isiTo represent
Figure BDA0002351692900000044
The number of samples.
And E, alternately iterating the step E and the step M until the model parameters are converged or the preset maximum iteration times. Solve out
Figure BDA0002351692900000045
Namely, it is
Figure BDA0002351692900000046
(4.4) according to the formula
Figure BDA0002351692900000047
Solving to obtain a mixed coefficient
Figure BDA0002351692900000048
And uses this to derive an estimate of the label probability transition matrix T
Figure BDA0002351692900000049
Figure BDA00023516929000000410
Figure BDA00023516929000000411
Wherein the content of the first and second substances,
Figure BDA00023516929000000412
representation estimation matrix
Figure BDA00023516929000000413
Row i and column k.
Further, in step five, the network of perceptrons MLP is trained for the second time using the modified loss function as:
Figure BDA00023516929000000414
compared with the prior art, the method has the advantages that modeling can be carried out on the scene with the inaccurate label of the labeled sample, label probability transition matrix evaluation is carried out on the inaccurate label sample, the label probability transition matrix evaluation is used for correcting the loss function of the classification network, weak supervision learning is completed, and therefore the classification precision of the model on the inaccurate label sample is improved.
Drawings
FIG. 1 is a TennesseeEastman (TE) process flow diagram;
fig. 2 is a classification accuracy comparison graph of an MLP network and a weak supervised learning based multi-layer perceptron (WS-MLP) for 9 TE process fault cases at 5 label noise ratios.
Detailed Description
The method for classifying faults based on the weakly supervised learning multi-layer perceptron of the present invention is further described in detail with reference to the following embodiments.
A process data fault classification method of a multilayer perceptron based on weak supervised learning is characterized in that the multilayer perceptron based on the weak supervised learning comprises the following steps: two layers of perceptrons MLP, a Softmax output layer and a Gaussian mixture model GMM. The process data fault classification method specifically comprises the following steps:
the method comprises the following steps: collecting samples containing inaccurate labels in historical industrial processes as training data sets
Figure BDA0002351692900000051
Wherein the content of the first and second substances,
Figure BDA0002351692900000052
in order to not accurately label the data sample,
Figure BDA0002351692900000053
is a label for the sample to be tested,
Figure BDA0002351692900000054
n represents the number of samples in the training data set, and K is the number of sample classes.
Step two: standardizing the training data set D collected in the step one, namely mapping each variable of the labeled sample set X into a sample set X with the mean value of 0 and the variance of 1stdAnd converting each sample label of the label set Y into a one-dimensional vector through one-hot coding to obtain a standardized data set
Figure BDA0002351692900000055
Step three: first a new data set D is createdstdAs input, carrying out first supervised training on the network of the sensor MLP, and obtaining a sample set X at a Softmax output layerstdBelong to its label
Figure BDA0002351692900000056
The posterior probability of (d). The process specifically comprises the following substeps:
(3.1) constructing a network of the sensor MLP, wherein the network of the sensor MLP consists of a first hidden layer, a BatchNormalization layer, a Dropout layer, a second hidden layer, a BatchNormalization layer, a Dropout layer and a Softmax layer which are connected in sequence. Wherein, the weight matrix and the offset vector of the first layer hidden layer and the second layer hidden layer are respectively W1,b1,W2,b2The weight matrix and the offset vector from the hidden layer of the second layer to the Softmax layer are respectively W3,b3These network parameters are denoted as θ ═ W1,b1,W2,b2,W3,b3}。
(3.2) normalized sample set DstdAs input, sense of oppositionThe network of perceptron MLPs is supervised trained, using a cross-entropy loss function:
Figure BDA0002351692900000057
wherein the content of the first and second substances,
Figure BDA0002351692900000058
is a representation of the last layer of the MLP network.
And the loss function adjusts parameters of the whole sensor MLP network through a back propagation algorithm (BP), and obtains parameters of the whole network after repeated iteration loss convergence to finish training.
Step four: taking the posterior probability obtained in the third step as the input of the Gaussian mixture model GMM, training the Gaussian mixture model, and using the trained parameters of the Gaussian mixture model
Figure BDA0002351692900000059
Estimating a label probability transition matrix T to obtain an estimation matrix
Figure BDA00023516929000000510
The general label probability transition matrix is difficult to obtain, the generation and the input of the label are independent according to the assumption of inaccuracy, the label probability transition matrix has perception consistency with an MLP network, and unsupervised learning can be carried out on the first training result of the MLP network by utilizing a Gaussian mixture model, so that the mixing coefficient learned by the Gaussian mixture model is approximate to the elements in the label probability transition matrix, and the method specifically comprises the following steps:
(4.1) each type of exemplar of the inaccurate label exemplar set consists of an exact-label exemplar and an incorrect-label exemplar, making the following assumptions: it is assumed that the generation of inaccurate labels is independent of the input, i.e. the probability that a certain class of samples is labeled into other classes is the same. And the MLP network is assumed to have perceptual consistency, i.e. the MLP network obeys a gaussian distribution for the characteristic representation of the labeled accurate samples and labeled erroneous samples in each class, respectively.
According to assumptions, it is possible to obtain:
Figure BDA0002351692900000061
Figure BDA0002351692900000062
Figure BDA0002351692900000063
Figure BDA0002351692900000064
wherein the content of the first and second substances,
Figure BDA0002351692900000065
is a sample set DstdIs a sample representation, y is a potential true label of the sample, p (-) represents a probability, eiI e {1,2, …, K } is shown in
Figure BDA0002351692900000066
Spatially, the ith element is a vector with 1 and other elements are 0, θ represents all weight matrix and offset vector parameters in the MLP network, μ, Σ represents the mean vector and covariance matrix of unknown gaussian distribution, respectively,
Figure BDA0002351692900000067
and
Figure BDA0002351692900000068
respectively representing the Gaussian distribution density of all samples and the class of i samples, T represents a label probability transition matrix, and defines
Figure BDA0002351692900000069
(4.2) for different classes of sample subsets
Figure BDA00023516929000000610
Modeling using a gaussian mixture model:
Figure BDA00023516929000000611
Figure BDA00023516929000000612
wherein x isiRepresentation belonging to a data set
Figure BDA00023516929000000613
The sample data of (a) is stored in the memory,
Figure BDA00023516929000000614
to represent
Figure BDA00023516929000000615
┐ i denote other categories than category i.
(4.3) establishing a two-component Gaussian mixture model, using a maximum Expectation (EM) algorithm to complete parameter estimation of the Gaussian mixture model, and solving
Figure BDA00023516929000000616
Namely, it is
Figure BDA00023516929000000617
When step (E step) is desired, the Q function is calculated:
Figure BDA0002351692900000071
where t is the number of iterations.
Calculation model for observed data
Figure BDA0002351692900000072
Degree of responsibility of
Figure BDA0002351692900000073
Figure BDA0002351692900000074
Wherein the content of the first and second substances,
Figure BDA0002351692900000075
denotes xiThe nth sample of (1).
At very big step (M steps), the mean value mu of Gaussian distribution is estimatedmAnd a mixing coefficient αm
Figure BDA0002351692900000076
Figure BDA0002351692900000077
Figure BDA0002351692900000078
Wherein S isiTo represent
Figure BDA0002351692900000079
The number of samples.
And E, alternately iterating the step E and the step M until the model parameters are converged or the preset maximum iteration times. Solve out
Figure BDA00023516929000000710
Namely, it is
Figure BDA00023516929000000711
(4.4) according to the formula
Figure BDA00023516929000000712
Solving to obtain a mixed coefficient
Figure BDA00023516929000000713
And uses this to derive an estimate of the label probability transition matrix T
Figure BDA00023516929000000714
Figure BDA00023516929000000715
Figure BDA00023516929000000716
Wherein the content of the first and second substances,
Figure BDA00023516929000000717
representation estimation matrix
Figure BDA00023516929000000718
Row i and column k.
Step five: according to
Figure BDA0002351692900000081
Correcting the loss function of the inaccurate label sample fitted by the MLP of the step three, and obtaining a data set D by the step twostdAnd (4) as input, performing second supervised training on the network of the three sensors MLP to finish weak supervised learning to obtain a trained WS-MLP network.
The network training of the second perceptron MLP uses a modified loss function as:
Figure BDA0002351692900000082
step six: collecting new industrial process data of unknown fault types, standardizing the process data according to the method of the second step to obtain a data set dstdInputting the data into the WS-MLP network trained in the fifth step, calculating the posterior probability of each fault category corresponding to the sample, and taking the category with the maximum posterior probability as the category of the sample to realize the fault classification of the sample.
In order to evaluate the classification effect of the fault classification model, a classification F corresponding to a certain fault is defined1Index, the calculation formula is as follows:
Figure BDA0002351692900000083
precision=TP/(TP+FP)
recall=TP/(TP+FN)
wherein, TP is the number of samples with correct classification of the fault samples; FP is the number of samples for classifying other category samples into the category faults by mistake, and FN is the number of samples for classifying the category fault samples by mistake.
Examples
The performance of the fault classification method for the multi-layered perceptron based on weakly supervised learning is described below with reference to a specific TE procedure example. The TE process is a standard data set commonly used in the field of fault diagnosis and fault classification, and the whole data set includes 53 process variables, and the process flow thereof is shown in fig. 1. The process consists of 5 operation units, namely a gas-liquid separation tower, a continuous stirring type reaction kettle, a partial condenser, a centrifugal compressor, a reboiler and the like.
9 faults in the TE process are selected, and the specific conditions of the 9 selected faults are given in table 1.
Table 1: TE Process Fault Listing
Figure BDA0002351692900000084
Figure BDA0002351692900000091
For the process, 34 variables of 22 process measurement variables and 12 control variables are used as modeling variables, and classification performance is tested on 9 types of fault condition data.
The MLP network consists of a first hidden layer, a BatchNornavigation layer, a Dropout layer, a second hidden layer, a BatchNornavigation layer, a Dropout layer and a Softmax layer which are connected in sequence. The number of input nodes of the MLP network is 34, the number of nodes of two hidden layers is 200 and 100 respectively, the number of nodes of a last Softmax layer is 9, the momentum value of a BatchNormalization layer is set to be 0.5, the loss proportion of nodes of a Dropout layer is 0.5, an Adam optimizer with an initial learning rate of 0.001 is used, the batch size is 110, and the iteration times are 30.
In fig. 2, a comparison of classification effects of an MLP network and a weak supervised learning-based multi-layer perceptron (WS-MLP) model under an F1 index is shown, MLP hidden nodes of the two networks are kept consistent, and by adjusting the label inaccuracy rates of input samples, the labels of samples in proportions of 0%, 10%, 20%, 30%, 40% and 50% are set to be labeled incorrectly, so as to observe the change condition of a classification index F1. The WS-MLP is accurate (namely 0% of sample labels are marked incorrectly) except in sample labels, and has better classification effect than the MLP network in other situations, thereby verifying the improvement of classification performance brought by estimating the label probability transition matrix by the Gaussian mixture model and correcting the MLP network loss function by using the same in the method; meanwhile, the classification performance of the WS-MLP model is close to that of an MLP network under the condition of accurate labels, which shows that the WS-MLP model is not only suitable for inaccurate label samples, but also suitable for fault classification of accurate label samples.

Claims (4)

1. A process data fault classification method of a multilayer perceptron based on weak supervised learning is characterized in that the multilayer perceptron based on the weak supervised learning comprises the following steps: two layers of perceptrons MLP, a Softmax output layer and a Gaussian mixture model GMM. The process data fault classification method specifically comprises the following steps:
the method comprises the following steps: collecting samples containing inaccurate labels in historical industrial processes as training data sets
Figure FDA0002351692890000011
Wherein the content of the first and second substances,
Figure FDA0002351692890000012
in order to not accurately label the data sample,
Figure FDA0002351692890000013
is a label for the sample to be tested,
Figure FDA0002351692890000014
n represents the number of samples in the training data set, and K is the number of sample classes.
Step two: standardizing the training data set D collected in the step one, namely mapping each variable of the labeled sample set X into a sample set X with the mean value of 0 and the variance of 1stdAnd converting each sample label of the label set Y into a one-dimensional vector through one-hot coding to obtain a standardized data set
Figure FDA0002351692890000015
Step three: first a new data set D is createdstdAs input, carrying out first supervised training on the network of the sensor MLP, and obtaining a sample set X at a Softmax output layerstdBelong to its label
Figure FDA0002351692890000016
The posterior probability of (d).
Step four: taking the posterior probability obtained in the third step as the input of the Gaussian mixture model GMM, training the Gaussian mixture model, and using the trained parameters of the Gaussian mixture model
Figure FDA0002351692890000017
Figure FDA0002351692890000018
Estimating a label probability transition matrix T to obtain an estimation matrix
Figure FDA0002351692890000019
Step five: according to
Figure FDA00023516928900000110
Correcting the loss function of the inaccurate label sample fitted by the MLP of the step three, and obtaining a data set D by the step twostdAs input, a second supervised training step, namely a third sensor MLP, completes weak supervised learning to obtain a trained WS-MLP network;
step six: collecting new industrial process data of unknown fault category according to the step twoThe method of (1) standardizes the process data to obtain a data set dstdInputting the data into the WS-MLP network trained in the fifth step, calculating the posterior probability of each fault category corresponding to the sample, and taking the category with the maximum posterior probability as the category of the sample to realize the fault classification of the sample.
2. The fault classification method according to claim 1, wherein step three specifically comprises the steps of:
(3.1) constructing a network of the sensor MLP, wherein the network of the sensor MLP consists of a first hidden layer, a BatchNormalization layer, a Dropout layer, a second hidden layer, a BatchNormalization layer, a Dropout layer and a Softmax layer which are connected in sequence. Wherein, the weight matrix and the offset vector of the first layer hidden layer and the second layer hidden layer are respectively W1,b1,W2,b2The weight matrix and the offset vector from the hidden layer of the second layer to the Softmax layer are respectively W3,b3These network parameters are denoted as θ ═ W1,b1,W2,b2,W3,b3}。
(3.2) normalized sample set DstdAs input, the network of perceptrons MLP is supervised trained, using a cross entropy loss function:
Figure FDA0002351692890000021
wherein the content of the first and second substances,
Figure FDA0002351692890000022
is a representation of the last layer of the MLP network.
And the loss function adjusts parameters of the whole sensor MLP network through a back propagation algorithm (BP), and obtains parameters of the whole network after repeated iteration loss convergence to finish training.
3. The fault classification method according to claim 1, wherein the fourth step specifically comprises the steps of:
(4.1) each type of exemplar of the inaccurate label exemplar set consists of an exact-label exemplar and an incorrect-label exemplar, making the following assumptions: it is assumed that the generation of inaccurate labels is independent of the input, i.e. the probability that a certain class of samples is labeled into other classes is the same. And the MLP network is assumed to have perceptual consistency, i.e. the MLP network obeys a gaussian distribution for the characteristic representation of the labeled accurate samples and labeled erroneous samples in each class, respectively.
According to assumptions, it is possible to obtain:
Figure FDA0002351692890000023
Figure FDA0002351692890000024
Figure FDA0002351692890000025
Figure FDA0002351692890000026
wherein the content of the first and second substances,
Figure FDA00023516928900000213
is a sample set DstdIs a sample representation, y is a potential true label of the sample, p (-) represents a probability, eiI ∈ {1,2, L, K } is represented in
Figure FDA0002351692890000027
Spatially, the ith element is a vector with 1 and other elements are 0, θ represents all weight matrix and offset vector parameters in the MLP network, μ, Σ represents the mean vector and covariance matrix of unknown gaussian distribution, respectively,
Figure FDA0002351692890000028
and
Figure FDA0002351692890000029
respectively representing the Gaussian distribution density of all samples and the class of i samples, T represents a label probability transition matrix, and defines
Figure FDA00023516928900000210
(4.2) for different classes of sample subsets
Figure FDA00023516928900000211
Modeling using a gaussian mixture model:
Figure FDA00023516928900000212
Figure FDA0002351692890000031
wherein x isiRepresentation belonging to a data set
Figure FDA0002351692890000038
The sample data of (a) is stored in the memory,
Figure FDA0002351692890000039
to represent
Figure FDA00023516928900000310
Figure FDA00023516928900000311
Representing other categories than category i.
(4.3) establishing a two-component Gaussian mixture model, using a maximum Expectation (EM) algorithm to complete parameter estimation of the Gaussian mixture model, and solving
Figure FDA00023516928900000312
Namely, it is
Figure FDA00023516928900000313
When step (E step) is desired, the Q function is calculated:
Figure FDA0002351692890000032
where t is the number of iterations.
Calculation model for observed data
Figure FDA00023516928900000314
Degree of responsibility of
Figure FDA00023516928900000315
Figure FDA0002351692890000033
Wherein the content of the first and second substances,
Figure FDA00023516928900000316
denotes xiThe nth sample of (1).
At very big step (M steps), the mean value mu of Gaussian distribution is estimatedmAnd a mixing coefficient αm
Figure FDA0002351692890000034
Figure FDA0002351692890000035
Figure FDA0002351692890000036
Wherein S isiTo represent
Figure FDA00023516928900000317
The number of samples.
And E, alternately iterating the step E and the step M until the model parameters are converged or the preset maximum iteration times. Solve out
Figure FDA00023516928900000320
Namely, it is
Figure FDA00023516928900000321
(4.4) according to the formula
Figure FDA0002351692890000037
Solving to obtain a mixed coefficient
Figure FDA00023516928900000319
And uses this to derive an estimate of the label probability transition matrix T
Figure FDA00023516928900000318
Figure FDA0002351692890000041
Figure FDA0002351692890000042
Wherein the content of the first and second substances,
Figure FDA0002351692890000044
representation estimation matrix
Figure FDA0002351692890000045
Row i and column k.
4. The fault classification method according to claim 1, wherein in step five, the network second training of the perceptron MLP uses a modified loss function as:
Figure FDA0002351692890000043
CN201911418196.5A 2019-12-31 2019-12-31 Fault classification method based on weak supervision learning multilayer perceptron Active CN111191726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911418196.5A CN111191726B (en) 2019-12-31 2019-12-31 Fault classification method based on weak supervision learning multilayer perceptron

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911418196.5A CN111191726B (en) 2019-12-31 2019-12-31 Fault classification method based on weak supervision learning multilayer perceptron

Publications (2)

Publication Number Publication Date
CN111191726A true CN111191726A (en) 2020-05-22
CN111191726B CN111191726B (en) 2023-07-21

Family

ID=70709761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911418196.5A Active CN111191726B (en) 2019-12-31 2019-12-31 Fault classification method based on weak supervision learning multilayer perceptron

Country Status (1)

Country Link
CN (1) CN111191726B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814962A (en) * 2020-07-09 2020-10-23 平安科技(深圳)有限公司 Method and device for acquiring parameters of recognition model, electronic equipment and storage medium
CN112989971A (en) * 2021-03-01 2021-06-18 武汉中旗生物医疗电子有限公司 Electrocardiogram data fusion method and device for different data sources
CN114925196A (en) * 2022-03-01 2022-08-19 健康云(上海)数字科技有限公司 Diabetes blood test abnormal value auxiliary removing method under multilayer perception network
CN116090872A (en) * 2022-12-07 2023-05-09 湖北华中电力科技开发有限责任公司 Power distribution area health state evaluation method
CN117347788A (en) * 2023-10-17 2024-01-05 国网四川省电力公司电力科学研究院 Power distribution network single-phase earth fault class probability prediction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875771A (en) * 2018-03-30 2018-11-23 浙江大学 A kind of failure modes model and method being limited Boltzmann machine and Recognition with Recurrent Neural Network based on sparse Gauss Bernoulli Jacob
WO2019048324A1 (en) * 2017-09-07 2019-03-14 Nokia Solutions And Networks Oy Method and device for monitoring a telecommunication network
CN110472665A (en) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 Model training method, file classification method and relevant apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019048324A1 (en) * 2017-09-07 2019-03-14 Nokia Solutions And Networks Oy Method and device for monitoring a telecommunication network
CN108875771A (en) * 2018-03-30 2018-11-23 浙江大学 A kind of failure modes model and method being limited Boltzmann machine and Recognition with Recurrent Neural Network based on sparse Gauss Bernoulli Jacob
CN110472665A (en) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 Model training method, file classification method and relevant apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
VAHID GOLMAH,ET AL: "Developing A Fault Diagnosis Approach Based On Artificial Neural Network And Self Organization Map For Occurred ADSL Faults" *
肖涵: "基于高斯混合模型与子空间技术的故障识别研究" *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814962A (en) * 2020-07-09 2020-10-23 平安科技(深圳)有限公司 Method and device for acquiring parameters of recognition model, electronic equipment and storage medium
WO2021151345A1 (en) * 2020-07-09 2021-08-05 平安科技(深圳)有限公司 Method and apparatus for parameter acquisition for recognition model, electronic device, and storage medium
CN111814962B (en) * 2020-07-09 2024-05-10 平安科技(深圳)有限公司 Parameter acquisition method and device for identification model, electronic equipment and storage medium
CN112989971A (en) * 2021-03-01 2021-06-18 武汉中旗生物医疗电子有限公司 Electrocardiogram data fusion method and device for different data sources
CN112989971B (en) * 2021-03-01 2024-03-22 武汉中旗生物医疗电子有限公司 Electrocardiogram data fusion method and device for different data sources
CN114925196A (en) * 2022-03-01 2022-08-19 健康云(上海)数字科技有限公司 Diabetes blood test abnormal value auxiliary removing method under multilayer perception network
CN114925196B (en) * 2022-03-01 2024-05-21 健康云(上海)数字科技有限公司 Auxiliary eliminating method for abnormal blood test value of diabetes under multi-layer sensing network
CN116090872A (en) * 2022-12-07 2023-05-09 湖北华中电力科技开发有限责任公司 Power distribution area health state evaluation method
CN117347788A (en) * 2023-10-17 2024-01-05 国网四川省电力公司电力科学研究院 Power distribution network single-phase earth fault class probability prediction method
CN117347788B (en) * 2023-10-17 2024-06-11 国网四川省电力公司电力科学研究院 Power distribution network single-phase earth fault class probability prediction method

Also Published As

Publication number Publication date
CN111191726B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN111079836B (en) Process data fault classification method based on pseudo label method and weak supervised learning
CN111191726B (en) Fault classification method based on weak supervision learning multilayer perceptron
CN106355030B (en) A kind of fault detection method based on analytic hierarchy process (AHP) and Nearest Neighbor with Weighted Voting Decision fusion
CN103914064B (en) Based on the commercial run method for diagnosing faults that multi-categorizer and D-S evidence merge
CN107274020B (en) Learner subject total measured result prediction system and method based on collaborative filtering thought
CN106843195B (en) The Fault Classification differentiated based on adaptive set at semi-supervised Fei Sheer
CN112200104B (en) Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis
CN108875772B (en) Fault classification model and method based on stacked sparse Gaussian Bernoulli limited Boltzmann machine and reinforcement learning
CN112085252B (en) Anti-fact prediction method for set type decision effect
CN111046961B (en) Fault classification method based on bidirectional long-time and short-time memory unit and capsule network
CN106897774B (en) Multiple soft measurement algorithm cluster modeling methods based on Monte Carlo cross validation
CN108875771A (en) A kind of failure modes model and method being limited Boltzmann machine and Recognition with Recurrent Neural Network based on sparse Gauss Bernoulli Jacob
CN108090515B (en) Data fusion-based environment grade evaluation method
CN111343147B (en) Network attack detection device and method based on deep learning
CN111768000A (en) Industrial process data modeling method for online adaptive fine-tuning deep learning
CN110880369A (en) Gas marker detection method based on radial basis function neural network and application
CN109240276B (en) Multi-block PCA fault monitoring method based on fault sensitive principal component selection
CN112116002A (en) Determination method, verification method and device of detection model
CN111950195B (en) Project progress prediction method based on portrait system and depth regression model
CN115757103A (en) Neural network test case generation method based on tree structure
CN113283288A (en) Nuclear power station evaporator eddy current signal type identification method based on LSTM-CNN
CN115096627A (en) Method and system for fault diagnosis and operation and maintenance in manufacturing process of hydraulic forming intelligent equipment
CN112149884A (en) Academic early warning monitoring method for large-scale students
CN116930042A (en) Building waterproof material performance detection equipment and method
CN110717602A (en) Machine learning model robustness assessment method based on noise data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant