CN117493946A

CN117493946A - VGG network-based small sample indicator diagram diagnosis method

Info

Publication number: CN117493946A
Application number: CN202311841934.3A
Authority: CN
Inventors: 王强; 张峰; 李照川; 王冠军; 张野; 常靓; 李捷明
Original assignee: Inspur Software Technology Co Ltd
Current assignee: Inspur Software Technology Co Ltd
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-02-02

Abstract

The invention discloses a small sample indicator diagram diagnosis method based on a VGG network, which relates to the technical field of mechanical equipment working condition diagnosis and comprises the following steps: constructing a VGG network unit module and constructing a pre-training network; performing dimension reduction on parameters of the pre-training network; constructing a diagnosis classification model, and initializing model parameters by using network parameters after dimension reduction; preparing a small sample indicator diagram data set, and preprocessing the data set; based on the preprocessed data, performing diagnosis classification model training outer circulation, performing diagnosis classification model inner circulation, and optimizing model super-parameters; performing a diagnostic classification model test to obtain optimal model parameters; and (5) carrying out industrial instance testing by using a diagnosis classification model qualified in testing, and guiding equipment working condition diagnosis. The invention can not rely on human experience any more, but allows the model to learn itself, can realize rapid convergence of parameter optimization and improvement of accuracy, and can solve the problem of small sample working condition diagnosis of industrial production mechanical equipment.

Description

VGG network-based small sample indicator diagram diagnosis method

Technical Field

The invention relates to the technical field of mechanical equipment working condition diagnosis, in particular to a small sample indicator diagram diagnosis method based on a VGG network.

Background

In the industrial production process, the working condition diagnosis of mechanical equipment is a complex and important detection task. The traditional manual regular inspection mode depends on experience and judgment of an inspection staff, and needs to spend more manpower and material resources, so that accurate analysis on certain faults cannot be achieved. Domestic production enterprises basically realize remote data acquisition and transmission based on the Internet of things through digital construction for many years, but the problems of fault early warning, fault diagnosis and the like in the production process of mechanical equipment still remain as difficulties in field management.

Many mechanical devices are tightly wrapped on the outer surface, and the working condition cannot be directly observed and judged. The main mode is to acquire relevant parameters by adopting relevant equipment and analyze the working condition of the equipment. The indicator diagram diagnostic method is one of the common analytical means. The indicator diagram is a closed curve of the change of the suspension point load of the mechanical movement device along with the displacement of the suspension point, and can reflect the running state of equipment tools.

The probability of faults of mechanical devices in the actual enterprise production process is different, so that the number of samples among different types of indicator diagrams is unbalanced, the normal working samples are more, and the fault type samples are less. Traditional computer diagnostics rely on necessary mathematical methods or deep networks with many parameters. The method requires a large amount of data, has complex analysis process, long test time and low efficiency, and has poor experimental effect on the working condition of a small sample. Aiming at the problems, a novel artificial neural network architecture is found to realize a rapid diagnosis function, and a proper working condition diagnosis model is selected to improve the diagnosis precision under small sample data, so that the artificial neural network architecture is an important measure for promoting the efficient production of enterprises.

Disclosure of Invention

Aiming at the needs and the shortcomings of the prior art development, the invention provides a small sample indicator diagram diagnosis method based on a VGG network, which starts from two aspects of network structure design and parameter optimization, and solves the working condition diagnosis task under a small number of samples.

The invention discloses a small sample indicator diagram diagnosis method based on a VGG network, which solves the technical problems and adopts the following technical scheme:

a small sample indicator diagram diagnosis method based on VGG network includes the following steps:

s1, constructing a VGG network unit module and building a pre-training network;

s2, reducing the dimension of parameters of the pre-training network, and learning the network parameters after the dimension reduction;

s3, constructing a diagnosis classification model, and initializing model parameters by using the pre-training network parameters after dimension reduction;

s4, preparing a small sample indicator diagram data set, and preprocessing the data set;

s5, training the outer circulation of the diagnosis classification model based on the preprocessed indicator diagram curve;

s6, based on the preprocessed indicator diagram curve, internal circulation of a diagnosis classification model is carried out, and model super-parameters are optimized;

s7, performing a diagnosis classification model test to obtain optimal model parameters;

s8, performing industrial instance testing by using a qualified diagnosis classification model, and guiding equipment working condition diagnosis.

Optionally, the step S1 involved specifically comprises the following operations:

s1.1, selecting a VGG16 network, wherein the VGG16 network structure comprises 5 layers of convolution layers and 2 layers of full connection layers, and pooling layers are used for separation between adjacent convolution layers and between the convolution layers and the adjacent full connection layers, wherein the convolution layers adopt convolution kernels of 3x3, and the pooling layers adopt maximum pooling of 2x 2;

s1.2, adding 1x1 convolution and an equivalent branch on an adjacent layer of a convolution kernel, and constructing a VGG type network unit module by combining a maximum pooling layer and a BN regularization layer;

s1.3, sequentially connecting twenty VGG type network unit modules end to end, and building a twenty-layer VGG type convolution network which is used as a pre-training network;

s1.4, training the pre-training network by using a universal data set imagenet-1k, and taking parameters of the pre-training network as follow-up learning objects after training is finished.

Further optionally, step S2 is performed, and the parameters of the pre-training network are reduced by using 1x1 convolution, where the parameters include a weight parameter and a bias parameter, and the dimension reduction process involves formula (1):

formula (1),

wherein SS represents the present operation; w represents a weight parameter; b represents a bias parameter; x represents a subject of learning; phi _S1 Learning coefficient, Φ, representing weight _S2 The learning coefficient representing the bias.

Further optionally, step S3 is executed to connect five VGG-type network element modules end to end in sequence, and build a five-layer VGG-type convolutional network, where the VGG-type convolutional network is used as a diagnostic classification model.

Optionally, step S4 is performed, and the preprocessing operation performed on the data set includes: sample screening, sample classification and sample normalization, wherein the sample normalization is to normalize sample data, and specifically comprises point normalization, coordinate digitization and size normalization.

Further optionally, step S5 is performed, and the meta learner is used to perform the training of the outer loop of the diagnostic classification model, with the following formula:

formula (2),

where θ relates to a plurality of categories; Θ is a set of parameters θ; θ' corresponds to the optimal parameters acting in the current dataset and can be calculated by sample labeling; beta is the learning rate of the corresponding parameter theta, and has no dimension; l (L) _T Is the error loss corresponding to task T, dimensionless;

based on formula (2), iterating the optimization parameter θ so that the error between θ and the optimal parameter θ' is minimized;

in the process of performing model training external circulation, the number of training samples selected in the meta-training stage is not more than the number of test samples selected in the meta-testing stage.

Further optionally, step S6 is executed to perform gradient iteration of the meta-learner to obtain an optimized meta-learning parameter, that is, a super-parameter, and the following formula is adopted:

formula (3),

where θ relates to a plurality of categories; gamma is the learning rate corresponding to the gradient change of the process, dimensionless; Θ is a set of parameters θ; θ' corresponds to the optimal parameters acting in the current dataset and can be calculated by sample labeling; l (L) _T Is the error loss corresponding to task T, dimensionless.

Optionally, executing step S7, wherein when the diagnosis classification model test is performed, the test data is derived from a working condition indicator diagram of the actual plant mechanical equipment;

and obtaining an actual indicator diagram with labels for a set number of sheets, obtaining a corresponding prediction result as input of the diagnosis classification model, comparing the prediction result with the actual labels, and judging that the diagnosis classification model is qualified in test when the accuracy of the prediction result reaches a set threshold value, wherein the parameters of the diagnosis classification model are the optimal parameters.

The small sample indicator diagram diagnosis method based on the VGG network has the beneficial effects that compared with the prior art:

according to the invention, the VGG network is adopted to construct the VGG network unit module, and the pretrained network and the diagnosis classification model are constructed based on the VGG network unit module, so that the analysis process is quickened by adopting the convolution kernel size with fixed size, and the model parameters are optimized by adopting meta learning, so that the model is not dependent on human experience any more, the model is allowed to learn, the rapid convergence of the parameter optimization and the improvement of the accuracy can be realized, the problem of the diagnosis of the small sample working condition of industrial production mechanical equipment can be effectively solved, and the method has practical application value.

Drawings

FIG. 1 is a block diagram of a method flow according to a first embodiment of the invention;

fig. 2 is a network configuration diagram of a VGG16 network according to the first embodiment of the invention;

fig. 3 is a structural and parameter diagram of a VGG network unit module according to a first embodiment of the invention;

fig. 4 is a network architecture diagram of a VGG network element module according to a first embodiment of the invention;

FIG. 5 is a schematic diagram of theoretical training results of a diagnostic classification model according to an embodiment of the present invention;

FIG. 6 is a graph of P-R for a diagnostic classification model during a test phase in accordance with an embodiment of the present invention;

FIG. 7 is a graph of ROC of a diagnostic classification model during a testing phase in accordance with an embodiment of the invention;

FIG. 8 is a graph showing the effect of a diagnostic classification model on a confusion matrix in actual industrial production in accordance with an embodiment of the present invention.

Detailed Description

In order to make the technical scheme, the technical problems to be solved and the technical effects of the invention more clear, the technical scheme of the invention is clearly and completely described below by combining specific embodiments.

Embodiment one:

referring to fig. 1-4, this embodiment provides a small sample indicator diagram diagnosis method based on VGG network, which includes the following steps:

s1, constructing a VGG network unit module, and constructing a pre-training network, wherein the specific operation is as follows:

s1.3, sequentially connecting twenty VGG type network unit modules end to end, and building a twenty-layer VGG type convolution network which is used as a pre-training network; the pre-training network has only one input layer and one output layer;

In performing step S1.2, reference is made to fig. 3 and 4, which are schematic diagrams of the VGG network element module according to this step, omitting the two parts of maximum pooling and regularization, including two convolution and relu activation functions. And adding 1x1 convolution and an equivalent branch on adjacent layers of the convolution kernel, namely a three-branch network structure. The multi-branch network structure is converted to a single-branch structure using structure re-parameterization. The structure re-parameterization method is as follows: there are a total of three branches, a 3x3 convolution, a 1x1 convolution, and equivalent branches. For the fusion of the 3x3 convolution and the 1x1 convolution, the 1x1 convolution does not change the size of the feature map, so that the consistency of the size of the feature map can be ensured without performing padding operation (increasing the number of pixels of each side), and the 3x3 convolution reduces the feature map, so that in order to ensure that the size of the feature map is unchanged, padding is needed on the original feature map for one circle. The 1x1 convolution can be regarded as a special 3x3 convolution, and only 0 needs to be filled in the periphery, so that the 1x1 convolution kernel can be added in the middle of the 3x3 convolution kernel, and the convolution branch fusion can be completed. Equivalent branches are equivalent to a special (convolution kernel with identity matrix) 1x1 convolution and thus a special 3x3 convolution. The problem now is to represent the equivalent branches with one convolution layer so that fusion is possible. In order to keep the equivalent front and back values unchanged, a convolution kernel with a weight equal to 1 is chosen here and the convolutions are performed separately, i.e. 1x1, depthwise convolutions with a weight fixed to 1. This corresponds to multiplying each element of each channel by 1 separately and then outputting again, which is the equivalent operation required later. All three branches have BN layers (regularization layers), and the convolutional layer at the time of reasoning and the following BN layers can be converted into one convolutional layer with offset parameters. And finally, respectively adding the convolution kernels and the offset parameters obtained by the three branches to finish the equivalent conversion into a single-branch network model with only 3x3 convolution. The essence of the structural re-parameterization is that the structure corresponds to one group of parameters during training, and the target structure corresponds to the other group of parameters during reasoning, so long as the parameters of the former can be equivalently converted into the latter, the structural equivalence of the former can be converted into the latter.

S2, performing dimension reduction on parameters of the pre-training network by adopting 1x1 convolution, wherein the parameters comprise weight parameters and bias parameters, and learning the network parameters after dimension reduction.

The step adopts 1x1 convolution to perform parameter dimension reduction, thereby reducing the parameter quantity, which is equivalent to mapping the convolution layer of the pre-training network to the 1x1 convolution by using a designated coefficient. The dimensionality reduction effect of the 1x1 convolution is explained here: so-called 1x1 is by default 1x1 in width and height, but for high dimensions it should be that width and height are unchanged, but channel (number of channels) is dimension-reduced, for example when the input is 32x32x10, the form of 1x1 convolution is 1x1x10 (in case of only one convolution kernel), and the output is 32x32x1. The substantial effect of the 1x1 convolution can be appreciated at this time: and (5) reducing the dimension. Because a convolution kernel essentially integrates all of the input channels and then outputs one channel. The convolution layer outputting the feature map of the plurality of channels is essentially due to the fact that the plurality of convolution kernels each output a feature map. That is, each convolution kernel will only produce one signature, but its information is actually sampled from each input channel. When the number of 1x1 convolution kernels is smaller than the number of input channels, the dimension is reduced. The 1x1 convolution can also be considered here to be similar to a full join operation. The 1x1 convolution internal parameters are similar to bias and weight, corresponding to equation (1):

formula (1),

S3, connecting five VGG type network unit modules end to end in sequence, and building a five-layer VGG type convolution network which is used as a diagnosis classification model; the diagnostic classification model has only one input layer and one output layer;

model parameter initialization is performed using the post-dimensionality pre-training network parameters W and b.

S4, preparing a small sample indicator diagram data set, and preprocessing the data set.

The preprocessing operation performed on the data set includes: sample screening, sample classification and sample normalization, wherein the sample normalization is to normalize sample data, and specifically comprises point normalization, coordinate digitization and size normalization.

Taking an oil pumping unit indicator diagram of an oil field as an example:

selecting actual pumping well data, including oil well related data, pumping rod displacement, load change along with time, daily liquid production, maximum load and the like, and drawing a showing diagram through displacement data and suspension point load data obtained through analysis of suspension point motion of a pumping unit;

sample screening: a total of 8134 samples, 8000 samples were obtained by sample screening; for the graph curve data, 30 types of samples can be obtained based on the diagnosis results of the corresponding working states, however, the data volume of each category is greatly different, so that the problem of data unbalance among the samples is more serious;

sample classification: there may be many problems in the operation of the pumping unit, but the main status types are divided into five types: the five types of indicator diagrams are selected as classification targets of theoretical experiments, wherein the indicator diagrams work normally, are insufficient in liquid supply, are influenced by gas, are discharged and are sucked;

sample normalization: sample data is normalized from three aspects: dot normalization, coordinate digitization, and size normalization.

After normalized data is obtained, drawing an indicator diagram curve according to the characteristic points, selecting a filter to smooth the curve in consideration of the influence of noise such as vibration load and the like when the oil pumping unit works, and providing the smoothed indicator diagram curve for subsequent use of a diagnosis classification model.

S5, based on the smoothed indicator diagram curve, the external circulation is trained by the diagnosis classification model, and the diagnosis classification error loss is reduced.

In the process, 5 classifications and 5 training samples are selected in a meta-training stage, 5 classifications and 15 test samples are selected in a meta-testing stage, and the training samples and the test samples are all from a data set consisting of smoothed indicator diagram curves; the samples in the meta training and meta testing stages are 5 types respectively, and the samples are 30 types of samples obtained after the previous sample screening, but belong to different types, namely 10 types in total; the sample data of the meta training stage and the meta testing stage are not directly related, but belong to working condition diagrams of mechanical equipment, and knowledge can be migrated. The more number of samples in the meta-test stage is used for better verifying the performance of the classification model and avoiding the accidental of the test result. The outer loop first trains all samples and optimizes parameters appropriate for the given sample. The outer loop is used to calculate the performance level of the network hyper-parameters, where the hyper-parameters refer to the pre-default partial parameters that were previously set by the person, whereas the hyper-parameters of the diagnostic classification model of the present embodiment are learned.

In this embodiment, the meta learner is specifically used to perform the training of the external loop of the diagnostic classification model, and the following formula is adopted:

formula (2),

where θ relates to a plurality of categories; Θ is a set of parameters θ; θ' corresponds to the optimal parameters acting in the current dataset and can be calculated by sample labeling; beta is the learning rate of the corresponding parameter theta, and has no dimension；L _T Is the error loss corresponding to task T, dimensionless;

based on equation (2), the loop iterates optimizing the parameter θ such that the error of θ and the optimal parameter θ' is minimized.

Fig. 5 is a schematic diagram of a theoretical training result of the diagnostic classification model according to the embodiment, and an accuracy graph is drawn according to the accuracy change obtained in the meta training stage. As can be seen from the figure, the accuracy of the training stage can reach the maximum in the initial stage, the network convergence speed is high, the optimization efficiency is high, and the road is paved for the subsequent meta-test stage.

S6, based on the preprocessed indicator diagram curve, internal circulation of a diagnosis classification model is carried out, and model super-parameters are optimized, wherein the specific process is as follows:

performing gradient iteration of the meta-learner to obtain optimized meta-learning parameters, namely super-parameters, and adopting the following formula:

formula (3),

And S7, performing a diagnosis classification model test to obtain optimal model parameters.

When the diagnosis classification model test is carried out, test data are derived from a working condition indicator diagram of the mechanical equipment of an actual factory;

and obtaining an actual indicator diagram with labels for a set number of sheets, obtaining a corresponding prediction result as input of a diagnosis classification model, comparing the prediction result with the actual labels, and judging that the diagnosis classification model is qualified when the accuracy of the prediction result reaches a set threshold value of 95%, wherein the parameters of the diagnosis classification model are optimal parameters.

FIG. 6 is a graph of P-R for the diagnostic classification model during the test phase in this embodiment, with the recall on the abscissa and precision on the ordinate, which helps to make good trade-offs later. It can be seen from the figure that the area enclosed by the P-R curves of all categories is relatively large and that the model as a whole has a good ability to determine the wrong category. This shows that the network model of the present invention is more discriminative for the indicator diagram samples that do not belong to a particular class.

Fig. 7 is a ROC graph of the diagnostic classification model in the test phase in this embodiment. ROC curves, also known as receiver operating characteristics. The ordinate corresponding to the curve is the true sample rate, and is the proportion of the predicted positive sample number to the actual positive sample number. The abscissa is the false positive rate, referring to the proportion of negative samples predicted to be positive samples. The ideal goal is: the true case rate is 1, and the false case rate is 0, i.e., point (0, 1) in the figure. Thus, the closer the ROC curve is to the (0, 1) point, i.e., the greater the distance from the 45 degree diagonal, the better. As can be seen from the figure, the curve deviates a great distance from the 45 degree diagonal, which indicates that the classification of the classifier herein works quite well.

FIG. 8 is a diagram showing the effect of the diagnostic classification model on the confusion matrix in actual industrial production in this embodiment.

In summary, the small sample indicator diagram diagnosis method based on the VGG network can not depend on human experience any more, but allows the model to learn, can realize rapid convergence of parameter optimization and improvement of accuracy, and can efficiently solve the problem of small sample working condition diagnosis of industrial production mechanical equipment.

The foregoing has outlined rather broadly the principles and embodiments of the present invention in order that the detailed description of the invention may be better understood. Based on the above-mentioned embodiments of the present invention, any improvements and modifications made by those skilled in the art without departing from the principles of the present invention should fall within the scope of the present invention.

Claims

1. The small sample indicator diagram diagnosis method based on the VGG network is characterized by comprising the following steps of:

s1, constructing a VGG network unit module and building a pre-training network;

2. The VGG network-based small sample indicator diagram diagnosis method according to claim 1, wherein the step S1 specifically comprises the following operations:

3. The small sample indicator diagram diagnosis method based on the VGG network according to claim 2, wherein the step S2 is executed, the parameters of the pretrained network are reduced in dimension by adopting 1x1 convolution, the parameters comprise weight parameters and bias parameters, and the dimension reduction process involves the formula (1):

formula (1),

4. The small sample indicator diagram diagnosis method based on VGG network as set forth in claim 2, wherein step S3 is executed to connect five VGG type network unit modules end to end in turn to build a five-layer VGG type convolution network, and the VGG type convolution network is used as a diagnosis classification model.

5. The VGG network-based small sample indicator diagram diagnosis method according to claim 1, wherein the performing step S4, the preprocessing operation on the data set, comprises: sample screening, sample classification and sample normalization, wherein the sample normalization is to normalize sample data, and specifically comprises point normalization, coordinate digitization and size normalization.

6. The small sample indicator diagram diagnosis method based on VGG network of claim 1, wherein step S5 is executed, the outer loop is trained by using a meta learner for a diagnosis classification model, and the following formula is adopted:

formula (2)，

7. The VGG network-based small sample indicator diagram diagnosis method according to claim 6, wherein step S6 is performed to perform gradient iteration of the meta learner to obtain the optimized meta learning parameter, that is, the super parameter, by using the following formula:

formula (3),

8. The small sample indicator diagram diagnosis method based on VGG network as set forth in claim 1, wherein the step S7 is executed, and when the diagnosis classification model test is performed, the test data is derived from the working condition indicator diagram of the actual plant mechanical equipment;