CN117668493A

CN117668493A - Tobacco equipment fault prediction method and system

Info

Publication number: CN117668493A
Application number: CN202311756833.6A
Authority: CN
Inventors: 蔡嘉; 杨渊策; 贾宁; 林琴萍; 陈燎
Original assignee: Tianjin Yingzhi Technology Co ltd
Current assignee: Tianjin Yingzhi Technology Co ltd
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-03-08

Abstract

The invention discloses a tobacco equipment fault prediction method and a system, wherein the method comprises the steps of obtaining historical data of target tobacco equipment; preprocessing historical data to obtain feature extraction data; generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data; constructing and optimizing a tobacco equipment fault prediction model; constructing a KD-WaveNet fault prediction model comprising an input module, an expansion convolution module and an output module; optimizing a KD-WaveNet fault prediction model by using a knowledge distillation method based on a generated countermeasure network; and predicting the fault probability according to the real-time data of the target tobacco equipment and the KD-WaveNet fault prediction model, and determining the fault condition at the next moment according to the set fault threshold. The invention can predict equipment faults in time and provide important early warning information for maintenance personnel.

Description

Tobacco equipment fault prediction method and system

Technical Field

The invention relates to the technical field of tobacco equipment maintenance, in particular to a tobacco equipment fault prediction method and system.

Background

The importance of the device is self-evident in manufacturing businesses, particularly businesses that rely on highly automated devices such as tobacco companies. Meanwhile, as the enterprise scale increases, the outage loss due to equipment failure increases dramatically, and one outage may result in hundreds of millions of dollars loss for the enterprise. In an automated manufacturing environment, sudden failure of equipment may even lead to dangerous events such as fires, explosions, etc., which constitute a life threat to operators and maintenance personnel. And therefore maintenance and management of tobacco company equipment becomes particularly important.

Equipment maintenance plays an irreplaceable role in the process of enterprise production and operation. However, most of the current maintenance schemes are regular preventive maintenance and post-diagnostic maintenance. On one hand, because the equipment has different conditions, the regular maintenance is unavoidable to cause excessive maintenance or insufficient maintenance; on the other hand, the post-diagnosis maintenance belongs to non-planned maintenance, the whole production planning activity is easily disturbed in the maintenance process, and the operation of a company is greatly influenced. In addition, in the field of fault prediction which is gradually rising, the problem that efficiency and precision are difficult to balance exists, and the requirement of high efficiency and accuracy of enterprises is difficult to meet.

Disclosure of Invention

The invention aims to provide an accurate and efficient tobacco equipment fault prediction method and system. The prediction model provided by the invention has the advantages of simple structure and easiness in implementation, equipment faults can be accurately predicted in a mode of convenient deployment and real-time efficient prediction, and important early warning information is provided for maintenance personnel.

In order to achieve the above object, the present invention provides the following solutions:

a tobacco equipment failure prediction method, comprising:

acquiring and preprocessing historical data of target tobacco equipment, wherein the historical data comprises equipment operation data, external environment data and fault data; preprocessing the historical data to obtain feature extraction data; generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data;

constructing and optimizing a fault prediction model, wherein the construction is to construct a KD-WaveNet fault prediction model comprising an input module, an expansion convolution module and an output module; the input module is used for adjusting the data dimension; the expansion convolution module is used for extracting and analyzing data characteristics; the output module is used for outputting fault probability; the optimizing is to optimize the KD-WaveNet fault prediction model by using a knowledge distillation method based on generating an countermeasure network;

acquiring real-time data generated by the operation of target tobacco equipment, inputting the real-time data into the KD-WaveNet fault prediction model for analysis, outputting the fault probability of the next moment, comparing the fault probability with a preset fault threshold value, judging the fault condition of the tobacco equipment at the next moment, and alarming in time to remind maintenance personnel; the fault conditions include faulty and non-faulty.

Optionally, preprocessing the historical data to obtain feature input data, which specifically includes:

performing outlier processing and normalization on the historical data to obtain model input data;

extracting features from the model input data by using a one-dimensional cavity convolution network to obtain feature extraction data; the one-dimensional cavity convolution network comprises two layers of one-dimensional cavity convolution layer extraction features, and one layer of the largest pooling layer reduces the dimension; and adding a BN layer between the one-dimensional cavity convolution layers for normalization.

Optionally, generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method, which specifically includes:

constructing a strategy space; the policy space contains a plurality of policies; the strategy is composed of a plurality of sub-strategies; each sub-policy includes a plurality of data enhancement operations;

performing strategy searching on the strategy space based on a Bayesian optimization method to generate a candidate strategy set; dividing the feature extraction data into K sub-data sets, performing Bayesian optimization on the sub-data sets respectively, selecting optimal N times of sampling as candidate strategies by each Bayesian optimization, and repeating the Bayesian optimization for T times on each sub-data set to generate Y candidate strategy sets; the number Y=K×N×T of the candidate strategy sets;

performing policy evaluation on the candidate policy set by using a density-based method, and finding out an optimal data generation policy; dividing the feature extraction data into a training set Dtrain and a verification set Dvalid, training a random forest prediction model on the Dtrain, predicting on the Dvalid by using the random forest model, judging the similarity degree between Dtarn and the Dvalid based on the minimization of a loss function, and finding out an optimal data generation strategy; the Dtrain is not enhanced by data, and the Dvalid is enhanced by data;

and generating fault data of the feature extraction data by using the optimal data generation strategy, and outputting balance data.

Optionally, constructing the fault prediction model specifically includes:

the KD-WaveNet fault prediction model is built, and comprises an input module, two identical expansion convolution modules and an output module, wherein a gating activation unit and residual error and jump connection are introduced to relieve gradient explosion and accelerate training speed; the input module is used for adjusting the data dimension and accords with the model analysis requirement; the expansion convolution module is used for extracting characteristics and analyzing information; the output module is used for outputting the fault probability.

Optionally, pre-training the fault prediction model specifically includes:

and inputting the balance data into the KD-WaveNet fault prediction model, training with the minimum loss between the fault prediction value and the fault true value as a target, searching for an optimal KD-WaveNet fault prediction model, and reducing the optimization time.

Optionally, the optimization method of the fault prediction model is as follows:

building and training a teacher network with excellent performance; the teacher network is a Multi-Channel Transformer model and comprises a data input and processing layer, an encoder, a decoder and an output layer; the data input and processing layer comprises a patch dividing and position coding layer; the encoder comprises a locally enhanced multi-head attention mechanism, a forward propagation layer, two residual error connection and layer standardization modules; the decoder comprises two locally enhanced multi-head attention mechanisms, a forward propagation layer, two residual error connection and layer standardization modules; the output layer comprises a flat layer and a linear layer; the training process is to find an optimal teacher network based on minimization of a loss function;

training a KD-WaveNet fault prediction model through knowledge distillation; constructing a generated countermeasure network, taking the KD-WaveNet fault prediction model as a student network and a generator, inputting characteristic information output by the teacher network Multi-Channel Transformer and characteristic information output by the student network KD-WaveNet into a discriminator, judging the similarity degree of the two, and training based on the generated countermeasure network until the difference between the characteristic information output by the student network and the teacher network is minimum; training a KD-WaveNet student network through the weighted sum predictive value minimization of the soft tag and the hard tag to obtain an optimal KD-WaveNet fault predictive model; the soft label is the difference value between the teacher network predicted value and the student network predicted value; the hard tag is the difference between the student network predicted value and the actual value.

Optionally, searching for an optimal teacher network by using loss function minimization specifically includes:

and uniformly dividing the balance data into a plurality of non-repeated time periods patch, regarding the non-repeated time periods patch as a predicted minimum unit, inputting the patch into the data input and processing layer, splitting M dimensions, independently putting each dimension into the teacher network for training, respectively outputting the fault probability at the next moment, carrying out weighted summation on the M predicted probabilities through the output layer, taking the weighted sum predicted value as a final predicted value, and obtaining a teacher network Multi-Channel Transformer model with the optimal effect according to the minimization of a loss function.

The invention also provides a tobacco equipment fault prediction system, which comprises:

the data collection and processing module is used for acquiring historical data of the target tobacco equipment and preprocessing the historical data to obtain feature extraction data; the historical data comprises equipment operation data, external environment data and fault data; generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data;

the off-line model construction and optimization module is used for receiving balance data of the tobacco equipment, constructing a fault prediction model to analyze and predict the balance data, optimizing the fault prediction model by using a knowledge distillation method based on a generated countermeasure network, and realizing construction and optimization of the fault prediction model; the fault prediction model is KD-WaveNet and comprises an input module, an expansion convolution module and an output module;

the on-line real-time prediction module is used for receiving data generated by real-time running of the equipment, inputting the real-time data into the KD-WaveNet fault prediction model for analysis and prediction, outputting the fault probability at the next moment, comparing the fault probability with a preset fault threshold value, judging whether the tobacco equipment breaks down at the next moment, and alarming in time to remind maintenance personnel.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the invention discloses a tobacco equipment fault prediction method and a system, wherein the method comprises the steps of obtaining historical data of target tobacco equipment; the historical data comprises equipment operation data, external environment data and fault data; preprocessing historical data to obtain feature extraction data; generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data; constructing and optimizing a tobacco equipment fault prediction model; constructing a KD-WaveNet fault prediction model comprising an input module, an expansion convolution module and an output module; the optimizing is to optimize the KD-WaveNet fault prediction model by using a knowledge distillation method based on generating an countermeasure network; and predicting according to the real-time data of the target tobacco equipment and the KD-WaveNet fault prediction model to obtain the fault prediction probability of the next moment, and comparing the fault prediction probability with a set fault threshold value to obtain the fault condition of the next moment. The invention can predict equipment faults in time and provide important early warning information for maintenance personnel.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for predicting faults of tobacco equipment according to the present invention;

FIG. 2 is a logic flow diagram of a method for predicting failure of tobacco equipment in the present embodiment;

FIG. 3 is an overall structure diagram of KD-WaveNet in this embodiment;

fig. 4 is an overall construction diagram of Multi-Channel Transformer in the present embodiment;

fig. 5 is a diagram showing the whole structure of knowledge distillation based on generation of an countermeasure network in the present embodiment.

FIG. 6 is a system schematic diagram of a tobacco plant failure prediction method of the present invention;

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide a tobacco equipment fault prediction method and system, which can accurately predict equipment faults in time and provide important early warning information for maintenance personnel.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

As shown in fig. 1-2, the present invention provides a method for predicting a failure of tobacco equipment, comprising:

step 100: acquiring and preprocessing historical data of target tobacco equipment, wherein the historical data comprises equipment operation data, external environment data and fault data; preprocessing the historical data to obtain feature extraction data; generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data;

step 200: constructing and optimizing a fault prediction model, wherein the construction is to construct a KD-WaveNet fault prediction model comprising an input module, an expansion convolution module and an output module; the input module is used for adjusting the data dimension; the expansion convolution module is used for extracting and analyzing data characteristics; the output module is used for outputting fault probability; the optimizing is to optimize the KD-WaveNet fault prediction model by using a knowledge distillation method based on generating an countermeasure network;

step 300: acquiring real-time data generated by the operation of target tobacco equipment, inputting the real-time data into the KD-WaveNet fault prediction model for analysis, outputting the fault probability of the next moment, comparing the fault probability with a preset fault threshold value, judging the fault condition of the tobacco equipment at the next moment, and alarming in time to remind maintenance personnel; the fault conditions include faulty and non-faulty.

As a specific embodiment of step 100, it includes:

performing outlier processing and normalization on the historical data to obtain model input data; carrying out one-dimensional cavity convolution network feature extraction on the model input data to obtain feature extraction data; and generating fault data for the feature extraction data by using a Fast Autoautomatic enhancement method to obtain balance data.

The specific method for extracting the characteristics comprises the following steps:

inputting the model input data into a one-dimensional cavity convolution network, and outputting feature extraction data; the one-dimensional cavity convolution network comprises two layers of one-dimensional cavity convolution layer extraction features, and one layer of the largest pooling layer reduces the dimension; and adding a BN layer between the one-dimensional cavity convolution layers for normalization.

The specific method for generating the fault data comprises the following steps:

constructing a strategy space; the policy space contains a plurality of policies; the strategy is composed of a plurality of sub-strategies; the sub-policy includes a plurality of data enhancement operations; performing strategy searching on the strategy space based on a Bayesian optimization method to generate a candidate strategy set; performing policy evaluation on the candidate policy set by using a density-based method to find out an optimal data enhancement policy; and generating fault data of the feature extraction data by using the optimal data generation strategy, and outputting balance data.

The specific method for searching the strategy comprises the following steps:

dividing the feature extraction data into K sub-data sets, performing Bayesian optimization on the sub-data sets respectively, selecting optimal N times of sampling as candidate strategies by each Bayesian optimization, and repeating the Bayesian optimization for T times on each sub-data set to generate Y candidate strategy sets; the number y=k×n×t of candidate policy sets.

The specific method for policy evaluation comprises the following steps:

dividing the feature extraction data into a training set Dtrain and a verification set Dvalid, training a random forest prediction model on the Dtrain, predicting on the Dvalid by using the random forest model, judging the similarity degree between the data sets based on the minimization of a loss function, and finding out an optimal data generation strategy; the Dtrain is not data enhanced, and the Dvalid is data enhanced.

As a specific embodiment of step 200, it includes:

the KD-WaveNet fault prediction model is built, and comprises an input module, two identical expansion convolution modules and an output module, wherein a gating activation unit and residual error and jump connection are introduced to relieve gradient explosion and accelerate training speed; pre-training the KD-WaveNet fault prediction model; pre-training the KD-WaveNet fault prediction model by using a loss minimization method, so as to reduce optimization complexity; and optimizing the KD-WaveNet fault prediction model by using a knowledge distillation method based on a generated countermeasure network to obtain the KD-WaveNet fault prediction model with the best efficiency.

The optimization method of the fault prediction model comprises the following steps:

constructing a teacher network Multi-Channel Transformer with excellent performance; training the teacher network based on the minimization of the loss function to obtain an optimal model; the KD-WaveNet failure prediction model was optimized using a knowledge distillation method based on generating an antagonism network.

The specific method for building the teacher model Multi-Channel Transformer comprises the following steps:

the teacher network Multi-Channel Transformer model comprises a data input and processing layer, an encoder, a decoder and an output layer; the data processing layer comprises a word embedding layer and a position coding layer; the encoder comprises a locally enhanced multi-head attention mechanism, a forward propagation layer, two residual error connection and layer standardization modules; the decoder comprises two locally enhanced multi-head attention mechanisms, a decoder of a forward propagation layer, two residual error connection and layer standardization modules; the output layer includes a flat layer and a linear layer.

The method for searching the optimal teacher network by using the minimization of the loss function specifically comprises the following steps:

The specific method for optimizing the KD-WaveNet fault prediction model by using the knowledge distillation method based on the generation of the countermeasure network comprises the following steps:

constructing a generated countermeasure network, taking the KD-WaveNet fault prediction model as a student network and a generator, inputting characteristic information output by the teacher network Multi-Channel Transformer and characteristic information output by the student network KD-WaveNet into a discriminator, judging the similarity degree of the two, and training based on the generated countermeasure network until the difference between the characteristic information output by the student network and the teacher network is minimum; training a KD-WaveNet student network through the weighted sum predictive value minimization of the soft tag and the hard tag to obtain an optimal KD-WaveNet fault predictive model; the soft label is the difference value between the teacher network predicted value and the student network predicted value; the hard tag is the difference between the student network predicted value and the actual value.

Based on the technical scheme, a specific implementation mode is provided by taking a rail guided vehicle of a tobacco company as an example.

S1, collecting historical data of a rail guided vehicle of target tobacco equipment, and constructing a data set;

s2, preprocessing the data set, and converting the collected real data set into information required by model analysis, namely model input data;

s3, building a one-dimensional cavity convolutional neural network to extract data features and generating feature extraction data;

s4, performing fault data generation on the feature extraction data by using a Fast Autoautomatic enhancement algorithm to obtain balance data;

s5, building and pre-training a KD-WaveNet fault prediction model;

s6, building and training a Multi-Channel Transformer fault prediction model;

s7, optimizing a KD-WaveNet fault prediction model by using a knowledge distillation method based on a generated countermeasure network, namely training the KD-WaveNet fault prediction model through a Multi-Channel Transformer fault prediction model;

s8, acquiring real-time data, analyzing and predicting the real-time data through a KD-WaveNet fault prediction model to obtain the fault condition at the next moment, and early warning in time before the fault actually occurs.

Further, the specific method for constructing the data set of the tobacco device in the step S1 is as follows:

s1-1, collecting operation characteristic data of tobacco equipment by means of a voltage and current sensor, wherein the operation characteristic data comprise voltage and current characteristic values of the equipment operation per se, and dividing the operation characteristic data according to a certain standard time interval;

s1-2, collecting external environment characteristic features of the operation of tobacco equipment by means of temperature and humidity sensors, wherein the external environment characteristic features comprise the temperature and humidity characteristic values of the operation of the equipment, and dividing the external environment characteristic values according to the same standard time interval;

s1-3, collecting fault log data of the tobacco equipment, giving a value of 1 to a time period of occurrence of faults, setting the same time interval as the time interval of a sensor, and facilitating subsequent matching;

s1-4, splicing the tobacco equipment characteristic data and the fault data by using a time index, and establishing a prediction table.

Further, the specific method for preprocessing the data in the step S2 is as follows:

s2-1, performing missing value processing on a prediction table: filling the missing lines of the individual characteristic data by using a Newton interpolation method, and removing the data lines with all the characteristics missing;

s2-2, carrying out outlier processing on the prediction table: carrying out descriptive statistics on the characteristics, and processing unreasonable values and outliers by combining a K-means clustering method to improve the prediction performance;

s2-3, normalizing a prediction table: selecting a MinMaxScale data standardization method, standardizing all samples in a prediction table to a [0,1] interval, and eliminating dimension influence;

further, the specific method for filling the missing values by Newton interpolation in the step S2-1 is as follows:

P(x)＝f(x ₀ )+f(x ₀ ，x ₁ )(x-x ₀ )+f(x ₀ ，x ₁ ，x ₂ )(x-x ₀ )(x-x ₁ )+…+f(x ₀ ，x ₁ ，…，x _n )(x-x ₀ )(x-x ₁ )…(x-x _n )

wherein P (x) represents a deficiency value, f (x) ₀ ，x ₁ ，...，x _n ) The n-th order parameter representing the data is calculated as follows:

wherein, (x) ₀ ，y ₀ )，(x ₁ ，y ₁ )，...，(x _n ，y _n ) Is a group of data points, in particular x ₀ ＜x ₁ ＜…＜x _n ，f(x _i )＝y _i 。

Further, the specific method for normalizing MinMaxScale data in the step S2-3 is as follows:

wherein y is _i Refers to the ith data, x after normalization _i Referring to the original i-th data,refers to the maximum value in the original data, +.>Refers to the minimum value in the original data.

Further, the specific method for constructing the one-dimensional cavity convolutional neural network in the step S3 is as follows:

s3-1, building a one-dimensional cavity convolution layer to extract characteristics: according to the invention, two one-dimensional cavity convolution layers are adopted to extract data characteristics, and a BN layer is added between the two one-dimensional cavity convolution layers for normalization;

s3-2, constructing a maximum pooling layer to perform dimension reduction treatment on the data: the invention adopts the maximum pooling layer to carry out downsampling on the data in each channel, thereby reducing the data dimension. In the process of maximum pooling, the length of each feature vector is compressed to be half of the original length, so that the running speed and accuracy of the model can be further improved;

s3-3, inputting data into the model, and outputting the processed feature extraction data.

Further, the specific method for generating the data in the step S4 is as follows:

s4-1, constructing a strategy space: the policy space contains several policies, each containing 5 sub-policies, each containing 2 data enhancement operations, each corresponding to 1 data enhancement method, with 3 parameters to be adjusted, i.e. the specific data enhancement method used, the probability p used and the intensity λ used. The operations in each sub-policy are serially combined together to form a policy pool. The data enhancement operation comprises data overturning, data scaling, data replacement, window normalization, window slicing, gaussian noise addition and abnormal label expansion;

s4-2, searching strategies based on a Bayesian optimization method, and generating a candidate strategy set: splitting the feature extraction data into 5 sub-data sets, respectively performing Bayesian optimization on the 5 sub-data sets, selecting optimal 10 samples as candidate strategies by each Bayesian optimization, and repeating the Bayesian optimization for 2 times on the data sets to generate strategy sets of 100 candidates;

s4-3, performing strategy evaluation by using a density-based method to obtain an optimal data generation strategy: dividing feature extraction data into a training set Dtrain and a verification set Dvalid, wherein the Dtrain is not subjected to data enhancement processing, the Dvalid is subjected to data enhancement to realize balance, a random forest prediction model is trained on the Dtrain data set, the random forest model is used for prediction on the Dvalid data set, the compliance degree of the training set Dtrain and the verification set Dvalid is judged through a loss function of a prediction result, and an optimal data generation strategy is found out;

s4-4, generating fault data by using an optimal data generation strategy to obtain balance data.

Further, the specific steps of constructing and pre-training the KD-WaveNet fault prediction model in step S5 are as follows, and the specific architecture thereof is shown in fig. 3:

and S5-1, building a KD-WaveNet fault prediction model, wherein the model comprises an input module, an expansion convolution module, residual error and jump connection, a gate control activation unit and an output module. The input module is used for adjusting the data dimension; the expansion convolution module is used for increasing receptive fields and enhancing the capturing capacity of the model to the characteristics; the residual error and jump connection are used for relieving gradient explosion and accelerating training; the gating activation unit is used for adjusting output information; the output module outputs fault probability;

s5-1, constructing two identical expansion convolution modules, setting the maximum expansion factor of each expansion convolution module to be 64, enabling the width of a convolution layer filter to be 2, and enabling each layer of convolution to adopt 32 filters; the expansion convolution module comprises a causal convolution and an expansion convolution; the causal hole convolution layer is used for processing time sequence data and long-term dependence, so that complexity is reduced; the expansion convolution layer comprises a cavity convolution layer, two activation functions (one tanh, one sigmoid) and two one-dimensional convolution layers, the receptive field is enlarged, and sequence information is captured;

s5-2, constructing residual connection, adopting residual connection between the convolution layers to avoid gradient disappearance, and simultaneously leading a gating activation unit to adjust information of a next layer by each layer of convolution, wherein the gating activation unit is shown in the following formula:

z＝tang(W _f ，k*x)⊙σ(W _g ，k*x)

wherein, represents convolution operation, by which is by-represented point multiplication operation, σ () is a sigmoid function, k is a layer index, f and g are respective filters and gates, and W is a learnable convolution kernel;

s5-3, constructing an output layer comprising two one-dimensional convolution layers and two ReLu activation functions, and outputting the result through softmax;

s5-4, taking the balance data as input data, taking the fault probability as output data, minimizing a pre-training model by means of a loss function MSE, and reducing the subsequent optimization time.

Further, the specific steps for constructing the Multi-Channel Transformer fault prediction model in the step S6 are as follows, and the specific architecture thereof is shown in fig. 4:

s6-1, constructing a data input and processing layer: carrying out average non-overlapping division on input data to obtain N patches, regarding the patches as the minimum unit of prediction, and carrying out position coding on each patch by adopting a trigonometric function position coding formula;

s6-2, constructing an encoder: constructing a locally enhanced multi-head attention mechanism, a forward propagation layer, two residual error connection and layer standardization modules, and converting input multi-channel coding information into a plurality of groups of expression vectors through encoders respectively;

s6-3, building a decoder: the method comprises the steps of taking a representation vector processed by an encoder and an output vector at the previous moment as input data, constructing two locally enhanced multi-head attention mechanisms, two residual error connection and layer standardization modules, and a forward propagation layer, and converting the input data into an output sequence through a feedforward neural network;

s6-4, building an output layer: the flatten layer and the linear layer were built and the results were exported via the softmax layer. Flattening the input vector, flattening the multidimensional vector into a one-dimensional vector by using a weighted summation method, and obtaining a final fault probability vector;

s6-5, training a Multi-Channel Transformer model: the method comprises the steps of dividing balance data into independent channel data according to characteristic dimensions by adopting a channel independent idea, respectively putting single channel data as input data into model training, outputting the prediction probability of the next prediction unit, and searching an optimal Multi-Channel Transformer model according to the minimization of a loss function MSE.

Further, the specific steps of optimizing the model in step S7 by using the knowledge distillation method are as follows, and the specific architecture is shown in fig. 5:

s7-1, constructing a discriminator, inputting the generated characteristics of a Multi-Channel Transformer model and a KD-WaveNet fault prediction model, and enabling the characteristics of the model and the KD-WaveNet fault prediction model to be close to be consistent based on generating an antagonistic network training KD-WaveNet fault prediction model;

s7-2, training a KD-WaveNet fault prediction model by using loss function minimization, wherein the specific expression of the loss function is as follows:

L＝α·L _soft +(1-α)·L _hard

wherein,of finger typeIs the prediction result output by teacher network Multi-Channel Transformer,>refers to the prediction result output by the student network KD-WaveNet,/I>Refers to the true value, L _soft Refers to the difference value between the teacher network predicted value and the student network predicted value, L _hard Refers to the difference between the student's network predicted value and the actual value, the weighted sum of the two differences being the final loss function.

Further, the specific steps of the online real-time prediction in step S8 are as follows:

s8-1, inputting real-time data into a trained KD-WaveNet fault prediction model, and outputting the fault probability y at the next moment _t ；

S8-2, comparing the predicted fault probability with a preset fault threshold value, and if the predicted fault probability exceeds the threshold value, alarming in time to remind maintenance personnel of the tobacco enterprises of paying attention to maintenance.

The final aim to be achieved by the patent is to predict whether the next moment is faulty or not and early warn in time, so that the classification index AUC, accuracy and recall rate are used for evaluating the models.

In view of the extreme unbalance of fault data used in the patent, the situation that the classification index is too low may occur, and the actual effect of the model cannot be comprehensively reflected. Meanwhile, considering that the focus of the patent research is the fault prediction of the guided vehicle, under the background, the patent pays more attention to AUC as an index for evaluating the performance of the model, and pays more attention to recall as a measure for the accuracy of the fault prediction. Therefore, the present patent evaluates the performance of the model in the fault prediction of the guided vehicle mainly through AUC and recall rate, so as to accurately judge the performance of the model.

TABLE 1 comparison and evaluation of various model Performance

As can be seen from table 1:

1) The teacher model Multi-Channel transformer model provided by the patent has the best effect, the AUC is 0.86, and the AUC is improved by 10.3 percent compared with an LSTM model; the recall rate is 0.78, which is improved by 1.1 times compared with LSTM; although its accuracy is only 0.89, this is mainly due to the imbalance of the dataset, the Multi-Channel Transformer model still better identifies the failure samples, capturing the failure at a higher recall.

2) The KD-WaveNet fault prediction model provided by the patent has the performance only higher than that of a reference model convolutional neural network CNN before knowledge distillation optimization (namely the WaveNet model), but has better KD-WaveNet fault prediction model effect than that of all reference models after knowledge distillation optimization, and has AUC of 0.83, 6.4 percent higher than LSTM and 36 percent higher than that of untrained model; the recall rate is 0.57, which is 54% higher than LSTM and 1.1 times higher than untrained. This demonstrates that the knowledge distillation optimized KD-WaveNet model has a great improvement in performance, more approaching the performance of the teacher's network.

3) The teacher network Multi-Channel Transformer model has a parameter size of 110M, while the student network KD-WaveNet model has a parameter size of only 1M. Under the training of knowledge distillation, the KD-WaveNet fault prediction model has lower complexity on the premise of approaching the performance of a teacher network, which is beneficial to the deployment and real-time operation of a computer.

Compared with the prior art, the invention has the beneficial effects that:

1. the tobacco equipment fault prediction method provided by the invention has the advantages of efficiency and accuracy. On one hand, the invention uses Fast Autoautomatic enhancement method to balance the data of the training set, and improves the accuracy of prediction; on the other hand, by generating knowledge distillation against the network, a student model KD-WaveNet fault prediction model with high accuracy and high accuracy is optimized by means of a teacher network Multi-Channel Transformer with high accuracy and high complexity. The efficient and accurate prediction has important significance for the actual early warning process.

2. According to the method, the accurate and efficient fault prediction model is constructed through the collection, integration and analysis of the multi-source data, and compared with a traditional manual maintenance mode, the tobacco equipment fault prediction method based on deep learning is intelligent, efficient and accurate, and meets the requirements of enterprises on pursuing efficiency and quality.

In the first aspect of data processing, the invention uses Newton interpolation method, K-means clustering, minMaxScaler normalization, fast Autoautomatic data generation and other methods to realize the conversion from original data to model input data.

Secondly, in the aspect of model construction, the invention provides a knowledge distillation method based on a generated countermeasure network, which trains a KD-WaveNet fault prediction model with accuracy and efficiency, and realizes accurate real-time prediction of future fault conditions.

In addition, as shown in fig. 6, the present invention provides a tobacco apparatus failure prediction system, comprising:

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the core concept of the invention; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. A method for predicting a malfunction of a tobacco plant, comprising:

2. The tobacco equipment failure prediction method according to claim 1, wherein preprocessing the history data to obtain feature extraction data specifically comprises:

3. The tobacco plant fault prediction method according to claim 1, wherein the feature extraction data is generated by using a Fast automation enhancement method, specifically comprising:

constructing a strategy space; the policy space contains a plurality of policies; the strategy is composed of a plurality of sub-strategies; the sub-policy includes a plurality of data enhancement operations;

4. The tobacco plant fault prediction method according to claim 1, wherein constructing the KD-WaveNet fault prediction model specifically comprises: the KD-WaveNet fault prediction model is built, and comprises an input module, two identical expansion convolution modules and an output module, wherein a gating activation unit and residual error and jump connection are introduced to relieve gradient explosion and accelerate training speed; the input module is used for adjusting the data dimension and accords with the model analysis requirement; the expansion convolution module is used for extracting characteristics and analyzing information; the output module is used for outputting the fault probability.

5. The tobacco plant fault prediction method according to claim 1, characterized in that the KD-WaveNet fault prediction model is pre-trained, in particular comprising: and inputting the balance data into the KD-WaveNet fault prediction model, training with the minimum loss between the fault prediction value and the fault true value as a target, searching for an optimal KD-WaveNet fault prediction model, and reducing the optimization time.

6. The tobacco plant fault prediction method of claim 1, wherein the KD-WaveNet fault prediction model is optimized using a knowledge distillation method based on generating an countermeasure network:

7. The tobacco plant failure prediction method of claim 6, wherein finding an optimal teacher network based on loss function minimization, comprises: and uniformly dividing the balance data into a plurality of non-repeated time periods patch, regarding the non-repeated time periods patch as a predicted minimum unit, inputting the patch into the data input and processing layer, splitting M dimensions, independently putting each dimension into the teacher network for training, respectively outputting the fault probability at the next moment, carrying out weighted summation on the M predicted probabilities through the output layer, taking the weighted sum predicted value as a final predicted value, and obtaining a teacher network Multi-Channel Transformer model with the optimal effect according to the minimization of a loss function.

8. A tobacco plant fault prediction system, comprising: