CN110929786A - Data augmentation method and electronic equipment - Google Patents

Data augmentation method and electronic equipment Download PDF

Info

Publication number
CN110929786A
CN110929786A CN201911157799.4A CN201911157799A CN110929786A CN 110929786 A CN110929786 A CN 110929786A CN 201911157799 A CN201911157799 A CN 201911157799A CN 110929786 A CN110929786 A CN 110929786A
Authority
CN
China
Prior art keywords
data
augmentation
label
sample
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911157799.4A
Other languages
Chinese (zh)
Other versions
CN110929786B (en
Inventor
蔺思宇
杨晨旺
马君
刘勇攀
刘涛
李素洁
王伟
史超
周景源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meifang Science And Technology Beijing Co ltd
Original Assignee
Meifang Science And Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meifang Science And Technology Beijing Co ltd filed Critical Meifang Science And Technology Beijing Co ltd
Priority to CN201911157799.4A priority Critical patent/CN110929786B/en
Publication of CN110929786A publication Critical patent/CN110929786A/en
Application granted granted Critical
Publication of CN110929786B publication Critical patent/CN110929786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Eyeglasses (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a data augmentation method and electronic equipment, wherein the method comprises the following steps: acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data; inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data; the trained working condition data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training. A data augmentation model which can augment the data similar to the label data of the large sample of the equipment is constructed by starting from the data of the large sample of the equipment, and the data augmentation model is adjusted according to the label data of the working condition of the small sample, so that the label data of the working condition of the small sample can be augmented, and the distribution space of the augmented data of the small sample is closer to the distribution breadth of the data of the large sample.

Description

Data augmentation method and electronic equipment
Technical Field
The invention relates to the technical field of industrial information, in particular to a data augmentation method and electronic equipment.
Background
With the increasing level of modern industrial automation, the scale of modern industrial systems is continuously enlarged, the complexity of cooperation among all parts of the systems is continuously increased, once a fault occurs in a part of the industrial systems, the whole system cannot work normally, and huge shutdown loss is caused.
The industrial mechanical equipment is used as long-term operation equipment, the operation life of the industrial mechanical equipment is generally specified by the industry for not less than 20 years, and the continuous operation of the industrial mechanical equipment is not less than three years, so that under the condition that the industrial mechanical equipment is qualified after leaving a factory, a large number of working conditions are difficult to occur in the operation of the equipment in an actual scene, and the data with the working condition labels in the actual operation process of the equipment is difficult to obtain. Many working condition diagnosis models of industrial mechanical equipment are trained under the condition of a small sample of the working condition, so that most of the models can be judged to be normal during classification due to excessive normal data, and the rate of missing report of the models is increased.
In addition, a method for data amplification of signal data of one-dimensional industrial mechanical equipment with a small sample is lacked in the prior art, and the accuracy of a working condition diagnosis model of the industrial mechanical equipment can be effectively improved only by ensuring a certain amount of data.
Therefore, how to perform data amplification on industrial machine signal data has become an urgent problem to be solved in the industry.
Disclosure of Invention
Embodiments of the present invention provide a data augmentation method and an electronic device, so as to solve the technical problems mentioned in the foregoing background art, or at least partially solve the technical problems mentioned in the foregoing background art.
In a first aspect, an embodiment of the present invention provides a data augmentation method, including:
acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data;
inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data;
the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
More specifically, before the step of acquiring operating condition tag data, the method further comprises:
splitting industrial original data into real label non-working condition data and working condition label data;
and cleaning abnormal points in the real label non-working condition data to obtain the normal data of the real label.
More specifically, the trained condition data augmentation model comprises a trained generator and a trained discriminator.
More specifically, before the step of inputting the initialization data of the condition label data and the augmentation data into the trained condition data augmentation model, the method further includes:
acquiring normal data of a real label, and manufacturing initialization sample augmentation data of random data with any number and the size consistent with the normal data of the real label;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation pseudodata with a false label, taking the sample augmentation pseudodata with the false label as the initialized sample augmentation data, inputting the sample augmentation pseudodata with the false label into the generator in the data augmentation model again for training until a loss function of the generator achieves stable convergence, and obtaining a trained generator;
mixing the sample augmentation false data with the false label and the normal data of the real label and inputting the mixture into a discriminator in a data augmentation model, and obtaining a trained discriminator when the loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained discriminator.
More specifically, before the step of inputting the initialization data of the condition label data and the augmentation data into the trained condition data augmentation model, the method further includes:
acquiring normal data of a real label, and manufacturing initialization sample augmentation data of random data with any number and the size consistent with the normal data of the real label;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation pseudodata with a false label, taking the sample augmentation pseudodata with the false label as the initialized sample augmentation data, inputting the sample augmentation pseudodata with the false label into the generator in the data augmentation model again for training until a loss function of the generator achieves stable convergence, and obtaining a trained generator;
mixing the sample augmentation false data with the false label and the normal data of the real label and inputting the mixture into a discriminator in a data augmentation model, and obtaining a trained discriminator when the loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained discriminator.
More specifically, the generator in the data augmentation model is formed by combining a convolutional neural network encoder and a convolutional neural network decoder.
More specifically, the loss function of the generator is specifically:
respectively operating normal data of a real label and initialized sample augmented data according to a convolutional neural network encoder, extracting a characteristic vector of the normal data of the real label and an initial augmented characteristic vector, and obtaining a first loss function according to the mean square error of the characteristic vector of the normal data of the real label and the initial augmented characteristic vector;
decoding the initial augmented feature vector according to the convolutional neural network decoding to obtain sample augmented spurious data, and obtaining a second loss function according to the mean square error of the sample augmented spurious data and the normal data of the real label;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and weighting the first loss function, the second loss function and the third loss function respectively to obtain the loss function of the generator.
More specifically, the loss function of the generator is specifically:
respectively operating normal data of a real label and initialized sample augmented data according to a convolutional neural network encoder, extracting a characteristic vector of the normal data of the real label and an initial augmented characteristic vector, and obtaining a first loss function according to the mean square error of the characteristic vector of the normal data of the real label and the initial augmented characteristic vector;
decoding the initial augmented feature vector according to the convolutional neural network decoding to obtain sample augmented spurious data, and obtaining a second loss function according to the mean square error of the sample augmented spurious data and the normal data of the real label;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and weighting the first loss function, the second loss function and the third loss function respectively to obtain the loss function of the generator.
In a second aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the data augmentation method according to the first aspect.
In a third aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the data augmentation method according to the first aspect.
According to the data amplification method and the electronic device provided by the embodiment of the invention, a data amplification model which can amplify the label data similarity of the large sample of the device is constructed from the large sample data of the device, and then the data amplification model is adjusted according to the working condition label data of the small sample, so that the working condition label data of the small sample can be amplified, the distribution space of the small sample amplification data is closer to the distribution extent of the large sample data, the reasonable expansion of the distribution extent of the small sample by the data amplification model can be ensured, and the reliability of the amplification data at the distribution edge of the small sample is ensured.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart illustrating a data augmentation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data amplification apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a data augmentation method described in an embodiment of the present invention, as shown in fig. 1, including:
step S1, acquiring working condition label data, and making random data with the same size as the working condition label data as the initialization data of the augmentation data;
step S2, inputting the initialization data of the working condition label data and the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data;
the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The working condition label data described in the embodiment of the invention refers to fault information in industrial mechanical equipment, and the fault information also comprises a working condition label.
The initialization data of the augmentation data described in the embodiment of the present invention refers to data generated randomly, and the data size of the data is consistent with the working condition tag data and is used as the initialization data of the augmentation data.
The trained working condition data augmentation model described in the embodiment of the invention is applied to the field of modern industrial machinery, and aiming at the problem that the number of working condition samples of the automatic learning model of industrial mechanical equipment is small, the working condition label data can be input into the trained working condition data augmentation model to augment the working condition label data, so that augmented working condition data can be obtained.
The augmented working condition data described in the embodiment of the invention are a plurality of expanded working condition data, and the accuracy of the automatic learning model for the industrial mechanical equipment can be effectively improved after the augmented working condition data are obtained.
The trained working condition data augmentation model described in the embodiment of the invention is obtained by initializing sample augmentation data and training of real label normal data and false labels.
The real tag normal data described herein is data of a large data volume with a real data tag, and the real data tag described in the embodiment of the present invention represents data not generated randomly but obtained from actual data. The dummy tag described in the embodiment of the present invention refers to data randomly generated by a generator in the data expansion model,
firstly, training to obtain a model capable of realizing large data sample data augmentation through initialization samples of real label normal data and random data of any number with the same size as the real label normal data, and then training the base model through working condition data by taking the model as the base model, thereby obtaining a trained working condition data augmentation model capable of expanding the working condition data.
According to the embodiment of the invention, a data augmentation model capable of augmenting the similarity of the label data of the large sample of the equipment is constructed from the large sample data of the equipment, and the data augmentation model is adjusted according to the label data of the working condition of the small sample, so that the label data of the working condition of the small sample can be augmented, the reasonable expansion of the distribution range of the small sample by the data augmentation model can be ensured, and the reliability of the augmented data at the distribution edge of the small sample is ensured.
On the basis of the above embodiment, before the step of acquiring the operating condition tag data, the method further includes:
splitting industrial original data into real label non-working condition data and working condition label data;
and cleaning abnormal points in the real label non-working condition data to obtain the normal data of the real label.
Specifically, the industrial raw data described in the embodiments of the present invention is raw data directly extracted from an industrial mechanical equipment system.
The real label non-working condition data described in the embodiment of the invention refers to real large sample data in an industrial mechanical equipment system, and the working condition label data refers to real working condition data of a small sample in the industrial mechanical equipment system.
According to the embodiment of the invention, the data reliability is improved by cleaning the abnormal points in the real label non-working condition data, and the real label non-working condition data and the working condition label data are distinguished; therefore, the model training can be effectively carried out aiming at the working condition data.
On the basis of the above embodiment, the trained working condition data augmentation model includes a trained generator and a trained discriminator.
Before the step of inputting the operating condition label data and the initialization data of the augmentation data into the trained operating condition data augmentation model, the method further comprises:
acquiring normal data of a real label, and manufacturing initialization sample augmentation data of random data with any number and the size consistent with the normal data of the real label;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation pseudodata with a false label, taking the sample augmentation pseudodata with the false label as the initialized sample augmentation data, inputting the sample augmentation pseudodata with the false label into the generator in the data augmentation model again for training until a loss function of the generator achieves stable convergence, and obtaining a trained generator;
mixing the sample augmentation false data with the false label and the normal data of the real label and inputting the mixture into a discriminator in a data augmentation model, and obtaining a trained discriminator when the loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained discriminator.
Specifically, in the embodiment of the present invention, the initialization sample augmentation data is randomly generated from any number of pieces of data, and the data format of the initialization sample augmentation data is consistent with the size of the normal data of the real tag.
Respectively operating the initialized sample augmented data and the normal data of the real label by using a convolutional neural network encoder to obtain an initial augmented characteristic vector and a normal data characteristic vector of the real label, and calculating the mean square error of the normal data characteristic vector of the real label and the initial augmented characteristic vector to obtain a first loss function;
decoding the initial augmentation characteristic vector by using a convolutional neural network decoder to obtain sample augmentation spurious data, and calculating the mean square error of the sample augmentation spurious data and the normal data of the real label to obtain a second loss function;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and taking a convolutional neural network encoder and a convolutional neural network decoder as generators of the data augmentation model, and then respectively weighting the first loss function, the second loss function and the third loss function to obtain a loss function of the generators.
And inputting the initialized sample augmentation data into a generator in the data augmentation model in the initial training stage to obtain sample augmentation pseudodata with a false label, if the loss function of the generator is not stable and convergent, continuing to perform cyclic training, replacing the initialized sample augmentation pseudodata with the false label output by the generator with the sample augmentation pseudodata with the false label as input, performing cyclic training until the loss function of the generator is stable and convergent, and stopping training to obtain a trained generator.
In each round of training process of the generator, the output of the generator is used as the input of the generator in the subsequent training process, the output of the generator is used as the output of the discriminator, meanwhile, the input of the discriminator also comprises real label normal data, the discriminator performs multiple times of training along with the cyclic training of the generator, the two steps are performed alternately, the discriminator is trained once when the generator is trained once, until the loss function of the discriminator reaches stable convergence, and finally, a data augmentation model capable of achieving augmentation of large sample data is obtained.
The embodiment of the invention can conveniently guide the training of the data augmentation model capable of realizing a large amount of sample data on the training of the data augmentation model aiming at the small sample data, and can more reasonably expand the distribution space of the working condition data by using the working condition data augmentation model trained by the base model to be closer to the distribution breadth of the large sample data.
On the basis of the above embodiment, after the step of obtaining the trained data augmentation model according to the trained generator and the trained discriminator, the method further includes:
obtaining sample working condition label data and manufacturing any number of initialized sample working condition augmentation data with the same size as the sample working condition label data;
and continuously training the base model according to the initialized sample working condition augmentation data and the sample working condition label data by taking the trained data augmentation model as the base model, and obtaining the trained working condition data augmentation model when the loss function of the base model reaches stable convergence.
Specifically, the data augmentation model is used as a base model, normal data of a real label is replaced by working condition label data, the base model is trained, and the trained augmented working condition label data are obtained when a loss function of the base model is stable and converged.
According to the embodiment of the invention, a data augmentation model capable of augmenting the similarity of the label data of the large sample of the equipment is constructed from the large sample data of the equipment, and the data augmentation model is adjusted according to the label data of the working condition of the small sample, so that the label data of the working condition of the small sample can be augmented, the distribution space of the augmented data of the small sample is closer to the distribution extent of the large sample data, the reasonable expansion of the distribution extent of the small sample by the data augmentation model is ensured, and the reliability of the augmented data at the distribution edge of the small sample is ensured.
On the basis of the above embodiment, the loss function of the generator is specifically:
respectively operating normal data of a real label and initialized sample augmented data according to a convolutional neural network encoder, extracting a characteristic vector of the normal data of the real label and an initial augmented characteristic vector, and obtaining a first loss function according to the mean square error of the characteristic vector of the normal data of the real label and the initial augmented characteristic vector;
decoding the initial augmented feature vector according to the convolutional neural network decoding to obtain sample augmented spurious data, and obtaining a second loss function according to the mean square error of the sample augmented spurious data and the normal data of the real label;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and weighting the first loss function, the second loss function and the third loss function respectively to obtain the loss function of the generator.
The convolutional neural network is used as a discriminator, a real label or a false label is used as a label, and cross entropy is used as a loss function of the discriminator.
After obtaining the augmented operating condition data, the method further comprises:
and inputting the augmented working condition data into an automatic learning and improving algorithm of industrial mechanical equipment, effectively improving the accuracy of the automatic learning algorithm, and finally combining the automatic learning algorithm to form the automatic learning and improving algorithm.
The embodiment of the invention does not directly start from the working condition data of the small sample, but starts from the large sample of the same equipment with more distribution breadth, and then adjusts the small sample, so that the parameter of the augmentation model can reasonably expand the distribution breadth of the small sample, and the classification accuracy of the data points at the distribution edge of the small sample is ensured.
Fig. 2 is a schematic structural diagram of a data amplification apparatus according to an embodiment of the present invention, as shown in fig. 2, including: an acquisition module 210 and an augmentation module 220; the obtaining module 210 is configured to obtain operating condition tag data, and make any number of random data with a size consistent with that of the operating condition tag data as initialization data of the augmentation data; the augmentation module 220 is configured to input the operating condition label data and initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data; the trained working condition data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The apparatus provided in the embodiment of the present invention is used for executing the above method embodiments, and for details of the process and the details, reference is made to the above embodiments, which are not described herein again.
According to the embodiment of the invention, a data augmentation model capable of augmenting the similarity of the label data of the large sample of the equipment is constructed from the large sample data of the equipment, and the data augmentation model is adjusted according to the label data of the working condition of the small sample, so that the label data of the working condition of the small sample can be augmented, the distribution space of the augmented data of the small sample is closer to the distribution extent of the large sample data, the reasonable expansion of the distribution extent of the small sample by the data augmentation model is ensured, and the reliability of the augmented data at the distribution edge of the small sample is ensured.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the following method: acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data; inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes: acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data; inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
Embodiments of the present invention provide a non-transitory computer-readable storage medium storing server instructions, where the server instructions cause a computer to execute the method provided in the foregoing embodiments, for example, the method includes: acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data; inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method of data augmentation, comprising:
acquiring working condition label data, and manufacturing random data with the size consistent with that of the working condition label data as initialization data of the augmentation data;
inputting the operating condition label data and the initialization data of the augmentation data into a trained operating condition data augmentation model to obtain augmentation operating condition data;
the trained working condition data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
2. The data augmentation method of claim 1, wherein prior to the step of obtaining condition tag data, the method further comprises:
splitting industrial original data into real label non-working condition data and working condition label data;
and cleaning abnormal points in the real label non-working condition data to obtain the normal data of the real label.
3. The data augmentation method of claim 2, wherein the trained condition data augmentation model comprises a trained generator and a trained discriminator.
4. The data augmentation method of claim 1, wherein prior to the step of inputting the condition-tagged data and initialization data for the augmentation data into a trained condition-data augmentation model, the method further comprises:
acquiring normal data of a real label, and manufacturing initialization sample augmentation data of random data with any number and the size consistent with the normal data of the real label;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation pseudodata with a false label, taking the sample augmentation pseudodata with the false label as the initialized sample augmentation data, inputting the sample augmentation pseudodata with the false label into the generator in the data augmentation model again for training until a loss function of the generator achieves stable convergence, and obtaining a trained generator;
mixing the sample augmentation false data with the false label and the normal data of the real label and inputting the mixture into a discriminator in a data augmentation model, and obtaining a trained discriminator when the loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained discriminator.
5. The data augmentation method of claim 4, wherein after the step of deriving a trained data augmentation model from the trained generator and the trained discriminator, the method further comprises:
obtaining sample working condition label data and manufacturing any number of initialized sample working condition augmentation data with the same size as the sample working condition label data;
and continuously training the base model according to the initialized sample working condition augmentation data and the sample working condition label data by taking the trained data augmentation model as the base model, and obtaining the trained working condition data augmentation model when the loss function of the base model reaches stable convergence.
6. The data augmentation method of claim 4, wherein the generator in the data augmentation model is formed by a combination of a convolutional neural network encoder and a convolutional neural network decoder.
7. The data augmentation method of claim 6, wherein the loss function of the generator is specifically:
respectively operating normal data of a real label and initialized sample augmented data according to a convolutional neural network encoder, extracting a characteristic vector of the normal data of the real label and an initial augmented characteristic vector, and obtaining a first loss function according to the mean square error of the characteristic vector of the normal data of the real label and the initial augmented characteristic vector;
decoding the initial augmented feature vector according to the convolutional neural network decoding to obtain sample augmented spurious data, and obtaining a second loss function according to the mean square error of the sample augmented spurious data and the normal data of the real label;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and weighting the first loss function, the second loss function and the third loss function respectively to obtain the loss function of the generator.
8. The data augmentation method of claim 1, wherein after the step of obtaining augmented operating condition data, the method further comprises:
and inputting the augmented working condition data into an automatic learning and lifting algorithm of industrial mechanical equipment.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the data augmentation method of any one of claims 1 to 8.
10. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the data augmentation method according to any one of claims 1 to 8.
CN201911157799.4A 2019-11-22 2019-11-22 Data augmentation method and electronic equipment Active CN110929786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911157799.4A CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911157799.4A CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Publications (2)

Publication Number Publication Date
CN110929786A true CN110929786A (en) 2020-03-27
CN110929786B CN110929786B (en) 2023-08-01

Family

ID=69850823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911157799.4A Active CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Country Status (1)

Country Link
CN (1) CN110929786B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464939A (en) * 2021-01-28 2021-03-09 知行汽车科技(苏州)有限公司 Data augmentation method, device and storage medium in target detection

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108680807A (en) * 2018-05-17 2018-10-19 国网山东省电力公司青岛供电公司 The Diagnosis Method of Transformer Faults and system of network are fought based on condition production
CN109543674A (en) * 2018-10-19 2019-03-29 天津大学 A kind of image copy detection method based on generation confrontation network
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study
US20190197358A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Generative Adversarial Network Medical Image Generation for Training of a Classifier
CN110119787A (en) * 2019-05-23 2019-08-13 湃方科技(北京)有限责任公司 A kind of rotary-type operating condition of mechanical equipment detection method and equipment
WO2019221654A1 (en) * 2018-05-17 2019-11-21 Tobii Ab Autoencoding generative adversarial network for augmenting training data usable to train predictive models

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190197358A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Generative Adversarial Network Medical Image Generation for Training of a Classifier
CN108680807A (en) * 2018-05-17 2018-10-19 国网山东省电力公司青岛供电公司 The Diagnosis Method of Transformer Faults and system of network are fought based on condition production
WO2019221654A1 (en) * 2018-05-17 2019-11-21 Tobii Ab Autoencoding generative adversarial network for augmenting training data usable to train predictive models
CN109543674A (en) * 2018-10-19 2019-03-29 天津大学 A kind of image copy detection method based on generation confrontation network
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study
CN110119787A (en) * 2019-05-23 2019-08-13 湃方科技(北京)有限责任公司 A kind of rotary-type operating condition of mechanical equipment detection method and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MAAYAN FRID-ADAR等: "《SYNTHETIC DATA AUGMENTATION USING GAN FOR IMPROVED LIVER LESION CLASSIFICATION》" *
尤鸣宇等: "《基于样本扩充的小样本车牌识别》" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464939A (en) * 2021-01-28 2021-03-09 知行汽车科技(苏州)有限公司 Data augmentation method, device and storage medium in target detection

Also Published As

Publication number Publication date
CN110929786B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN111726248A (en) Alarm root cause positioning method and device
CN112397057B (en) Voice processing method, device, equipment and medium based on generation countermeasure network
CN112612664B (en) Electronic equipment testing method and device, electronic equipment and storage medium
CN110261080B (en) Heterogeneous rotary mechanical anomaly detection method and system based on multi-mode data
CN112988440A (en) System fault prediction method and device, electronic equipment and storage medium
CN110929786A (en) Data augmentation method and electronic equipment
CN116226676B (en) Machine tool fault prediction model generation method suitable for extreme environment and related equipment
CN112446389A (en) Fault judgment method and device
CN112687274A (en) Voice information processing method, device, equipment and medium
CN114118295A (en) Anomaly detection model training method, anomaly detection device and medium
CN111026087B (en) Weight-containing nonlinear industrial system fault detection method and device based on data
CN110442439B (en) Task process processing method and device and computer equipment
CN112836807A (en) Data processing method and device based on neural network
CN110955603A (en) Automatic testing method and device, electronic equipment and computer readable storage medium
CN114020640A (en) Automatic testing method and device
CN114760109A (en) Numerical behavior security baseline generation method and device for security analysis
CN110175456A (en) Software action sampling method, relevant device and software systems
CN117827620B (en) Abnormality diagnosis method, training device, training equipment, and recording medium
CN106547679B (en) Script management method and script management platform
US20100262416A1 (en) Computer and method for simulating an attention command test of a mobile phone
CN113992436B (en) Local information generating method, device, equipment and storage medium
CN115391213A (en) Test script generation method and device
CN113535655A (en) Log analysis method and device
CN115309645A (en) Defect positioning method, device, equipment and storage medium for development and test
CN116263816A (en) Method, server, storage medium and system for identifying user by mobile phone terminal in high privacy protection environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Lin Siyu

Inventor after: Yang Chenwang

Inventor after: Ma Jun

Inventor after: Liu Tao

Inventor after: Li Sujie

Inventor after: Wang Wei

Inventor after: Shi Chao

Inventor after: Zhou Jingyuan

Inventor before: Lin Siyu

Inventor before: Yang Chenwang

Inventor before: Ma Jun

Inventor before: Liu Yongpan

Inventor before: Liu Tao

Inventor before: Li Sujie

Inventor before: Wang Wei

Inventor before: Shi Chao

Inventor before: Zhou Jingyuan

GR01 Patent grant
GR01 Patent grant