CN115205616A

CN115205616A - Cloth flaw data enhancement method based on generation countermeasure network

Info

Publication number: CN115205616A
Application number: CN202210595194.9A
Authority: CN
Inventors: 朱威; 徐希舟; 张佳伟; 郑雅羽
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2022-05-28
Filing date: 2022-05-28
Publication date: 2022-10-18

Abstract

The invention relates to a cloth defect data enhancement method based on a generated countermeasure network, which comprises the steps of building the generated countermeasure network, training, carrying out multi-scale feature extraction and style coding on an input defective cloth image to be data enhanced by multilayer convolution and example regularization, building an up-sampling convolution module by space self-adaptive normalization, decoupling network tasks by parallel 2 convolution branches, respectively corresponding to generation of a defect foreground and defect transparency, and matching transparency control, defect form and space constraint to finally obtain the data enhanced defective cloth image. The method aims at the existing data set to carry out data enhancement, improves the performance of the cloth flaw detector by balancing the sample category number and expanding flaw samples, and can also carry out flaw superposition on new flawless cloth to quickly generate a sufficient amount of flaw data sets under a new background.

Description

Cloth flaw data enhancement method based on generation countermeasure network

Technical Field

The invention relates to the technical field of general image data processing or generation, belongs to the field of machine vision, and particularly relates to a cloth flaw data enhancement method based on a generation countermeasure network.

Background

Cloth flaw detection is an important link in the cloth production process, and early flaw detection is mainly completed in a manual screening mode. In recent years, with the rapid development of machine learning technology and its large-scale application in the field of computer vision, the detection of cloth defects is gradually developing in the direction of automation and intellectualization.

Machine learning-based cloth defect detection typically uses target detection networks, such as YOLO, SSD, cascade-RCNN, etc., whose deployment requires a sufficient number of data sets as support. However, in the field of textile production, due to the wide variety of cloth types, the existing data set cannot well cover all cloth types, and in addition, the collection of the cloth defect samples is difficult and inefficient, and the above factors limit the improvement of the cloth defect detection performance and easily cause the problem of network overfitting. Therefore, effective data enhancement is carried out on the collected cloth data set so as to improve the generalization and the robustness of the neural network, and the method is a work with practical application value.

Conventional data enhancement methods can be classified into a spatial geometric transformation class, a color transformation class, and a multi-sample synthesis class. The spatial geometric transformation comprises turning, rotating, clipping, deforming, zooming and the like; color transformation includes superimposing noise, blurring, erasing, color change, filling, and the like; in the class of multi-sample synthesis, chawla et al (see Chawla N V, bowyer K W, hall L O, et al, SMOTE: synthetic Minrity Over-sampling Technique [ J ]. Journal of Industrial Intelligent Research,2002,16 (1): 321-357.) use a feature space interpolation-based approach to synthesize new samples for small sample classes to balance the number of samples for sample class imbalance. Inoue proposes a simple multi-sample synthesis method, samplePair (see Inoue H. Data evaluation by Pair sampling for Images Classification [ J ]. ArXiv preprinting arXiv:1801.02929, 2018.), namely, two pictures in a training set are subjected to basic data enhancement firstly, and then are superposed and synthesized into a new sample in the form of pixel average value, and the label is any one of original sample categories. The data set processed by the method is enlarged from N to N multiplied by N, and the performance improvement effect is considerable. Zhang et al proposed a Mixup method (see Zhang H, cisse M, dauphin Y N, et al. Mixup: beyond actual Risk Minimization [ J ]. 2017.), using linear interpolation to obtain new sample data. Experiments show that the method can improve the generalization error of the network in the data set and improve the robustness and stability of the network model.

With the development of deep learning technology, goodfellow et al proposed to generate an antagonistic network architecture GAN in 2014 (see Goodfellow I J, pouget-Abadie J, mirza M, et al. Genetic Adversal Networks [ J ]. Advances in Neural Information Processing Systems,2014, 3. Numerous researchers have made continuous improvements to GAN architecture, and many methods for applying GAN to data enhancement are proposed in succession. Tanaka et al trained the use of GAN for synthesizing medical data for cancer detection, and achieved better performance than the original small data set (see Tanaka F, aranha c. Data evaluation Using GANs [ J ]. ArXiv preprint, arXiv:1904.09135, 2019.). Liu et al, on the basis of the Pix2pixHD model (see Wang T C, liu M Y, zhu J Y, et al. High-Resolution Image Synthesis and Selective management with Conditional GANs [ J ]. IEEE International Conference on Computer Vision and Pattern Recognition,2018, 8798-8807.), designed an end-to-end Data enhancement scheme for the Cityspace dataset (see Liu S, zhang J, chen Y, et al. Pixel Level Data enhancement for Selective Image Segmentation genetic enhancement [ J ]. IEEE International Conference on Acoustics, signal Processing, 2016). Compared with the traditional data enhancement scheme, the scheme further improves the PSPNet by 2.1% on the mIoU of the Cityspace19 semantic segmentation task. Zhang et al proposed a Defect synthetic framework Defect-GAN (see Zhang G, cui K, hung TY, et al. Defect-GAN: high-Fidelity Defect Synthesis for Automated Defect Inspection [ J ]. IEEE Winter Conference on Applications of Computer,2021, 2523-2533). The best visual effect was obtained on the CODEBRIM concrete Defect data set, while ResNet34 and DenseNet152 were used as concrete Defect detectors, and the accuracy of both was significantly improved after data enhancement using Defect GAN.

The existing data enhancement algorithm is not suitable for being directly used for enhancing the cloth flaw data no matter based on a traditional mode or a deep learning mode, and mainly has the problems in several aspects: (1) The traditional data enhancement method cannot directly and quickly superimpose the defects on the new cloth by the existing defect sample library. (2) The number of cloth defects is small, the problem of small sample learning is solved, and a large sample library is needed for training the GAN network. (3) Data enhancement is often to overlap defects in a certain area of the cloth, so that a smooth transition between the data enhancement area part and the whole cloth image is required, and if the defects are overlapped not naturally enough, the performance of the detector is reduced.

Disclosure of Invention

The invention solves the problems in the prior art and provides an optimized method for enhancing the cloth defect data based on the generation countermeasure network.

The technical scheme includes that the method for enhancing the cloth defect data based on the generation countermeasure network is characterized by comprising the steps of building the generation countermeasure network, training, carrying out multi-scale feature extraction and style coding on an input cloth image with defects to be enhanced through multilayer convolution and example regularization, building an up-sampling convolution module through space self-adaptive normalization, decoupling network tasks through parallel 2 convolution branches, and matching transparency control, defect forms and space constraints with defect foreground generation and defect transparency generation to finally obtain the cloth image with defects after data enhancement.

Preferably, the set up resulting antagonistic network comprises:

a feature coding module for extracting feature maps with different resolutions in the down-sampling process and coding the image features to obtain a mean value mu and a variance sigma representing the image style ² ；

A defect generation module for generating a mean value mu and a variance sigma from the image style ² Performing up-sampling on style noise obtained by sampling, and continuously introducing a feature map extracted by a feature coding module in the up-sampling process to obtain a synthesized flaw sample;

and the discriminator module is used for judging whether the input sample is a real flaw sample or a synthesized flaw sample, and guiding the flaw generation module to synthesize the flaw sample in the model iterative training process so that the synthesized flaw sample is close to the real flaw sample in the aspects of image authenticity and definition.

Preferably, the feature encoding module comprises 5 volume blocks and 2 fully connected layers;

any volume block comprises a standard 3 x3 convolution layer, an instance normalization layer, and a leaky linear activation function layer;

the 5 convolution blocks respectively output the feature maps F under five different resolutions ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ While simultaneously adding F ₄ Inputting the image style mean vector mu and the variance vector sigma into two parallel 256-dimensional full-connected layers to respectively calculate the image style mean vector mu and the variance vector sigma ² And for mu and sigma ² A single gaussian sampling is performed to obtain the stylistic noise z.

In the invention, a feature coding module takes an unblemished cloth sample as input and consists of five convolution blocks and two full-connected layers, wherein the convolution blocks are used for extracting feature information and down-sampling, and each convolution block consists of three parts, namely standard 3 multiplied by 3 convolution Conv, example normalization InstanceNorm and leakage linear activation function LeakyReLU; the step length of the first Conv layer is 1, zero padding is 1, the step lengths of the other four Conv layers are 2, and zero padding is 2, so that a down-sampling function is realized, and feature maps under different resolutions are obtained; the Instance normalization can be performed on a single image Instance, so that the independence among the image instances is guaranteed, the style of the image instances is reserved, and the Instance normalization method is more suitable than the Batch normalization in an image synthesis task; the LeakyReLU is similar to the ReLU, sparsity is introduced into the network, and the calculation performance is improved. In addition, as LeakyReLU still has a slight gradient under the condition that the activation value is negative, the problem of neuron death can be avoided, and the network convergence is accelerated.

In the invention, after passing through the feature coding module, feature maps F under five different resolutions can be obtained from the output ends of five volume blocks ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ . Wherein F ₀ Resolution is consistent with the input image, with the most abundant textural feature information, F ₁ 、F ₂ 、F ₃ And F ₄ Is 1/2, 1/4, 1/8 and 1/16 of the input image respectively. Further, F ₄ The method comprises high-dimensional semantic information, and needs to input the high-dimensional semantic information into two parallel 256-dimensional full-connected layers to respectively calculate the image style mean value mu and the variance sigma ² And to the mean μ and variance σ ² The sampling results in a stylistic noise z.

Preferably, the defect generation module comprises a defect foreground generation branch and a defect transparency generation branch which are arranged in parallel;

the flaw foreground generation branch comprises a full connection layer and 6 up-sampling modules, a residual connection layer is arranged between the input and the output of each up-sampling module, and input style noise z is projected noise z obtained after passing through the full connection layer _project Obtaining a high-resolution characteristic diagram after 6 up-sampling modules, and outputting a foreground diagram O through a Tanh activation function _forge ；

The flaw transparency generation branch comprises a full connection layer and 6 up-sampling modules, a residual connection layer is arranged between the input and the output of each up-sampling module, and standard normal distribution noise z is input _noise Obtaining a high-resolution characteristic diagram after passing through 6 upsampling modules, and outputting a transparency image O through a Sigmoid activation function _alpha ；

To obtain S _defect ＝S _normal ·(1-O _alpha )+O _forge ·O _alpha Wherein S is _defect To synthesize a defect map, S _normal Coding module for characteristicsInput of a flawless sample, O _forge Generating an output of the branch for the defect foreground, O _alpha The output of the branch is generated for flaw transparency.

Preferably, the up-sampling module comprises two convolution blocks connected in series, each convolution block comprising a SPADE layer, a nonlinear activation function ReLU and a standard 3 × 3 convolution arranged in sequence.

Preferably, the input of the flaw generation module comprises 256-dimensional style noise z obtained by the feature coding module and a feature map F output by the middle layer of the feature coding module ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ Randomly sampling a defect mask map M from a preset defect sample library;

projecting noise z obtained by inputting style noise z into corresponding full-connected layer _project And standard normal distribution noise z _noise Respectively inputting 6 up-sampling modules in corresponding generation branches for decoding, and simultaneously inputting a flaw mask image M to the up-sampling modules for guiding decoding; respectively splicing the characteristic graphs F at the output ends of the first 5 upsampling modules in sequence ₄ 、F ₃ 、F ₂ 、F ₁ 、F ₀ As input to the next level up-sampling module; for a defect foreground generation branch, compressing the dimensionality of a feature map to 3 dimensionalities through a last up-sampling module, and outputting through a corresponding activation function; for the flaw transparency generation branch, the dimension of the feature map is compressed to 1 dimension through a last up-sampling module, and the feature map is output through a corresponding activation function.

In the invention, the flaw foreground generation branch takes random style noise z as input, and the image resolution is continuously improved by a series of convolution and bilinear interpolation up-sampling. In the generation process of the flaw foreground, in order to enable the shape and the position of the generated flaw to be controllable, a flaw semantic mask map needs to be introduced for constraint. The mainstream method for translating the semantic graph into the actual image at present comprises the following steps: the pix2pix series model directly takes a mask as input to generate an image, and under the condition of the same mask input, the generated image is blurred due to the fact that the mask is required to approach a target image under any input noise; furthermore, since Batch Normalize normalizes the data distribution, the style of the generated image converges. In view of the above problems, nvidia researchers have proposed a space Normalization method (see Park T, liu M Y, wang T C, et al. Semantic Image Synthesis With Adaptive Normalization [ C ]. Reference on Computer Vision and Pattern Recognition, IEEE,2019 2332-2341.), which does not use a mask map as an input directly, but introduces a semantic mask map on the basis of Batch Normalization to guide the calculation of coefficients, thereby obtaining a very excellent visual effect, and under the same semantic mask input, synthetic images of different styles can be obtained according to different input noises.

In the invention, the flaw foreground generation branch uses SPADE normaize to introduce a semantic graph. Spadersblock in GauGAN is used as the basic sampling module. The module consists of two convolution modules which are connected in series, each convolution module consists of an SPADE layer, a nonlinear activation function ReLU and a standard 3 multiplied by 3 convolution in sequence, the convolution step length is 1, zero value filling is 1, residual connection is established between input and output, and the module has good visual performance effect in an actual experiment; meanwhile, the image features F extracted from the feature coding module are processed ₀ ～F ₄ As a common input, the texture, morphology, etc. of the defect are made to match the non-defective cloth original image in the up-sampling process.

In the invention, the flaw transparency generation branch and the flaw foreground branch have the same structure, and the up-sampling intermediate layer also receives the image characteristic F extracted from the characteristic coding module ₀ ～F ₄ The difference is that the input it accepts is standard normal distribution noise with a mean of 0 and a variance of 1, which can increase the diversity of the model-generated images to some extent.

The final defect image synthesis formula is as follows:

S _defect ＝S _normal ·(1-O _alpha )+O _forge ·O _alpha (1)

wherein S _defect To synthesize a defect map, S _normal Inputing inpur-flawless samples for feature coding modules, O _forge Generating outputs for branches for defect prospects，O _alpha The output of the branch is generated for flaw transparency.

Preferably, the discriminator module comprises three parallel branches B ₀ 、B ₁ And B ₂ Any branch comprises an input layer and a plurality of modules CIL, wherein each module CIL comprises a standard 3 x3 convolution layer, an example normalization layer and a leakage nonlinear activation function layer which are sequentially arranged;

branch B ₀ The number of the modules CIL is 3, and the modules CIL are used for receiving images with the original size W multiplied by H and carrying out identification;

branch B ₁ The number of modules CIL is 2, which are used for receiving and identifying images with the size of (W/2) × (H/2);

branch B ₂ The number of modules CIL is 1, which is used to receive and identify images with the size of (W/4) × (H/4).

In the present invention, patchGAN proposes a block-based discriminator: after the image passes through various convolution layers, the image is not directly output to an activation function or full connection, but the output of convolution is mapped into an N multiplied by N matrix, and each numerical value in the matrix represents an evaluation value of a block area in the original image. The discriminator of the design can focus on more area details and improve the image quality generated by the GAN. The pix2pixHD introduces a multi-scale design on the basis of a PatchGAN discriminator, the images are respectively sent to block discriminators with the same structure according to different resolutions, and an evaluation value of a comprehensive multi-scale image is finally output, so that the discriminator not only focuses on local details, but also further expands the receptive field, the discriminator has the capability of evaluating the whole image, and the quality and the definition of the GAN image are further improved; the method uses a multi-scale Patch discriminator of pix2pixHD to perform score evaluation on the original image, the 1/2 length-width original image and the 1/4 length-width original image.

Preferably, the training of generating the countermeasure network comprises the steps of:

step 1.1: designing a loss function;

step 1.2: making and enhancing the data set to obtain a flaw data set omega ₁ And flawless cloth dataset omega ₂ ；

Step 1.3: optimizing network model parameters using Adam optimizer, using loss function L _gan Training flaw foreground generation branches using a loss function L _alpha Training the flaw transparency generation branch, and optimizing the flaw transparency generation branch in the back propagation process.

Preferably, the first and second electrodes are formed of a metal,

final loss function L _all ＝L _gan +λ _alpha ×L _alpha ；

Where x represents the distribution p from the true flaw data _r True flaw samples are sampled, z represents random noise sampled from a standard normal distribution N (0, 1),

representing data distribution from a fault generator

Sampling the resultant flaw sample, D representing discriminator, G representing generator, m representing the distribution p of the flaw mask data from the real flaw _ralpha True defect mask map, a denotes data distribution p from defect transparency generator _galpha And sampling the obtained synthesized flaw transparency map. Lambda [ alpha ] _disc Is a hyper-parameter, λ, for controlling the truth of the synthetic flaws _disc For controlling the hyperparametric, λ, of the weight penalty intensity _alpha Is a hyper-parameter for controlling the similarity of the composite defect transparency map and the real defect mask map.

In the invention, the common GAN discriminator needs to map the output to a certain value range through an activation function, which results in that when the activation value of one of the generator or the discriminator is too large, the other one cannot learn the gradient effectively, and the training degree of the generator and the discriminator needs to be coordinated carefully to ensure the stability of GAN training. The WGAN provides a new loss function, three-point modification is performed on the traditional GAN, and the common situations of unstable training and mode collapse in the GAN are solved. The WGAN was modified as follows: the activation function in the discriminator is removed; no logarithmic loss function is used in the loss function; the method adopts Weight clipping skill to limit the parameter range of the discriminator between-0.01 and 0.01 so as to meet the 1-Lipschitz limit. The specific loss function of the WGAN is as follows:

the WGAN-GP is continuously improved on the basis of the WGAN, because the model modeling capability is weakened by the simple and rough Weight clipping in the WGAN, and the phenomenon of Gradient disappearance or Gradient explosion is generated, the WGAN-GP uses Gradient Penalty (GP) to replace the Weight clipping, and finally the loss function is obtained as follows:

in addition, the method introduces flaw morphology and spatial constraints by computing the similarity of transparent channels and mask maps. Considering that the transparency in the mask region should be determined by the network itself, and is not limited by this constraint, the similarity outside the mask region is only calculated, and the morphological spatial constraint loss function is obtained as follows:

in summary, the method of the present invention combines the WGAN-GP loss function and the morphological-spatial constraint loss function to obtain a final loss function:

L _all ＝L _gan +λ _alpha ×L _alpha (5)

in the invention, because the network model parameter quantity is larger, the gradient penalty value of WGAN-GP is larger than other loss function values by several orders of magnitude, and the method is easy to cause during trainingThe network falls into a mediocre solution that does not generate images, so λ _grad The selection is not too large, and the range is [1e-3,1e-5 ]]。

In the invention, because the flaw generation module is a multi-output network, different branches need to be trained separately to accelerate model convergence and improve model effect. In the training process, the Adam optimizer is adopted to optimize the network, the learning rate is adjusted by adopting a strategy of each epoch attenuation, and a learning rate attenuation formula is as follows:

wherein lr is _new For the changed learning rate, lr _old To change the post-learning rate, curepoch is the current training round, maxepoch is the maximum training round, and power is the attenuation factor.

According to the method, firstly, semantic annotation is carried out on different cloth and different types of defects shot on a production line to obtain a defect data set omega ₁ (ii) a Then to Ω ₁ The data in the method (1) is subjected to data enhancement of vertical and horizontal random overturning, so that the diversity of flaw data is increased; finally, shooting from the production line to obtain an image of the flawless cloth to form a flawless data set omega ₂ 。

In the present invention, a flaw data set Ω is used ₁ And flawless dataset omega ₂ The process of training the network model is divided into three stages: generator content training, generator transparency training, and discriminator training.

In the producer content training phase, the flaw data set omega is first extracted ₁ Obtaining a defective area mask image by random sampling, and obtaining a non-defective data set omega ₂ Random sampling to obtain flawless cloth image, using the two as input data of generator to obtain synthetic sample, sending it to discriminator to make evaluation, using loss function L of formula (3) _gan And training the flaw foreground generation branch.

The generator transparency training and generator content training phases coincide, except that this phase uses the loss function L of equation (4) _alpha Training is performed and only flaw transparency generating branches are optimized during back propagation.

In the training stage of the discriminator, the discriminator receives the flaw sample and the corresponding flaw mask image as input and outputs whether the flaw sample belongs to a real image or a composite image of the generator. True flaw sample and flaw mask map from flaw dataset omega ₁ The synthesis defect sample is generated by the generator. This stage uses the loss function L of equation (3) _gan Training the discriminator to judge true and false.

Preferably, the step of enhancing the cloth defect data by the trained generation countermeasure network comprises the following steps:

step 2.1: for flawless cloth data set omega ₂ Is performed for each image of (1) _r Performing random frame selection to obtain several background images of 256 × 256 pixels, wherein N _r In the range of [2,10]；

Step 2.2: for each background plot, from the defect dataset Ω ₁ Randomly sampling to obtain a mask image, and inputting the background image and the mask image into a flaw generation network model together to obtain a synthetic flaw image;

step 2.3: and (4) pasting the synthesized defective image back to the frame selection area corresponding to the non-defective cloth image in the step 2.1 to obtain a defective cloth image generated by data enhancement.

The invention relates to an optimized cloth defect data enhancement method based on a generated countermeasure network, which comprises the steps of building the generated countermeasure network, training, carrying out multi-scale feature extraction and style coding on an input defective cloth image to be subjected to data enhancement through multilayer convolution and example regularization, building an up-sampling convolution module through space self-adaptive normalization, decoupling network tasks through parallel 2 convolution branches, respectively corresponding to generation of a defect foreground and defect transparency, and matching transparency control, defect forms and space constraint to finally obtain the data enhanced defective cloth image.

The method uses multilayer convolution and instant Normalize to perform multi-scale feature extraction and style coding on an input image; on the basis of referring to GauGAN, SPADE normaize is used for constructing an up-sampling convolution module, and can ensure that the style characteristics of a feature map are still kept when semantic segmentation information is introduced, and finally high-resolution features are obtained; aiming at the high-resolution characteristics obtained by up-sampling, two parallel convolution branches are designed to decouple the network task, the two branches are respectively responsible for generation of a flaw foreground and a flaw transparency, and a concept of transparency control is introduced to enable flaw synthesis to be more natural and smooth; and meanwhile, flaw form and space constraint are introduced to limit the coordinates and form generated by flaws, so that the data enhancement effect is controllable. In addition, the method designs a discriminator by referring to a block discriminator of PatchGAN to monitor the effect of defect synthesis.

The method can be used for enhancing the data of the existing data set, improving the performance of the cloth defect detector by balancing the sample category number and expanding the defect samples, and can also be used for performing defect superposition on the new flawless cloth to quickly generate a sufficient defect data set under a new background.

Drawings

FIG. 1 is a content block diagram of the present invention;

FIG. 2 is an overall network architecture of the present invention;

FIG. 3 is a feature encoding module network structure;

FIG. 4 is a network structure of a fault generation module;

FIG. 5 is a SPADEResblock structure;

FIG. 6 is a multi-scale block discriminator structure;

FIG. 7 is a graph illustrating the data enhancement effect of the invention on junction defects;

FIG. 8 is a diagram illustrating the effect of data enhancement on dirty defects according to the present invention;

FIG. 9 is a graph showing the data enhancement effect of the present invention on color yarn defects;

FIG. 10 is a graph showing the effect of the present invention on data enhancement of a spinning defect.

Detailed Description

The present invention is described in further detail with reference to the following examples, but the scope of the present invention is not limited thereto.

The computer hardware configuration selected by the invention is as follows: CPU Intel i7-11700K @3.6GHz, GPU RTX3090, 24GB video memory and memory 16GB. The software platform is an Ubuntu18.04 64-bit system and is realized based on Pytroch 1.8.

The cloth defect data enhancement method based on the generation of the countermeasure network as shown in FIG. 1 comprises three parts:

(1) Generating a confrontation network building;

(2) Training and optimizing a neural network;

(3) Performing data enhancement on the existing data set by using the trained neural network model;

generating the antagonistic network setup specifically comprises:

(1-1) feature encoding Module

The feature coding module is used for sampling the flawless cloth _normal As input, the sample picture resolution is width W × height H. Consists of five rolling blocks and two full-connected layers, and the structure of the full-connected layer is shown in figure 3. The convolution blocks are used for extracting characteristic information and down sampling, and each convolution block consists of three parts, namely standard 3 x3 convolution Conv, instance normalization InstanceNorm and leakage linear activation function LeakyReLU. The step size of the first Conv layer is 1 and the zero padding is 1; the step length of the rest four Conv layers is 2, and zero value filling is 2, so that a down-sampling function is realized, and feature maps under different resolutions are obtained; the slope of the leakyreu at the input of the negative activation value is 0.2. After passing through the feature encoding module, five feature maps F under different resolutions can be obtained from the output ends of the five convolution modules ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ It is used as input for subsequent network. Wherein F ₀ Resolution in accordance with the input image, F ₁ 、F ₂ 、F ₃ And F ₄ Is 1/2, 1/4, 1/8 and 1/16 of the input image respectively.

The full connection layer is used for calculating the style mean and variance of the feature map, and the highest dimension feature map F ₄ Spreading into one-dimensional sequence, inputting into two parallel 256-dimensional full-connection layers, respectively outputting and respectively calculating 256-dimensional style mean value mu and variance sigma of image ² And for the mean μ and variance σ ² Sampling results in 256-dimensional stylistic noise z.

(1-2) Defect Generation Module

The flaw generation module is composed of two parallel branches with the same structure, and generates a branch for a flaw foreground and a branch for a flaw transparency, the whole framework is as shown in fig. 4, both of which are decoding networks for decoding noise into a flaw image and a flaw transparency, and the specific structure is as follows:

(1-2-1) Defect prospect generating branch

The flaw foreground generation branch uses SPADE normaize to introduce a semantic map and takes SPADEResblock in GauGAN as a basic module. The structure of the SPADEResblock module is shown in figure 5, the module is composed of two convolution modules which are connected in series, each convolution module is composed of a SPADE layer, a nonlinear activation function ReLU and standard 3 x3 convolution in sequence, the convolution step length is 1, zero padding is 1, and residual connection is established between input and output.

The flaw foreground generation branch receives three groups of data input, which are respectively: 256-dimensional style noise z obtained by a feature coding module, and a feature graph F output by a middle layer of the feature coding module ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ And randomly sampling a flaw mask image M from the existing flaw sample library. Obtaining projection noise z from style noise z through a full connection layer _project Wherein the output dimension of the fully-connected layer is an integer multiple of the input dimension, and the multiple range is [128,512 ]]Here, 256 is taken. Will z _project The two-dimensional feature map is adjusted to have a resolution of (W/32) × (H/32). A basic decoding module is formed by a SPADEResblock module and bilinear interpolation 2 times up-sampling, the two-dimensional characteristic diagram is input into 5 serial basic decoding modules for decoding, and meanwhile, a flaw mask diagram M is input at the SPADEResblock end for guiding decoding.

At the output of each elementary decoding module, the feature maps F of the corresponding resolution in the feature coding module are dimensionally concatenated ₁ /F ₂ /F ₃ /F ₄ And is used as the input of the basic decoding module of the next stage. After 5 basic decoding modules, finally obtaining a high-resolution feature map F with the resolution of W multiplied by H _out A 1 to F _out And F ₀ Is carried out in dimensionSplicing, compressing the dimensionality to 3 dimensionalities through a layer of SPADEResblock, and outputting a foreground graph O through a Tanh activation function _forge 。

(1-2-2) Defect transparency generating Branch

The flaw transparency generation branch has the same structure as the flaw foreground branch, and the up-sampling intermediate layer also receives the image characteristic F extracted from the characteristic coding module ₀ ～F ₄ . The main differences are that: the input is a standard normal distribution noise z with a mean of 0 and a variance of 1 _noise (ii) a The activation function at the output end is changed from Tanh to Sigmoid; final output dimension is 1-dimensional transparency image O _alpha And finally obtaining a synthetic defect map by the formula (1).

(1-3) discriminator Module

The invention uses a multi-scale block discriminator to discriminate the generated cloth defect map, the main structure is shown in figure 6, and three parallel branches B are provided ₀ 、B ₁ And B ₂ And the three branches have input layers with the same parameters and are all standard 3 multiplied by 3 convolution plus leakage nonlinear activation function LeakyReLU. Where the step size of the convolution is 2, the zero padding is 2, and the negative slope of the LeakyReLU is 0.2. Furthermore, a standard 3 × 3 convolution, an example normalized InstanceNorm and a leaky nonlinear activation function, leak-nonlinear relu, are considered as basic modules CIL, where the step size of the convolution is 2, the zero padding is 2, and the negative slope of leak-nonlinear relu is 0.2.B ₀ The branch comprises 3 CIL, B connected in series ₁ Comprising 2 CIL, B connected in series ₂ Contains 1 CIL.

The multi-scale block discriminator receives four channels of input image data, wherein 0-2 channels are cloth defect images, 3 channels are defect mask images, and the original dimension W multiplied by H image is input into B ₀ Branch circuit identification, (W/2) x (H/2) image input B ₁ Branch circuit identification, (W/4) x (H/4) image input B ₂ And (4) branch circuit identification.

The neural network training optimization specifically comprises the following steps:

(2-1) loss function design

The loss function in the formula (5) is used for training the model, and the gradient penalty value of WGAN-GP is larger due to larger network model parameter quantityIt is easy to get the network into a mediocre solution without generating images in training, compared to other loss function values several orders of magnitude larger, so λ _grad The selection is not too large, and the range is [1e-3,1e-5 ]]Taking 1e-4; lambda _disc In the range of [0.1,5]Here, take 1; lambda [ alpha ] _alpha In the range of [1,10]Here, 5 is taken.

(2-2) data set creation and enhancement

Cloth defect datasets are marked due to the lack of open semantics. The method carries out semantic annotation on different cloth and different types of defects shot on a production line to obtain a data set omega ₁ In total, 460 figures are included, 652 defect examples, including 4 defects of common knots, dirty spots, colored yarns and silks. Due to the fact that the number of the data sets is insufficient, in order to guarantee that the model avoids mode collapse and overfitting, data enhancement is conducted on data in a random vertical and horizontal overturning mode in the training process, and the probability of the vertical overturning and the probability of the horizontal overturning are both 0.5. Flawless cloth data set omega ₂ The method is obtained by shooting in a factory production line, does not need additional marking work, and is unlimited in quantity.

(2-3) model training Process

According to the method, an Adam optimizer is adopted to optimize network model parameters, the range of the initial learning rate is [1e-6,5e-4], and 1e-5 is taken here. The learning rate attenuation uses equation (6) where the power range is [0.1,10], where 4 is taken.

Discriminator training phase uses a loss function L _gan Training the discriminator to judge true and false.

In the generator content training stage, a defect mask image and a non-defect sample are sent to a generator, the generated defect sample is sent to a discriminator for evaluation, and a loss function L is used _gan And training the flaw foreground generation branch. The generator transparency training and generator content training phases coincide, except that this phase uses a loss function L _alpha Training is performed and only flaw transparency generation branches are optimized during back propagation.

The process of using the trained neural network model to perform data enhancement on the existing data set specifically comprises the following steps:

and performing data enhancement on the existing flawless cloth image by using the trained feature coding module and the flawed generation module. First on the dataset omega ₂ In the step (2), the flawless cloth is subjected to N _r Selecting randomly to obtain multiple background images with 256 × 256 pixels, N _r In the range of [2,10]Here, 5 is taken. For each background plot, from dataset Ω ₁ And randomly sampling to obtain a mask image, inputting the mask image and the feature image into a feature coding module together for feature extraction, inputting the obtained feature image into a defect generating module to obtain a synthetic defect image, and finally pasting the synthetic defect back to a frame selection area corresponding to the cloth to finish data enhancement.

Fig. 7 to 10 are data enhancement effect diagrams of the method of the present invention, which respectively perform data enhancement on four defects of knots, dirty spots, colored yarns and silks. As shown in the figure, the cloth defect data enhancement algorithm can be used for superposing vivid defect patterns on the defect-free cloth to realize the expansion of the cloth defect data.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A cloth defect data enhancement method based on a generation countermeasure network is characterized by comprising the following steps: after a countermeasure network is built and generated and trained, multi-scale feature extraction and style coding are carried out on an input flawed cloth image to be enhanced through multilayer convolution and example regularization, an up-sampling convolution module is built through spatial adaptive normalization, 2 parallel convolution branches are used for decoupling network tasks, the generation of a flaw foreground and flaw transparency is respectively corresponding, transparency control, flaw form and spatial constraint are matched, and the flawed cloth image after data enhancement is finally obtained.

2. The cloth defect data enhancement method based on the generation countermeasure network of claim 1, characterized in that: the created generative confrontation network comprises:

and the discriminator module is used for judging whether the input sample is a real flaw sample or a synthesized flaw sample and guiding the flaw generation module to synthesize the flaw sample in the model iterative training process.

3. The cloth defect data enhancement method based on the generation countermeasure network of claim 2, characterized in that: the feature encoding module comprises 5 rolling blocks and 2 full connection layers;

any convolution block comprises a standard 3 x3 convolution layer, an instance normalization layer and a leaky linear activation function layer;

the 5 convolution blocks respectively output the feature maps F under five different resolutions ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ While simultaneously adding F ₄ Inputting the data into two parallel 256-dimensional full-connection layers, and respectively calculating an image style mean vector mu and a variance vector sigma ² And for mu and sigma ² A single gaussian sampling is performed to obtain the stylistic noise z.

4. The cloth defect data enhancement method based on the generation countermeasure network of claim 3, characterized in that: the flaw generation module comprises a parallel flaw foreground generation branch and a parallel flaw transparency generation branch;

To obtain S _defect ＝S _normal ·(1-O _alpha )+O _forge ·O _alpha Wherein S is _defect To synthesize a defect map, S _normal Inputing inpur-flawless samples for feature coding modules, O _forge Generating an output of the branch for the defect foreground, O _alpha The output of the branch is generated for defective transparency.

5. The cloth defect data enhancement method based on the generation countermeasure network of claim 4, characterized in that: the up-sampling module is composed of two convolution blocks connected in series, and each convolution block comprises a SPADE layer, a nonlinear activation function ReLU and a standard 3 x3 convolution which are arranged in sequence.

6. The cloth defect data enhancement method based on the generation countermeasure network of claim 3, characterized in that: the input of the flaw generation module comprises 256-dimensional style noise z obtained by the characteristic coding module and a characteristic diagram F output by the intermediate layer of the characteristic coding module ₀ 、F ₁ 、F ₂ 、F ₃ And F ₄ Randomly sampling a defect mask map M from a preset defect sample library;

projecting noise z obtained by inputting style noise z into corresponding full connection layer _project And standard normal distribution noise z _noise Respectively inputting 6 up-sampling modules in corresponding generation branches for decoding, and simultaneously inputting a flaw mask image M to the up-sampling modules for guiding decoding; respectively splicing the characteristic graphs F at the output ends of the first 5 upsampling modules in sequence ₄ 、F ₃ 、F ₂ 、F ₁ 、F ₀ As input to the next level up-sampling module; for a defect foreground generation branch, compressing the dimensionality of a feature map to 3 dimensionalities through a last up-sampling module, and outputting through a corresponding activation function; for the flaw transparency generation branch, the dimension of the feature map is compressed to 1 dimension through a last up-sampling module, and the feature map is output through a corresponding activation function.

7. The cloth defect data enhancement method based on the generation countermeasure network of claim 2, characterized in that: the discriminator module comprises three parallel branches B ₀ 、B ₁ And B ₂ Any branch comprises an input layer and a plurality of modules CIL, wherein each module CIL comprises a standard 3 x3 convolution layer, an example normalization layer and a leakage nonlinear activation function layer which are sequentially arranged; branch B ₀ The number of the modules CIL is 3, and the modules CIL are used for receiving images with the original size W multiplied by H and carrying out identification;

branch B ₂ The number of modules CIL is 1, and the modules CIL are used for receiving and identifying images with the size of (W/4) × (H/4).

8. The cloth defect data enhancement method based on the generation countermeasure network of claim 1, characterized in that: the training of the generation of the countermeasure network comprises the following steps:

step 1.1: designing a loss function;

step 1.2: making and enhancing the data set to obtain a flaw data set omega ₁ And flawless cloth data set omega ₂ ；

9. The cloth defect data enhancement method based on the generation countermeasure network of claim 8, wherein:

final loss function L _all ＝L _gan +λ _alpha ×L _alpha ；

representing data distribution from a fault generator

Sampling the resultant flaw sample, D representing discriminator, G representing generator, m representing the distribution p of the flaw mask data from the real flaw _ralpha True defect mask map, a denotes data distribution p from defect transparency generator _galpha Sampling the obtained synthesized flaw transparency map; lambda _disc Is a hyper-parameter, λ, for controlling the truth of the synthetic flaws _disc For controlling the hyperparametric, λ, of the weight penalty intensity _alpha Is a hyper-parameter for controlling the similarity of the composite defect transparency map and the real defect mask map。

10. The method for enhancing the cloth defect data based on the generation countermeasure network of claim 8, wherein: the method for enhancing the cloth defect data by the trained generation countermeasure network comprises the following steps:

step 2.1: for flawless cloth data set omega ₂ Is performed for each image of (1) _r Random frame selection is performed to obtain a plurality of background images with 256 multiplied by 256 pixels, wherein N is _r In the range of [2,10]；

Step 2.2: for each background image, from the defect dataset Ω ₁ Randomly sampling to obtain a mask image, and inputting the background image and the mask image into a flaw generation network model together to obtain a synthetic flaw image;