CN113962893A - Face image restoration method based on multi-scale local self-attention generation countermeasure network - Google Patents
Face image restoration method based on multi-scale local self-attention generation countermeasure network Download PDFInfo
- Publication number
- CN113962893A CN113962893A CN202111253713.5A CN202111253713A CN113962893A CN 113962893 A CN113962893 A CN 113962893A CN 202111253713 A CN202111253713 A CN 202111253713A CN 113962893 A CN113962893 A CN 113962893A
- Authority
- CN
- China
- Prior art keywords
- attention
- image
- channel
- network
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 230000002950 deficient Effects 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 23
- 230000003042 antagnostic effect Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 76
- 230000008439 repair process Effects 0.000 claims description 23
- 230000004913 activation Effects 0.000 claims description 21
- 238000013507 mapping Methods 0.000 claims description 21
- 238000005070 sampling Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 20
- 230000007547 defect Effects 0.000 claims description 19
- 238000010606 normalization Methods 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 17
- 238000010586 diagram Methods 0.000 claims description 16
- 230000009466 transformation Effects 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 13
- 230000001815 facial effect Effects 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000017105 transposition Effects 0.000 claims description 6
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 238000005315 distribution function Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 239000003086 colorant Substances 0.000 claims description 2
- 230000006835 compression Effects 0.000 claims description 2
- 238000007906 compression Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract 1
- 230000003213 activating effect Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 2
- 206010061619 Deformity Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a human face image restoration method based on a multi-scale local self-attention generation antagonistic network, which comprises the following steps: acquiring a face missing image and a corresponding mask and preprocessing the face missing image and the corresponding mask; constructing a countervailing network based on multi-scale local self-attention generation, and training and modeling the countervailing network based on the multi-scale local self-attention generation by using a defective face image data set to obtain a face repairing model; and generating an antagonistic network model through multi-scale local self-attention, and repairing the defected human face image to be detected. According to the invention, a multi-scale structure and a dual-channel local self-attention module are added in the generated network, so that the technical problems of unstable training, low restoration precision and efficiency, lack of symmetry and mode collapse in the face restoration problem of the generated confrontation network are effectively solved, and an efficient, accurate and stable restoration method is provided for the face restoration.
Description
Technical Field
The invention belongs to the technical field of computer face image restoration, and particularly relates to a face image restoration method based on a multi-scale local self-attention generation countermeasure network.
Background
The image restoration is to restore the damaged area of the image by a certain technical means, so that the damaged area of the image has good consistency with surrounding features, and the restored image and the original image have the same semantic content. At present, the classic algorithms for repairing face images mainly include diffusion-based algorithms and image block-based matching algorithms. However, these classical image inpainting algorithms are mainly based on mathematical and physical models, so that the input image is required to contain similar information as the missing region, such as similar pixels, structures or image blocks, and cannot generate new content. If the image has a large area missing, the image can not be effectively reconstructed.
The image reconstruction method based on the depth generation countermeasure network repairs a missing image by learning the distribution of input images. Compared with the traditional reconstruction method, the method can capture the high-level semantic information of the image without having similar pixels or image blocks in the missing image, and generate the missing area with the same semantic as the original image, thereby realizing the effective restoration of the image. Therefore, the method for generating the countermeasure network based on the depth can repair the images which are lost in a small range, can realize the repair and reconstruction of large-area lost images according to semantic content, and is an effective face repair method.
At present, the networks for implementing image restoration by using the generation countermeasure network mainly include a Context Encoder (CE) proposed by Pathak et al and a global local image restoration network (GLCIC) proposed by Iizuka et al. Both architectures can reconstruct images through semantic information, using reconstruction losses and countermeasures to guide the image generation process. However, the method of the context encoder mainly focuses on repairing the missing region, and the method of replacing the non-missing region with the original image may cause a repair boundary between the missing region and the non-missing region, thereby affecting the integrity of the generated image. The GLCIC network controls the generation process of the image through a global discriminator and a local discriminator, and the problem that the image generated by the CE has a repaired boundary is solved. However, this method does not focus on repairing the missing area image, and therefore the generated image is blurred in the missing area.
Disclosure of Invention
The invention aims to solve the technical problems that the existing method for generating the countermeasure network to realize image restoration does not pay close attention to restoration of images in the missing area and the generated images are fuzzy in the missing area, and provides a human face image restoration method for generating the countermeasure network based on multi-scale local self-attention. According to the method, a multi-scale local self-attention module is adopted in a generator, and the information of a missing area is focused, so that the problem of high-precision restoration of the face image can be solved in a targeted manner, and multi-scale image information is added, so that the training process is more efficient and stable.
In order to solve the technical problems, the invention adopts the technical scheme that:
a facial image restoration method based on a multi-scale local self-attention generation countermeasure network comprises the following steps:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and dividing the defective face image data set into a training set and a test set according to a preset proportion. Wherein ☉ represents element multiplication;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode;
step three: setting a network model hyper-parameter, training and modeling a multi-scale local self-attention generation and countermeasure network model by using a defect face image training set, respectively adopting an Adam optimizer and a random gradient descent (SGD) algorithm in the modeling process to generate a network and a discrimination network, and optimizing the sum of a plurality of loss functions in the countermeasure training process to obtain a multi-scale local self-attention generation and countermeasure defect face restoration model;
step four: and testing the anti-defect human face restoration model by adopting a defect human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
Further, the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises a plurality of encoding feature extraction modules, and the encoding feature extraction modules comprise encoding convolution layers, batch normalization and Leaky ReLu activation functions. The method comprises the steps that each coding convolution layer carries out feature extraction on an input defect image through convolution kernels with the size of k, the scanning step length of s and the number of filling pixels of p, batch normalization is carried out after each convolution operation, the convolution kernels are activated through a nonlinear activation function Leaky ReLu, and along with the increase of the number of convolution layers, extracted features gradually evolve from low-level features based on colors and textures to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises a plurality of feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activating function. The convolution kernel of each expansion convolution layer is 3 × 3 expansion convolution, and the expansion rate of the t-th layer convolution kernel is 2tWherein T is 1,2, … T0(ii) a The method is used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder module consists of a plurality of decoding feature mapping modules, m-scale dual-channel local self-attention modules, a plurality of up-sampling modules and a nonlinear image balancing module; the decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a double-channel local self-attention module is added in front of each up-sampling module; the concrete connection mode is as follows: the first decoding feature mapping module is connected with the second decoding feature mapping module and used for extracting a feature map of a corresponding scale, the second decoding feature mapping module is connected with the mth scale dual-channel local self-attention module, and the missing information is repaired through the difference between the known area and the missing area of the focused image; the m-scale dual-channel local self-attention module is connected with the 1 st up-sampling module, up-sampling of the image is achieved through deconvolution operation, and batch normalization operation and Leaky ReLu function activation are conducted; the 1 st upsampling module is connected with the m +1 th scale double-channel local self-attention module through the third decoding feature mapping module, and the function of the module is to focus the difference between a known region and a missing region in the upsampled feature map again on the m +1 scale to repair and adjust the missing information, so that the repair of the feature map is realized on multiple scales; the (m + 1) th scale dual-channel local self-attention module is connected with the (2) th up-sampling module, the (2) th up-sampling module is connected with the nonlinear image equalization module after passing through the fourth decoding feature mapping module, namely, the image is converted into a three-channel RGB image, so that the effective reconstruction of the image is realized.
Furthermore, the discrimination network comprises a plurality of feature discrimination modules, and each feature discrimination module consists of a discrimination convolution layer-batch normalization-Leaky ReLu activation function. Each judging convolutional layer carries out feature extraction and compression on the reconstructed image through the convolutional layer with the size of k 'and the scanning step length of s', finally a probability value is output for judging the repairing effect of the generated image, and the number of channels of the characteristic image output by each judging convolutional layer of the judging network is at least doubled compared with the number of channels of the convolutional layer corresponding to the generating network.
Further, performing convolution operation on a feature map obtained before each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2Reconstructing loss, and comparing with real image with corresponding scale in the process of image reconstruction, thereby gradually controlling generation of face imageThe training process is more stable.
Further, the cross attention channel in the dual-channel local self-attention module repairs the defective region of the image by focusing attention of the missing region and the non-missing region, specifically:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb;
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' number of pixels of foreground and background;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) element E in the cross attention channel attention map EijCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the output of the cross attention channel is:β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
Further, the space focuses on the attention in the missing region from the attention channel, and obtains the internal relation of the facial image features to repair the image, specifically:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:wherein: beta is a2Is the weight distribution parameter of the spatial self-attention channel, is the learnability parameter of the network, and pad (-) represents the zero-padding operationDo, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
Further, the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd Total Variation loss function (Total Variation loss) LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiIs the weight of each scale;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation trainingAnd transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
the style loss function expression is:
the total loss function expression is: l ═ alpha1Lm+α2Ladv+α3Lperceptual+α4Lstyle+α5LTV
Wherein:and x denotes a restored face image and a real face image, phi denotes a VGG-16 network,andrespectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj,Wj,CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function.
Further, tongOptimizing the sum of the loss functions to obtain a parameter theta of the multi-scale local self-attention generation antagonistic networkd,θgArgmin (L), and further obtaining a repaired face imageWherein, thetadAnd thetagTo discriminate the network and to generate parameters for the network.
The invention has the beneficial effects that:
1. according to the invention, the double-channel local self-attention module is added in the generator, and the network acquires the internal relation of the facial image characteristics by focusing the attention of the missing region and the non-missing region and the self-attention in the missing region, so that the network learning efficiency is improved, the repair of the fine part of the face is realized, and an effective way is provided for the reconstruction of the high-precision missing facial image;
2. according to the method, a multi-scale local self-attention mechanism is added on each scale in the image generation process, the generation process of the face image is gradually controlled, so that a dual-channel local self-attention module can play a role on each scale, and the training process is more stable;
3. according to the invention, the 'jump' connection is adopted in the generator, so that the expression and repair efficiency of high-level semantic information of the image is enhanced, and the mode collapse is avoided;
4. the invention adopts a 'high-capacity' judging network, wherein the 'high capacity' means that the number of channels of an output characteristic diagram of each judging convolution layer of the judging network is at least doubled compared with the number of channels of the convolution layers corresponding to the generating network. The large-capacity discrimination network discriminates a large number of feature maps of the generated image, so that the small difference between the restored image and the original image is effectively discriminated, and the precision of the image to be restored is improved.
Drawings
FIG. 1 is a schematic overall framework diagram of the present invention for face defect image restoration;
FIG. 2 is a schematic diagram of the operation of the self-attention mechanism of the present invention;
fig. 3 is a schematic diagram of the test results of the present invention for repairing a defective image of a human face.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
In this embodiment, a method for repairing a face image based on a multi-scale local self-attention generation countermeasure network includes:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and the acquired defective face image data set is preprocessed, i.e. the image size is uniformly set to N0×N0,N0Number of pixels in width and height dimensions, N, of an image0128, ☉ represents the element multiplication and normalization before entering the network; dividing the preprocessed face image data into a training set and a test set according to the proportion of 10: 1; in this embodiment, 22,000 different face image data sets are divided into a training set and a test set according to a ratio of 10:1, and the number of images in the training set and the test set is 20k and 2k respectively;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode as shown in FIG. 1; the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises 6 encoding feature extraction modules, and the encoding feature extraction module comprises encoding convolution layers, batch normalization and Leaky ReLu activation functions. The number of the convolution kernel size k, the corresponding step length s and the feature map filling pixel number p adopted by each coding convolution layer is { k, s, p } { (5,1, 1); (3,2, 1); (3,1, 1); (3,2, 1); (3,1, 1); (3,1,1) }, generating feature maps having sizes of 128 × 128 × 64,64 × 64 × 128,64 × 64 × 128,32 × 32 × 256,32 × 32 × 256, and 32 × 32 × 256, respectively; carrying out batch normalization operation after each coding convolution layer and activating the coding convolution layer through a Leaky ReLu function with the slope of 0.2; with the increase of the number of the convolution layers, the extracted features gradually evolve from low-level features based on color and texture to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises 4 feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activating function. The convolution kernel of each expansion convolution layer is expansion convolution of 3 multiplied by 3, the expansion rates are respectively 2,4,8 and 16, and the expansion convolution kernels are used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder consists of 4 decoding feature mapping modules, 2 scale double-channel local self-attention modules, 2 upsampling modules and 1 nonlinear image balancing module. The decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a two-channel local self-attention module is added in front of each up-sampling module. The concrete connection mode is as follows: the first decoding feature mapping module and the second decoding feature mapping module with convolution kernel of 3 multiplied by 3 are connected and used for extracting feature maps with corresponding scales, the second decoding feature mapping module is connected with the first scale dual-channel local self-attention module, and feature map missing information with the scale of 32 multiplied by 32 is repaired through the difference between the known area and the missing area of the focused image; the first-scale dual-channel local self-attention module is connected with the No. 1 up-sampling module, up-sampling of an image is realized through deconvolution operation with a convolution kernel of 4 x 4, and the image is restored to 64 x 64; the 1 st upsampling module is connected with a second-scale dual-channel local self-attention module through a third decoding feature mapping module with convolution kernel of 3 x 3, and the function of the module is to focus on the difference between a known area and a missing area in the upsampled feature map again on the second scale so as to repair and adjust the missing information of the feature map with the scale of 64 x 64, thereby realizing the repair of the feature map on multiple scales. The second scale dual-channel local self-attention module is connected with the 2 nd up-sampling module, namely, the up-sampling of the image is realized again through the deconvolution operation with convolution kernel of 4 x 4, namely, the image is restored to the original size of 128 x 128, the 2 nd up-sampling module is connected with the nonlinear image equalization module through the fourth decoding feature mapping module with convolution kernel of 3 x 3, the image is converted into a three-channel RGB image, and therefore effective reconstruction of the image is realized.
Performing convolution operation on the feature map obtained in front of each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2And loss is reconstructed, and the real images with corresponding scales are compared in the image reconstruction process, so that the generation process of the face images is gradually controlled, and the training process is more stable.
The discrimination network comprises 6 feature discrimination modules, and the feature discrimination modules consist of discrimination convolution layers, batch normalization and Leaky ReLu activation functions. The first 5 discriminating convolutional layers adopt 4 × 4 convolutional kernels, the scanning step is 2 × 2, the number of channels for generating feature maps is about 2-4 times of that of the corresponding layers of the generated network, the number of channels is 128,128,256,512,1024 'large-capacity' feature maps respectively, the size of a network output tensor after the 5 th convolutional operation is 4 × 4 × 1024, the tensor is subjected to feature extraction again by adopting the 4 × 4 convolutional kernels, activation is carried out through a Sigmoid function, a 1 × 1 × 1 probability value is output, and the result is used for representing the truth of an input image. And adding batch normalization operation after the convolution layers of the generated network and the judgment network, and carrying out batch normalization processing on the feature graph after convolution to accelerate network convergence.
In this embodiment, a dual-channel local self-attention module is added in front of each deconvolution layer of the decoder module, as shown in fig. 2; the two-channel local self-attention module comprises a cross attention channel and a space self-attention channel, wherein the cross attention channel and the space self-attention channel are connected in a parallel mode, and a generated feature map needs to pass through the cross attention channel and the space self-attention channel; the network model carries out image restoration through two dimensions of feature information of a known region and self attention in an unknown region, and therefore high-precision and high-efficiency reconstruction of the missing region of the face image is achieved.
The cross attention channel restores the image by focusing attention of the missing region and the non-missing region, specifically:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb;
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' is the number of pixels of the foreground and the background, and C is the number of channels;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) an attention map of the intersecting attention channelsElement E of Ei,jCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the output of the cross attention channel is:β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
The space self-attention channel focuses on the attention in the missing area, acquires the internal relation of the human face image characteristics to repair the human face image, and specifically comprises the following steps:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:wherein: beta is a2Is a weight assignment parameter of the spatial self-attention channel, is a learnability parameter of the network, pad (-) denotes a zero-padding operation, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
Step three: setting network model hyper-parameters, wherein the hyper-parameters comprise an initial learning rate (gamma), an optimization algorithm for distinguishing a network and generating the network, a batch size (batch size) and iteration times (epoch), and values of the hyper-parameters are respectively as follows: the method comprises the steps that gamma is 0.001, batch size is 64, epoch is 200, a multi-scale local self-attention generation countermeasure network model is trained and modeled by using a defective face image training set, in the modeling process, an Adam optimizer and a random gradient descent (SGD) algorithm are respectively adopted for generating a network and a discrimination network, in the training process of countermeasures, the sum of a plurality of loss functions is optimized, and a parameter theta of the multi-scale local self-attention generation countermeasure is obtainedd,θgArgmin (L), and further obtaining a repaired face imageWherein, thetadAnd thetagTo discriminate between the network and generate the parameters of the network,representing a repaired face image; thereby obtaining a multi-scale local self-attention generation disfigurement-resistant face repairing model;
the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd total variation loss function (Tota)l Variation loss)LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiThe weight of each scale is, the total number m of scales in this embodiment is 3, and the corresponding weights are 0.4,0.6, and 0.8, respectively;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation trainingAnd transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
the style loss function expression is:
the total loss function expression is: l ═ alpha1Lm+α2Ladv+α3Lperceptual+α4Lstyle+α5LTV
Wherein: x represents trueThe face image, phi, represents the VGG-16 network,andrespectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj、WjAnd CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function. Set to 100,10,1,1,1 in the present embodiment.
Step four: and testing the anti-human face restoration model by adopting a defective human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
Fig. 3 shows the repair results of the generated confrontation network model based on multi-scale local self-attention on 2k face defect image test sets, where the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) reach 25.39 and 0.87, respectively. The method not only improves the network learning efficiency, but also realizes the repair of the fine part of the face, and proves the excellent performance of the method in the aspect of repairing the defective face image.
Claims (8)
1. A facial image restoration method based on a multi-scale local self-attention generation countermeasure network is characterized by comprising the following steps:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and dividing the defective face image data set into a training set and a test set according to a preset proportion; wherein ☉ represents element multiplication;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode;
step three: setting a network model hyper-parameter, training and modeling a multi-scale local self-attention generation and countermeasure network model by using a defect face image training set, respectively adopting an Adam optimizer and a random gradient descent (SGD) algorithm in the modeling process to generate a network and a discrimination network, and optimizing the sum of a plurality of loss functions in the countermeasure training process to obtain a multi-scale local self-attention generation and countermeasure defect face restoration model;
step four: and testing the anti-defect human face restoration model by adopting a defect human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
2. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises a plurality of encoding feature extraction modules, and each encoding feature extraction module comprises an encoding convolution layer, batch normalization and a Leaky ReLu activation function; the method comprises the steps that each coding convolution layer carries out feature extraction on an input defect image through convolution kernels with the size of k, the scanning step length of s and the number of filling pixels of p, batch normalization is carried out after each convolution operation, the convolution kernels are activated through a nonlinear activation function Leaky ReLu, and along with the increase of the number of convolution layers, extracted features gradually evolve from low-level features based on colors and textures to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises a plurality of feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activation function; the convolution kernel of each expansion convolution layer is 3 × 3 expansion convolution, and the expansion rate of the t-th layer convolution kernel is 2tWherein T is 1,2, … T0(ii) a The method is used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder module consists of a plurality of decoding feature mapping modules, m-scale dual-channel local self-attention modules, a plurality of up-sampling modules and a nonlinear image balancing module; the decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a double-channel local self-attention module is added in front of each up-sampling module; the concrete connection mode is as follows: the first decoding feature mapping module is connected with the second decoding feature mapping module and used for extracting a feature map of a corresponding scale, the second decoding feature mapping module is connected with the mth scale dual-channel local self-attention module, and the missing information is repaired through the difference between the known area and the missing area of the focused image; the m-scale dual-channel local self-attention module is connected with the 1 st up-sampling module, up-sampling of the image is achieved through deconvolution operation, and batch normalization operation and Leaky ReLu function activation are conducted; the 1 st upsampling module is connected with the m +1 th scale double-channel local self-attention module through the third decoding feature mapping module, and the function of the module is to focus the difference between a known region and a missing region in the upsampled feature map again on the m +1 scale to repair and adjust the missing information, so that the repair of the feature map is realized on multiple scales; the (m + 1) th scale dual-channel local self-attention module is connected with the (2) th up-sampling module, the (2) th up-sampling module is connected with the nonlinear image equalization module after passing through the fourth decoding feature mapping module, namely, the image is converted into a three-channel RGB image, so that the effective reconstruction of the image is realized.
3. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the discrimination network comprises a plurality of characteristic discrimination modules, and each characteristic discrimination module consists of a discrimination convolution layer, batch normalization and a Leaky ReLu activation function; each judging convolutional layer carries out feature extraction and compression on the reconstructed image through the convolutional layer with the size of k 'and the scanning step length of s', finally a probability value is output for judging the repairing effect of the generated image, and the number of channels of the characteristic image output by each judging convolutional layer of the judging network is at least doubled compared with the number of channels of the convolutional layer corresponding to the generating network.
4. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 2, characterized in that: performing convolution operation on the feature map obtained in front of each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2And loss is reconstructed, and the real images with corresponding scales are compared in the image reconstruction process, so that the generation process of the face images is gradually controlled, and the training process is more stable.
5. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the cross attention channel in the dual-channel local self-attention module repairs the defective region of the image by focusing attention of the missing region and the non-missing region, and specifically comprises the following steps:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb;
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' number of pixels of foreground and background;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) element E in the cross attention channel attention map EijCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the Cross attentionThe output of the channel is:β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
6. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the method comprises the following steps that the space self-attention channel focuses on the attention in the missing area, and the internal relation of the human face image characteristics is obtained to repair the image, and specifically comprises the following steps:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:wherein: beta is a2Is a weight distribution parameter of a spatial self-attention channel, is of a networkThe learnability parameter, pad (-), represents a zero-stuffing operation, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
7. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd total variation loss function LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiIs the weight of each scale;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation trainingAnd transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
the style loss function expression is:
the total loss function expression is: l ═ alpha1Lm+α2Ladv+α3Lperceptual+α4Lstyle+α5LTV
Wherein:and x denotes a restored face image and a real face image, phi denotes a VGG-16 network,andrespectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj,Wj,CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function.
8. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 7, characterized in that: obtaining a parameter theta of the multi-scale local self-attention generation antagonistic network by optimizing the sum of the loss functionsd,θgArgmin (L), and further obtaining a repaired face imageWherein, thetadAnd thetagTo discriminate the network and to generate parameters for the network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111253713.5A CN113962893A (en) | 2021-10-27 | 2021-10-27 | Face image restoration method based on multi-scale local self-attention generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111253713.5A CN113962893A (en) | 2021-10-27 | 2021-10-27 | Face image restoration method based on multi-scale local self-attention generation countermeasure network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113962893A true CN113962893A (en) | 2022-01-21 |
Family
ID=79467506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111253713.5A Pending CN113962893A (en) | 2021-10-27 | 2021-10-27 | Face image restoration method based on multi-scale local self-attention generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113962893A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114386531A (en) * | 2022-01-25 | 2022-04-22 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on double-stage attention |
CN114494499A (en) * | 2022-01-26 | 2022-05-13 | 电子科技大学 | Sketch coloring method based on attention mechanism |
CN114581343A (en) * | 2022-05-05 | 2022-06-03 | 南京大学 | Image restoration method and device, electronic equipment and storage medium |
CN114693577A (en) * | 2022-04-20 | 2022-07-01 | 合肥工业大学 | Infrared polarization image fusion method based on Transformer |
CN114782291A (en) * | 2022-06-23 | 2022-07-22 | 中国科学院自动化研究所 | Training method and device of image generator, electronic equipment and readable storage medium |
CN114862699A (en) * | 2022-04-14 | 2022-08-05 | 中国科学院自动化研究所 | Face repairing method, device and storage medium based on generation countermeasure network |
CN115358954A (en) * | 2022-10-21 | 2022-11-18 | 电子科技大学 | Attention-guided feature compression method |
CN115471901A (en) * | 2022-11-03 | 2022-12-13 | 山东大学 | Multi-pose face frontization method and system based on generation of confrontation network |
CN115984106A (en) * | 2022-12-12 | 2023-04-18 | 武汉大学 | Line scanning image super-resolution method based on bilateral generation countermeasure network |
CN116051936A (en) * | 2023-03-23 | 2023-05-02 | 中国海洋大学 | Chlorophyll concentration ordered complement method based on space-time separation external attention |
CN116071275A (en) * | 2023-03-29 | 2023-05-05 | 天津大学 | Face image restoration method based on online knowledge distillation and pretraining priori |
CN117611753A (en) * | 2024-01-23 | 2024-02-27 | 吉林大学 | Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology |
CN117974508A (en) * | 2024-03-28 | 2024-05-03 | 南昌航空大学 | Iris image restoration method for irregular occlusion based on generation countermeasure network |
CN117974832A (en) * | 2024-04-01 | 2024-05-03 | 南昌航空大学 | Multi-modal liver medical image expansion algorithm based on generation countermeasure network |
CN117994173A (en) * | 2024-04-07 | 2024-05-07 | 腾讯科技(深圳)有限公司 | Repair network training method, image processing method, device and electronic equipment |
CN118036701A (en) * | 2024-04-10 | 2024-05-14 | 南昌工程学院 | Ultraviolet image-based insulator corona discharge data enhancement method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689499A (en) * | 2019-09-27 | 2020-01-14 | 北京工业大学 | Face image restoration method based on dense expansion convolution self-coding countermeasure network |
CN111275638A (en) * | 2020-01-16 | 2020-06-12 | 湖南大学 | Face restoration method for generating confrontation network based on multi-channel attention selection |
CN112184582A (en) * | 2020-09-28 | 2021-01-05 | 中科人工智能创新技术研究院(青岛)有限公司 | Attention mechanism-based image completion method and device |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
-
2021
- 2021-10-27 CN CN202111253713.5A patent/CN113962893A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689499A (en) * | 2019-09-27 | 2020-01-14 | 北京工业大学 | Face image restoration method based on dense expansion convolution self-coding countermeasure network |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
CN111275638A (en) * | 2020-01-16 | 2020-06-12 | 湖南大学 | Face restoration method for generating confrontation network based on multi-channel attention selection |
CN112184582A (en) * | 2020-09-28 | 2021-01-05 | 中科人工智能创新技术研究院(青岛)有限公司 | Attention mechanism-based image completion method and device |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114386531B (en) * | 2022-01-25 | 2023-02-14 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on double-stage attention |
CN114386531A (en) * | 2022-01-25 | 2022-04-22 | 山东力聚机器人科技股份有限公司 | Image identification method and device based on double-stage attention |
CN114494499A (en) * | 2022-01-26 | 2022-05-13 | 电子科技大学 | Sketch coloring method based on attention mechanism |
CN114862699A (en) * | 2022-04-14 | 2022-08-05 | 中国科学院自动化研究所 | Face repairing method, device and storage medium based on generation countermeasure network |
CN114693577A (en) * | 2022-04-20 | 2022-07-01 | 合肥工业大学 | Infrared polarization image fusion method based on Transformer |
CN114693577B (en) * | 2022-04-20 | 2023-08-11 | 合肥工业大学 | Infrared polarized image fusion method based on Transformer |
CN114581343A (en) * | 2022-05-05 | 2022-06-03 | 南京大学 | Image restoration method and device, electronic equipment and storage medium |
CN114782291B (en) * | 2022-06-23 | 2022-09-06 | 中国科学院自动化研究所 | Training method and device of image generator, electronic equipment and readable storage medium |
CN114782291A (en) * | 2022-06-23 | 2022-07-22 | 中国科学院自动化研究所 | Training method and device of image generator, electronic equipment and readable storage medium |
CN115358954B (en) * | 2022-10-21 | 2022-12-23 | 电子科技大学 | Attention-guided feature compression method |
CN115358954A (en) * | 2022-10-21 | 2022-11-18 | 电子科技大学 | Attention-guided feature compression method |
CN115471901A (en) * | 2022-11-03 | 2022-12-13 | 山东大学 | Multi-pose face frontization method and system based on generation of confrontation network |
CN115984106A (en) * | 2022-12-12 | 2023-04-18 | 武汉大学 | Line scanning image super-resolution method based on bilateral generation countermeasure network |
CN115984106B (en) * | 2022-12-12 | 2024-04-02 | 武汉大学 | Line scanning image super-resolution method based on bilateral generation countermeasure network |
CN116051936A (en) * | 2023-03-23 | 2023-05-02 | 中国海洋大学 | Chlorophyll concentration ordered complement method based on space-time separation external attention |
CN116051936B (en) * | 2023-03-23 | 2023-06-20 | 中国海洋大学 | Chlorophyll concentration ordered complement method based on space-time separation external attention |
CN116071275B (en) * | 2023-03-29 | 2023-06-09 | 天津大学 | Face image restoration method based on online knowledge distillation and pretraining priori |
CN116071275A (en) * | 2023-03-29 | 2023-05-05 | 天津大学 | Face image restoration method based on online knowledge distillation and pretraining priori |
CN117611753A (en) * | 2024-01-23 | 2024-02-27 | 吉林大学 | Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology |
CN117611753B (en) * | 2024-01-23 | 2024-03-22 | 吉林大学 | Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology |
CN117974508A (en) * | 2024-03-28 | 2024-05-03 | 南昌航空大学 | Iris image restoration method for irregular occlusion based on generation countermeasure network |
CN117974508B (en) * | 2024-03-28 | 2024-06-07 | 南昌航空大学 | Iris image restoration method for irregular occlusion based on generation countermeasure network |
CN117974832A (en) * | 2024-04-01 | 2024-05-03 | 南昌航空大学 | Multi-modal liver medical image expansion algorithm based on generation countermeasure network |
CN117974832B (en) * | 2024-04-01 | 2024-06-07 | 南昌航空大学 | Multi-modal liver medical image expansion algorithm based on generation countermeasure network |
CN117994173A (en) * | 2024-04-07 | 2024-05-07 | 腾讯科技(深圳)有限公司 | Repair network training method, image processing method, device and electronic equipment |
CN117994173B (en) * | 2024-04-07 | 2024-06-11 | 腾讯科技(深圳)有限公司 | Repair network training method, image processing method, device and electronic equipment |
CN118036701A (en) * | 2024-04-10 | 2024-05-14 | 南昌工程学院 | Ultraviolet image-based insulator corona discharge data enhancement method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113962893A (en) | Face image restoration method based on multi-scale local self-attention generation countermeasure network | |
US11450066B2 (en) | 3D reconstruction method based on deep learning | |
CN113240580B (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN112819910B (en) | Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network | |
CN113673590B (en) | Rain removing method, system and medium based on multi-scale hourglass dense connection network | |
CN113989129A (en) | Image restoration method based on gating and context attention mechanism | |
CN115018727A (en) | Multi-scale image restoration method, storage medium and terminal | |
CN114445292A (en) | Multi-stage progressive underwater image enhancement method | |
CN111833261A (en) | Image super-resolution restoration method for generating countermeasure network based on attention | |
CN113112416A (en) | Semantic-guided face image restoration method | |
CN117274760A (en) | Infrared and visible light image fusion method based on multi-scale mixed converter | |
CN114638768B (en) | Image rain removing method, system and equipment based on dynamic association learning network | |
CN114266957A (en) | Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation | |
CN116797461A (en) | Binocular image super-resolution reconstruction method based on multistage attention-strengthening mechanism | |
Cherian et al. | A Novel AlphaSRGAN for Underwater Image Super Resolution. | |
CN112686822B (en) | Image completion method based on stack generation countermeasure network | |
CN113628143A (en) | Weighted fusion image defogging method and device based on multi-scale convolution | |
CN112634168A (en) | Image restoration method combined with edge information | |
CN116703750A (en) | Image defogging method and system based on edge attention and multi-order differential loss | |
CN114862699B (en) | Face repairing method, device and storage medium based on generation countermeasure network | |
CN116188272A (en) | Two-stage depth network image super-resolution reconstruction method suitable for multiple fuzzy cores | |
CN115660979A (en) | Attention mechanism-based double-discriminator image restoration method | |
CN115861108A (en) | Image restoration method based on wavelet self-attention generation countermeasure network | |
CN115100091A (en) | Conversion method and device for converting SAR image into optical image | |
CN114331894A (en) | Face image restoration method based on potential feature reconstruction and mask perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |