CN113962893A - Face image restoration method based on multi-scale local self-attention generation countermeasure network - Google Patents

Face image restoration method based on multi-scale local self-attention generation countermeasure network Download PDF

Info

Publication number
CN113962893A
CN113962893A CN202111253713.5A CN202111253713A CN113962893A CN 113962893 A CN113962893 A CN 113962893A CN 202111253713 A CN202111253713 A CN 202111253713A CN 113962893 A CN113962893 A CN 113962893A
Authority
CN
China
Prior art keywords
attention
image
channel
network
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111253713.5A
Other languages
Chinese (zh)
Inventor
梁美彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi University
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN202111253713.5A priority Critical patent/CN113962893A/en
Publication of CN113962893A publication Critical patent/CN113962893A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a human face image restoration method based on a multi-scale local self-attention generation antagonistic network, which comprises the following steps: acquiring a face missing image and a corresponding mask and preprocessing the face missing image and the corresponding mask; constructing a countervailing network based on multi-scale local self-attention generation, and training and modeling the countervailing network based on the multi-scale local self-attention generation by using a defective face image data set to obtain a face repairing model; and generating an antagonistic network model through multi-scale local self-attention, and repairing the defected human face image to be detected. According to the invention, a multi-scale structure and a dual-channel local self-attention module are added in the generated network, so that the technical problems of unstable training, low restoration precision and efficiency, lack of symmetry and mode collapse in the face restoration problem of the generated confrontation network are effectively solved, and an efficient, accurate and stable restoration method is provided for the face restoration.

Description

Face image restoration method based on multi-scale local self-attention generation countermeasure network
Technical Field
The invention belongs to the technical field of computer face image restoration, and particularly relates to a face image restoration method based on a multi-scale local self-attention generation countermeasure network.
Background
The image restoration is to restore the damaged area of the image by a certain technical means, so that the damaged area of the image has good consistency with surrounding features, and the restored image and the original image have the same semantic content. At present, the classic algorithms for repairing face images mainly include diffusion-based algorithms and image block-based matching algorithms. However, these classical image inpainting algorithms are mainly based on mathematical and physical models, so that the input image is required to contain similar information as the missing region, such as similar pixels, structures or image blocks, and cannot generate new content. If the image has a large area missing, the image can not be effectively reconstructed.
The image reconstruction method based on the depth generation countermeasure network repairs a missing image by learning the distribution of input images. Compared with the traditional reconstruction method, the method can capture the high-level semantic information of the image without having similar pixels or image blocks in the missing image, and generate the missing area with the same semantic as the original image, thereby realizing the effective restoration of the image. Therefore, the method for generating the countermeasure network based on the depth can repair the images which are lost in a small range, can realize the repair and reconstruction of large-area lost images according to semantic content, and is an effective face repair method.
At present, the networks for implementing image restoration by using the generation countermeasure network mainly include a Context Encoder (CE) proposed by Pathak et al and a global local image restoration network (GLCIC) proposed by Iizuka et al. Both architectures can reconstruct images through semantic information, using reconstruction losses and countermeasures to guide the image generation process. However, the method of the context encoder mainly focuses on repairing the missing region, and the method of replacing the non-missing region with the original image may cause a repair boundary between the missing region and the non-missing region, thereby affecting the integrity of the generated image. The GLCIC network controls the generation process of the image through a global discriminator and a local discriminator, and the problem that the image generated by the CE has a repaired boundary is solved. However, this method does not focus on repairing the missing area image, and therefore the generated image is blurred in the missing area.
Disclosure of Invention
The invention aims to solve the technical problems that the existing method for generating the countermeasure network to realize image restoration does not pay close attention to restoration of images in the missing area and the generated images are fuzzy in the missing area, and provides a human face image restoration method for generating the countermeasure network based on multi-scale local self-attention. According to the method, a multi-scale local self-attention module is adopted in a generator, and the information of a missing area is focused, so that the problem of high-precision restoration of the face image can be solved in a targeted manner, and multi-scale image information is added, so that the training process is more efficient and stable.
In order to solve the technical problems, the invention adopts the technical scheme that:
a facial image restoration method based on a multi-scale local self-attention generation countermeasure network comprises the following steps:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and dividing the defective face image data set into a training set and a test set according to a preset proportion. Wherein ☉ represents element multiplication;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode;
step three: setting a network model hyper-parameter, training and modeling a multi-scale local self-attention generation and countermeasure network model by using a defect face image training set, respectively adopting an Adam optimizer and a random gradient descent (SGD) algorithm in the modeling process to generate a network and a discrimination network, and optimizing the sum of a plurality of loss functions in the countermeasure training process to obtain a multi-scale local self-attention generation and countermeasure defect face restoration model;
step four: and testing the anti-defect human face restoration model by adopting a defect human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
Further, the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises a plurality of encoding feature extraction modules, and the encoding feature extraction modules comprise encoding convolution layers, batch normalization and Leaky ReLu activation functions. The method comprises the steps that each coding convolution layer carries out feature extraction on an input defect image through convolution kernels with the size of k, the scanning step length of s and the number of filling pixels of p, batch normalization is carried out after each convolution operation, the convolution kernels are activated through a nonlinear activation function Leaky ReLu, and along with the increase of the number of convolution layers, extracted features gradually evolve from low-level features based on colors and textures to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises a plurality of feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activating function. The convolution kernel of each expansion convolution layer is 3 × 3 expansion convolution, and the expansion rate of the t-th layer convolution kernel is 2tWherein T is 1,2, … T0(ii) a The method is used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder module consists of a plurality of decoding feature mapping modules, m-scale dual-channel local self-attention modules, a plurality of up-sampling modules and a nonlinear image balancing module; the decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a double-channel local self-attention module is added in front of each up-sampling module; the concrete connection mode is as follows: the first decoding feature mapping module is connected with the second decoding feature mapping module and used for extracting a feature map of a corresponding scale, the second decoding feature mapping module is connected with the mth scale dual-channel local self-attention module, and the missing information is repaired through the difference between the known area and the missing area of the focused image; the m-scale dual-channel local self-attention module is connected with the 1 st up-sampling module, up-sampling of the image is achieved through deconvolution operation, and batch normalization operation and Leaky ReLu function activation are conducted; the 1 st upsampling module is connected with the m +1 th scale double-channel local self-attention module through the third decoding feature mapping module, and the function of the module is to focus the difference between a known region and a missing region in the upsampled feature map again on the m +1 scale to repair and adjust the missing information, so that the repair of the feature map is realized on multiple scales; the (m + 1) th scale dual-channel local self-attention module is connected with the (2) th up-sampling module, the (2) th up-sampling module is connected with the nonlinear image equalization module after passing through the fourth decoding feature mapping module, namely, the image is converted into a three-channel RGB image, so that the effective reconstruction of the image is realized.
Furthermore, the discrimination network comprises a plurality of feature discrimination modules, and each feature discrimination module consists of a discrimination convolution layer-batch normalization-Leaky ReLu activation function. Each judging convolutional layer carries out feature extraction and compression on the reconstructed image through the convolutional layer with the size of k 'and the scanning step length of s', finally a probability value is output for judging the repairing effect of the generated image, and the number of channels of the characteristic image output by each judging convolutional layer of the judging network is at least doubled compared with the number of channels of the convolutional layer corresponding to the generating network.
Further, performing convolution operation on a feature map obtained before each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2Reconstructing loss, and comparing with real image with corresponding scale in the process of image reconstruction, thereby gradually controlling generation of face imageThe training process is more stable.
Further, the cross attention channel in the dual-channel local self-attention module repairs the defective region of the image by focusing attention of the missing region and the non-missing region, specifically:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' number of pixels of foreground and background;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) element E in the cross attention channel attention map EijCan be expressed as:
Figure BDA0003323197550000041
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the output of the cross attention channel is:
Figure BDA0003323197550000042
β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
Further, the space focuses on the attention in the missing region from the attention channel, and obtains the internal relation of the facial image features to repair the image, specifically:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
Figure BDA0003323197550000051
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:
Figure BDA0003323197550000052
wherein: beta is a2Is the weight distribution parameter of the spatial self-attention channel, is the learnability parameter of the network, and pad (-) represents the zero-padding operationDo, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
Further, the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd Total Variation loss function (Total Variation loss) LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:
Figure BDA0003323197550000053
wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiIs the weight of each scale;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation training
Figure BDA0003323197550000054
And transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
Figure BDA0003323197550000055
the perceptual loss function expression is:
Figure BDA0003323197550000061
the style loss function expression is:
Figure BDA0003323197550000062
the total variation loss function expression is as follows:
Figure BDA0003323197550000063
the total loss function expression is: l ═ alpha1Lm2Ladv3Lperceptual4Lstyle5LTV
Wherein:
Figure BDA0003323197550000064
and x denotes a restored face image and a real face image, phi denotes a VGG-16 network,
Figure BDA0003323197550000065
and
Figure BDA0003323197550000066
respectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj,Wj,CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,
Figure BDA0003323197550000067
the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,
Figure BDA0003323197550000068
representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function.
Further, tongOptimizing the sum of the loss functions to obtain a parameter theta of the multi-scale local self-attention generation antagonistic networkdgArgmin (L), and further obtaining a repaired face image
Figure BDA0003323197550000069
Wherein, thetadAnd thetagTo discriminate the network and to generate parameters for the network.
The invention has the beneficial effects that:
1. according to the invention, the double-channel local self-attention module is added in the generator, and the network acquires the internal relation of the facial image characteristics by focusing the attention of the missing region and the non-missing region and the self-attention in the missing region, so that the network learning efficiency is improved, the repair of the fine part of the face is realized, and an effective way is provided for the reconstruction of the high-precision missing facial image;
2. according to the method, a multi-scale local self-attention mechanism is added on each scale in the image generation process, the generation process of the face image is gradually controlled, so that a dual-channel local self-attention module can play a role on each scale, and the training process is more stable;
3. according to the invention, the 'jump' connection is adopted in the generator, so that the expression and repair efficiency of high-level semantic information of the image is enhanced, and the mode collapse is avoided;
4. the invention adopts a 'high-capacity' judging network, wherein the 'high capacity' means that the number of channels of an output characteristic diagram of each judging convolution layer of the judging network is at least doubled compared with the number of channels of the convolution layers corresponding to the generating network. The large-capacity discrimination network discriminates a large number of feature maps of the generated image, so that the small difference between the restored image and the original image is effectively discriminated, and the precision of the image to be restored is improved.
Drawings
FIG. 1 is a schematic overall framework diagram of the present invention for face defect image restoration;
FIG. 2 is a schematic diagram of the operation of the self-attention mechanism of the present invention;
fig. 3 is a schematic diagram of the test results of the present invention for repairing a defective image of a human face.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
In this embodiment, a method for repairing a face image based on a multi-scale local self-attention generation countermeasure network includes:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and the acquired defective face image data set is preprocessed, i.e. the image size is uniformly set to N0×N0,N0Number of pixels in width and height dimensions, N, of an image0128, ☉ represents the element multiplication and normalization before entering the network; dividing the preprocessed face image data into a training set and a test set according to the proportion of 10: 1; in this embodiment, 22,000 different face image data sets are divided into a training set and a test set according to a ratio of 10:1, and the number of images in the training set and the test set is 20k and 2k respectively;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode as shown in FIG. 1; the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises 6 encoding feature extraction modules, and the encoding feature extraction module comprises encoding convolution layers, batch normalization and Leaky ReLu activation functions. The number of the convolution kernel size k, the corresponding step length s and the feature map filling pixel number p adopted by each coding convolution layer is { k, s, p } { (5,1, 1); (3,2, 1); (3,1, 1); (3,2, 1); (3,1, 1); (3,1,1) }, generating feature maps having sizes of 128 × 128 × 64,64 × 64 × 128,64 × 64 × 128,32 × 32 × 256,32 × 32 × 256, and 32 × 32 × 256, respectively; carrying out batch normalization operation after each coding convolution layer and activating the coding convolution layer through a Leaky ReLu function with the slope of 0.2; with the increase of the number of the convolution layers, the extracted features gradually evolve from low-level features based on color and texture to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises 4 feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activating function. The convolution kernel of each expansion convolution layer is expansion convolution of 3 multiplied by 3, the expansion rates are respectively 2,4,8 and 16, and the expansion convolution kernels are used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder consists of 4 decoding feature mapping modules, 2 scale double-channel local self-attention modules, 2 upsampling modules and 1 nonlinear image balancing module. The decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a two-channel local self-attention module is added in front of each up-sampling module. The concrete connection mode is as follows: the first decoding feature mapping module and the second decoding feature mapping module with convolution kernel of 3 multiplied by 3 are connected and used for extracting feature maps with corresponding scales, the second decoding feature mapping module is connected with the first scale dual-channel local self-attention module, and feature map missing information with the scale of 32 multiplied by 32 is repaired through the difference between the known area and the missing area of the focused image; the first-scale dual-channel local self-attention module is connected with the No. 1 up-sampling module, up-sampling of an image is realized through deconvolution operation with a convolution kernel of 4 x 4, and the image is restored to 64 x 64; the 1 st upsampling module is connected with a second-scale dual-channel local self-attention module through a third decoding feature mapping module with convolution kernel of 3 x 3, and the function of the module is to focus on the difference between a known area and a missing area in the upsampled feature map again on the second scale so as to repair and adjust the missing information of the feature map with the scale of 64 x 64, thereby realizing the repair of the feature map on multiple scales. The second scale dual-channel local self-attention module is connected with the 2 nd up-sampling module, namely, the up-sampling of the image is realized again through the deconvolution operation with convolution kernel of 4 x 4, namely, the image is restored to the original size of 128 x 128, the 2 nd up-sampling module is connected with the nonlinear image equalization module through the fourth decoding feature mapping module with convolution kernel of 3 x 3, the image is converted into a three-channel RGB image, and therefore effective reconstruction of the image is realized.
Performing convolution operation on the feature map obtained in front of each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2And loss is reconstructed, and the real images with corresponding scales are compared in the image reconstruction process, so that the generation process of the face images is gradually controlled, and the training process is more stable.
The discrimination network comprises 6 feature discrimination modules, and the feature discrimination modules consist of discrimination convolution layers, batch normalization and Leaky ReLu activation functions. The first 5 discriminating convolutional layers adopt 4 × 4 convolutional kernels, the scanning step is 2 × 2, the number of channels for generating feature maps is about 2-4 times of that of the corresponding layers of the generated network, the number of channels is 128,128,256,512,1024 'large-capacity' feature maps respectively, the size of a network output tensor after the 5 th convolutional operation is 4 × 4 × 1024, the tensor is subjected to feature extraction again by adopting the 4 × 4 convolutional kernels, activation is carried out through a Sigmoid function, a 1 × 1 × 1 probability value is output, and the result is used for representing the truth of an input image. And adding batch normalization operation after the convolution layers of the generated network and the judgment network, and carrying out batch normalization processing on the feature graph after convolution to accelerate network convergence.
In this embodiment, a dual-channel local self-attention module is added in front of each deconvolution layer of the decoder module, as shown in fig. 2; the two-channel local self-attention module comprises a cross attention channel and a space self-attention channel, wherein the cross attention channel and the space self-attention channel are connected in a parallel mode, and a generated feature map needs to pass through the cross attention channel and the space self-attention channel; the network model carries out image restoration through two dimensions of feature information of a known region and self attention in an unknown region, and therefore high-precision and high-efficiency reconstruction of the missing region of the face image is achieved.
The cross attention channel restores the image by focusing attention of the missing region and the non-missing region, specifically:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' is the number of pixels of the foreground and the background, and C is the number of channels;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) an attention map of the intersecting attention channelsElement E of Ei,jCan be expressed as:
Figure BDA0003323197550000101
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the output of the cross attention channel is:
Figure BDA0003323197550000102
β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
The space self-attention channel focuses on the attention in the missing area, acquires the internal relation of the human face image characteristics to repair the human face image, and specifically comprises the following steps:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
Figure BDA0003323197550000103
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:
Figure BDA0003323197550000104
wherein: beta is a2Is a weight assignment parameter of the spatial self-attention channel, is a learnability parameter of the network, pad (-) denotes a zero-padding operation, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
Step three: setting network model hyper-parameters, wherein the hyper-parameters comprise an initial learning rate (gamma), an optimization algorithm for distinguishing a network and generating the network, a batch size (batch size) and iteration times (epoch), and values of the hyper-parameters are respectively as follows: the method comprises the steps that gamma is 0.001, batch size is 64, epoch is 200, a multi-scale local self-attention generation countermeasure network model is trained and modeled by using a defective face image training set, in the modeling process, an Adam optimizer and a random gradient descent (SGD) algorithm are respectively adopted for generating a network and a discrimination network, in the training process of countermeasures, the sum of a plurality of loss functions is optimized, and a parameter theta of the multi-scale local self-attention generation countermeasure is obtaineddgArgmin (L), and further obtaining a repaired face image
Figure BDA0003323197550000115
Wherein, thetadAnd thetagTo discriminate between the network and generate the parameters of the network,
Figure BDA0003323197550000116
representing a repaired face image; thereby obtaining a multi-scale local self-attention generation disfigurement-resistant face repairing model;
the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd total variation loss function (Tota)l Variation loss)LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:
Figure BDA0003323197550000111
wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiThe weight of each scale is, the total number m of scales in this embodiment is 3, and the corresponding weights are 0.4,0.6, and 0.8, respectively;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation training
Figure BDA0003323197550000112
And transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
Figure BDA0003323197550000113
the perceptual loss function expression is:
Figure BDA0003323197550000114
the style loss function expression is:
Figure BDA0003323197550000121
the total variation loss function expression is as follows:
Figure BDA0003323197550000122
the total loss function expression is: l ═ alpha1Lm2Ladv3Lperceptual4Lstyle5LTV
Wherein: x represents trueThe face image, phi, represents the VGG-16 network,
Figure BDA0003323197550000123
and
Figure BDA0003323197550000124
respectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj、WjAnd CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,
Figure BDA0003323197550000126
the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,
Figure BDA0003323197550000125
representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function. Set to 100,10,1,1,1 in the present embodiment.
Step four: and testing the anti-human face restoration model by adopting a defective human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
Fig. 3 shows the repair results of the generated confrontation network model based on multi-scale local self-attention on 2k face defect image test sets, where the peak signal-to-noise ratio (PSNR) and the Structural Similarity (SSIM) reach 25.39 and 0.87, respectively. The method not only improves the network learning efficiency, but also realizes the repair of the fine part of the face, and proves the excellent performance of the method in the aspect of repairing the defective face image.

Claims (8)

1. A facial image restoration method based on a multi-scale local self-attention generation countermeasure network is characterized by comprising the following steps:
the method comprises the following steps: acquiring an original face image x and a corresponding binary defect mask M; construction of a defective face image dataset { xM|xMM ☉ x, and a corresponding original image data set { x }, and dividing the defective face image data set into a training set and a test set according to a preset proportion; wherein ☉ represents element multiplication;
step two: constructing a multi-scale local self-attention generating countermeasure network, wherein the network consists of a generating network and a judging network, and embedding a dual-channel local self-attention module on different scales of the generating network, wherein the dual-channel local self-attention module comprises a cross attention channel and a space self-attention channel, and the cross attention channel and the space self-attention channel are connected in a parallel mode;
step three: setting a network model hyper-parameter, training and modeling a multi-scale local self-attention generation and countermeasure network model by using a defect face image training set, respectively adopting an Adam optimizer and a random gradient descent (SGD) algorithm in the modeling process to generate a network and a discrimination network, and optimizing the sum of a plurality of loss functions in the countermeasure training process to obtain a multi-scale local self-attention generation and countermeasure defect face restoration model;
step four: and testing the anti-defect human face restoration model by adopting a defect human face image test set to generate the multi-scale local self-attention, and evaluating the restoration performance of the model through peak signal to noise ratio (PSNR) and Structural Similarity (SSIM) indexes.
2. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the generation network comprises an encoder module, a semantic feature repair module and a decoder module, and the encoder module and the decoder module are basically symmetrical in structure;
the encoder module comprises a plurality of encoding feature extraction modules, and each encoding feature extraction module comprises an encoding convolution layer, batch normalization and a Leaky ReLu activation function; the method comprises the steps that each coding convolution layer carries out feature extraction on an input defect image through convolution kernels with the size of k, the scanning step length of s and the number of filling pixels of p, batch normalization is carried out after each convolution operation, the convolution kernels are activated through a nonlinear activation function Leaky ReLu, and along with the increase of the number of convolution layers, extracted features gradually evolve from low-level features based on colors and textures to high-level abstract features based on image semantic information; compressing the input defect image into feature maps of different scales through coding operation;
the semantic feature repairing module comprises a plurality of feature restoring modules, and each feature restoring module consists of an expansion convolution layer, batch normalization and a Leaky ReLu activation function; the convolution kernel of each expansion convolution layer is 3 × 3 expansion convolution, and the expansion rate of the t-th layer convolution kernel is 2tWherein T is 1,2, … T0(ii) a The method is used for performing semantic feature extraction and face image restoration on the compressed feature map;
the decoder module consists of a plurality of decoding feature mapping modules, m-scale dual-channel local self-attention modules, a plurality of up-sampling modules and a nonlinear image balancing module; the decoding feature mapping module consists of a decoding convolution layer-batch normalization-Leaky ReLu activation function; the up-sampling module consists of a deconvolution layer, batch normalization and a Leaky ReLu activation function; the nonlinear image equalization module consists of a decoding convolution layer-Tanh activation function; a double-channel local self-attention module is added in front of each up-sampling module; the concrete connection mode is as follows: the first decoding feature mapping module is connected with the second decoding feature mapping module and used for extracting a feature map of a corresponding scale, the second decoding feature mapping module is connected with the mth scale dual-channel local self-attention module, and the missing information is repaired through the difference between the known area and the missing area of the focused image; the m-scale dual-channel local self-attention module is connected with the 1 st up-sampling module, up-sampling of the image is achieved through deconvolution operation, and batch normalization operation and Leaky ReLu function activation are conducted; the 1 st upsampling module is connected with the m +1 th scale double-channel local self-attention module through the third decoding feature mapping module, and the function of the module is to focus the difference between a known region and a missing region in the upsampled feature map again on the m +1 scale to repair and adjust the missing information, so that the repair of the feature map is realized on multiple scales; the (m + 1) th scale dual-channel local self-attention module is connected with the (2) th up-sampling module, the (2) th up-sampling module is connected with the nonlinear image equalization module after passing through the fourth decoding feature mapping module, namely, the image is converted into a three-channel RGB image, so that the effective reconstruction of the image is realized.
3. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the discrimination network comprises a plurality of characteristic discrimination modules, and each characteristic discrimination module consists of a discrimination convolution layer, batch normalization and a Leaky ReLu activation function; each judging convolutional layer carries out feature extraction and compression on the reconstructed image through the convolutional layer with the size of k 'and the scanning step length of s', finally a probability value is output for judging the repairing effect of the generated image, and the number of channels of the characteristic image output by each judging convolutional layer of the judging network is at least doubled compared with the number of channels of the convolutional layer corresponding to the generating network.
4. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 2, characterized in that: performing convolution operation on the feature map obtained in front of each dual-channel local self-attention module in the decoder to obtain an RGB image with a corresponding scale, and enabling the RGB image to pass through L together with a real image with the corresponding scale in the image restoration process2And loss is reconstructed, and the real images with corresponding scales are compared in the image reconstruction process, so that the generation process of the face images is gradually controlled, and the training process is more stable.
5. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the cross attention channel in the dual-channel local self-attention module repairs the defective region of the image by focusing attention of the missing region and the non-missing region, and specifically comprises the following steps:
(I) the input of each channel of the dual-channel local self-attention module is a characteristic diagram F before each deconvolution layer in a decoder, and the size of the characteristic diagram F is M1×M2×C,M1、M2And C is the height dimension pixel number, the width dimension pixel number and the channel number of the characteristic image F respectively;
(II) dividing the characteristic diagram F into a defective area and a non-defective area according to the size of the mask, wherein the defective area is defined as a foreground FfThe non-defective area is defined as the background Fb
(III) converting the foreground FfAnd background FbIs adjusted to be PfX C and Pb' × C one-dimensional vector, wherein: pf=m1×m2,Pb'=(M1×M2)-(m1×m2);m1And m2Are respectively foreground FfHeight dimension pixel number and width dimension pixel number, PfAnd Pb' number of pixels of foreground and background;
(IV) adjusted foreground F in the cross attention channelfAnd background FbRespectively carrying out one-dimensional convolution operation to obtain a foreground FfTransformation characteristic Q of (1), and background FbThe two transformation characteristics K and V are as follows: q ═ WqFf,K=WkFbAnd V ═ WvFbWherein: wq、WkAnd WvThe feature transformation matrix of the cross attention channel is a learnable parameter of the network;
(V) element E in the cross attention channel attention map EijCan be expressed as:
Figure FDA0003323197540000031
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(VI) the Cross attentionThe output of the channel is:
Figure FDA0003323197540000032
β1is a weight assignment parameter of the cross attention channel, is a network learnability parameter, pad (-) denotes zero-padding operation, VTFor background F in the cross attention channelbTransposition of the transformation characteristics V, ETTransposing attention map E in cross-attention channel.
6. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the method comprises the following steps that the space self-attention channel focuses on the attention in the missing area, and the internal relation of the human face image characteristics is obtained to repair the image, and specifically comprises the following steps:
(i) the spatial self-attention channel connects foreground FfIs adjusted to be PfObtaining a foreground F after three one-dimensional convolution operations of the one-dimensional vector of the x CfThree types of transformation characteristics Q ', K ' and V ' are specifically represented as follows: q ═ Wq'Ff,K'=Wk'FfAnd V ═ Wv'Ff(ii) a Wherein: w'q、W'kAnd W'vThe feature transformation matrix of the spatial self-attention channel is a learnable parameter of the network;
(ii) element E 'in the spatial self-attention channel attention map E'i,jCan be expressed as:
Figure FDA0003323197540000041
wherein, subscripts i and j represent indexes of elements in corresponding physical quantities respectively, and superscript T represents transposition operation;
(iii) the output of the spatial self-attention channel is:
Figure FDA0003323197540000042
wherein: beta is a2Is a weight distribution parameter of a spatial self-attention channel, is of a networkThe learnability parameter, pad (-), represents a zero-stuffing operation, V'TFor foreground F in spatial self-attention channelfTransposition of the transform feature V ', E'TTransposing an attention map E' in a spatial self-attention channel;
(iv) fusing the feature graphs of the cross attention channel and the spatial self-attention channel to obtain a simplified feature graph Y, wherein the expression is as follows: y ═ conv (Y)f+Y'f) (ii) a Where conv (·) denotes a 1 × 1 convolution operation.
7. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 1, characterized in that: the loss function includes: multi-scale reconstruction loss function LmReconstructed image contrast loss function LadvThe perceptual loss function LperceptualStyle loss function LstyleAnd total variation loss function LTV(ii) a The method specifically comprises the following steps:
the multi-scale reconstruction loss function is defined as:
Figure FDA0003323197540000043
wherein: x is the number ofMX denotes the original image, M denotes a binary mask, x ☉ xMRepresenting a defect image, G (-) representing a generated image, SiRepresenting the RGB output image of the ith scale extracted from the decoder, TiRepresenting the true image of the same image at the ith scale, λiIs the weight of each scale;
loss-fighting function L for reconstructed imagesadvIs derived from a cost function in the confrontation training
Figure FDA0003323197540000044
And transforming to obtain the expression of the resistance loss function of the reconstructed image as follows:
Figure FDA0003323197540000051
the perceptual loss function expression is:
Figure FDA0003323197540000052
the style loss function expression is:
Figure FDA0003323197540000053
the total variation loss function expression is as follows:
Figure FDA0003323197540000054
the total loss function expression is: l ═ alpha1Lm2Ladv3Lperceptual4Lstyle5LTV
Wherein:
Figure FDA0003323197540000055
and x denotes a restored face image and a real face image, phi denotes a VGG-16 network,
Figure FDA0003323197540000056
and
Figure FDA0003323197540000057
respectively representing the extraction of the j-th layer characteristic diagram of the repaired image and the real image by using a VGG-16 network, Hj,Wj,CjRepresenting the height, width and channel number of the feature map extracted from the layer j by the VGG-16 network, N being the layer number in the VGG-16 feature extractor, D (-) representing the discrimination of the image in the brackets, Ex(. cndot.) represents the expectation of a distribution function,
Figure FDA0003323197540000058
the gram matrix representing the VGG-16 network layer j characteristic graph, | | · | | purple2Represents L2The norm of the number of the first-order-of-arrival,
Figure FDA0003323197540000059
representing the corresponding pixel values when the height dimension, width dimension and channel dimension values are h, w and c respectively in the repaired face RGB image, { alpha [ [ alpha ]1,...,α5Denotes the weight that each loss takes in the total loss function.
8. The facial image restoration method based on the multi-scale local self-attention generation countermeasure network as claimed in claim 7, characterized in that: obtaining a parameter theta of the multi-scale local self-attention generation antagonistic network by optimizing the sum of the loss functionsdgArgmin (L), and further obtaining a repaired face image
Figure FDA00033231975400000510
Wherein, thetadAnd thetagTo discriminate the network and to generate parameters for the network.
CN202111253713.5A 2021-10-27 2021-10-27 Face image restoration method based on multi-scale local self-attention generation countermeasure network Pending CN113962893A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111253713.5A CN113962893A (en) 2021-10-27 2021-10-27 Face image restoration method based on multi-scale local self-attention generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111253713.5A CN113962893A (en) 2021-10-27 2021-10-27 Face image restoration method based on multi-scale local self-attention generation countermeasure network

Publications (1)

Publication Number Publication Date
CN113962893A true CN113962893A (en) 2022-01-21

Family

ID=79467506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111253713.5A Pending CN113962893A (en) 2021-10-27 2021-10-27 Face image restoration method based on multi-scale local self-attention generation countermeasure network

Country Status (1)

Country Link
CN (1) CN113962893A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386531A (en) * 2022-01-25 2022-04-22 山东力聚机器人科技股份有限公司 Image identification method and device based on double-stage attention
CN114494499A (en) * 2022-01-26 2022-05-13 电子科技大学 Sketch coloring method based on attention mechanism
CN114581343A (en) * 2022-05-05 2022-06-03 南京大学 Image restoration method and device, electronic equipment and storage medium
CN114693577A (en) * 2022-04-20 2022-07-01 合肥工业大学 Infrared polarization image fusion method based on Transformer
CN114782291A (en) * 2022-06-23 2022-07-22 中国科学院自动化研究所 Training method and device of image generator, electronic equipment and readable storage medium
CN114862699A (en) * 2022-04-14 2022-08-05 中国科学院自动化研究所 Face repairing method, device and storage medium based on generation countermeasure network
CN115358954A (en) * 2022-10-21 2022-11-18 电子科技大学 Attention-guided feature compression method
CN115471901A (en) * 2022-11-03 2022-12-13 山东大学 Multi-pose face frontization method and system based on generation of confrontation network
CN115984106A (en) * 2022-12-12 2023-04-18 武汉大学 Line scanning image super-resolution method based on bilateral generation countermeasure network
CN116051936A (en) * 2023-03-23 2023-05-02 中国海洋大学 Chlorophyll concentration ordered complement method based on space-time separation external attention
CN116071275A (en) * 2023-03-29 2023-05-05 天津大学 Face image restoration method based on online knowledge distillation and pretraining priori
CN117611753A (en) * 2024-01-23 2024-02-27 吉林大学 Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology
CN117974508A (en) * 2024-03-28 2024-05-03 南昌航空大学 Iris image restoration method for irregular occlusion based on generation countermeasure network
CN117974832A (en) * 2024-04-01 2024-05-03 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network
CN117994173A (en) * 2024-04-07 2024-05-07 腾讯科技(深圳)有限公司 Repair network training method, image processing method, device and electronic equipment
CN118036701A (en) * 2024-04-10 2024-05-14 南昌工程学院 Ultraviolet image-based insulator corona discharge data enhancement method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689499A (en) * 2019-09-27 2020-01-14 北京工业大学 Face image restoration method based on dense expansion convolution self-coding countermeasure network
CN111275638A (en) * 2020-01-16 2020-06-12 湖南大学 Face restoration method for generating confrontation network based on multi-channel attention selection
CN112184582A (en) * 2020-09-28 2021-01-05 中科人工智能创新技术研究院(青岛)有限公司 Attention mechanism-based image completion method and device
CN113112411A (en) * 2020-01-13 2021-07-13 南京信息工程大学 Human face image semantic restoration method based on multi-scale feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689499A (en) * 2019-09-27 2020-01-14 北京工业大学 Face image restoration method based on dense expansion convolution self-coding countermeasure network
CN113112411A (en) * 2020-01-13 2021-07-13 南京信息工程大学 Human face image semantic restoration method based on multi-scale feature fusion
CN111275638A (en) * 2020-01-16 2020-06-12 湖南大学 Face restoration method for generating confrontation network based on multi-channel attention selection
CN112184582A (en) * 2020-09-28 2021-01-05 中科人工智能创新技术研究院(青岛)有限公司 Attention mechanism-based image completion method and device

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386531B (en) * 2022-01-25 2023-02-14 山东力聚机器人科技股份有限公司 Image identification method and device based on double-stage attention
CN114386531A (en) * 2022-01-25 2022-04-22 山东力聚机器人科技股份有限公司 Image identification method and device based on double-stage attention
CN114494499A (en) * 2022-01-26 2022-05-13 电子科技大学 Sketch coloring method based on attention mechanism
CN114862699A (en) * 2022-04-14 2022-08-05 中国科学院自动化研究所 Face repairing method, device and storage medium based on generation countermeasure network
CN114693577A (en) * 2022-04-20 2022-07-01 合肥工业大学 Infrared polarization image fusion method based on Transformer
CN114693577B (en) * 2022-04-20 2023-08-11 合肥工业大学 Infrared polarized image fusion method based on Transformer
CN114581343A (en) * 2022-05-05 2022-06-03 南京大学 Image restoration method and device, electronic equipment and storage medium
CN114782291B (en) * 2022-06-23 2022-09-06 中国科学院自动化研究所 Training method and device of image generator, electronic equipment and readable storage medium
CN114782291A (en) * 2022-06-23 2022-07-22 中国科学院自动化研究所 Training method and device of image generator, electronic equipment and readable storage medium
CN115358954B (en) * 2022-10-21 2022-12-23 电子科技大学 Attention-guided feature compression method
CN115358954A (en) * 2022-10-21 2022-11-18 电子科技大学 Attention-guided feature compression method
CN115471901A (en) * 2022-11-03 2022-12-13 山东大学 Multi-pose face frontization method and system based on generation of confrontation network
CN115984106A (en) * 2022-12-12 2023-04-18 武汉大学 Line scanning image super-resolution method based on bilateral generation countermeasure network
CN115984106B (en) * 2022-12-12 2024-04-02 武汉大学 Line scanning image super-resolution method based on bilateral generation countermeasure network
CN116051936A (en) * 2023-03-23 2023-05-02 中国海洋大学 Chlorophyll concentration ordered complement method based on space-time separation external attention
CN116051936B (en) * 2023-03-23 2023-06-20 中国海洋大学 Chlorophyll concentration ordered complement method based on space-time separation external attention
CN116071275B (en) * 2023-03-29 2023-06-09 天津大学 Face image restoration method based on online knowledge distillation and pretraining priori
CN116071275A (en) * 2023-03-29 2023-05-05 天津大学 Face image restoration method based on online knowledge distillation and pretraining priori
CN117611753A (en) * 2024-01-23 2024-02-27 吉林大学 Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology
CN117611753B (en) * 2024-01-23 2024-03-22 吉林大学 Facial shaping and repairing auxiliary system and method based on artificial intelligent reconstruction technology
CN117974508A (en) * 2024-03-28 2024-05-03 南昌航空大学 Iris image restoration method for irregular occlusion based on generation countermeasure network
CN117974508B (en) * 2024-03-28 2024-06-07 南昌航空大学 Iris image restoration method for irregular occlusion based on generation countermeasure network
CN117974832A (en) * 2024-04-01 2024-05-03 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network
CN117974832B (en) * 2024-04-01 2024-06-07 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network
CN117994173A (en) * 2024-04-07 2024-05-07 腾讯科技(深圳)有限公司 Repair network training method, image processing method, device and electronic equipment
CN117994173B (en) * 2024-04-07 2024-06-11 腾讯科技(深圳)有限公司 Repair network training method, image processing method, device and electronic equipment
CN118036701A (en) * 2024-04-10 2024-05-14 南昌工程学院 Ultraviolet image-based insulator corona discharge data enhancement method and system

Similar Documents

Publication Publication Date Title
CN113962893A (en) Face image restoration method based on multi-scale local self-attention generation countermeasure network
US11450066B2 (en) 3D reconstruction method based on deep learning
CN113240580B (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN112819910B (en) Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network
CN113673590B (en) Rain removing method, system and medium based on multi-scale hourglass dense connection network
CN113989129A (en) Image restoration method based on gating and context attention mechanism
CN115018727A (en) Multi-scale image restoration method, storage medium and terminal
CN114445292A (en) Multi-stage progressive underwater image enhancement method
CN111833261A (en) Image super-resolution restoration method for generating countermeasure network based on attention
CN113112416A (en) Semantic-guided face image restoration method
CN117274760A (en) Infrared and visible light image fusion method based on multi-scale mixed converter
CN114638768B (en) Image rain removing method, system and equipment based on dynamic association learning network
CN114266957A (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
CN116797461A (en) Binocular image super-resolution reconstruction method based on multistage attention-strengthening mechanism
Cherian et al. A Novel AlphaSRGAN for Underwater Image Super Resolution.
CN112686822B (en) Image completion method based on stack generation countermeasure network
CN113628143A (en) Weighted fusion image defogging method and device based on multi-scale convolution
CN112634168A (en) Image restoration method combined with edge information
CN116703750A (en) Image defogging method and system based on edge attention and multi-order differential loss
CN114862699B (en) Face repairing method, device and storage medium based on generation countermeasure network
CN116188272A (en) Two-stage depth network image super-resolution reconstruction method suitable for multiple fuzzy cores
CN115660979A (en) Attention mechanism-based double-discriminator image restoration method
CN115861108A (en) Image restoration method based on wavelet self-attention generation countermeasure network
CN115100091A (en) Conversion method and device for converting SAR image into optical image
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination