CN114022506A - Image restoration method with edge prior fusion multi-head attention mechanism - Google Patents

Image restoration method with edge prior fusion multi-head attention mechanism Download PDF

Info

Publication number
CN114022506A
CN114022506A CN202111356234.6A CN202111356234A CN114022506A CN 114022506 A CN114022506 A CN 114022506A CN 202111356234 A CN202111356234 A CN 202111356234A CN 114022506 A CN114022506 A CN 114022506A
Authority
CN
China
Prior art keywords
image
edge
repairing
restoration
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111356234.6A
Other languages
Chinese (zh)
Other versions
CN114022506B (en
Inventor
张加万
赵晨曦
李会彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202111356234.6A priority Critical patent/CN114022506B/en
Publication of CN114022506A publication Critical patent/CN114022506A/en
Application granted granted Critical
Publication of CN114022506B publication Critical patent/CN114022506B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of image restoration, and discloses an image restoration method with an edge prior fusion multi-head attention mechanism, which comprises the following steps of S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image; step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training; the method integrates a multi-head attention mechanism, and by extracting richer images, the long-distance dependence of pixels is achieved to improve the image restoration effect.

Description

Image restoration method with edge prior fusion multi-head attention mechanism
Technical Field
The invention relates to the technical field of image restoration, in particular to an image restoration method with an edge prior fusion multi-head attention mechanism.
Background
In the information society, images are the most important sources of information. How to obtain more complete and clearer images has also become a hotspot in the field of computer vision, and related application fields include image restoration and super-resolution. Image inpainting refers to a technique for recovering a complete image from the rest of the image information in a damaged image. This is not a difficult task for the human eye, but is a rather challenging task for computer vision. There are many practical solutions to this technology, such as image restoration (to remove photo scratches and text occlusion), photo editing (to remove unwanted objects), image coding and transmission (network during image transmission) that require the use of image block content loss due to packet loss). Therefore, image restoration techniques are a very popular research area in recent years.
At present, the image is repaired to become mainstream based on the generation of the countermeasure network, so that the image which is close to training data but does not exist can be generated through a network model to achieve the effect of falseness.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an image restoration method with an edge priori fusion multi-head attention mechanism.
In order to achieve the above purpose, the invention provides the following technical scheme:
an image restoration method for an edge prior fusion multi-head attention mechanism comprises
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing one-time multi-head attention network and two-time deconvolution;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
In the present invention, the edge restoration model further includes an edge restorer, which samples the extracted edge map, original image and mask image, and converts the feature map into a single-channel edge map after a plurality of expansion convolution-based residuals and two deconvolutions.
In the present invention, further, the repairing method of the edge repairing model comprises:
step S20: obtaining a predicted edge repairing result of the edge repairing device, obtaining a generation result of an edge repairing model according to the predicted edge repairing result, reserving the image edge of the existing region, and filling the edge part needing repairing in the missing region, as follows:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeTo representEdge restorer, M denotes mask image, C denotes edge map of image to be restored, IgrayGrey-scale map representing the image to be restored, C+A generated restored edge image representing a table edge restoration model.
In the invention, further, the method for repairing the edge repairing model further comprises
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model to obtain a repaired edge image.
In the present invention, further, the image restoration method includes:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+IM)
I+=I·(1-M)+Ip·M
wherein, IpFor predicted restored pictures, I is the real picture, GiAs an image inpainting device C+To repair edge images, IM
Step S24: calculating an image restoration loss function and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss, and the calculation method comprises the following steps:
Figure BDA0003357238620000031
wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,
Figure BDA0003357238620000032
the countermeasure loss generated for the image restoration model,
Figure BDA0003357238620000033
in order to be a loss of style,
Figure BDA0003357238620000034
is the loss of perception.
In the present invention, further, the generating of the restored picture by the image restorer after performing multiple sampling on the restored edge image, multiple residual convolutions based on the extended convolution, one multi-head attention network and two deconvolution includes: step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
step S2-2: acquiring a reconstructed characteristic map;
step S2-3: splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results;
step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
In the present invention, further, the acquiring of the reconstructed feature map in step S2-2 includes:
step S2-2-1: the key characteristic diagrams are subjected to rank conversion, and the grouping between the query characteristic diagrams and the rank-converted key characteristic diagrams is subjected to dot product operation to obtain a plurality of groups of correlation attention matrixes;
step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: and carrying out matrix multiplication on the normalized self-attention matrix of each group of correlations and the value characteristic diagram of the group to obtain a reconstructed characteristic diagram of the group.
The step S2-3 of splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:
step S2-3-1: attention results were obtained for the ith head:
Figure BDA0003357238620000041
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: splicing the self-attention results of all heads, fusing and projecting a plurality of feature spaces back to the size of an original matrix by using a W matrix, and finally obtaining a plurality of self-attention combined results:
MultiHead=Concat(heead1,dead2,...,headh)Wo
wherein the content of the first and second substances,
Figure BDA0003357238620000042
representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
The style loss calculation method comprises the following steps:
Figure BDA0003357238620000051
wherein the content of the first and second substances,
Figure BDA0003357238620000052
representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
In the present invention, further, in the step S2-2-3, the value feature map element in the value feature map is weighted and reconstructed by using the set of correlation attention matrixes to the pixel thereof, and the weight of the other elements in the weighted reconstruction process is the pixel value corresponding to the correlation attention moment matrix.
Compared with the prior art, the invention has the beneficial effects that:
according to the method, the multi-head attention network capable of capturing the long-distance relationship among the richer pixel regions is added behind the last residual layer of the image restoration model, in order to enable the model to learn information in different subspaces, each head is used for processing different information by using a plurality of parallel repeated attention calculations, so that the characteristics of different parts can be processed, the richer long-distance dependency relationship is extracted, the multi-head self-attention network can learn the correlation matrixes of different modes, the method has an important effect on improving the restoration result, and the image restoration effect is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is an overall flowchart of an image inpainting method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 2 is a flowchart of step S2 in the image inpainting method of the present invention in which the multi-head attention mechanism is fused with edge priors;
FIG. 3 is a flowchart of the steps S2-2 and S2-3 in the image inpainting method of the present invention with edge prior fusion multi-head attention mechanism;
FIG. 4 is a flowchart of an implementation of a method for repairing an edge repairing model in an image repairing method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 5 is a schematic diagram of obtaining query, key, value feature maps in the image restoration method with edge prior fusion multi-head attention mechanism of the present invention;
FIG. 6 is a schematic diagram of a flow chart for obtaining a correlation attention matrix in an image restoration method with edge prior fusion multi-head attention mechanism according to the present invention
FIG. 7 is a schematic flow chart of a reconstructed feature map in an image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 8 is a network architecture diagram of a multi-head self-attention layer in the image restoration method with edge prior fusion of a multi-head attention mechanism according to the present invention;
FIG. 9 is a schematic diagram of an edge restoration model construction framework in the image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 10 is a schematic diagram of a framework for constructing an edge image restoration model in an image restoration method with an edge prior fusion multi-head attention mechanism according to the present invention;
FIG. 11 is a schematic diagram of an experimental result of the image inpainting method of the edge prior fusion multi-head attention mechanism of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When a component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When a component is referred to as being "disposed on" another component, it can be directly on the other component or intervening components may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, a preferred embodiment of the present invention provides an image restoration method of an edge prior fusion multi-head attention mechanism, which includes
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model which comprises an edge repairing model and an image repairing model, wherein the edge repairing model takes the extracted edge image, the original image and the mask image as input and outputs the extracted edge image, the original image and the mask image as repaired edge images, and the image repairing model takes the repaired edge images and the defect images as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing multi-head attention network for one time and performing deconvolution for two times;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
Specifically, according to the scheme, a certain amount of intact images of related images are collected according to experimental requirements, data collection is completed, then, data are subjected to preliminary processing through a preprocessing technology to obtain data meeting standards, the data set is divided into a training set and a testing set, then, an image restoration model is gradually built according to algorithm design, after the model is built, the model is trained through the training set, and the effect of the model is tested and evaluated through the testing set. According to the scheme, the multi-attention mechanism is fused, so that the edge is fused with the multi-attention mechanism repairing model firstly, and the effect of improving image repairing is achieved by extracting richer images and relying on the long distance of pixels.
In the present invention, the image size is adjusted to 256 × 256 using the Celeba public dataset image, and the image is applied to the experiment. However, because the data set is not divided into a training set, a verification set and a test set, the first 18 ten thousand pictures are selected from the data set for training the model, and 4000 pictures are selected as the test set for analyzing and comparing the experimental results. In addition, when the model is trained, the mask images used by the model are taken from an irregular mask data set, the irregular mask images arranged in the data set are divided into six groups according to the code images of which the missing region area accounts for 0-10%, 10-20%, 20-30%, 30-40%, 40-50% and 50-60% of the proportion of the whole image, each group comprises 2000 images, 1000 mask images represent the condition that the image boundary is missing, and the other 1000 mask images represent the condition that the image boundary is intact.
The method comprises the following steps of extracting an edge image of an image by Canny edge detection aiming at an image originally input by a training set, wherein the Canny edge detection is divided into four steps: and Gaussian filtering, namely calculating gradient values and gradient directions, filtering non-maximum values, detecting edges by using upper and lower threshold values, obtaining an edge image of the original image, performing edge detection by using a binary mask, and generating a repaired edge image by generating a countermeasure network. In the test task, 4000 pictures are taken out from the Celeba data set as a test set, a random missing mask graph is used for simulating a missing area, the hand-drawn mask graph used for testing is divided into four groups of 1 to 4 according to the missing area ratio from small to large, and each group contains 1000 pictures which are 4000.
In the present invention, as shown in fig. 9, the edge repairing model includes an edge repairing unit, and the edge repairing unit samples the extracted edge map, the original image, and the mask image, and converts the feature map into a single-channel edge map after performing convolution residual based on dilation and two deconvolution for multiple times. Specifically, the feature map is converted into a single-channel edge map through 3 times of downsampling, 8 times of residual convolution based on expansion convolution and 2 times of deconvolution.
Specifically, in the present invention, as shown in fig. 4, the repairing method of the edge repairing model is as follows:
step S20: the edge repairing device splices an edge image, a mask image and a gray level image of a to-be-repaired image into a tensor as input, so as to obtain a predicted edge repairing result of the edge repairing device, obtains a generation result of an edge repairing model according to the predicted edge repairing result, reserves the image edge of an existing region, and fills an edge part needing to be repaired in a missing region, and the method comprises the following steps:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, Igray′Grey-scale map representing the image to be restored, C+A generated restored edge image representing an edge restoration model.
In the present invention, further, the method for repairing the edge repairing model further includes:
step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model according to the loss function to obtain a repaired edge image finally output by the edge repairing model.
Specifically, the loss function of the edge repairing model belongs to a mixed loss function, the purpose of which is to constrain the result processed by the edge discriminator, and the mixed function is to generate a weighted sum of the edge confrontation loss and the edge feature loss. Wherein, generating the edge countermeasure loss is a sort of cross entropy, which can be recorded as:
Figure BDA0003357238620000101
wherein the content of the first and second substances,
Figure BDA0003357238620000102
representing a loss of edge-confrontation,
Figure BDA0003357238620000103
representing a real picture and a grey-scale map as desired,
Figure BDA0003357238620000104
expressing the gray-scale map and the original map to expect, DeAn edge discriminator is indicated.
Secondly, the edge feature loss is a distance function defined on the feature layer of the edge restorer, and the main function of the edge feature loss is to calculate the sum of the distances between the generated edge and the features of different layers extracted by the edge restorer and detected by canny, so that the feature loss formula expression is as follows:
Figure BDA0003357238620000105
wherein the content of the first and second substances,
Figure BDA0003357238620000106
indicating loss of edge features, n representing the number of active layers of the edge discriminator
Finally, the optimization goal of the edge model can be written as:
Figure BDA0003357238620000107
wherein the content of the first and second substances,
Figure BDA0003357238620000108
it is shown that the edge-healer is minimized,
Figure BDA0003357238620000109
representing a maximized edge discriminator.
In the present invention, as shown in fig. 10, the image restoration model includes an image restorer, and the image restorer generates a restoration picture after sampling the restored edge image for a plurality of times, performing a plurality of residual convolutions based on the dilation convolution, performing a plurality of multi-head attention networks once, and performing two deconvolution.
The convolutional neural network only focuses on pixel values of local areas when learning characteristics, and influences of relevance of pixels of remote areas on image generation and restoration are ignored, so that a plurality of attention mechanism models are designed to better capture remote dependence relations, wherein one of the multi-head self-attention networks is based on an expansion structure of the self-attention network, and the self-attention network can effectively capture remote relations among pixels in imagination. But the pixel long-distance relations of each region are not just one group, but the self-attention network is not enough to learn a plurality of long-distance relations, so that a multi-head attention network capable of capturing the long-distance relations among richer pixel regions is adopted. The multi-head self-attention network may learn correlation matrixes of different modes, which has a very important role in improving the repair result.
Specifically, as shown in fig. 2, in the present scheme, a multi-head self-attention layer network is added after the last residual layer, and the specific scheme is as follows:
step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
specifically, as shown in FIG. 5, the size of the query feature map is Bg×Wf×Hf×CqIn which B isgFor hidden variable batches of generator inputs, WfIs the width, H, of the query profilefIs the height, C, of the query profileqIs the channel dimension of the query feature map, and key is the size of the feature map Bg×Wf×Hf×CkSeveral other parameters are the same as the query profile, CkIs the channel dimension of the key profile. Value feature map size Bg×Wf×Hf×CvOther parameters are the same as key and query, CvIs the channel dimension of the feature map.
Step S2-2: acquiring a reconstructed feature map, as shown in fig. 3, the specific method includes:
step S2-2-1: rank conversion is carried out on the key characteristic diagrams, and dot product operation is carried out on the groups between the query characteristic diagrams and the rank-converted key characteristic diagrams, as shown in fig. 6, so that a plurality of groups of correlation attention matrixes are obtained;
step S2-2-2: and (4) normalizing the correlation attention moment matrix, wherein a dot product matrix is normalized by a method such as Softmax and the like.
Step S2-2-3: the normalized self-attention matrix of each set of correlations is matrix-multiplied with the value profile of the set, as shown in fig. 7, to obtain a reconstructed profile of the set. And weighting and reconstructing the element by using the pixel of the value characteristic diagram element in the value characteristic diagram and the set of correlation attention matrixes, wherein the weight values of other elements in the weighting and reconstructing process are pixel values corresponding to the correlation attention moment matrixes.
Further, after the reconstructed feature map is obtained, step S2-3 is performed, as shown in fig. 8,
step S2-3: and splicing the reconstructed characteristic graphs according to the channel dimension to obtain a plurality of attention combination results.
Step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
In one embodiment provided by the present invention, a specific method for obtaining a plurality of attention combination results is as follows:
step S2-3-1: attention results were obtained for the ith head:
Figure BDA0003357238620000121
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: splicing the self-attention results of all heads, fusing and projecting a plurality of feature spaces back to the size of an original matrix by using a W matrix, and finally obtaining a plurality of self-attention combined results:
MultiHead=Concat(head1,head2,...,heanh)Wo
wherein the content of the first and second substances,
Figure BDA0003357238620000122
representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
In conclusion, according to the scheme, although the multi-head self-attention layer network is added, the output characteristic size is not changed, and the long-distance information processed by a plurality of heads participates more, so that the image restoration effect is improved.
In the present invention, further, the image after being processed by the edge repairing model of the image repairing model is used for repairing, and the specific repairing method includes:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+,IM)
I+=I·(1-M)+Ip·MIpfor predicted restored pictures, I is the real picture, GiAs an image inpainting device, C+To repair edge maps, IMIs a defect map.
Step S24: and calculating an image restoration loss function and optimizing the restoration result of the image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss.
For example, the image contrast loss is similar to the edge contrast loss generated by the edge restoration model
Figure BDA0003357238620000131
Comprises the following steps:
Figure BDA0003357238620000132
furthermore, the first appearance of style loss was proposed in the image style migration task, and in a new improvement, the artifact problem existing in deconvolution was alleviated by introducing gram matrix (GramMatrix). Model adoption of the textA loss function based on the gram matrix style loss is presented. Loss function thereof
Figure BDA0003357238620000133
The expression is as follows:
Figure BDA0003357238620000134
wherein the content of the first and second substances,
Figure BDA0003357238620000135
representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature. Four activation layers, relu-2, relu3-4, relu4-4 and relu5-2 in the VGG19 network are selected.
In addition, the perception loss penalizes the generative image which does not accord with the perception result of the real image phenomenon by defining a distance measure between pre-training activation layers
Figure BDA0003357238620000136
Can be defined as:
Figure BDA0003357238620000137
wherein in the formula
Figure BDA0003357238620000138
The 5 activation layers corresponding to the pre-training VGG19 network are relu1-1, relu2-1, relu3-1, relu4-1 and relu 5-1. Wherein wiRepresenting weight parameters (scheme w)iAll of the values of (1).
In summary, the loss function of the image restoration model includes multiple loss functions, which can be jointly calculated as:
Figure BDA0003357238620000141
wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,
Figure BDA0003357238620000142
the countermeasure loss generated for the image restoration model,
Figure BDA0003357238620000143
in order to be a loss of style,
Figure BDA0003357238620000144
is the loss of perception.
In the invention, further, after the training of the edge first fusion multi-attention mechanism repairing model is completed, the repairing result of the model is tested and evaluated through a test set, and the part is mainly completed on two 1080 TIANMUs by using a PyTorch learning framework. By peak signal-to-noise ratio (PSNR), similarity (SSIM), liAnd evaluating the quality and the repairing effect of the model in the current chapter by four evaluation indexes of error and distance score (FID).
In addition, as shown in fig. 11, a repair result of the multi-attention mechanism repair model fused at the edge first is displayed, from left to right, the first image is an original image, the second image is an image to be repaired covered by the binary mask, the third image is an image repaired by the edge repair model, and the fourth and fifth images are result images repaired by the image repair model. Therefore, the image restored by the first-fusion multi-attention mechanism restoration model is very similar to the original image, but the restored image at some completely missing parts is different from the original image, but is not different according to the human sensory observation. The scheme has good repairing effect and can reasonably repair the missing part. The results show that the network is better than the expectation in the aspect of fusing a multi-head attention mechanism for image restoration.
The above description is intended to describe in detail the preferred embodiments of the present invention, but the embodiments are not intended to limit the scope of the claims of the present invention, and all equivalent changes and modifications made within the technical spirit of the present invention should fall within the scope of the claims of the present invention.

Claims (10)

1. An image restoration method with edge prior fusion of a multi-head attention mechanism is characterized by comprising
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing one-time multi-head attention network and two-time deconvolution;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
2. The method of claim 1, wherein the edge restoration model comprises an edge restorer, and the edge restorer samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after performing dilation convolution-based residual and two deconvolution.
3. The image restoration method based on the edge prior fusion multi-head attention mechanism according to claim 2, wherein the restoration method of the edge restoration model comprises:
step S20: obtaining a predicted edge repairing result of the edge repairing device, obtaining a generation result of an edge repairing model according to the predicted edge repairing result, reserving the image edge of the existing region, and filling the edge part needing repairing in the missing region, as follows:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, IgrayGrey-scale map representing the image to be restored, C+A generated restored edge image representing an edge restoration model.
4. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 3, wherein the method for repairing the edge repairing model further comprises
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model to obtain a repaired edge image.
5. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 1, wherein the image inpainting model inpainting method comprises:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+,IM)
I+=I·(1-M)+Ip·M
wherein, IpFor predicted restored pictures, I is the real picture, GiAs an image inpainting device, C+Repairing the edge graph;
step S24: calculating an image restoration loss function and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss, and the calculation method comprises the following steps:
Figure FDA0003357238610000021
wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,
Figure FDA0003357238610000022
the countermeasure loss generated for the image restoration model,
Figure FDA0003357238610000023
in order to be a loss of style,
Figure FDA0003357238610000024
is the loss of perception.
6. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 1, wherein the image repairing device generates the repaired image after sampling the repaired edge image for a plurality of times, performing a plurality of residual convolutions based on dilation convolution, performing a multi-head attention network once and performing deconvolution twice comprises:
step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
step S2-2: acquiring a reconstructed characteristic map;
step S2-3: splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results;
step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
7. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the obtaining of the reconstructed feature map in step S2-2 includes:
step S2-2-1: the key characteristic diagrams are subjected to rank conversion, and the grouping between the query characteristic diagrams and the rank-converted key characteristic diagrams is subjected to dot product operation to obtain a plurality of groups of correlation attention matrixes;
step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: and carrying out matrix multiplication on the normalized self-attention matrix of each group of correlations and the value characteristic diagram of the group to obtain a reconstructed characteristic diagram of the group.
8. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the step S2-3 of stitching the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:
step S2-3-1: attention results were obtained for the ith head:
Figure FDA0003357238610000031
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: stitching the self-attention results of the individual heads, using WoThe matrix performs fusion projection on a plurality of feature spaces to the size of the original matrix, and finally a plurality of self-attention combined results are obtained:
MultiHead=Concat(head1,head2,...,headh)Wo
9. the image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 5, wherein the style loss is calculated by:
Figure FDA0003357238610000041
wherein the content of the first and second substances,
Figure FDA0003357238610000042
representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
10. The method for repairing an image with an edge prior fused with a multi-head attention mechanism according to claim 7, wherein value feature map elements in the value feature map in step S2-2-3 are used to perform weighted reconstruction on pixels of the value feature map elements by using the set of correlation attention matrices, and weights of other elements in the weighted reconstruction process are pixel values corresponding to the correlation attention moment matrices.
CN202111356234.6A 2021-11-16 2021-11-16 Image restoration method for edge prior fusion multi-head attention mechanism Active CN114022506B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111356234.6A CN114022506B (en) 2021-11-16 2021-11-16 Image restoration method for edge prior fusion multi-head attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111356234.6A CN114022506B (en) 2021-11-16 2021-11-16 Image restoration method for edge prior fusion multi-head attention mechanism

Publications (2)

Publication Number Publication Date
CN114022506A true CN114022506A (en) 2022-02-08
CN114022506B CN114022506B (en) 2024-05-17

Family

ID=80065024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111356234.6A Active CN114022506B (en) 2021-11-16 2021-11-16 Image restoration method for edge prior fusion multi-head attention mechanism

Country Status (1)

Country Link
CN (1) CN114022506B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188875A (en) * 2023-03-29 2023-05-30 北京百度网讯科技有限公司 Image classification method, device, electronic equipment, medium and product
CN117351015A (en) * 2023-12-05 2024-01-05 中国海洋大学 Tamper detection method and system based on edge supervision and multi-domain cross correlation
CN117649365A (en) * 2023-11-16 2024-03-05 西南交通大学 Paper book graph digital restoration method based on convolutional neural network and diffusion model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127346A (en) * 2019-12-08 2020-05-08 复旦大学 Multi-level image restoration method based on partial-to-integral attention mechanism
CN113112411A (en) * 2020-01-13 2021-07-13 南京信息工程大学 Human face image semantic restoration method based on multi-scale feature fusion
CN113240613A (en) * 2021-06-07 2021-08-10 北京航空航天大学 Image restoration method based on edge information reconstruction
CN113379655A (en) * 2021-05-18 2021-09-10 电子科技大学 Image synthesis method for generating antagonistic network based on dynamic self-attention
US20210342983A1 (en) * 2020-04-29 2021-11-04 Adobe Inc. Iterative image inpainting with confidence feedback

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111127346A (en) * 2019-12-08 2020-05-08 复旦大学 Multi-level image restoration method based on partial-to-integral attention mechanism
CN113112411A (en) * 2020-01-13 2021-07-13 南京信息工程大学 Human face image semantic restoration method based on multi-scale feature fusion
US20210342983A1 (en) * 2020-04-29 2021-11-04 Adobe Inc. Iterative image inpainting with confidence feedback
CN113379655A (en) * 2021-05-18 2021-09-10 电子科技大学 Image synthesis method for generating antagonistic network based on dynamic self-attention
CN113240613A (en) * 2021-06-07 2021-08-10 北京航空航天大学 Image restoration method based on edge information reconstruction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李炬;黄文培;: "基于生成对抗网络的图像修复技术研究", 计算机应用与软件, no. 12, 12 December 2019 (2019-12-12) *
邵杭;王永雄;: "基于并行对抗与多条件融合的生成式高分辨率图像修复", 模式识别与人工智能, no. 04, 15 April 2020 (2020-04-15) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116188875A (en) * 2023-03-29 2023-05-30 北京百度网讯科技有限公司 Image classification method, device, electronic equipment, medium and product
CN116188875B (en) * 2023-03-29 2024-03-01 北京百度网讯科技有限公司 Image classification method, device, electronic equipment, medium and product
CN117649365A (en) * 2023-11-16 2024-03-05 西南交通大学 Paper book graph digital restoration method based on convolutional neural network and diffusion model
CN117351015A (en) * 2023-12-05 2024-01-05 中国海洋大学 Tamper detection method and system based on edge supervision and multi-domain cross correlation
CN117351015B (en) * 2023-12-05 2024-03-19 中国海洋大学 Tamper detection method and system based on edge supervision and multi-domain cross correlation

Also Published As

Publication number Publication date
CN114022506B (en) 2024-05-17

Similar Documents

Publication Publication Date Title
CN113240580B (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN114022506B (en) Image restoration method for edge prior fusion multi-head attention mechanism
CN108648197B (en) Target candidate region extraction method based on image background mask
CN111062872A (en) Image super-resolution reconstruction method and system based on edge detection
CN111709902A (en) Infrared and visible light image fusion method based on self-attention mechanism
CN107977932A (en) It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method
CN111787187B (en) Method, system and terminal for repairing video by utilizing deep convolutional neural network
CN111784624B (en) Target detection method, device, equipment and computer readable storage medium
CN115018727A (en) Multi-scale image restoration method, storage medium and terminal
CN114897742B (en) Image restoration method with texture and structural features fused twice
CN112163998A (en) Single-image super-resolution analysis method matched with natural degradation conditions
CN116468645A (en) Antagonistic hyperspectral multispectral remote sensing fusion method
Ma et al. Multi-task interaction learning for spatiospectral image super-resolution
CN113298736A (en) Face image restoration method based on face pattern
Luo et al. Bi-GANs-ST for perceptual image super-resolution
CN112884657B (en) Face super-resolution reconstruction method and system
CN117197627B (en) Multi-mode image fusion method based on high-order degradation model
CN117291803B (en) PAMGAN lightweight facial super-resolution reconstruction method
CN116523985B (en) Structure and texture feature guided double-encoder image restoration method
CN111401209B (en) Action recognition method based on deep learning
Ren et al. A lightweight object detection network in low-light conditions based on depthwise separable pyramid network and attention mechanism on embedded platforms
CN114862699B (en) Face repairing method, device and storage medium based on generation countermeasure network
CN114764754B (en) Occlusion face restoration method based on geometric perception priori guidance
CN116071229A (en) Image super-resolution reconstruction method for wearable helmet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant