CN114022506A - Image restoration method with edge prior fusion multi-head attention mechanism - Google Patents
Image restoration method with edge prior fusion multi-head attention mechanism Download PDFInfo
- Publication number
- CN114022506A CN114022506A CN202111356234.6A CN202111356234A CN114022506A CN 114022506 A CN114022506 A CN 114022506A CN 202111356234 A CN202111356234 A CN 202111356234A CN 114022506 A CN114022506 A CN 114022506A
- Authority
- CN
- China
- Prior art keywords
- image
- edge
- repairing
- restoration
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000007246 mechanism Effects 0.000 title claims abstract description 45
- 230000004927 fusion Effects 0.000 title claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000012360 testing method Methods 0.000 claims abstract description 15
- 230000007547 defect Effects 0.000 claims abstract description 6
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000010586 diagram Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 22
- 239000013598 vector Substances 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 8
- 230000008447 perception Effects 0.000 claims description 8
- 239000000126 substance Substances 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000010339 dilation Effects 0.000 claims description 4
- 101000840267 Homo sapiens Immunoglobulin lambda-like polypeptide 1 Proteins 0.000 claims description 3
- 102100029616 Immunoglobulin lambda-like polypeptide 1 Human genes 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 9
- 230000008439 repair process Effects 0.000 description 8
- 238000003708 edge detection Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of image restoration, and discloses an image restoration method with an edge prior fusion multi-head attention mechanism, which comprises the following steps of S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image; step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training; the method integrates a multi-head attention mechanism, and by extracting richer images, the long-distance dependence of pixels is achieved to improve the image restoration effect.
Description
Technical Field
The invention relates to the technical field of image restoration, in particular to an image restoration method with an edge prior fusion multi-head attention mechanism.
Background
In the information society, images are the most important sources of information. How to obtain more complete and clearer images has also become a hotspot in the field of computer vision, and related application fields include image restoration and super-resolution. Image inpainting refers to a technique for recovering a complete image from the rest of the image information in a damaged image. This is not a difficult task for the human eye, but is a rather challenging task for computer vision. There are many practical solutions to this technology, such as image restoration (to remove photo scratches and text occlusion), photo editing (to remove unwanted objects), image coding and transmission (network during image transmission) that require the use of image block content loss due to packet loss). Therefore, image restoration techniques are a very popular research area in recent years.
At present, the image is repaired to become mainstream based on the generation of the countermeasure network, so that the image which is close to training data but does not exist can be generated through a network model to achieve the effect of falseness.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an image restoration method with an edge priori fusion multi-head attention mechanism.
In order to achieve the above purpose, the invention provides the following technical scheme:
an image restoration method for an edge prior fusion multi-head attention mechanism comprises
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing one-time multi-head attention network and two-time deconvolution;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
In the present invention, the edge restoration model further includes an edge restorer, which samples the extracted edge map, original image and mask image, and converts the feature map into a single-channel edge map after a plurality of expansion convolution-based residuals and two deconvolutions.
In the present invention, further, the repairing method of the edge repairing model comprises:
step S20: obtaining a predicted edge repairing result of the edge repairing device, obtaining a generation result of an edge repairing model according to the predicted edge repairing result, reserving the image edge of the existing region, and filling the edge part needing repairing in the missing region, as follows:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeTo representEdge restorer, M denotes mask image, C denotes edge map of image to be restored, IgrayGrey-scale map representing the image to be restored, C+A generated restored edge image representing a table edge restoration model.
In the invention, further, the method for repairing the edge repairing model further comprises
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model to obtain a repaired edge image.
In the present invention, further, the image restoration method includes:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+IM)
I+=I·(1-M)+Ip·M
wherein, IpFor predicted restored pictures, I is the real picture, GiAs an image inpainting device C+To repair edge images, IM;
Step S24: calculating an image restoration loss function and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss, and the calculation method comprises the following steps:
wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,the countermeasure loss generated for the image restoration model,in order to be a loss of style,is the loss of perception.
In the present invention, further, the generating of the restored picture by the image restorer after performing multiple sampling on the restored edge image, multiple residual convolutions based on the extended convolution, one multi-head attention network and two deconvolution includes: step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
step S2-2: acquiring a reconstructed characteristic map;
step S2-3: splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results;
step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
In the present invention, further, the acquiring of the reconstructed feature map in step S2-2 includes:
step S2-2-1: the key characteristic diagrams are subjected to rank conversion, and the grouping between the query characteristic diagrams and the rank-converted key characteristic diagrams is subjected to dot product operation to obtain a plurality of groups of correlation attention matrixes;
step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: and carrying out matrix multiplication on the normalized self-attention matrix of each group of correlations and the value characteristic diagram of the group to obtain a reconstructed characteristic diagram of the group.
The step S2-3 of splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:
step S2-3-1: attention results were obtained for the ith head:
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: splicing the self-attention results of all heads, fusing and projecting a plurality of feature spaces back to the size of an original matrix by using a W matrix, and finally obtaining a plurality of self-attention combined results:
MultiHead=Concat(heead1,dead2,...,headh)Wo
wherein the content of the first and second substances,representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
The style loss calculation method comprises the following steps:
wherein the content of the first and second substances,representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
In the present invention, further, in the step S2-2-3, the value feature map element in the value feature map is weighted and reconstructed by using the set of correlation attention matrixes to the pixel thereof, and the weight of the other elements in the weighted reconstruction process is the pixel value corresponding to the correlation attention moment matrix.
Compared with the prior art, the invention has the beneficial effects that:
according to the method, the multi-head attention network capable of capturing the long-distance relationship among the richer pixel regions is added behind the last residual layer of the image restoration model, in order to enable the model to learn information in different subspaces, each head is used for processing different information by using a plurality of parallel repeated attention calculations, so that the characteristics of different parts can be processed, the richer long-distance dependency relationship is extracted, the multi-head self-attention network can learn the correlation matrixes of different modes, the method has an important effect on improving the restoration result, and the image restoration effect is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is an overall flowchart of an image inpainting method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 2 is a flowchart of step S2 in the image inpainting method of the present invention in which the multi-head attention mechanism is fused with edge priors;
FIG. 3 is a flowchart of the steps S2-2 and S2-3 in the image inpainting method of the present invention with edge prior fusion multi-head attention mechanism;
FIG. 4 is a flowchart of an implementation of a method for repairing an edge repairing model in an image repairing method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 5 is a schematic diagram of obtaining query, key, value feature maps in the image restoration method with edge prior fusion multi-head attention mechanism of the present invention;
FIG. 6 is a schematic diagram of a flow chart for obtaining a correlation attention matrix in an image restoration method with edge prior fusion multi-head attention mechanism according to the present invention
FIG. 7 is a schematic flow chart of a reconstructed feature map in an image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 8 is a network architecture diagram of a multi-head self-attention layer in the image restoration method with edge prior fusion of a multi-head attention mechanism according to the present invention;
FIG. 9 is a schematic diagram of an edge restoration model construction framework in the image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;
FIG. 10 is a schematic diagram of a framework for constructing an edge image restoration model in an image restoration method with an edge prior fusion multi-head attention mechanism according to the present invention;
FIG. 11 is a schematic diagram of an experimental result of the image inpainting method of the edge prior fusion multi-head attention mechanism of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When a component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When a component is referred to as being "disposed on" another component, it can be directly on the other component or intervening components may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, a preferred embodiment of the present invention provides an image restoration method of an edge prior fusion multi-head attention mechanism, which includes
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model which comprises an edge repairing model and an image repairing model, wherein the edge repairing model takes the extracted edge image, the original image and the mask image as input and outputs the extracted edge image, the original image and the mask image as repaired edge images, and the image repairing model takes the repaired edge images and the defect images as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing multi-head attention network for one time and performing deconvolution for two times;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
Specifically, according to the scheme, a certain amount of intact images of related images are collected according to experimental requirements, data collection is completed, then, data are subjected to preliminary processing through a preprocessing technology to obtain data meeting standards, the data set is divided into a training set and a testing set, then, an image restoration model is gradually built according to algorithm design, after the model is built, the model is trained through the training set, and the effect of the model is tested and evaluated through the testing set. According to the scheme, the multi-attention mechanism is fused, so that the edge is fused with the multi-attention mechanism repairing model firstly, and the effect of improving image repairing is achieved by extracting richer images and relying on the long distance of pixels.
In the present invention, the image size is adjusted to 256 × 256 using the Celeba public dataset image, and the image is applied to the experiment. However, because the data set is not divided into a training set, a verification set and a test set, the first 18 ten thousand pictures are selected from the data set for training the model, and 4000 pictures are selected as the test set for analyzing and comparing the experimental results. In addition, when the model is trained, the mask images used by the model are taken from an irregular mask data set, the irregular mask images arranged in the data set are divided into six groups according to the code images of which the missing region area accounts for 0-10%, 10-20%, 20-30%, 30-40%, 40-50% and 50-60% of the proportion of the whole image, each group comprises 2000 images, 1000 mask images represent the condition that the image boundary is missing, and the other 1000 mask images represent the condition that the image boundary is intact.
The method comprises the following steps of extracting an edge image of an image by Canny edge detection aiming at an image originally input by a training set, wherein the Canny edge detection is divided into four steps: and Gaussian filtering, namely calculating gradient values and gradient directions, filtering non-maximum values, detecting edges by using upper and lower threshold values, obtaining an edge image of the original image, performing edge detection by using a binary mask, and generating a repaired edge image by generating a countermeasure network. In the test task, 4000 pictures are taken out from the Celeba data set as a test set, a random missing mask graph is used for simulating a missing area, the hand-drawn mask graph used for testing is divided into four groups of 1 to 4 according to the missing area ratio from small to large, and each group contains 1000 pictures which are 4000.
In the present invention, as shown in fig. 9, the edge repairing model includes an edge repairing unit, and the edge repairing unit samples the extracted edge map, the original image, and the mask image, and converts the feature map into a single-channel edge map after performing convolution residual based on dilation and two deconvolution for multiple times. Specifically, the feature map is converted into a single-channel edge map through 3 times of downsampling, 8 times of residual convolution based on expansion convolution and 2 times of deconvolution.
Specifically, in the present invention, as shown in fig. 4, the repairing method of the edge repairing model is as follows:
step S20: the edge repairing device splices an edge image, a mask image and a gray level image of a to-be-repaired image into a tensor as input, so as to obtain a predicted edge repairing result of the edge repairing device, obtains a generation result of an edge repairing model according to the predicted edge repairing result, reserves the image edge of an existing region, and fills an edge part needing to be repaired in a missing region, and the method comprises the following steps:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, Igray′Grey-scale map representing the image to be restored, C+A generated restored edge image representing an edge restoration model.
In the present invention, further, the method for repairing the edge repairing model further includes:
step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model according to the loss function to obtain a repaired edge image finally output by the edge repairing model.
Specifically, the loss function of the edge repairing model belongs to a mixed loss function, the purpose of which is to constrain the result processed by the edge discriminator, and the mixed function is to generate a weighted sum of the edge confrontation loss and the edge feature loss. Wherein, generating the edge countermeasure loss is a sort of cross entropy, which can be recorded as:
wherein the content of the first and second substances,representing a loss of edge-confrontation,representing a real picture and a grey-scale map as desired,expressing the gray-scale map and the original map to expect, DeAn edge discriminator is indicated.
Secondly, the edge feature loss is a distance function defined on the feature layer of the edge restorer, and the main function of the edge feature loss is to calculate the sum of the distances between the generated edge and the features of different layers extracted by the edge restorer and detected by canny, so that the feature loss formula expression is as follows:
wherein the content of the first and second substances,indicating loss of edge features, n representing the number of active layers of the edge discriminator
Finally, the optimization goal of the edge model can be written as:
wherein the content of the first and second substances,it is shown that the edge-healer is minimized,representing a maximized edge discriminator.
In the present invention, as shown in fig. 10, the image restoration model includes an image restorer, and the image restorer generates a restoration picture after sampling the restored edge image for a plurality of times, performing a plurality of residual convolutions based on the dilation convolution, performing a plurality of multi-head attention networks once, and performing two deconvolution.
The convolutional neural network only focuses on pixel values of local areas when learning characteristics, and influences of relevance of pixels of remote areas on image generation and restoration are ignored, so that a plurality of attention mechanism models are designed to better capture remote dependence relations, wherein one of the multi-head self-attention networks is based on an expansion structure of the self-attention network, and the self-attention network can effectively capture remote relations among pixels in imagination. But the pixel long-distance relations of each region are not just one group, but the self-attention network is not enough to learn a plurality of long-distance relations, so that a multi-head attention network capable of capturing the long-distance relations among richer pixel regions is adopted. The multi-head self-attention network may learn correlation matrixes of different modes, which has a very important role in improving the repair result.
Specifically, as shown in fig. 2, in the present scheme, a multi-head self-attention layer network is added after the last residual layer, and the specific scheme is as follows:
step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
specifically, as shown in FIG. 5, the size of the query feature map is Bg×Wf×Hf×CqIn which B isgFor hidden variable batches of generator inputs, WfIs the width, H, of the query profilefIs the height, C, of the query profileqIs the channel dimension of the query feature map, and key is the size of the feature map Bg×Wf×Hf×CkSeveral other parameters are the same as the query profile, CkIs the channel dimension of the key profile. Value feature map size Bg×Wf×Hf×CvOther parameters are the same as key and query, CvIs the channel dimension of the feature map.
Step S2-2: acquiring a reconstructed feature map, as shown in fig. 3, the specific method includes:
step S2-2-1: rank conversion is carried out on the key characteristic diagrams, and dot product operation is carried out on the groups between the query characteristic diagrams and the rank-converted key characteristic diagrams, as shown in fig. 6, so that a plurality of groups of correlation attention matrixes are obtained;
step S2-2-2: and (4) normalizing the correlation attention moment matrix, wherein a dot product matrix is normalized by a method such as Softmax and the like.
Step S2-2-3: the normalized self-attention matrix of each set of correlations is matrix-multiplied with the value profile of the set, as shown in fig. 7, to obtain a reconstructed profile of the set. And weighting and reconstructing the element by using the pixel of the value characteristic diagram element in the value characteristic diagram and the set of correlation attention matrixes, wherein the weight values of other elements in the weighting and reconstructing process are pixel values corresponding to the correlation attention moment matrixes.
Further, after the reconstructed feature map is obtained, step S2-3 is performed, as shown in fig. 8,
step S2-3: and splicing the reconstructed characteristic graphs according to the channel dimension to obtain a plurality of attention combination results.
Step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
In one embodiment provided by the present invention, a specific method for obtaining a plurality of attention combination results is as follows:
step S2-3-1: attention results were obtained for the ith head:
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: splicing the self-attention results of all heads, fusing and projecting a plurality of feature spaces back to the size of an original matrix by using a W matrix, and finally obtaining a plurality of self-attention combined results:
MultiHead=Concat(head1,head2,...,heanh)Wo
wherein the content of the first and second substances,representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature.
In conclusion, according to the scheme, although the multi-head self-attention layer network is added, the output characteristic size is not changed, and the long-distance information processed by a plurality of heads participates more, so that the image restoration effect is improved.
In the present invention, further, the image after being processed by the edge repairing model of the image repairing model is used for repairing, and the specific repairing method includes:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+,IM)
I+=I·(1-M)+Ip·MIpfor predicted restored pictures, I is the real picture, GiAs an image inpainting device, C+To repair edge maps, IMIs a defect map.
Step S24: and calculating an image restoration loss function and optimizing the restoration result of the image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss.
For example, the image contrast loss is similar to the edge contrast loss generated by the edge restoration modelComprises the following steps:
furthermore, the first appearance of style loss was proposed in the image style migration task, and in a new improvement, the artifact problem existing in deconvolution was alleviated by introducing gram matrix (GramMatrix). Model adoption of the textA loss function based on the gram matrix style loss is presented. Loss function thereofThe expression is as follows:
wherein the content of the first and second substances,representing the gray matrix, Gr, formed by the inner products of predicted image vectorsi(IM) A gram matrix representing the inner product of the vectors of the real image, cihiwiRepresenting the dimension of the activation feature. Four activation layers, relu-2, relu3-4, relu4-4 and relu5-2 in the VGG19 network are selected.
In addition, the perception loss penalizes the generative image which does not accord with the perception result of the real image phenomenon by defining a distance measure between pre-training activation layersCan be defined as:
wherein in the formulaThe 5 activation layers corresponding to the pre-training VGG19 network are relu1-1, relu2-1, relu3-1, relu4-1 and relu 5-1. Wherein wiRepresenting weight parameters (scheme w)iAll of the values of (1).
In summary, the loss function of the image restoration model includes multiple loss functions, which can be jointly calculated as:
wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,the countermeasure loss generated for the image restoration model,in order to be a loss of style,is the loss of perception.
In the invention, further, after the training of the edge first fusion multi-attention mechanism repairing model is completed, the repairing result of the model is tested and evaluated through a test set, and the part is mainly completed on two 1080 TIANMUs by using a PyTorch learning framework. By peak signal-to-noise ratio (PSNR), similarity (SSIM), liAnd evaluating the quality and the repairing effect of the model in the current chapter by four evaluation indexes of error and distance score (FID).
In addition, as shown in fig. 11, a repair result of the multi-attention mechanism repair model fused at the edge first is displayed, from left to right, the first image is an original image, the second image is an image to be repaired covered by the binary mask, the third image is an image repaired by the edge repair model, and the fourth and fifth images are result images repaired by the image repair model. Therefore, the image restored by the first-fusion multi-attention mechanism restoration model is very similar to the original image, but the restored image at some completely missing parts is different from the original image, but is not different according to the human sensory observation. The scheme has good repairing effect and can reasonably repair the missing part. The results show that the network is better than the expectation in the aspect of fusing a multi-head attention mechanism for image restoration.
The above description is intended to describe in detail the preferred embodiments of the present invention, but the embodiments are not intended to limit the scope of the claims of the present invention, and all equivalent changes and modifications made within the technical spirit of the present invention should fall within the scope of the claims of the present invention.
Claims (10)
1. An image restoration method with edge prior fusion of a multi-head attention mechanism is characterized by comprising
Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;
step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training;
the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing one-time multi-head attention network and two-time deconvolution;
step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.
2. The method of claim 1, wherein the edge restoration model comprises an edge restorer, and the edge restorer samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after performing dilation convolution-based residual and two deconvolution.
3. The image restoration method based on the edge prior fusion multi-head attention mechanism according to claim 2, wherein the restoration method of the edge restoration model comprises:
step S20: obtaining a predicted edge repairing result of the edge repairing device, obtaining a generation result of an edge repairing model according to the predicted edge repairing result, reserving the image edge of the existing region, and filling the edge part needing repairing in the missing region, as follows:
Cp=Ge(M,C,Igray)
C+=C·(1-M)+Cp·M
wherein, CpRepresenting a predicted edge-repaired image, GeRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, IgrayGrey-scale map representing the image to be restored, C+A generated restored edge image representing an edge restoration model.
4. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 3, wherein the method for repairing the edge repairing model further comprises
Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;
step S22: and optimizing the generation result of the edge repairing model to obtain a repaired edge image.
5. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 1, wherein the image inpainting model inpainting method comprises:
step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:
Ip=Gi(M,C+,IM)
I+=I·(1-M)+Ip·M
wherein, IpFor predicted restored pictures, I is the real picture, GiAs an image inpainting device, C+Repairing the edge graph;
step S24: calculating an image restoration loss function and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss, and the calculation method comprises the following steps:
6. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 1, wherein the image repairing device generates the repaired image after sampling the repaired edge image for a plurality of times, performing a plurality of residual convolutions based on dilation convolution, performing a multi-head attention network once and performing deconvolution twice comprises:
step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;
step S2-2: acquiring a reconstructed characteristic map;
step S2-3: splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results;
step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.
7. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the obtaining of the reconstructed feature map in step S2-2 includes:
step S2-2-1: the key characteristic diagrams are subjected to rank conversion, and the grouping between the query characteristic diagrams and the rank-converted key characteristic diagrams is subjected to dot product operation to obtain a plurality of groups of correlation attention matrixes;
step S2-2-2: normalizing the correlation attention moment array;
step S2-2-3: and carrying out matrix multiplication on the normalized self-attention matrix of each group of correlations and the value characteristic diagram of the group to obtain a reconstructed characteristic diagram of the group.
8. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the step S2-3 of stitching the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:
step S2-3-1: attention results were obtained for the ith head:
wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;
step S2-3-2: stitching the self-attention results of the individual heads, using WoThe matrix performs fusion projection on a plurality of feature spaces to the size of the original matrix, and finally a plurality of self-attention combined results are obtained:
MultiHead=Concat(head1,head2,...,headh)Wo
9. the image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 5, wherein the style loss is calculated by:
10. The method for repairing an image with an edge prior fused with a multi-head attention mechanism according to claim 7, wherein value feature map elements in the value feature map in step S2-2-3 are used to perform weighted reconstruction on pixels of the value feature map elements by using the set of correlation attention matrices, and weights of other elements in the weighted reconstruction process are pixel values corresponding to the correlation attention moment matrices.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111356234.6A CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111356234.6A CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114022506A true CN114022506A (en) | 2022-02-08 |
CN114022506B CN114022506B (en) | 2024-05-17 |
Family
ID=80065024
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111356234.6A Active CN114022506B (en) | 2021-11-16 | 2021-11-16 | Image restoration method for edge prior fusion multi-head attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114022506B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116188875A (en) * | 2023-03-29 | 2023-05-30 | 北京百度网讯科技有限公司 | Image classification method, device, electronic equipment, medium and product |
CN117351015A (en) * | 2023-12-05 | 2024-01-05 | 中国海洋大学 | Tamper detection method and system based on edge supervision and multi-domain cross correlation |
CN117649365A (en) * | 2023-11-16 | 2024-03-05 | 西南交通大学 | Paper book graph digital restoration method based on convolutional neural network and diffusion model |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127346A (en) * | 2019-12-08 | 2020-05-08 | 复旦大学 | Multi-level image restoration method based on partial-to-integral attention mechanism |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
CN113240613A (en) * | 2021-06-07 | 2021-08-10 | 北京航空航天大学 | Image restoration method based on edge information reconstruction |
CN113379655A (en) * | 2021-05-18 | 2021-09-10 | 电子科技大学 | Image synthesis method for generating antagonistic network based on dynamic self-attention |
US20210342983A1 (en) * | 2020-04-29 | 2021-11-04 | Adobe Inc. | Iterative image inpainting with confidence feedback |
-
2021
- 2021-11-16 CN CN202111356234.6A patent/CN114022506B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127346A (en) * | 2019-12-08 | 2020-05-08 | 复旦大学 | Multi-level image restoration method based on partial-to-integral attention mechanism |
CN113112411A (en) * | 2020-01-13 | 2021-07-13 | 南京信息工程大学 | Human face image semantic restoration method based on multi-scale feature fusion |
US20210342983A1 (en) * | 2020-04-29 | 2021-11-04 | Adobe Inc. | Iterative image inpainting with confidence feedback |
CN113379655A (en) * | 2021-05-18 | 2021-09-10 | 电子科技大学 | Image synthesis method for generating antagonistic network based on dynamic self-attention |
CN113240613A (en) * | 2021-06-07 | 2021-08-10 | 北京航空航天大学 | Image restoration method based on edge information reconstruction |
Non-Patent Citations (2)
Title |
---|
李炬;黄文培;: "基于生成对抗网络的图像修复技术研究", 计算机应用与软件, no. 12, 12 December 2019 (2019-12-12) * |
邵杭;王永雄;: "基于并行对抗与多条件融合的生成式高分辨率图像修复", 模式识别与人工智能, no. 04, 15 April 2020 (2020-04-15) * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116188875A (en) * | 2023-03-29 | 2023-05-30 | 北京百度网讯科技有限公司 | Image classification method, device, electronic equipment, medium and product |
CN116188875B (en) * | 2023-03-29 | 2024-03-01 | 北京百度网讯科技有限公司 | Image classification method, device, electronic equipment, medium and product |
CN117649365A (en) * | 2023-11-16 | 2024-03-05 | 西南交通大学 | Paper book graph digital restoration method based on convolutional neural network and diffusion model |
CN117351015A (en) * | 2023-12-05 | 2024-01-05 | 中国海洋大学 | Tamper detection method and system based on edge supervision and multi-domain cross correlation |
CN117351015B (en) * | 2023-12-05 | 2024-03-19 | 中国海洋大学 | Tamper detection method and system based on edge supervision and multi-domain cross correlation |
Also Published As
Publication number | Publication date |
---|---|
CN114022506B (en) | 2024-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113240580B (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN110119780B (en) | Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network | |
CN114022506B (en) | Image restoration method for edge prior fusion multi-head attention mechanism | |
CN108648197B (en) | Target candidate region extraction method based on image background mask | |
CN111062872A (en) | Image super-resolution reconstruction method and system based on edge detection | |
CN111709902A (en) | Infrared and visible light image fusion method based on self-attention mechanism | |
CN107977932A (en) | It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method | |
CN111787187B (en) | Method, system and terminal for repairing video by utilizing deep convolutional neural network | |
CN111784624B (en) | Target detection method, device, equipment and computer readable storage medium | |
CN115018727A (en) | Multi-scale image restoration method, storage medium and terminal | |
CN114897742B (en) | Image restoration method with texture and structural features fused twice | |
CN112163998A (en) | Single-image super-resolution analysis method matched with natural degradation conditions | |
CN116468645A (en) | Antagonistic hyperspectral multispectral remote sensing fusion method | |
Ma et al. | Multi-task interaction learning for spatiospectral image super-resolution | |
CN113298736A (en) | Face image restoration method based on face pattern | |
Luo et al. | Bi-GANs-ST for perceptual image super-resolution | |
CN112884657B (en) | Face super-resolution reconstruction method and system | |
CN117197627B (en) | Multi-mode image fusion method based on high-order degradation model | |
CN117291803B (en) | PAMGAN lightweight facial super-resolution reconstruction method | |
CN116523985B (en) | Structure and texture feature guided double-encoder image restoration method | |
CN111401209B (en) | Action recognition method based on deep learning | |
Ren et al. | A lightweight object detection network in low-light conditions based on depthwise separable pyramid network and attention mechanism on embedded platforms | |
CN114862699B (en) | Face repairing method, device and storage medium based on generation countermeasure network | |
CN114764754B (en) | Occlusion face restoration method based on geometric perception priori guidance | |
CN116071229A (en) | Image super-resolution reconstruction method for wearable helmet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |