CN114022506A

CN114022506A - Image restoration method with edge prior fusion multi-head attention mechanism

Info

Publication number: CN114022506A
Application number: CN202111356234.6A
Authority: CN
Inventors: 张加万; 赵晨曦; 李会彬
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2021-11-16
Filing date: 2021-11-16
Publication date: 2022-02-08
Anticipated expiration: 2041-11-16
Also published as: CN114022506B

Abstract

The invention relates to the technical field of image restoration, and discloses an image restoration method with an edge prior fusion multi-head attention mechanism, which comprises the following steps of S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image; step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training; the method integrates a multi-head attention mechanism, and by extracting richer images, the long-distance dependence of pixels is achieved to improve the image restoration effect.

Description

Image restoration method with edge prior fusion multi-head attention mechanism

Technical Field

The invention relates to the technical field of image restoration, in particular to an image restoration method with an edge prior fusion multi-head attention mechanism.

Background

In the information society, images are the most important sources of information. How to obtain more complete and clearer images has also become a hotspot in the field of computer vision, and related application fields include image restoration and super-resolution. Image inpainting refers to a technique for recovering a complete image from the rest of the image information in a damaged image. This is not a difficult task for the human eye, but is a rather challenging task for computer vision. There are many practical solutions to this technology, such as image restoration (to remove photo scratches and text occlusion), photo editing (to remove unwanted objects), image coding and transmission (network during image transmission) that require the use of image block content loss due to packet loss). Therefore, image restoration techniques are a very popular research area in recent years.

At present, the image is repaired to become mainstream based on the generation of the countermeasure network, so that the image which is close to training data but does not exist can be generated through a network model to achieve the effect of falseness.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an image restoration method with an edge priori fusion multi-head attention mechanism.

In order to achieve the above purpose, the invention provides the following technical scheme:

an image restoration method for an edge prior fusion multi-head attention mechanism comprises

Step S1: acquiring experimental data and preprocessing the data, wherein the experimental data comprises a training set and a testing set, and extracting an edge image of an image from a preprocessed image;

step S2: constructing an edge first-fusion multi-attention mechanism repairing model, wherein the edge first-fusion multi-attention mechanism repairing model comprises an edge repairing model and an image repairing model, the edge repairing model takes an extracted edge image, an original image and a mask image as input and outputs a repaired edge image, and the image repairing model takes the repaired edge image and a defect image as input for training;

the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing one-time multi-head attention network and two-time deconvolution;

step S3: and evaluating the result of the edge first-fusion multi-attention mechanism repairing model through the test set.

In the present invention, the edge restoration model further includes an edge restorer, which samples the extracted edge map, original image and mask image, and converts the feature map into a single-channel edge map after a plurality of expansion convolution-based residuals and two deconvolutions.

In the present invention, further, the repairing method of the edge repairing model comprises:

step S20: obtaining a predicted edge repairing result of the edge repairing device, obtaining a generation result of an edge repairing model according to the predicted edge repairing result, reserving the image edge of the existing region, and filling the edge part needing repairing in the missing region, as follows:

C_p＝G_e(M，C，I_gray)

C₊＝C·(1-M)+C_p·M

wherein, C_pRepresenting a predicted edge-repaired image, G_eTo representEdge restorer, M denotes mask image, C denotes edge map of image to be restored, I_grayGrey-scale map representing the image to be restored, C₊A generated restored edge image representing a table edge restoration model.

In the invention, further, the method for repairing the edge repairing model further comprises

Step S21, calculating a loss function of the edge restorer, wherein the loss function is a weighted sum of the generated edge confrontation loss and the edge characteristic loss;

step S22: and optimizing the generation result of the edge repairing model to obtain a repaired edge image.

In the present invention, further, the image restoration method includes:

step S23: obtaining a predicted repaired image by taking the tensor of the spliced repaired edge image and the damaged image as input, and obtaining the repaired image according to the predicted repaired image:

I_p＝G_i(M，C+I^M)

I₊＝I·(1-M)+I_p·M

wherein, I_pFor predicted restored pictures, I is the real picture, G_iAs an image inpainting device C₊To repair edge images, I^M；

Step S24: calculating an image restoration loss function and optimizing a restoration result of an image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss, and the calculation method comprises the following steps:

wherein, the lambda 3, the lambda 4, the lambda 5 and the lambda 6 are self-defined super parameters,

the countermeasure loss generated for the image restoration model,

in order to be a loss of style,

is the loss of perception.

In the present invention, further, the generating of the restored picture by the image restorer after performing multiple sampling on the restored edge image, multiple residual convolutions based on the extended convolution, one multi-head attention network and two deconvolution includes: step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;

step S2-2: acquiring a reconstructed characteristic map;

step S2-3: splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results;

step S2-4: and transforming the original input feature into the size of the original input feature through convolutional network transformation, and adding the restored reconstructed feature map and the original feature map to obtain a final output restored picture result.

In the present invention, further, the acquiring of the reconstructed feature map in step S2-2 includes:

step S2-2-1: the key characteristic diagrams are subjected to rank conversion, and the grouping between the query characteristic diagrams and the rank-converted key characteristic diagrams is subjected to dot product operation to obtain a plurality of groups of correlation attention matrixes;

step S2-2-2: normalizing the correlation attention moment array;

step S2-2-3: and carrying out matrix multiplication on the normalized self-attention matrix of each group of correlations and the value characteristic diagram of the group to obtain a reconstructed characteristic diagram of the group.

The step S2-3 of splicing the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:

step S2-3-1: attention results were obtained for the ith head:

wherein, Qi, Ki and Vi are used for representing the characteristic diagram query key value matrix of the ith head;

step S2-3-2: splicing the self-attention results of all heads, fusing and projecting a plurality of feature spaces back to the size of an original matrix by using a W matrix, and finally obtaining a plurality of self-attention combined results:

MultiHead＝Concat(heead₁，dead₂，...，head_h)W^o

wherein the content of the first and second substances,

representing the gray matrix, Gr, formed by the inner products of predicted image vectors_i(I^M) A gram matrix representing the inner product of the vectors of the real image, c_ih_iw_iRepresenting the dimension of the activation feature.

The style loss calculation method comprises the following steps:

wherein the content of the first and second substances,

In the present invention, further, in the step S2-2-3, the value feature map element in the value feature map is weighted and reconstructed by using the set of correlation attention matrixes to the pixel thereof, and the weight of the other elements in the weighted reconstruction process is the pixel value corresponding to the correlation attention moment matrix.

Compared with the prior art, the invention has the beneficial effects that:

according to the method, the multi-head attention network capable of capturing the long-distance relationship among the richer pixel regions is added behind the last residual layer of the image restoration model, in order to enable the model to learn information in different subspaces, each head is used for processing different information by using a plurality of parallel repeated attention calculations, so that the characteristics of different parts can be processed, the richer long-distance dependency relationship is extracted, the multi-head self-attention network can learn the correlation matrixes of different modes, the method has an important effect on improving the restoration result, and the image restoration effect is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is an overall flowchart of an image inpainting method with edge prior fusion with a multi-head attention mechanism according to the present invention;

FIG. 2 is a flowchart of step S2 in the image inpainting method of the present invention in which the multi-head attention mechanism is fused with edge priors;

FIG. 3 is a flowchart of the steps S2-2 and S2-3 in the image inpainting method of the present invention with edge prior fusion multi-head attention mechanism;

FIG. 4 is a flowchart of an implementation of a method for repairing an edge repairing model in an image repairing method with edge prior fusion with a multi-head attention mechanism according to the present invention;

FIG. 5 is a schematic diagram of obtaining query, key, value feature maps in the image restoration method with edge prior fusion multi-head attention mechanism of the present invention;

FIG. 6 is a schematic diagram of a flow chart for obtaining a correlation attention matrix in an image restoration method with edge prior fusion multi-head attention mechanism according to the present invention

FIG. 7 is a schematic flow chart of a reconstructed feature map in an image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;

FIG. 8 is a network architecture diagram of a multi-head self-attention layer in the image restoration method with edge prior fusion of a multi-head attention mechanism according to the present invention;

FIG. 9 is a schematic diagram of an edge restoration model construction framework in the image restoration method with edge prior fusion with a multi-head attention mechanism according to the present invention;

FIG. 10 is a schematic diagram of a framework for constructing an edge image restoration model in an image restoration method with an edge prior fusion multi-head attention mechanism according to the present invention;

FIG. 11 is a schematic diagram of an experimental result of the image inpainting method of the edge prior fusion multi-head attention mechanism of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When a component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When a component is referred to as being "disposed on" another component, it can be directly on the other component or intervening components may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

Referring to fig. 1, a preferred embodiment of the present invention provides an image restoration method of an edge prior fusion multi-head attention mechanism, which includes

step S2: constructing an edge first-fusion multi-attention mechanism repairing model which comprises an edge repairing model and an image repairing model, wherein the edge repairing model takes the extracted edge image, the original image and the mask image as input and outputs the extracted edge image, the original image and the mask image as repaired edge images, and the image repairing model takes the repaired edge images and the defect images as input for training;

the image restoration model comprises an image restoration device, wherein the image restoration device generates a restoration picture after sampling the restored edge image for multiple times, performing residual convolution based on expansion convolution for multiple times, performing multi-head attention network for one time and performing deconvolution for two times;

Specifically, according to the scheme, a certain amount of intact images of related images are collected according to experimental requirements, data collection is completed, then, data are subjected to preliminary processing through a preprocessing technology to obtain data meeting standards, the data set is divided into a training set and a testing set, then, an image restoration model is gradually built according to algorithm design, after the model is built, the model is trained through the training set, and the effect of the model is tested and evaluated through the testing set. According to the scheme, the multi-attention mechanism is fused, so that the edge is fused with the multi-attention mechanism repairing model firstly, and the effect of improving image repairing is achieved by extracting richer images and relying on the long distance of pixels.

In the present invention, the image size is adjusted to 256 × 256 using the Celeba public dataset image, and the image is applied to the experiment. However, because the data set is not divided into a training set, a verification set and a test set, the first 18 ten thousand pictures are selected from the data set for training the model, and 4000 pictures are selected as the test set for analyzing and comparing the experimental results. In addition, when the model is trained, the mask images used by the model are taken from an irregular mask data set, the irregular mask images arranged in the data set are divided into six groups according to the code images of which the missing region area accounts for 0-10%, 10-20%, 20-30%, 30-40%, 40-50% and 50-60% of the proportion of the whole image, each group comprises 2000 images, 1000 mask images represent the condition that the image boundary is missing, and the other 1000 mask images represent the condition that the image boundary is intact.

The method comprises the following steps of extracting an edge image of an image by Canny edge detection aiming at an image originally input by a training set, wherein the Canny edge detection is divided into four steps: and Gaussian filtering, namely calculating gradient values and gradient directions, filtering non-maximum values, detecting edges by using upper and lower threshold values, obtaining an edge image of the original image, performing edge detection by using a binary mask, and generating a repaired edge image by generating a countermeasure network. In the test task, 4000 pictures are taken out from the Celeba data set as a test set, a random missing mask graph is used for simulating a missing area, the hand-drawn mask graph used for testing is divided into four groups of 1 to 4 according to the missing area ratio from small to large, and each group contains 1000 pictures which are 4000.

In the present invention, as shown in fig. 9, the edge repairing model includes an edge repairing unit, and the edge repairing unit samples the extracted edge map, the original image, and the mask image, and converts the feature map into a single-channel edge map after performing convolution residual based on dilation and two deconvolution for multiple times. Specifically, the feature map is converted into a single-channel edge map through 3 times of downsampling, 8 times of residual convolution based on expansion convolution and 2 times of deconvolution.

Specifically, in the present invention, as shown in fig. 4, the repairing method of the edge repairing model is as follows:

step S20: the edge repairing device splices an edge image, a mask image and a gray level image of a to-be-repaired image into a tensor as input, so as to obtain a predicted edge repairing result of the edge repairing device, obtains a generation result of an edge repairing model according to the predicted edge repairing result, reserves the image edge of an existing region, and fills an edge part needing to be repaired in a missing region, and the method comprises the following steps:

C_p＝G_e(M，C，I_gray)

C₊＝C·(1-M)+C_p·M

wherein, C_pRepresenting a predicted edge-repaired image, G_eRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, I_gray′Grey-scale map representing the image to be restored, C₊A generated restored edge image representing an edge restoration model.

In the present invention, further, the method for repairing the edge repairing model further includes:

step S22: and optimizing the generation result of the edge repairing model according to the loss function to obtain a repaired edge image finally output by the edge repairing model.

Specifically, the loss function of the edge repairing model belongs to a mixed loss function, the purpose of which is to constrain the result processed by the edge discriminator, and the mixed function is to generate a weighted sum of the edge confrontation loss and the edge feature loss. Wherein, generating the edge countermeasure loss is a sort of cross entropy, which can be recorded as:

wherein the content of the first and second substances,

representing a loss of edge-confrontation,

representing a real picture and a grey-scale map as desired,

expressing the gray-scale map and the original map to expect, D_eAn edge discriminator is indicated.

Secondly, the edge feature loss is a distance function defined on the feature layer of the edge restorer, and the main function of the edge feature loss is to calculate the sum of the distances between the generated edge and the features of different layers extracted by the edge restorer and detected by canny, so that the feature loss formula expression is as follows:

wherein the content of the first and second substances,

indicating loss of edge features, n representing the number of active layers of the edge discriminator

Finally, the optimization goal of the edge model can be written as:

wherein the content of the first and second substances,

it is shown that the edge-healer is minimized,

representing a maximized edge discriminator.

In the present invention, as shown in fig. 10, the image restoration model includes an image restorer, and the image restorer generates a restoration picture after sampling the restored edge image for a plurality of times, performing a plurality of residual convolutions based on the dilation convolution, performing a plurality of multi-head attention networks once, and performing two deconvolution.

The convolutional neural network only focuses on pixel values of local areas when learning characteristics, and influences of relevance of pixels of remote areas on image generation and restoration are ignored, so that a plurality of attention mechanism models are designed to better capture remote dependence relations, wherein one of the multi-head self-attention networks is based on an expansion structure of the self-attention network, and the self-attention network can effectively capture remote relations among pixels in imagination. But the pixel long-distance relations of each region are not just one group, but the self-attention network is not enough to learn a plurality of long-distance relations, so that a multi-head attention network capable of capturing the long-distance relations among richer pixel regions is adopted. The multi-head self-attention network may learn correlation matrixes of different modes, which has a very important role in improving the repair result.

Specifically, as shown in fig. 2, in the present scheme, a multi-head self-attention layer network is added after the last residual layer, and the specific scheme is as follows:

step S2-1: obtaining a plurality of groups of query, key and value characteristic graphs through different convolution changes of the characteristic graphs obtained through the convolution layer and the residual error network;

specifically, as shown in FIG. 5, the size of the query feature map is B_g×W_f×H_f×C_qIn which B is_gFor hidden variable batches of generator inputs, W_fIs the width, H, of the query profile_fIs the height, C, of the query profile_qIs the channel dimension of the query feature map, and key is the size of the feature map B_g×W_f×H_f×C_kSeveral other parameters are the same as the query profile, C_kIs the channel dimension of the key profile. Value feature map size B_g×W_f×H_f×C_vOther parameters are the same as key and query, C_vIs the channel dimension of the feature map.

Step S2-2: acquiring a reconstructed feature map, as shown in fig. 3, the specific method includes:

step S2-2-1: rank conversion is carried out on the key characteristic diagrams, and dot product operation is carried out on the groups between the query characteristic diagrams and the rank-converted key characteristic diagrams, as shown in fig. 6, so that a plurality of groups of correlation attention matrixes are obtained;

step S2-2-2: and (4) normalizing the correlation attention moment matrix, wherein a dot product matrix is normalized by a method such as Softmax and the like.

Step S2-2-3: the normalized self-attention matrix of each set of correlations is matrix-multiplied with the value profile of the set, as shown in fig. 7, to obtain a reconstructed profile of the set. And weighting and reconstructing the element by using the pixel of the value characteristic diagram element in the value characteristic diagram and the set of correlation attention matrixes, wherein the weight values of other elements in the weighting and reconstructing process are pixel values corresponding to the correlation attention moment matrixes.

Further, after the reconstructed feature map is obtained, step S2-3 is performed, as shown in fig. 8,

step S2-3: and splicing the reconstructed characteristic graphs according to the channel dimension to obtain a plurality of attention combination results.

In one embodiment provided by the present invention, a specific method for obtaining a plurality of attention combination results is as follows:

step S2-3-1: attention results were obtained for the ith head:

MultiHead＝Concat(head₁，head₂，...，hean_h)W^o

wherein the content of the first and second substances,

In conclusion, according to the scheme, although the multi-head self-attention layer network is added, the output characteristic size is not changed, and the long-distance information processed by a plurality of heads participates more, so that the image restoration effect is improved.

In the present invention, further, the image after being processed by the edge repairing model of the image repairing model is used for repairing, and the specific repairing method includes:

I_p＝G_i(M，C₊，I^M)

I₊＝I·(1-M)+I_p·MI_pfor predicted restored pictures, I is the real picture, G_iAs an image inpainting device, C₊To repair edge maps, I^MIs a defect map.

Step S24: and calculating an image restoration loss function and optimizing the restoration result of the image restoration model, wherein the image restoration loss function comprises image confrontation loss, style loss and perception loss.

For example, the image contrast loss is similar to the edge contrast loss generated by the edge restoration model

Comprises the following steps:

furthermore, the first appearance of style loss was proposed in the image style migration task, and in a new improvement, the artifact problem existing in deconvolution was alleviated by introducing gram matrix (GramMatrix). Model adoption of the textA loss function based on the gram matrix style loss is presented. Loss function thereof

The expression is as follows:

wherein the content of the first and second substances,

representing the gray matrix, Gr, formed by the inner products of predicted image vectors_i(I^M) A gram matrix representing the inner product of the vectors of the real image, c_ih_iw_iRepresenting the dimension of the activation feature. Four activation layers, relu-2, relu3-4, relu4-4 and relu5-2 in the VGG19 network are selected.

In addition, the perception loss penalizes the generative image which does not accord with the perception result of the real image phenomenon by defining a distance measure between pre-training activation layers

Can be defined as:

wherein in the formula

The 5 activation layers corresponding to the pre-training VGG19 network are relu1-1, relu2-1, relu3-1, relu4-1 and relu 5-1. Wherein w_iRepresenting weight parameters (scheme w)_iAll of the values of (1).

In summary, the loss function of the image restoration model includes multiple loss functions, which can be jointly calculated as:

the countermeasure loss generated for the image restoration model,

in order to be a loss of style,

is the loss of perception.

In the invention, further, after the training of the edge first fusion multi-attention mechanism repairing model is completed, the repairing result of the model is tested and evaluated through a test set, and the part is mainly completed on two 1080 TIANMUs by using a PyTorch learning framework. By peak signal-to-noise ratio (PSNR), similarity (SSIM), l_iAnd evaluating the quality and the repairing effect of the model in the current chapter by four evaluation indexes of error and distance score (FID).

In addition, as shown in fig. 11, a repair result of the multi-attention mechanism repair model fused at the edge first is displayed, from left to right, the first image is an original image, the second image is an image to be repaired covered by the binary mask, the third image is an image repaired by the edge repair model, and the fourth and fifth images are result images repaired by the image repair model. Therefore, the image restored by the first-fusion multi-attention mechanism restoration model is very similar to the original image, but the restored image at some completely missing parts is different from the original image, but is not different according to the human sensory observation. The scheme has good repairing effect and can reasonably repair the missing part. The results show that the network is better than the expectation in the aspect of fusing a multi-head attention mechanism for image restoration.

The above description is intended to describe in detail the preferred embodiments of the present invention, but the embodiments are not intended to limit the scope of the claims of the present invention, and all equivalent changes and modifications made within the technical spirit of the present invention should fall within the scope of the claims of the present invention.

Claims

1. An image restoration method with edge prior fusion of a multi-head attention mechanism is characterized by comprising

2. The method of claim 1, wherein the edge restoration model comprises an edge restorer, and the edge restorer samples the extracted edge map, the original image and the mask image, and converts the feature map into a single-channel edge map after performing dilation convolution-based residual and two deconvolution.

3. The image restoration method based on the edge prior fusion multi-head attention mechanism according to claim 2, wherein the restoration method of the edge restoration model comprises:

C_p＝G_e(M,C,I_gray)

C₊＝C·(1-M)+C_p·M

wherein, C_pRepresenting a predicted edge-repaired image, G_eRepresenting an edge restorer, M representing a mask image, C representing an edge map of the image to be restored, I_grayGrey-scale map representing the image to be restored, C₊A generated restored edge image representing an edge restoration model.

4. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 3, wherein the method for repairing the edge repairing model further comprises

5. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 1, wherein the image inpainting model inpainting method comprises:

I_p＝G_i(M,C₊,I^M)

I₊＝I·(1-M)+I_p·M

wherein, I_pFor predicted restored pictures, I is the real picture, G_iAs an image inpainting device, C₊Repairing the edge graph;

the countermeasure loss generated for the image restoration model,

in order to be a loss of style,

is the loss of perception.

6. The method for repairing an image with an edge a priori fused with a multi-head attention mechanism according to claim 1, wherein the image repairing device generates the repaired image after sampling the repaired edge image for a plurality of times, performing a plurality of residual convolutions based on dilation convolution, performing a multi-head attention network once and performing deconvolution twice comprises:

step S2-2: acquiring a reconstructed characteristic map;

7. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the obtaining of the reconstructed feature map in step S2-2 includes:

step S2-2-2: normalizing the correlation attention moment array;

8. The image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 6, wherein the step S2-3 of stitching the reconstructed feature maps according to the channel dimensions to obtain a plurality of attention combination results includes:

step S2-3-1: attention results were obtained for the ith head:

step S2-3-2: stitching the self-attention results of the individual heads, using W_oThe matrix performs fusion projection on a plurality of feature spaces to the size of the original matrix, and finally a plurality of self-attention combined results are obtained:

MultiHead＝Concat(head₁，head₂，...，head_h)W^o

9. the image inpainting method of an edge prior fusion multi-head attention mechanism according to claim 5, wherein the style loss is calculated by:

wherein the content of the first and second substances,

10. The method for repairing an image with an edge prior fused with a multi-head attention mechanism according to claim 7, wherein value feature map elements in the value feature map in step S2-2-3 are used to perform weighted reconstruction on pixels of the value feature map elements by using the set of correlation attention matrices, and weights of other elements in the weighted reconstruction process are pixel values corresponding to the correlation attention moment matrices.