CN112102170B

CN112102170B - Super-resolution reconstruction network training method, super-resolution reconstruction network training system, super-resolution reconstruction network training image generation method, super-resolution reconstruction network training system, terminal and medium

Info

Publication number: CN112102170B
Application number: CN202010987451.4A
Authority: CN
Inventors: 王猛
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2021-05-18
Anticipated expiration: 2040-09-18
Also published as: CN112102170A

Abstract

The invention provides a super-resolution reconstruction network training method, an image generation method, a system, a terminal and a medium, wherein the super-resolution reconstruction network training method respectively acquires a first clear face image and a fuzzy face image, generates a second clear face image according to the first clear face image, and generates a first pseudo-clear face image through a first generation network from the fuzzy face image; the second clear face image sequentially passes through a second generation network and the first generation network to generate a second pseudo-clear face image, and the super-resolution reconstruction network is trained according to the first clear face image, the first pseudo-clear face image and the second pseudo-clear face image; the invention also provides a super-resolution reconstruction image generation method, super-resolution reconstruction network training, an image generation system, a terminal and a medium, which realize the reduction of the training threshold of the super-resolution reconstruction network and the reduction of the training obstacle.

Description

Super-resolution reconstruction network training method, super-resolution reconstruction network training system, super-resolution reconstruction network training image generation method, super-resolution reconstruction network training system, terminal and medium

Technical Field

The invention relates to the technical field of computers, in particular to a super-resolution reconstruction network training and image generation method, a super-resolution reconstruction network training and image generation system, a terminal and a medium.

Background

The image super-resolution reconstruction technology is a technology for enriching low-resolution pixels and details and improving image expressive force by using a specific method.

In the face image super-resolution reconstruction in the related technology, a clear high-resolution face image is usually utilized to obtain a fuzzy low-resolution face image through a down-sampling method such as interpolation, the calculated amount is large, and a super-resolution reconstruction network is trained by the fuzzy low-resolution face image and the corresponding clear high-resolution face image which are subjected to down-sampling. Clear high-resolution face images and corresponding fuzzy low-resolution face images in a real scene are difficult to obtain, and the training of a super-resolution reconstruction network is hindered.

Disclosure of Invention

In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide a super-resolution reconstruction network training method, an image generation method, a system, a terminal and a medium, which are used to solve the technical problems that the calculation amount of a blurred low-resolution face image obtained by performing a downsampling method such as interpolation on a sharp high-resolution face image is large, an image pair of a sharp high-resolution face and a corresponding blurred low-resolution face in a real scene is difficult to obtain, and the training of the super-resolution reconstruction network is hindered.

The invention provides a super-resolution reconstruction network training method, which comprises the following steps:

respectively acquiring first clear face images I_chrAnd blurred face image I_lrSaid first sharp face image I_chrIs higher than the blurred face image I_lrThe resolution of (a);

according to the first clear face image I_chrGenerating a second sharp face image I_clrSaid first sharp face image I_chrIs higher than the second sharp face image I_clrThe resolution of (a);

blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

Second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and the first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a;

according to the first clear face image I_chrThe first pseudo-sharp face image I_flrAnd said second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

Optionally, the training end condition of the super-resolution reconstruction network includes:

the first pseudo-sharp face image I is processed_flrAnd said second pseudo-sharp face image I_fclrInputting the super-resolution reconstruction network to generate a reconstructed first pseudo-sharp face image I_fhrAnd reconstructing a second pseudo-sharp face image I_fchrReconstructing the first pseudo-sharp face image I_fhrAnd said reconstructed second pseudo-sharp face image I_fchrInputting a third discrimination network D _ C to judge whether the fuzzy picture is generated or not, wherein the third discrimination network D _ C fails to judge;

said reconstructing a second pseudo-sharp face image I_fchrAnd said first sharp face image I_chrIs greater than a preset similarity threshold.

Optionally, the first clear face image I is obtained_chrThe first pseudo-sharp face image I_flrAnd said second pseudo-sharp face image I_fclrTo the aboveBefore the super-resolution reconstruction network is trained, the method further comprises the following steps: acquiring a face component network;

the first pseudo-sharp face image I is processed_flrAnd said second pseudo-sharp face image I_fclrInputting the face component network to generate a face fusion component;

the face fusion component and the first pseudo-sharp face image I_flrAnd said second pseudo-sharp face image I_fclrAnd fusing to generate four-channel images, and inputting the four-channel images into the super-resolution reconstruction network.

Optionally, the method further includes:

inputting the four-channel images into the super-resolution reconstruction network to respectively generate the reconstructed first pseudo-sharp face image I_fhrAnd said reconstructed second pseudo-sharp face image I_fchr。

Optionally, the first clear face images I are respectively obtained_chrAnd blurred face image I_lrThe method comprises the following steps:

acquiring a plurality of face images of a real scene;

dividing the real scene face image into a real scene clear face image and a real scene fuzzy face image;

normalizing the clear face image of the real scene to a first resolution to generate the first clear face image I_chr；

Normalizing the fuzzy face image of the real scene to a second resolution to generate the fuzzy face image I_lr。

Optionally, the second generation network G _ B is obtained by training in the following manner:

obtaining a training sample, wherein the training sample comprises a sample second clear face image I_yclrAnd sample blurred face image I_ylr；

The sample second clear face image I_yclrAnd generating a sample second pseudo-sharp face image I through a second generation network G _ B_yfclr；

The sample second pseudo-sharp face image I_yfclrAnd the sample blurred face image I_ylrInputting a second judgment network D _ A;

adjusting the second generation network G _ B until the second discrimination network D _ A fails to discriminate, wherein the second discrimination network D _ A is used for discriminating the second pseudo-sharp face image I of the sample_yfclrAnd the sample blurred face image I_ylrWhether it is a truly blurred picture.

Optionally, the sample real scene clear face image I_yclrMulti-sample real-scene sharp face image I comprising multiple scenes_yclr；

The sample blurs the face image I_ylrMulti-sample blurred face image I comprising multiple scenes_ylr。

Optionally, at least one of the following is also included:

the first and second generation networks comprise a first loss function, and the first loss function is used for enhancing the fuzzy face image I of the first and second generation networks_lrConverted into a first pseudo-sharp face image I_flrThe first loss function comprises a round consistency loss;

obtaining a feature extraction network, and reconstructing the second pseudo-sharp face image I_fchrThe first clear face image I_hrAnd respectively inputting the feature extraction network, respectively acquiring a reconstructed face feature and a real face feature, acquiring cosine similarity of the reconstructed face feature and the real face feature, and taking the cosine similarity as a second loss function, wherein the second loss function is used for training the super-resolution reconstruction network.

The invention also provides a super-resolution reconstruction image generation method, which comprises the following steps:

acquiring a front image of a target face, and inputting the front image of the target face into a super-resolution reconstruction network after training to generate a super-resolution reconstruction image;

the training method of the super-resolution reconstruction network comprises the following steps:

acquiring a plurality of face images of a real scene;

normalizing the clear face image of the real scene to generate a first clear face image I_chr；

Normalizing the fuzzy face image of the real scene to generate a fuzzy face image I_lrSaid first sharp face image I_chrIs higher than the blurred face image I_lrThe resolution of (a);

the blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

The second clear face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and the first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a;

The invention also provides a super-resolution reconstruction network training system, which comprises:

an acquisition module for respectively acquiring a first clear face image I_chrAnd blurred face image I_lrSaid first sharp face image I_chrIs higher than the blurred face image I_lrResolution of (I)_lr；

A clear face image generation module for generating a first clear face image I according to the first clear face image I_chrGenerating a second sharp face image I_clrSaid first sharp face image I_chrHas a resolution higher than that of the second clear personFace image I_clrThe resolution of (a);

a first pseudo-sharp face image generation module for generating the blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

A second pseudo-sharp face image generation module for generating the second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and the first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a;

a training module for obtaining the first clear face image I_chrThe first pseudo-sharp face image I_flrAnd said second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

The invention also provides a terminal, which comprises a processor, a memory and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute the computer program stored in the memory to implement the super-resolution reconstruction network training method according to one or more of the above embodiments.

The present invention also provides a computer-readable storage medium, having stored thereon a computer program,

the computer program is for causing the computer to perform the super-resolution reconstruction network training method as described in any one of the above embodiments.

As described above, the super-resolution reconstruction network training method, the super-resolution reconstruction network training system, the super-resolution reconstruction network training terminal and the super-resolution reconstruction network training medium provided by the invention:

by separately acquiring a first sharp face image I_chrAnd blurred face image I_lrFirst clear face image I_chrThe resolution ratio of the image is higher than that of the fuzzy face image I_lrThe resolution of (a); according to the first clear face image I_chrGenerating a second sharp face image I_clrFirst clear face mapLike I_chrHas a resolution higher than that of the second clear face image I_clrThe resolution of (a); blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr(ii) a Second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a; according to the first clear face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrTraining the super-resolution reconstruction network; the technical problems that the calculated amount of a fuzzy low-resolution face image obtained by a downsampling method such as interpolation on a clear high-resolution face image is large, the image pair of a clear high-resolution face and a corresponding fuzzy low-resolution face in a real scene is difficult to obtain, and the training of the super-resolution reconstruction network is hindered can be solved.

Drawings

Fig. 1 is a schematic flow chart of a super-resolution reconstruction network training method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a face component network according to an embodiment of the present invention;

fig. 3 is another schematic flow chart of a super-resolution reconstruction network training method according to an embodiment of the present invention;

FIG. 4-1 is a blurred picture in a monitored scene according to an embodiment of the present invention;

FIG. 4-2 is a diagram of a first generated network generated according to FIG. 4-1 according to an embodiment of the present invention;

fig. 4-3 are clear pictures in a monitoring scene according to a first embodiment of the present invention;

fig. 5 shows a real-scene second-resolution blurred face image and a first pseudo-sharp face image generated by the first generating network G _ a according to the first embodiment of the present invention;

fig. 6 is a schematic diagram of a face fusion component according to an embodiment of the present invention;

fig. 7 is a schematic diagram illustrating an image effect output by the pre-trained super-resolution reconstruction network according to an embodiment of the present invention;

fig. 8-1 is a super-divided picture of the super-resolution reconstruction network according to a first embodiment of the present invention;

FIG. 8-2 is a diagram of an input blurred face image according to a first embodiment of the present invention;

fig. 9-1 is a comparison graph of a super-resolution image obtained by the super-resolution reconstruction network according to an embodiment of the present invention and an image obtained by an upsampling method;

fig. 9-2 is another comparison graph of a super-resolution image super-separated by the super-resolution reconstruction network according to the first embodiment of the present invention and an image obtained by an upsampling method;

FIG. 10-1 is an example of a super-resolution result provided by one embodiment of the present invention;

FIG. 10-2 is an example of another over-scoring result provided by the first embodiment of the present invention;

fig. 11 is a specific flowchart of a super-resolution reconstructed image generation method according to a second embodiment of the present invention;

fig. 12 is a schematic structural diagram of a super-resolution reconstruction network training system according to a third embodiment of the present invention;

fig. 13 is a schematic structural diagram of a terminal according to a fourth embodiment of the present invention;

fig. 14 is a diagram illustrating a structure of a first generation network according to an embodiment of the present invention;

fig. 15 is a schematic structural diagram of a first discrimination network according to another embodiment of the present invention;

fig. 16 is a schematic structural diagram of ConvBlock in the embodiment of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

Example one

Referring to fig. 1, an embodiment of the present invention provides a super-resolution reconstruction network training method, including:

s101: respectively acquiring first clear face images I_chrAnd blurred face image I_lr；

S102: according to the first clear face image I_chrGenerating a second sharp face image I_clr；

S103: blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

S104: second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclr；

S105: according to the first clear face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

It should be noted that the second generation network G _ B is the reverse process of the first generation network G _ a.

It should be noted that the first clear face image I_chrHas a resolution higher than that ofBlurred face image I_lrThe resolution of (2).

It should be noted that the first clear face image I_chrHas a resolution higher than that of the second clear face image I_clrThe resolution of (2).

In some embodiments, the super-resolution reconstruction network is generated based on an SRGAN network model.

In some embodiments, the training end condition of the super-resolution reconstruction network includes:

the first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrInput super-resolution reconstruction network generates and reconstructs a first pseudo-sharp face image I_fhrAnd reconstructing a second pseudo-sharp face image I_fchrReconstructing a first pseudo-sharp face image I_fhrAnd reconstructing a second pseudo-sharp face image I_fchrInputting a third discrimination network D _ C to judge whether the fuzzy picture is generated or not, wherein the third discrimination network D _ C fails to judge;

reconstructing a second pseudo-sharp face image I_fchrAnd a first sharp face image I_chrIs greater than a preset similarity threshold.

In some embodiments, the first sharp face image I is based on_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrBefore training the super-resolution reconstruction network, the method further comprises the following steps:

acquiring a face component network;

the first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrInputting a face component network to generate a face fusion component;

face fusion component and first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrAnd fusing to generate four-channel images, and inputting the four-channel images into a super-resolution reconstruction network.

In some embodiments, the acquisition of the face component network includes, but is not limited to, the following methods:

acquiring face component training data and training a face component model;

generating a face component;

and fusing the face components on the generated face five sense organs to obtain a fused face component network.

In some embodiments, the face component training data comprises an open source face component data set. The face component training data may also be other face component data sets collected by those skilled in the art, and is not limited herein.

In some embodiments, the face component includes, but is not limited to, at least one of: background, skin, hair, left and right eyebrows, left and right eyes, nose, upper and lower lips, teeth, and the like.

In some embodiments, referring to fig. 2, fig. 2 illustrates a network of face components.

In some embodiments, the training of the super-resolution reconstruction network is aided by a face component network to reduce loss of detail.

In some embodiments, the super-resolution reconstruction network training method further includes:

inputting the four-channel images into a super-resolution reconstruction network to respectively generate and reconstruct a first pseudo-sharp face image I_fhrAnd reconstructing a second pseudo-sharp face image I_fchr。

In some embodiments, if the determination of the third determination network D _ C fails, it means that the reconstructed first pseudo-sharp face image I generated by the super-resolution reconstruction network cannot be distinguished at this time_fhrAnd reconstructing a second pseudo-sharp face image I_fchrWhich one is generated by the blurred image and which one is the original sharp image, that is, the sharpness of the picture generated by the super-resolution reconstruction network meets the requirement at this time.

In some embodiments, the first sharp face image I_chrAnd blurred face image I_lrRespectively generating the images according to the acquired face front images and the definition normalization processing. The clear face front image is normalized to a first resolution and the blurred face front image is normalized to a second resolution. Wherein, the first clear face image I_chrAnd blurred face image I_lrThe face front images of the same person or different persons can be obtained.

In some embodiments, the first sharp face image I is acquired separately_chrAnd blurred face image I_lrThe method comprises the following steps:

acquiring a plurality of face images of a real scene;

dividing a real scene face image into a real scene clear face image and a real scene fuzzy face image;

normalizing the clear face image of the real scene to a first resolution to generate a first clear face image I_chr；

Normalizing the fuzzy face image of the real scene to the second resolution to generate a fuzzy face image I_lr。

It should be noted that the first resolution is greater than the second resolution.

In some embodiments, the obtained face images are all front face images, that is, a clear face image, a blurred face image and a first clear face image I_chrSecond clear face image I_clrBlurred face image I_lrAnd all the images generated according to the images are front face images.

In some embodiments, the face images may be captured from the same scene or from multiple scenes. It should be noted that the face image may be a face image of the same person or a face image of a different person, and is not limited herein.

In some embodiments, a plurality of face front images shot by one or some monitoring cameras can be randomly collected as the face images.

Alternatively, the second resolution may be 56 × 56, the first resolution may be 112 × 112, and the first and second resolutions may also be other resolutions preset by those skilled in the art, which is not limited herein.

In some embodiments, the second sharp face image I_clrAnd blurred face image I_lrHaving the same resolution。

In some embodiments, the first sharp face image I is based on_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrTraining the super-resolution reconstruction network comprises the following steps:

the second pseudo-sharp face image I_fclrInput super-resolution reconstruction network to generate and reconstruct second pseudo-sharp face image I_fchr；

Training the super-resolution reconstruction network for the first time until reconstructing a second pseudo-sharp face image I_fchrAnd a first sharp face image I_chrThe similarity of (2) is greater than a preset similarity threshold;

the first pseudo-sharp face image I_flrInputting the super-resolution reconstruction network finished by the first training to generate a new reconstructed first pseudo-sharp face image I_fhr；

The second pseudo-sharp face image I_fclrRe-inputting the super-resolution reconstruction network finished by the first training to generate and reconstruct a second pseudo-sharp face image I_fchr；

Training the super-resolution reconstruction network for the second time until the first pseudo-sharp face image I cannot be reconstructed_fhrAnd a new reconstructed second pseudo-sharp face image I_fchrWhether the fuzzy picture is generated;

and finishing the training of the super-resolution reconstruction network.

That is, the trained super-resolution reconstruction network is based on the first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrThe generated reconstructed first pseudo-sharp face image I_fhrAnd reconstructing a second pseudo-sharp face image I_fchrOn the one hand, the reconstructed first pseudo-sharp face image I cannot be judged_fhrAnd reconstructing a second pseudo-sharp face image I_fchrWhether generated from a blurred picture or not, on the one hand, a second pseudo-sharp face image I is reconstructed_fchrAnd a first sharp face image I_chrIs greater than a preset similarity threshold. Therefore, the definition and the accuracy of the clear picture generated by the super-resolution reconstruction network can be improved, and the super-resolution reconstruction network can be improvedAnd establishing the authenticity of the network output picture.

It should be noted that the preset similarity threshold may be set by a person skilled in the art as needed.

In some embodiments, a second pseudo-sharp face image I is reconstructed_fchrAnd a first sharp face image I_chrThe similarity includes reconstructing a second pseudo-sharp face image I_fchrAnd a first sharp face image I_chrCosine similarity of (c).

In some embodiments, the second pseudo-sharp face image I is reconstructed by separate extraction_fchrAnd a first sharp face image I_chrThe facial features of the two are compared to obtain a reconstructed second pseudo-sharp facial image I_fchrAnd a first sharp face image I_chrCosine similarity of (c).

In some embodiments, the second generation network G _ B is trained by:

Sample second clear face image I_yclrAnd generating a sample second pseudo-sharp face image I through a second generation network G _ B_yfclr；

Sample second pseudo-sharp face image I_yfclrAnd sample blurred face image I_ylrInputting a second judgment network D _ A;

and adjusting the second generating network G _ B until the second judging network D _ A fails to judge.

It should be noted that the second discrimination network D _ a is used for discriminating the second pseudo-sharp face image I of the sample_yfclrAnd sample blurred face image I_ylrWhether generated from a blurred picture.

In some embodiments, the second discrimination network D _ a is used to discriminate the sample second pseudo-sharp face image I_yfclrAnd sample blurred face image I_ylrWhich is the image generated by the generation network and which is the original clear image, if the second determination network D _ A fails, that is, the determination is madeWhen the second determination network D _ a does not determine which image is the image generated by the generation network, the training of the second generation network G _ B is completed.

In some embodiments, the training formulas of the second generation network G _ B and the second discrimination network D _ a include:

wherein, min_GThe identification takes the minimum value, max, for the generating network_DIndicating maximum value for the decision network, E indicating expectation, G _ B (I)_yclr) Representing a sample second sharp face image I_yclrThe pseudo-blurred face image obtained via the second generation network G _ B, D _ A (G _ B (I)_yclr) Representing the discrimination result, D _ A (I), of the pseudo-blurred face image generated by the second discrimination network D _ A_ylr) Representing sample blurred face images I_ylrBy the result of the discrimination of the first discrimination network, P_cData distribution, P, representing sharp face images_iData distribution representing blurred face images, I_ylr～P_iIs represented by_ylrCompliance P_iDistribution of (A), I_yclr～P_cIs represented by_yclrCompliance P_cDistribution of (2).

In some embodiments, the first generation network G _ a is trained by:

obtaining a training sample, wherein the training sample comprises a sample second clear face image I_yclrAnd sample real field second resolution scene fuzzy face image I_ylr；

Obtaining a blurred face image I_ylrAnd generating a sample first pseudo-sharp face image I through a first generation network G _ A_yflr；

And adjusting the first generation network G _ A until the first judgment network D _ B fails to judge.

It should be noted that the first discrimination network D _ B is used for discriminating the sample first pseudo-sharp face image I_yflrAnd sample second sharp face image I_yclrWhether generated from a blurred picture.

In some embodiments, the first discrimination network D _ B is used to discriminate the sample first pseudo-sharp face image I_yflrAnd sample second sharp face image I_yclrAnd if the first discrimination network D _ B fails to judge, namely the first discrimination network D _ B cannot judge which image is the image generated by the generation network, the training of the first generation network G _ A is finished.

The training formulas of the first generating network G _ a and the first judging network D _ B are similar to those of the second generating network G _ B and the second judging network D _ a, and are not repeated herein.

It should be noted that the sample second clear face image I_yclrAnd sample blurred face image I_ylrCan be obtained by the same way as the second clear face image I_clrAnd blurred face image I_lrThe same method is adopted. For example, a first clear face image I of a sample is obtained by acquiring face front images acquired by a plurality of monitoring cameras, dividing the face front images into clear pictures and fuzzy pictures according to the definition degree, and normalizing the clear pictures_ychrThen according to the sample first clear face image I_ychrGenerating a sample second sharp face image I_yclrNormalizing the fuzzy picture to obtain a sample fuzzy face image I_ylr。

In some embodiments, the sample second sharp face image I_yclrAnd sample blurred face image I_ylrAnd, a second sharp face image I_clrAnd blurred face image I_lrOr from the same set of image samples.

In some embodiments, the first generating network G _ a and the corresponding first generating network D _ B form a rectification network that can convert the blurred face image into a first pseudo-sharp face image.

In some embodiments, the sample second sharp face image I_yclrMulti-sample second sharp face image I comprising multiple scenes_yclr(ii) a Sample blurred face image I_ylrMulti including multiple scenesBlurred face image I of sheet sample_ylr。

In some embodiments, when training the second generation network G _ B and the first generation network G _ a, training samples in the same scene are used for training, and when the training standard is met, training samples in other scenes are obtained to train the second generation network G _ B and the first generation network G _ a, so that the second generation network G _ B and the first generation network G _ a can support image output in multiple scenes.

For example, in an initial state, training is performed on the second generation network G _ B and the first generation network G _ a by using training samples acquired by a subway entrance, and after the standard is met, the second generation network G _ B and the first generation network G _ a are further finely adjusted by using monitoring images acquired by scenes such as a shopping mall, a corridor and an elevator, so that the second generation network G _ B and the first generation network G _ a have a good output effect on face images of a plurality of scenes.

In some embodiments, referring to fig. 14, fig. 14 is a schematic structural diagram of a first generation network G _ a, and after an input three-channel image normalized to 56 × 56 is input into the first generation network G _ a, an output three-channel image generated after being processed by the first generation network G _ a is clearer than an input three-channel image. The input three-channel image includes, but is not limited to, a blurred face image and the like. Specifically, taking a blurred face image normalized to 56 × 56 as an example, the blurred face image is input into a first generation network, and is processed by one conv layer to generate a feature map with the specification of 28 × 64, after 6 times a1 processing, the processed feature map with the specification of 28 × 64 is further processed by one conv layer to generate a feature map with the specification of 14 × 128, the extracted facial features in the feature map with the specification of 14 × 128 and the output result of the subsequent B2 are combined and input into an update in the next processing step for processing, after 3 times B1 processing is performed on the feature map with the specification of 14 × 128, the facial features in the output result of B1 and the input feature 128 subsequently input into B2 are combined and output to generate a result 3893B 38764, and the input result is output to B38764. Finally, a first pseudo-sharp face image of 56 x 3 is generated which is sharper relative to the input blurred face image of 56 x 3.

In some embodiments, each conv dimension is reduced by a factor of two and each Upsample dimension is increased by a factor of two in fig. 14, upsamples may employ deconvolution layers.

Referring to fig. 15, fig. 15 is a schematic structural diagram of a first discrimination network D _ B, and the first pseudo sharp face image of 56 × 3 generated in fig. 14 is input to the first discrimination network, and a discrimination result is output, where the discrimination result is used to determine whether the first pseudo sharp face image of 56 × 3 is a sharp image.

It should be noted that, in the present embodiment, the image is only an exemplary image using 56 × 56, and those skilled in the art may also use images of other specifications, and the processing manner is similar to the processing procedure shown in fig. 14 and 15, and is not limited herein.

In some embodiments, the ConvBlock structure is shown with reference to FIG. 16.

In some embodiments, the second generation network has a similar structure to the first generation network shown in fig. 14, and is not described herein again.

In some embodiments, the second decision network has a similar structure to the first decision network shown in fig. 15, and is not described herein again.

In some embodiments, the super-resolution reconstruction network training method further comprises at least one of:

obtaining a feature extraction network, and reconstructing a second pseudo-sharp face image I_fchrThe first clear face image I_hrAnd respectively inputting the feature extraction network, respectively acquiring the reconstructed face feature and the real face feature, acquiring the cosine similarity of the reconstructed face feature and the real face feature, and taking the cosine similarity as a second loss function which is used for training the super-resolution reconstruction network.

The super-resolution reconstruction network training method provided by the embodiment of the present invention is exemplarily described below by a specific embodiment, and the specific super-resolution reconstruction network training method includes:

s301: training samples are obtained.

In some embodiments, the training samples may be obtained by:

selecting a plurality of face images of real scenes from a plurality of scenes, and dividing the selected face images of the real scenes into clear face images of the real scenes and fuzzy face images of the real scenes according to the definition;

normalizing the clear face image of each real scene to 112 × 112, and recording as a clear HR;

normalizing the clear face image of each real scene to 56 × 56, and recording as clear LR;

normalizing the fuzzy face image of each real scene to 56 × 56, and recording as fuzzy LR;

selecting at least one image from each clear LR as a sample second clear face image I_yclr；

At least one image is selected from all fuzzy LRs as a sample fuzzy face image I_ylr；

The training sample also comprises a sample second clear face image I_yclrAnd sample blurred face image I_ylrAnd the training module is used for training the first and second generation networks.

Wherein, the scene includes but is not limited to subway entrance, district, market, etc.

It should be noted that the above normalized resolution is only an exemplary example, and those skilled in the art can set the resolution of each category of picture according to needs. That is, the resolution of the clear HR, the blur LR, and the blur LR can be set as needed by those skilled in the art.

In some embodiments, the training samples may be obtained by:

acquiring a face front image of a real scene under a certain monitoring scene, and dividing the face front image of the real scene into a clear face image of the real scene and a fuzzy face image of the real scene according to the definition;

normalizing the clear face image of the real scene into 112 x 112 as a sample first clear face image I_yclrFor the sample first clear face image I_yclrDown-sampling to 56 x 56 as sample second clear face image I_yclr；

Normalizing the fuzzy face image of the real scene into 56 x 56 as the fuzzy face image I of the sample_ylr。

In some embodiments, the facial images are all frontal facial images.

S302, acquiring a first generation network and a second generation network.

In some embodiments, the accuracy of the image generated by the first generating network G _ a may be improved by the first discriminating network D _ B, wherein the first generating network G _ a is a generating network that competes with the first discriminating network D _ B; improving the accuracy of the image generated by a second generating network G _ B through a second judging network D _ A, wherein the second generating network G _ B is a generating network which is in an antagonistic relationship with the second judging network D _ A; the first generation network G _ a is the inverse process of the second generation network G _ B.

In some embodiments, a first generation network G _ a and a first discriminant network D _ B form the orthotic model.

In some embodiments, the training method of the first and second generation networks is as follows:

sample second clear face image I_yclrGenerating a sample second pseudo-sharp face image I by a second generating network G _ B (inverse process of the first generating network G _ A)_yfclrAnd judging a second pseudo-sharp face image I of the sample through a second judgment network D _ A_yfclrBlurring the face image I with the sample_ylrWhich is true and which is false. The training formula is as follows:

wherein, min_GThe identification takes the minimum value, max, for the generating network_DIndicating maximum value for the decision network, E indicating expectation, G _ B (I)_yclr) Representing a sample second sharp face image I_yclrThe pseudo-blurred face image obtained via the second generation network G _ B, D _ A (G _ B (I)_yclr) Representing the discrimination result, D _ A (I), of the pseudo-blurred face image generated by the second discrimination network D _ A_ylr) Representing sample blurred face images I_ylrBy the result of the discrimination of the first discrimination network, P_cData distribution, P, representing sharp face images_iData distribution representing blurred face images, I_ylr～P_iIs represented by_ylrCompliance P_iDistribution of (A), I_yclr～P_cIs represented by_yclrCompliance P_cDistribution of (2). Blurring the sample into a face image I_ylrSending the image into a first generation network G _ A to generate a first pseudo-sharp face image I_yflrJudging the generated first pseudo-sharp face image I through a first judgment network D _ B_yflrAnd sample second sharp face image I_yclrTrue and false. The training formula is similar to the above, and is not described herein again.

In some embodiments, when the first and second decision networks do not distinguish the true and false of the picture, the training of the first and second generation networks is completed.

In some embodiments, the first and second generation networks comprise a first loss function, i.e. the sample second sharp face image I, for highlighting the effect of the first generation network ga for converting a blurred picture into a sharp picture_yclrRestoring a second clear face image I of the sample through a second generating network G _ B and a first generating network G _ A_yclr：G_A(G_B(I_yclr))～I_yclr。

In some embodiments, the sample second sharp face image I_yclrAnd sample blurred face image I_ylrAll face images collected under the same scene.

In some embodiments, the training method for the first and second generation networks further includes:

sample second sharp face image I_yclrMulti-sample second sharp face image I comprising multiple scenes_yclr；

Sample blurred face image I_ylrIncluding a plurality of scenesMulti-sample fuzzy face image I_ylr。

That is, the face front image data collected in different scenes is used as a training sample to adjust the first and second generation networks, so that the first and second generation networks can stably and accurately output clear pictures for the samples in different scenes. Referring to fig. 4-1, 4-2 and 4-3, wherein fig. 4-1 monitors a blurred picture in a scene, fig. 4-2 is a picture generated by the first generation network G _ a according to fig. 4-1, and fig. 4-3 is a clear picture in a monitored scene.

And S303, acquiring a face component network.

In some embodiments, the generation of the face component network includes, but is not limited to, the following methods:

acquiring face component training data, wherein the face component training data comprises an open-source face component data set;

training a face component model according to the face component training data to generate a face component, wherein the face component comprises but is not limited to at least one of the following: background, skin, hair, left and right eyebrows, left and right eyes, nose, upper and lower lips, and teeth;

and fusing the generated components on the facial features to obtain a fused facial component network.

In some embodiments, the generated face component network may be as shown in FIG. 2.

S304: respectively acquiring second clear face images I_clrAnd blurred face image I_lr。

In some embodiments, the second sharp face image I_clrAnd blurred face image I_lrAnd, the sample second sharp face image I_yclrAnd sample blurred face image I_ylrThe same image.

In some embodiments, the second sharp face image I_clrAnd blurred face image I_lrAnd, sample second sharp face image I_yclrAnd sample blurred face image I_ylrThe obtaining method is similar and will not be described herein again.

S305: blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr。

In some embodiments, referring to FIG. 5, the left picture is the blurred face image I_lrThe right side is the generation of a first pseudo-sharp face image I by means of a first generation network G _ a_flr. The first pseudo-sharp face I_flrThe image has a down-sampling style.

S306: second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclr。

S307: first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrGenerating a corresponding face fusion component through a face component network;

referring to fig. 6, fig. 6 is a first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrAnd generating a schematic diagram of a corresponding face fusion component through the face component network.

S308: face fusion component and first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrFusing to generate a four-channel (R, G, B, A) input;

s309: respectively inputting the fused four channels into a pre-trained super-resolution reconstruction network (SR network), and respectively generating corresponding reconstructed first pseudo-sharp face images I_fhrAnd reconstructing a second pseudo-sharp face image I_fchr；

In some embodiments, the pre-training mode of the super-resolution reconstruction network is as follows:

training the super-resolution reconstruction network to reconstruct a second pseudo-sharp face image I_fchrApproaches the first clear face image I on the pixel_hr。

In some embodiments, a second pseudo-sharp face image I is reconstructed_fchrApproaches the first clear face image I on the pixel_hrAlso, it isI.e. to calculate a reconstructed second pseudo-sharp face image I_fchrAnd a first sharp face image I_hrMSE (peak signal-to-noise ratio) of (I), making both close in pixels, i.e. let I_fchrNearly realistic sharp picture I_hrThe calculation formula is as follows:

||I_hr-I_fchr||。

referring to fig. 7, fig. 7 is a schematic diagram illustrating an image effect output by the super-resolution reconstruction network after pre-training, where the first row of images is an original input image, and the second row of images is a reconstructed image output by the super-resolution reconstruction network.

S310: a first pseudo-sharp face image I is reconstructed_fhrAnd reconstructing a second pseudo-sharp face image I_fchrInputting the third judgment network D _ C for judgment.

S311: obtaining a feature extraction network, and reconstructing a second pseudo-sharp face image I_fchrThe first clear face image I_hrAnd respectively inputting the features into the feature extraction network, respectively acquiring the reconstructed face features and the real face features, acquiring the cosine similarity of the reconstructed face features and the real face features, and taking the cosine similarity as a second loss function.

And the second loss function is used for training the super-resolution reconstruction network.

S312: and jointly training the super-resolution reconstruction network (SR network) and the first generation network G _ A through the first loss function and the second loss function.

Referring to fig. 8-1 and 8-2, fig. 8-1 is a picture super-separated by a super-resolution reconstruction network, and fig. 8-2 is an input real scene low-resolution blurred face image.

Referring to fig. 9-1 and 9-2, fig. 9-1 and 9-2 are respectively a comparison between a super-resolution reconstruction network super-divided picture and a picture obtained by the up-sampling method, wherein the first row of pictures of fig. 9-1 and 9-2 are pictures super-divided by the super-resolution reconstruction network, and the second row of pictures of fig. 9-1 and 9-2 are pictures obtained by the up-sampling method.

Referring to fig. 10-1 and 10-2, fig. 10-1 and 10-2 provide examples of a super-resolution result of super-resolution performed by a super-resolution reconstruction network trained by the training method provided by the embodiment according to the face frontal image of a real scene.

The embodiment of the invention respectively obtains the first clear face image I_chrAnd blurred face image I_lrFirst clear face image I_chrThe resolution ratio of the image is higher than that of the fuzzy face image I_lrThe resolution of (a); according to the first clear face image I_chrGenerating a second sharp face image I_clrFirst clear face image I_chrHas a resolution higher than that of the second clear face image I_clrThe resolution of (a); blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr(ii) a Second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a; according to the first clear face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrTraining the super-resolution reconstruction network; the method can solve the technical problems that the calculated amount of the fuzzy low-resolution face image obtained by a downsampling method such as interpolation is large, the image pair of the clear high-resolution face and the corresponding fuzzy low-resolution face in a real scene is difficult to obtain, and the training of the super-resolution reconstruction network is hindered, because the second clear face image and the fuzzy face image are only required to be obtained according to the clear image and the fuzzy image in the real scene, the obtaining mode is easy, the obtaining modes of the first pseudo-clear face image and the second pseudo-clear face image are simple, the training threshold of the super-resolution reconstruction network is reduced, and the blockage of the training of the super-resolution reconstruction network is reduced.

Optionally, by combining the first pseudo-sharp face image I_flrAnd a second pseudo-sharp face image I_fclrGenerating a face fusion component based on the face component network, and fusing the face fusion component and the first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrFusing to generate four-channel imageThe super-resolution reconstruction network is input, and the loss of details can be reduced through the face fusion component, so that the reconstructed image generated by super-resolution reconstruction network is more real, and excessive loss of details is prevented. The method can remove various noises on the face picture in the monitoring field, so that the monitored face becomes clear.

Optionally, the training end condition includes reconstructing a second pseudo-sharp face image I_fchrAnd a first sharp face image I_chrThe similarity is larger than a preset similarity threshold value, and the reconstruction image generated by the super-resolution reconstruction network can be improved to be more real.

Optionally, the first and second generation networks are trained by training samples in the same scene and then by training samples in different scenes, so that the training samples in different scenes are converted into pictures in the same domain after being processed by the first and second generation networks, and the converted first pseudo-sharp face image I is used_flrA second pseudo-sharp face image I_fclrThe super-resolution reconstruction network is trained, a large clear image can be super-divided from a monitoring face with rich scenes, authenticity is not affected, the super-resolution reconstruction network can be suitable for multiple scenes, and expandability is better.

Optionally, the first and second generation networks and the super-resolution reconstruction network are trained through the first and second loss functions, so that the authenticity of the first super-resolution face can be improved, excessive details are prevented from being lost while the first resolution is improved, and the first cosine similarity is achieved.

Example two

Referring to fig. 11, the present embodiment provides a super-resolution reconstructed image generating method, including:

and acquiring a front image of the target face, and inputting the front image of the target face into the super-resolution reconstruction network after training to generate a super-resolution reconstruction image.

acquiring a plurality of face images of a real scene;

normalizing a clear face image of a real scene to generate a first clear face image I_chr；

Normalization of fuzzy face image of real scene to generate fuzzy face image I_lr；

It should be noted that the first clear face image I_chrThe resolution ratio of the image is higher than that of the fuzzy face image I_lrThe resolution of (2).

According to the first clear face image I_chrGenerating a second sharp face image I_clr；

Second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclr；

It should be noted that the second generation network G _ B is the reverse process of the first generation network G _ a

According to the first clear face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

It should be noted that the target face front image is a low-resolution face front image to be reconstructed.

In some embodiments, the target face frontal image may be a low resolution blurred face frontal image acquired under multiple scenes.

The super-resolution reconstructed image generation method is further described below by a specific embodiment, and referring to fig. 11, the specific super-resolution reconstructed image generation method includes:

s1101: acquiring a plurality of face images of a real scene;

s1102: dividing a real scene face image into a real scene clear face image and a real scene fuzzy face image;

s1103: normalizing a clear face image of a real scene to generate a first clear face image I_chr；

S1104: normalization of fuzzy face image of real scene to generate fuzzy face image I_lr；

S1105: according to the first clear face image I_chrGenerating a second sharp face image I_clr；

S1106: blurred face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

S1107: second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclr；

S1108: according to the first clear face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

S1109: judging whether the training is finished, if the training is finished, executing the step S1110, and if the training is not finished, executing the step S1107;

s1110, generating a super-resolution reconstruction network after training is finished;

s1111: acquiring a front image of a target face;

s1112: and inputting the front image of the target face into the super-resolution reconstruction network after training to generate a super-resolution reconstruction image.

In some embodiments, the super-resolution reconstructed image can be seen in FIGS. 10-1 and 10-2.

The embodiment provides a super-resolution reconstruction image generation method, wherein a target face front image is acquired and input into a super-resolution reconstruction network after training is finished according to any one of the above embodiments, and the target face front image is input into the super-resolution reconstruction network to generate a super-resolution reconstruction image, so that a clearer and more accurate reconstruction image can be obtained, and the quality of a super-resolution face image reconstruction result is improved.

EXAMPLE III

Referring to fig. 12, the present embodiment provides a super-resolution reconstruction network training system 1200, which includes:

an obtaining module 1201, configured to obtain first clear face images I respectively_chrAnd blurred face image I_lrFirst clear face image I_chrThe resolution ratio of the image is higher than that of the fuzzy face image I_lrResolution of (I)_lr；

A sharp face image generation module 1202 for generating a first sharp face image I_chrGenerating a second sharp face image I_clrFirst clear face image I_chrHas a resolution higher than that of the second clear face image I_clrThe resolution of (a);

a first pseudo-sharp face image generation module 1203 for blurring the face image I_lrGeneration of a first pseudo-sharp face image I by means of a first generation network G _ A_flr；

A second pseudo-sharp face image generation module 1204 for generating a second sharp face image I_clrGenerating a second pseudo-sharp face image I sequentially through a second generation network G _ B and a first generation network G _ A_fclrThe second generation network G _ B is the reverse process of the first generation network G _ a;

a training module 1205 for generating a first sharp face image I_chrThe first pseudo-clear face image I_flrAnd a second pseudo-sharp face image I_fclrAnd training the super-resolution reconstruction network.

In this embodiment, the message processing system is substantially provided with a plurality of modules for executing the super-resolution reconstruction network training method of the first embodiment, and specific functions and technical effects are as described in the first embodiment, and are not described herein again.

Example four

Referring to fig. 13, an embodiment of the present invention further provides a terminal 1300, including a processor 1301, a memory 1302, and a communication bus 1303;

communication bus 1303 is used to connect processor 1301 and memory 1302;

the processor 1301 is configured to execute a computer program stored in the memory 1302 to implement the super-resolution reconstruction network training method as described in one or more of the above embodiment one.

An embodiment of the present invention also provides a computer-readable storage medium, characterized in that, a computer program is stored thereon,

the computer program is used for causing a computer to execute the super-resolution reconstruction network training method according to any one of the above embodiments.

Embodiments of the present application also provide a non-transitory readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may execute instructions (instructions) included in an embodiment of the present application.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A super-resolution reconstruction network training method is characterized by comprising the following steps:

2. The super-resolution reconstruction network training method according to claim 1, wherein the training end condition of the super-resolution reconstruction network comprises:

3. The super-resolution reconstruction network training method of claim 1, wherein the super-resolution reconstruction network training method is based on the first sharp face image I_chrThe first pseudo-sharp face image I_flrAnd said second pseudo-sharp face image I_fclrBefore the super-resolution reconstruction network is trained, the method further comprises the following steps:

acquiring a face component network;

4. The super-resolution reconstruction network training method according to claim 3, further comprising:

5. The super-resolution reconstruction network training method of claim 1, wherein the first sharp face images I are respectively obtained_chrAnd blurred face image I_lrThe method comprises the following steps:

acquiring a plurality of face images of a real scene;

6. The super-resolution reconstruction network training method according to any one of claims 1 to 5, wherein the second generation network G _ B is trained by:

7. The super-resolution reconstruction network training method according to claim 6,

the sample second sharp face image I_yclrMulti-sample second sharp face image I comprising multiple scenes_yclr；

8. The super-resolution reconstruction network training method according to any one of claims 1 to 5, further comprising at least one of:

obtaining a feature extraction network, and reconstructing the second pseudo-sharp face image I_fchrThe first clear face image I_hrRespectively inputting the feature extraction network and respectively obtaining reconstructed face features and real face featuresAnd obtaining cosine similarity of the reconstructed face features and the real face features, and taking the cosine similarity as a second loss function, wherein the second loss function is used for training the super-resolution reconstruction network.

9. A super-resolution reconstruction image generation method is characterized by comprising the steps of,

acquiring a plurality of face images of a real scene;

according to the first clear face image I_chrThe first pseudo-sharp face image I_flrAnd said second pseudo-sharp faceImage I_fclrAnd training the super-resolution reconstruction network.

10. A super-resolution reconstruction network training system is characterized by comprising:

A clear face image generation module for generating a first clear face image I according to the first clear face image I_chrGenerating a second sharp face image I_clrSaid first sharp face image I_chrIs higher than the second sharp face image I_clrThe resolution of (a);

11. A terminal comprising a processor, a memory, and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute a computer program stored in the memory to implement the super resolution reconstruction network training method according to one or more of claims 1 to 8.

12. A computer-readable storage medium, having stored thereon a computer program,

the computer program is for causing the computer to perform the super resolution reconstruction network training method of any one of claims 1 to 8.