CN111369440A

CN111369440A - Model training method, image super-resolution processing method, device, terminal and storage medium

Info

Publication number: CN111369440A
Application number: CN202010141266.3A
Authority: CN
Inventors: 陈伟民; 袁燚; 范长杰; 胡志鹏
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2020-03-03
Filing date: 2020-03-03
Publication date: 2020-07-03
Anticipated expiration: 2040-03-03
Also published as: CN111369440B

Abstract

The application provides a model training method, an image super-resolution processing method, a model training device, an image super-resolution processing device, a terminal and a storage medium, and relates to the technical field of model training. The method comprises the following steps: down-sampling an original high-resolution image corresponding to the sample low-resolution image to obtain a plurality of resolution high-resolution images including the original high resolution; respectively adopting a plurality of feature extraction branches to perform feature extraction on the sample low-resolution image to obtain a plurality of levels of image features; adopting a characteristic fusion module to perform fusion processing on the image characteristics of multiple layers to obtain fusion characteristics of the sample low-resolution images; reconstructing the fusion features by adopting a plurality of reconstruction branches to obtain super-resolution images with a plurality of resolutions; and training the neural network model according to the high-resolution images with the multiple resolutions and the corresponding super-resolution images. The neural network model is used for recovering the low-resolution images, so that semantic information contained in the generated super-resolution images is richer, and the definition is higher.

Description

Model training method, image super-resolution processing method, device, terminal and storage medium

Technical Field

The invention relates to the technical field of model training, in particular to a method, a device, a terminal and a storage medium for model training and image super-resolution processing.

Background

The image resolution refers to the amount of information stored in the image, and is how many pixels are in each inch of the image. Low resolution images are less sharp and contain fewer features. The low-image-resolution image is restored to the super-resolution image, so that the definition of the image can be improved, and the details contained in the image are more real.

In the related art, a plurality of feature extraction blocks are arranged and sequentially connected in series, so that image features of different levels are extracted by the plurality of feature extraction blocks connected in series, and a super-resolution image is generated according to the image features of the different levels.

However, in the related art, the image features are extracted by a plurality of feature extraction blocks connected in series, and the super-resolution image is directly generated, so that the generated super-resolution image has a lot of noises and artifacts, and the generated super-resolution image has a poor definition.

Disclosure of Invention

The present invention aims to provide a model training method, an image super-resolution processing device, a terminal and a storage medium, which are used for solving the problem in the related art that the generated super-resolution image has more noise and artifacts, which result in poor definition of the generated super-resolution image, because the image features are extracted by a plurality of feature extraction blocks connected in series to directly generate the super-resolution image.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present invention provides a model training method, where the model training method is applied to a neural network model, and the neural network model includes: the device comprises a feature extraction module and a reconstruction module, wherein the feature extraction module comprises: the system comprises a plurality of feature extraction branches and a feature fusion module, wherein different feature extraction branches correspond to image features of different levels; the reconstruction module includes: the system comprises a plurality of reconstruction branches, wherein different reconstruction branches correspond to different resolutions, and the input of a reconstruction branch of a next resolution is the output of a reconstruction branch of a previous resolution, wherein the next resolution is greater than the previous resolution; the method comprises the following steps:

performing down-sampling on an original high-resolution image corresponding to an input sample low-resolution image at least once to obtain a plurality of resolution high-resolution images including the original high resolution;

respectively adopting the plurality of feature extraction branches to perform feature extraction on the sample low-resolution image to obtain a plurality of levels of image features;

performing fusion processing on the image features of the multiple layers by adopting the feature fusion module to obtain fusion features of the sample low-resolution image;

respectively adopting the plurality of reconstruction branches to reconstruct the fusion features to obtain super-resolution images with a plurality of resolutions; the image output by the last reconstruction branch is a target super-resolution image corresponding to the sample low-resolution image;

and training the neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images.

Further, the training the neural network model according to the high resolution images of the plurality of resolutions and the corresponding super-resolution images includes:

determining a loss function value of the initial neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images;

and adjusting parameters of the neural network model according to the loss function value until the adjusted loss function value of the neural network model converges.

Further, the determining a loss function value of the neural network model from the high resolution images of the plurality of resolutions and the corresponding super resolution images comprises:

determining a pixel loss value of the neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images;

determining a perception loss value of the neural network model according to the high-resolution images with the plurality of resolutions and a feature map output by a preset layer of the corresponding super-resolution image in a pre-training model;

determining a countermeasure loss value of the neural network model according to the original high-resolution image and the target super-resolution image;

determining a loss function value of the neural network model according to the pixel loss value, the perceptual loss value and the antagonistic loss value.

Further, the determining a countermeasure loss value of the initial neural network model from the original high resolution image and the target super resolution image comprises:

determining the probability of the original high-resolution image to be true to the target super-resolution image and the probability of the target super-resolution image to be false to the original high-resolution image by adopting a discriminator;

determining the countermeasure loss value based on the probability of truth and the probability of falseness.

Further, the determining a loss function value of the neural network model from the pixel loss value, the perceptual loss value, and the antagonistic loss value includes:

and determining a loss function value of the neural network model by adopting a preset weighting algorithm according to the pixel loss value, the perception loss value and the confrontation loss value.

Further, the adjusting the parameters of the neural network model according to the loss function value until the adjusted loss function value of the neural network model converges includes:

and adjusting parameters of the neural network model by adopting a preset gradient descent method according to the loss function value until the adjusted loss function value of the neural network model is converged.

In a second aspect, an embodiment of the present application further provides an image super-resolution processing method, where the method is applied to a neural network model obtained by the training method in any one of the first aspects, and the image super-resolution processing method includes:

acquiring an input low-resolution image;

and carrying out super-resolution processing on the low-resolution image by adopting the neural network model to obtain a target super-resolution image corresponding to the low-resolution image.

In a third aspect, an embodiment of the present application further provides a model training apparatus, where the model training apparatus is applied to a neural network model, and the neural network model includes: the device comprises a feature extraction module and a reconstruction module, wherein the feature extraction module comprises: the system comprises a plurality of feature extraction branches and a feature fusion module, wherein different feature extraction branches correspond to image features of different levels; the reconstruction module includes: the system comprises a plurality of reconstruction branches, wherein different reconstruction branches correspond to different resolutions, and the input of a reconstruction branch of a next resolution is the output of a reconstruction branch of a previous resolution, wherein the next resolution is greater than the previous resolution; the device comprises:

the down-sampling module is used for carrying out down-sampling on an original high-resolution image corresponding to an input sample low-resolution image at least once to obtain a plurality of resolution high-resolution images including the original high resolution;

the extraction module is used for respectively adopting the plurality of characteristic extraction branches to extract the characteristics of the sample low-resolution image to obtain the image characteristics of a plurality of layers;

the fusion module is used for performing fusion processing on the image features of the multiple layers by adopting the feature fusion module to obtain fusion features of the sample low-resolution images;

the reconstruction processing module is used for respectively adopting the plurality of reconstruction branches to reconstruct the fusion features to obtain super-resolution images with a plurality of resolutions; the image output by the last reconstruction branch is a target super-resolution image corresponding to the sample low-resolution image;

and the training module is used for training the neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images.

Further, the training module is further configured to determine a loss function value of the initial neural network model according to the high-resolution images of the plurality of resolutions and the corresponding super-resolution images; and adjusting parameters of the neural network model according to the loss function value until the adjusted loss function value of the neural network model converges.

Further, the training module is further configured to determine a pixel loss value of the neural network model according to the high resolution images of the plurality of resolutions and the corresponding super-resolution images; determining a perception loss value of the neural network model according to the high-resolution images with the plurality of resolutions and a feature map output by a preset layer of the corresponding super-resolution image in a pre-training model; determining a countermeasure loss value of the neural network model according to the original high-resolution image and the target super-resolution image; determining a loss function value of the neural network model according to the pixel loss value, the perceptual loss value and the antagonistic loss value.

Further, the training module is further configured to determine, by using a discriminator, a probability that the original high-resolution image is true to the target super-resolution image, and a probability that the target super-resolution image is false to the original high-resolution image; determining the countermeasure loss value based on the probability of truth and the probability of falseness.

Further, the training module is further configured to determine a loss function value of the neural network model by using a preset weighting algorithm according to the pixel loss value, the perception loss value, and the confrontation loss value.

Further, the training module is further configured to adjust parameters of the neural network model by using a preset gradient descent method according to the loss function value until the adjusted loss function value of the neural network model converges.

In a fourth aspect, an embodiment of the present application further provides an image super-resolution processing apparatus, where the apparatus is applied to a neural network model obtained by the training method in any one of the first aspect, and the image super-resolution processing apparatus includes:

the acquisition module is used for acquiring an input low-resolution image;

and the processing module is used for carrying out super-resolution processing on the low-resolution image by adopting the neural network model to obtain a target super-resolution image corresponding to the low-resolution image.

In a fifth aspect, an embodiment of the present application further provides a terminal, including: a memory storing a computer program executable by the processor, and a processor implementing any of the methods provided by the first and second aspects when executing the computer program.

In a sixth aspect, an embodiment of the present application further provides a storage medium, where a computer program is stored on the storage medium, and when the computer program is read and executed, the computer program implements any one of the methods provided in the first aspect and the second aspect.

The beneficial effect of this application is: the embodiment of the invention provides a model training method, which comprises the steps of carrying out at least one time of downsampling on an original high-resolution image corresponding to an input sample low-resolution image to obtain a plurality of resolution high-resolution images comprising the original high resolution; respectively adopting the plurality of feature extraction branches to perform feature extraction on the sample low-resolution image to obtain a plurality of levels of image features; performing fusion processing on the image features of the multiple layers by adopting the feature fusion module to obtain fusion features of the sample low-resolution image; respectively adopting the plurality of reconstruction branches to reconstruct the fusion features to obtain super-resolution images with a plurality of resolutions; and training the neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images. The method comprises the steps of obtaining high-resolution images with multiple resolutions, obtaining the super-resolution images with the multiple resolutions based on multiple feature extraction branches and multiple reconstruction branches, training a model according to the super-resolution images with the multiple resolutions and the high-resolution images, and extracting more image features when restoring low-resolution images through the model, wherein the generated super-resolution images contain richer semantic information and have higher definition.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic structural flow diagram of a generator of a neural network model according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a model training method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a model training method according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of a model training method according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of a model training method according to an embodiment of the present invention;

fig. 6 is a schematic flowchart of an image super-resolution processing method according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of an image super-resolution processing apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.

In the model training method provided in the embodiment of the present invention, the execution subject may be a server or a terminal, for example, an individual computer such as a desktop computer, a notebook computer, and a tablet computer, which is not limited in the embodiment of the present invention.

The model training method provided by the application is exemplified by a plurality of examples by taking a terminal as an execution subject.

Fig. 1 is a schematic structural flow diagram of a generator of a neural network model according to an embodiment of the present invention, as shown in fig. 1, the neural network model may include a feature extraction module 10 and a reconstruction module 20, where the feature extraction module 10 includes: a plurality of feature extraction branches and feature fusion modules 11, wherein different feature extraction branches correspond to image features of different levels; the reconstruction module 20 includes: and the input of the reconstruction branch of the next resolution is the output of the reconstruction branch of the previous resolution, wherein the next resolution is higher than the previous resolution.

In the feature extraction module 10, the number of the plurality of feature extraction branches may be N; in the reconstruction module 20, the number of the plurality of reconstruction branches may also be N; each reconstruction branch can output one super-resolution image, and the N reconstruction branches can output N super-resolution images. The resolution ratios corresponding to the N super-resolution images are different.

In the embodiment of the present invention, a first reconstruction branch of the plurality of reconstruction branches may include only the convolutional layer 21, and each reconstruction branch may include the sampling layer 22 and the convolutional layer 21 for a non-first reconstruction branch, and the number of the convolutional layer 21 and the upsampling layer 22 is not particularly limited.

In the embodiment of the present invention, since the first reconstruction branch includes only the convolutional layer 21, the resolution of the super-resolution image output by the first reconstruction branch is similar to that of the sample low-resolution image.

In addition, for the non-first reconstruction branch, the output of the sampling layer 22 in the reconstruction branch of the previous resolution may be used as the input of the sampling layer 22 in the reconstruction branch of the next resolution, so that the resolution of the super-resolution image output by the reconstruction branch of the next resolution is greater than the resolution of the super-resolution image output by the reconstruction branch of the previous resolution. The resolution ratios of the N super-resolution images output by the N reconstruction branches are sequentially increased, and the resolution ratio of the super-resolution image output by the last reconstruction branch is the highest. After the neural network model is trained, when a low-resolution image is input, the super-resolution image output by the last reconstruction branch can be used as the super-resolution image output by the neural network model.

It should be noted that both the feature extraction module 10 and the reconstruction module 20 may belong to a generator of a neural network model.

Fig. 2 is a schematic flowchart of a model training method according to an embodiment of the present invention, where the model training method may be implemented by software and/or hardware. As shown in fig. 2, the method may include:

s101, performing at least one down-sampling on an original high-resolution image corresponding to the input sample low-resolution image to obtain a high-resolution image comprising a plurality of original high resolutions.

The sample low-resolution image and the original high-resolution image are images with different resolutions of the same image.

In addition, the sample low resolution image and the original high resolution image may each include: the pixel information of the color channel, i.e. each pixel point in the sample low-resolution image and the original high-resolution image, can be represented by RGB (red, green, blue) values.

In some embodiments, the terminal may perform down-sampling on the original high resolution image corresponding to the sample low resolution image for N-1 times, so that N-1 high resolution images with different resolutions may be obtained. The original high resolution image and the N-1 high resolution images of different resolutions form N high resolution images of different resolutions, i.e., N labels.

It is to be noted thatThe sample low resolution image can be represented by I_LRTo represent, the original high resolution image can be represented by I_HRRepresenting, N high resolution images of different resolutions, i.e. N tags, can be represented as

I_HRWherein, in the step (A),

the high-resolution images with N-1 different resolutions are obtained through down-sampling.

And S102, respectively adopting a plurality of feature extraction branches to perform feature extraction on the sample low-resolution image to obtain image features of a plurality of layers.

Wherein each feature extraction branch may include: a plurality of convolutional layers and a void convolutional layer. Each convolutional layer and void convolutional layer has a corresponding convolutional kernel. In the same feature extraction branch, the convolution kernels may all be the same size. The convolution kernel size may be different for different feature extraction branches.

In addition, the convolution kernels of different feature extraction branches have different sizes, so that the feature extraction branches can extract image features of different levels, for example, the feature extraction branch with a smaller convolution kernel can extract image features of a smaller range in a sample low-resolution image, such as detail features and texture features; feature extraction branches with larger convolution kernels can extract a larger range of image features, e.g., object location features, in the sample low resolution image.

In some embodiments, the number of the feature extraction branches may be N, and in the feature extraction branches, the convolutional layer and the hole convolutional layer may be interspersed, as shown in fig. 1, each feature extraction branch may sequentially include: four convolutional layers 12, three void convolutional layers 13 and one convolutional layer 12.

In the embodiment of the present invention, the number of the feature extraction branches is not specifically limited, and the sizes of convolution kernels of the convolution layer and the void convolution layer in each feature extraction branch may be set according to actual requirements.

And S103, adopting a feature fusion module to perform fusion processing on the image features of the multiple layers to obtain fusion features of the sample low-resolution images.

In a possible implementation manner, when each feature extraction branch is ended, the features of each feature extraction branch are merged through a merging channel and sent to a residual block, and the residual block can merge the image features of multiple layers to obtain a merged feature of the sample low-resolution image.

The feature fusion module 11 may be a residual block, the feature fusion module 11 may include a plurality of convolutional layers, as shown in fig. 1, the feature fusion module 11 may include 3 convolutional layers, a convolutional kernel of a first convolutional layer may be 1, sizes of a second convolutional layer and a third convolutional layer may be the same, and may be both 3, an output of the first convolutional layer may be input to the second convolutional layer, an output of the second convolutional layer may be input to the third convolutional layer, and a fusion feature of the sample low-resolution image may be determined by summing an output of the first convolutional layer and an output of the third convolutional layer.

Of course, the terminal may also adopt other modules capable of performing feature fusion to fuse the image features of multiple layers, which is not specifically limited in the embodiment of the present invention.

And S104, respectively adopting a plurality of reconstruction branches to reconstruct the fusion features to obtain super-resolution images with a plurality of resolutions.

And the image output by the last reconstruction branch is a target super-resolution image corresponding to the sample low-resolution image. Each reconstruction branch may output a super-resolution image.

In addition, the target super-resolution image can be represented by I_SRWhen the number of reconstruction branches is N, the super-resolution image output by the non-last reconstruction branch can be expressed as

The super-resolution images of multiple resolutions can be represented as

I_SR。

It should be noted that the number of high resolution images, the number of extraction branches, and the number of reconstruction branches may be the same, and the number of high resolution images and the number of super resolution images may be the same.

In the embodiment of the present invention, the number of reconstruction branches is not specifically limited, and the sizes of convolution kernels of the sampling layer and the convolution layer in each reconstruction branch may be set according to actual requirements.

And S105, training the neural network model according to the high-resolution images with the multiple resolutions and the corresponding super-resolution images.

In a possible implementation manner, the terminal may perform calculation of an optimization target parameter according to the high-resolution images with multiple resolutions and the corresponding super-resolution images, and train the neural network model through the optimization target parameter.

In summary, an embodiment of the present invention provides a model training method, which performs at least one down-sampling on an original high-resolution image corresponding to an input sample low-resolution image to obtain a high-resolution image with multiple resolutions including the original high resolution; respectively adopting the plurality of feature extraction branches to perform feature extraction on the sample low-resolution image to obtain a plurality of levels of image features; performing fusion processing on the image features of the multiple layers by adopting the feature fusion module to obtain fusion features of the sample low-resolution image; respectively adopting the plurality of reconstruction branches to reconstruct the fusion features to obtain super-resolution images with a plurality of resolutions; and training the neural network model according to the high-resolution images with the plurality of resolutions and the corresponding super-resolution images. The method comprises the steps of obtaining high-resolution images with multiple resolutions, obtaining the super-resolution images with the multiple resolutions based on multiple feature extraction branches and multiple reconstruction branches, training a model according to the super-resolution images with the multiple resolutions and the high-resolution images, and extracting more image features when restoring low-resolution images through the model, wherein the generated super-resolution images contain richer semantic information and have higher definition.

Optionally, fig. 3 is a schematic flow chart of a model training method according to an embodiment of the present invention, as shown in fig. 3, in step S105, the method may further include:

s201, determining a loss function value of the initial neural network model according to the high-resolution images with the multiple resolutions and the corresponding super-resolution images.

Wherein the high resolution images of the plurality of resolutions may include an original high resolution image and the corresponding super resolution image may include a target super resolution image.

In one possible embodiment, the terminal may determine a first loss value from the original high resolution image and the target super resolution image, determine a plurality of second loss values from the high resolution images of the plurality of resolutions and the corresponding super resolution images, and then determine the loss function value from the first loss value and the second loss value.

For example, when the high resolution images of multiple resolutions are

I_HRThe super-resolution images with multiple resolutions are

I_SRWherein the original high resolution image is I_HRThe target super-resolution image is I_SRThe terminal can be according to I_HRAnd I_SRDetermining a first loss function and based on

And

a second loss function is determined.

S202, adjusting parameters of the neural network model according to the loss function value until the adjusted loss function value of the neural network model is converged.

The neural network model may include a generator and a discriminator, among others. The generator comprises: the device comprises a feature extraction module and a reconstruction module.

In the embodiment of the invention, the terminal can adjust the parameters of the generator and the discriminator according to the loss function value until the adjusted loss function value of the neural network model is converged, so as to obtain the trained neural network model. The low-resolution image is input into the neural network model, and the neural network model can output a high-resolution image which comprises more detail information and is higher in definition.

Optionally, fig. 4 is a schematic flowchart of a model training method provided in an embodiment of the present invention, as shown in fig. 4, where S202 may further include:

s301, determining a pixel loss value of the neural network model according to the high-resolution images with the multiple resolutions and the corresponding super-resolution images.

In some embodiments, the terminal may calculate a similarity between the high-resolution image of each resolution and the corresponding super-resolution image by using a preset similarity calculation formula to obtain a plurality of similarities, and then may determine a pixel loss value of the neural network model according to the plurality of similarities.

The terminal can superpose the plurality of similarities to obtain a pixel loss value of the neural network model.

It should be noted that the similarity calculation formula may be

Wherein, I_HR1For one of a plurality of resolution high resolution pictures, I_SR1Is a reaction of_HR1The corresponding super-resolution images can determine the similarity between one high-resolution image and the corresponding super-resolution image, similarly, the similarity between each high-resolution image and the corresponding super-resolution image is calculated to obtain a plurality of similarities, the similarities are superposed to obtain the pixel loss value of the neural network model, and the pixel loss value of the neural network model is obtainedCan use

And (4) showing.

In the embodiment of the invention, when

The smaller the higher the high resolution image and the corresponding super-resolution image.

S302, determining a perception loss value of the neural network model according to the high-resolution images with multiple resolutions and a feature map output by a preset layer of the corresponding super-resolution images in the pre-training model.

The feature graph output by the preset layer in the pre-training model may include: feature maps of a plurality of high-resolution images and feature maps of corresponding super-resolution images. The pre-training model may be VGG (Visual Geometry Group Network) -19.

In a possible implementation manner, the terminal may calculate a perceptual loss value between the feature map of each high-resolution image and the feature map of the corresponding super-resolution image by using a preset perceptual loss formula to obtain a plurality of perceptual loss values, and then may determine the perceptual loss value of the neural network model according to the plurality of perceptual loss values.

The terminal can superpose the multiple perception loss values, so that the perception loss value of the neural network model is obtained.

In the embodiment of the present invention, the feature graph output by the preset layer may be: in the pre-training model, the feature maps output after the ith convolutional layer and the jth active layer may be:

wherein phi is_i,j(I_SR1) A feature map showing a super-resolution image outputted after the i-th convolutional layer and the j-th active layer_i,j(I_HR1) A feature map representing a high resolution image output after the ith convolutional layer and the jth active layer.

Similarly, calculating a perception loss value between the feature map of each high-resolution image and the feature map of the corresponding super-resolution image to obtain a plurality of perception loss values, and superposing the plurality of perception loss values to obtain a pixel loss value of the neural network model, wherein the perception loss value of the neural network model can be used as the perception loss value

And (4) showing.

In the embodiment of the invention, when

And S303, determining the confrontation loss value of the neural network model according to the original high-resolution image and the target super-resolution image.

In some embodiments, the terminal may input the original high resolution image and the target super resolution image into the discriminator, the discriminator may output probability information, and the terminal may determine the countermeasure loss value of the neural network model according to the probability information by using a preset countermeasure loss value calculation formula.

Note that the countermeasure loss value is the first loss value in S201.

S304, determining a loss function value of the neural network model according to the pixel loss value, the perception loss value and the confrontation loss value.

In the embodiment of the present invention, the terminal may determine the loss function value of the neural network model according to the pixel loss value, the perceptual loss value, and the antagonistic loss value by using a preset loss function value calculation formula. The loss function value of the neural network model may be used to indicate whether model training is complete, and parameters of the neural network model may also be optimized based on the loss function value.

Optionally, fig. 5 is a schematic flow chart of a model training method according to an embodiment of the present invention, as shown in fig. 5, in step S303, the method may further include:

s401, determining the probability that the original high-resolution image is true to the target super-resolution image and the probability that the target super-resolution image is false to the original high-resolution image by adopting a discriminator.

The network structure of the discriminator may be VGG-13.

It should be noted that determining the probability that the original high-resolution image is truer than the target super-resolution image and the probability that the target super-resolution image is false than the original high-resolution image can increase the speed and stability of the model training process.

And S402, determining a confrontation loss value according to the true probability and the false probability.

Wherein, the confrontation loss value can measure the generating capability of the generator and the judging capability of the discriminator.

In addition, the terminal can adopt a preset countermeasure loss value calculation formula to determine the countermeasure loss value according to the real probability and the false probability.

In some embodiments, the predetermined countermeasure loss value calculation formula can be expressed as

Wherein the content of the first and second substances,

in order to combat the value of the loss,

the probability that the original high resolution image is truer than the target super resolution image can be represented,

the probability that the target super-resolution image is more false than the original high-resolution image can be represented.

It should be noted that when

When convergence occurs, the representation discriminator can hardly distinguish the target super-resolution image generated by the generator from the original highThe image is resolved, and the generator and the discriminator reach an equilibrium state.

Optionally, the process of S304 may include: and determining a loss function value of the neural network model by adopting a preset weighting algorithm according to the pixel loss value, the perception loss value and the confrontation loss value.

The terminal can calculate the weighted sum of the pixel loss value, the perception loss value and the confrontation loss value through a preset weighting algorithm, and accordingly the loss function value of the neural network model is determined.

In some embodiments, the preset loss function value calculation formula may be:

wherein the content of the first and second substances,

is the loss function value of the neural network model,

is the pixel loss value of the neural network model,

is the perception loss value of the neural network model,

lambda and η are weight parameters, and the larger the weight parameter is, the larger the gradient of the parameter related to the corresponding loss in the training process is, and the super-resolution image generated by the neural network model changes.

Optionally, the step S202 may include: and adjusting parameters of the neural network model by adopting a preset gradient descent method according to the loss function value until the adjusted loss function value of the neural network model is converged.

In the embodiment of the present invention, the terminal may calculate, according to the loss function value, by using a chain derivation rule, a gradient of the loss function value on each parameter, where each parameter may be in the generator and the discriminator, so as to optimize the parameters in the generator and the discriminator, thereby reducing the loss.

In the case of model training, a model can be trained by using PyTorch (deep learning framework) and selecting a random gradient descent method, and a neural network model with good performance can be obtained.

In summary, in the embodiment of the present invention, the neural network model integrates the hierarchical feature extraction module and the hierarchical guiding reconstruction module, extracts and analyzes multi-scale features, focuses on local texture and global semantics, and generates a reasonable and natural super-resolution image step by step based on multiple monitoring information, so that the resolution of the generated super-resolution image is significantly improved.

Fig. 6 is a schematic flowchart of an image super-resolution processing method according to an embodiment of the present invention, and as shown in fig. 6, the method may include:

s501, acquiring an input low-resolution image.

In an implementation of the invention, the low resolution image may comprise: pixel information of a color channel.

S502, carrying out super-resolution processing on the low-resolution image by adopting a neural network model to obtain a target super-resolution image corresponding to the low-resolution image.

The neural network model may be the neural network model shown in any one of fig. 1 to 5.

It should be noted that the super-resolution processing is performed on the low-resolution image through the neural network model, so that the obtained target super-resolution image can contain more feature information, and the target super-resolution image is clearer, more reasonable and more natural.

Fig. 7 is a schematic structural diagram of a model training apparatus according to an embodiment of the present invention, as shown in fig. 7, the model training apparatus is applied to a neural network model, and the neural network model includes: the device comprises a feature extraction module and a reconstruction module, wherein the feature extraction module comprises: the system comprises a plurality of characteristic extraction branches and a characteristic fusion module, wherein different characteristic extraction branches correspond to different resolutions; the reconstruction module includes: the system comprises a plurality of reconstruction branches, wherein different reconstruction branches correspond to different resolutions, and the input of the reconstruction branch of the next resolution is the output of the reconstruction branch of the previous resolution, wherein the next resolution is higher than the previous resolution; the apparatus may include:

a down-sampling module 701, configured to down-sample an original high-resolution image corresponding to an input sample low-resolution image at least once to obtain a high-resolution image with multiple resolutions including an original high resolution;

an extracting module 702, configured to perform feature extraction on the sample low-resolution image by using a plurality of feature extraction branches, respectively, to obtain image features of multiple levels;

the fusion module 703 is configured to perform fusion processing on the image features of multiple layers by using the feature fusion module to obtain fusion features of the sample low-resolution image;

a reconstruction processing module 704, configured to respectively adopt multiple reconstruction branches to reconstruct the fusion features, so as to obtain super-resolution images with multiple resolutions; the image output by the last reconstruction branch is a target super-resolution image corresponding to the sample low-resolution image;

the training module 705 is configured to train the neural network model according to the high resolution images with multiple resolutions and the corresponding super-resolution images.

Optionally, the training module 705 is further configured to determine a loss function value of the initial neural network model according to the high-resolution images with multiple resolutions and the corresponding super-resolution images; and adjusting parameters of the neural network model according to the loss function value until the adjusted loss function value of the neural network model converges.

Optionally, the training module 705 is further configured to determine a pixel loss value of the neural network model according to the high-resolution images with multiple resolutions and the corresponding super-resolution images; determining a perception loss value of a neural network model according to the high-resolution images with a plurality of resolutions and a characteristic diagram output by a preset layer of the corresponding super-resolution images in a pre-training model; determining a countermeasure loss value of the neural network model according to the original high-resolution image and the target super-resolution image; and determining a loss function value of the neural network model according to the pixel loss value, the perception loss value and the confrontation loss value.

Optionally, the training module 705 is further configured to determine, by using a discriminator, a probability that the original high-resolution image is true to the target super-resolution image, and a probability that the target super-resolution image is false to the original high-resolution image; and determining the confrontation loss value according to the true probability and the false probability.

Optionally, the training module 705 is further configured to determine a loss function value of the neural network model by using a preset weighting algorithm according to the pixel loss value, the perceptual loss value, and the countermeasure loss value.

Optionally, the training module 705 is further configured to adjust parameters of the neural network model by using a preset gradient descent method according to the loss function value until the adjusted loss function value of the neural network model converges.

FIG. 8 is a schematic structural diagram of an image super-resolution processing apparatus according to an embodiment of the present invention; the device is applied to a neural network model obtained by the training method in any one of fig. 2 to 5, and the image super-resolution processing device comprises:

an obtaining module 801, configured to obtain an input low-resolution image;

the processing module 802 is configured to perform super-resolution processing on the low-resolution image by using a neural network model to obtain a target super-resolution image corresponding to the low-resolution image.

The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application, where the terminal may include: memory 901, processor 902. The memory 901 is used for storing programs, and the processor 902 calls the programs stored in the memory 901 to execute the method embodiments of fig. 2 to 6. The specific implementation and technical effects are similar, and are not described herein again.

Optionally, the invention also provides a program product, for example a computer-readable storage medium, comprising a program which, when being executed by a processor, is adapted to carry out the above-mentioned method embodiments.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A model training method is applied to a neural network model, and the neural network model comprises the following steps: the device comprises a feature extraction module and a reconstruction module, wherein the feature extraction module comprises: the system comprises a plurality of feature extraction branches and a feature fusion module, wherein different feature extraction branches correspond to image features of different levels; the reconstruction module includes: the system comprises a plurality of reconstruction branches, wherein different reconstruction branches correspond to different resolutions, and the input of a reconstruction branch of a next resolution is the output of a reconstruction branch of a previous resolution, wherein the next resolution is greater than the previous resolution; the method comprises the following steps:

2. The method of claim 1, wherein the training the neural network model from the plurality of resolution high resolution images and corresponding super resolution images comprises:

3. The method of claim 2, wherein determining the loss function value for the neural network model from the high resolution images of the plurality of resolutions and the corresponding super resolution images comprises:

4. The method of claim 3, wherein determining the countermeasure loss value for the initial neural network model from the original high resolution image and the target super resolution image comprises:

5. The method of claim 3, wherein determining the loss function value for the neural network model from the pixel loss value, the perceptual loss value, and the antagonistic loss value comprises:

6. The method of claim 2, wherein adjusting the parameters of the neural network model according to the loss function values until the adjusted neural network model has a converged loss function value comprises:

7. An image super-resolution processing method, which is applied to the neural network model obtained by the training method of any one of claims 1 to 6, the image super-resolution processing method comprising:

acquiring an input low-resolution image;

8. A model training apparatus, wherein the model training apparatus is applied to a neural network model, and the neural network model includes: the device comprises a feature extraction module and a reconstruction module, wherein the feature extraction module comprises: the system comprises a plurality of feature extraction branches and a feature fusion module, wherein different feature extraction branches correspond to image features of different levels; the reconstruction module includes: the system comprises a plurality of reconstruction branches, wherein different reconstruction branches correspond to different resolutions, and the input of a reconstruction branch of a next resolution is the output of a reconstruction branch of a previous resolution, wherein the next resolution is greater than the previous resolution; the device comprises:

9. An image super-resolution processing device, which is applied to the neural network model obtained by the training method of any one of claims 1 to 6, the image super-resolution processing device comprising:

the acquisition module is used for acquiring an input low-resolution image;

10. A terminal, comprising: a memory storing a computer program executable by the processor, and a processor implementing the method of any of the preceding claims 1-7 when executing the computer program.

11. A storage medium having stored thereon a computer program which, when read and executed, implements the method of any of claims 1-7.