CN111383200B

CN111383200B - CFA image demosaicing method based on generated antagonistic neural network

Info

Publication number: CN111383200B
Application number: CN202010239207.XA
Authority: CN
Inventors: 罗静蕊; 王婕
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2020-03-30
Filing date: 2020-03-30
Publication date: 2023-05-23
Anticipated expiration: 2040-03-30
Also published as: CN111383200A

Abstract

The invention discloses a CFA image demosaicing method based on a generated antagonistic neural network, which comprises the following steps: step 1, constructing a training data set, and preprocessing data to obtain a training data set with sufficient quantity; step 2, constructing a generated type countermeasure network GAN; step 3, performing a series of pixel operations on the training data set obtained in the step 1, and using the training data set as input of the GAN network built in the step 2; step 4, setting the loss function and super parameters of the GAN network built in the step 2, and selecting a network optimization algorithm to optimize the loss function; step 5, training the constructed generation type countermeasure network GAN; and 6, performing a test experiment on the trained network model in the step 5, and measuring demosaicing images by using the signal-to-noise ratio of the color peak and the structural similarity index to illustrate the network performance. The method aims at reducing the artifact phenomenon of the demosaicing image and better recovering the texture information of high-frequency parts (corners or edges) in the image.

Description

CFA image demosaicing method based on generated antagonistic neural network

Technical Field

The invention belongs to the technical field of image processing methods, and particularly relates to a CFA image demosaicing method based on a generated type antagonistic neural network.

Background

Images are one of the important means for humans to acquire information, express information, and communicate information. Digital cameras are currently becoming mainstream imaging devices, and are widely applied to the fields of intelligent transportation, medical imaging, remote sensing technology and the like. The most widely used in daily life is a digital color image, in which each pixel has three color values: red, green, and blue. In order to obtain accurate image color information, the digital camera needs three color sensors to receive three color component information of red, green and blue of each pixel, respectively, and then synthesizes the three components into one color image. The placement problem of three color sensors can affect subsequent color synthesis. In addition, three-sensor cameras are often expensive and relatively bulky, with most digital cameras placing a Color Filter Array (CFA) in front of the sensor element. For CFA, one of the most popular modes of Bayer pattern, and many demosaicing algorithms are designed for this mode. In order to recover a high quality color image from a CFA image, it is important to use a demosaicing technique.

CFA image demosaicing is essentially a pathological inverse problem, and many algorithms currently exist for image demosaicing, among which the simplest and widely used are image interpolation algorithms, the most common linear interpolation methods, which are simple and quick to implement, but they have some inherent drawbacks, such as artifacts and blurring of details at the edge. The nearest interpolation method is a relatively fast interpolation method, and the interpolation processing is completed by copying the pixel value with the nearest spatial position. However, since this method does not take into account the correlation between color and space, the interpolation results in a lot of false color and aliasing. Bilinear interpolation, which estimates the non-captured pixels from neighboring pixels, however, the reconstructed image by this method has many artifacts and color distortions. Bicubic is an enlarged gray image interpolation algorithm, which can achieve the purpose of demosaicing by independently interpolating each color component, but the output image of Bicubic tends to have artifacts. The conventional demosaicing method can obtain high precision, and a good effect can be obtained by using a simple interpolation algorithm for processing a smooth area with approximately the same color and gradually changed brightness in a large area, but the method still has great challenges in demosaicing the edges and the corner faces of the image. Overall, these methods have two problems. Firstly, interpolation is carried out by using a filter which is made by hand, when high-frequency signals change (such as edges and angles), space deviation can occur to each color channel, and when adjacent values are interpolated, interpolation results can show color artifacts or zipper effects; second, some conventional demosaicing methods ignore the correlation between different color channels and do not take into account all of the image information in the CFA. In order to obtain a smoother image, the correlation between channels must be considered.

Disclosure of Invention

The invention aims to provide a CFA image demosaicing method based on a generative type anti-neural network, which aims to reduce artifacts of demosaicing images, and better recover texture information of high-frequency parts (corners or edges) in the images so as to obtain images which are closer to facts.

The technical scheme adopted by the invention is that the CFA image demosaicing method based on the generated antagonistic neural network is implemented according to the following steps:

step 1, constructing a training data set, and preprocessing data to obtain a training data set with sufficient quantity;

step 2, constructing a generated type countermeasure network GAN, wherein the GAN comprises two parts: a generator and a discriminator using a U-Net model for the generator and a dense residual network for the discriminator;

step 3, performing a series of pixel operations on the training data set obtained in the step 1, and using the training data set as input of the GAN network built in the step 2;

step 4, setting the loss function and super parameters of the GAN network built in the step 2, and selecting a network optimization algorithm to optimize the loss function;

step 5, training the constructed generation type countermeasure network GAN, and training a training data set according to the network super-parameters, the loss function and the selected network optimization algorithm set in the step 4 to obtain a trained network model corresponding to the training data set;

And 6, performing a test experiment on the trained network model in the step 5, and measuring demosaicing images by using the signal-to-noise ratio of the color peak and the structural similarity index to illustrate the network performance.

The present invention is also characterized in that,

the step 1 specifically comprises the following steps:

step 1.1, randomly finding out a plurality of color images from an existing database to serve as color image data sets, performing downsampling operation on each color image by using a filter, wherein the obtained downsampled images are color filter array images, all CFA images form a CFA image data set, and the color image data sets and the CFA image data sets form a training data set;

step 1.2, preprocessing the training data set obtained in the step 1.1, scaling each image in the training data set by 0.7,0.8,0.9,1 times, and selecting a sliding window with a proper scale to perform translation operation according to the size of the image in the training data set, namely completing the segmentation of the small blocks of the image to obtain a plurality of small block images so as to improve the network training performance; an augmentation operation of 90 deg. rotation, 180 deg. rotation, 270 deg. rotation and flip-up and down is then performed on each tile image, resulting in a sufficient number of training data sets.

The step 2 specifically comprises the following steps:

step 2.1, constructing a generator part in GAN, and setting parameters of each layer of the network by adopting a U-Net model;

and 2.2, constructing a discriminator part in the GAN, and setting parameters of each layer of the network by adopting a dense residual network model.

In step 2.1, the generator structure adopts a U-Net model, and the structure is as follows in sequence: input layer-1 st convolution layer-1 st PReLU activation function layer-2 nd convolution layer-1 st batch normalization operation layer-2 nd PReLU activation function layer-3 rd convolution layer-2 nd batch normalization operation layer-3 rd PReLU activation function layer-4 th convolution layer-3 rd batch normalization operation layer-4 th PReLU activation function layer-5 th convolution layer-4 th batch normalization operation layer-5 th PReLU activation function layer-6 th convolution layer-5 th batch normalization operation layer-6 th PReLU activation function layer-7 th convolution layer-6 th batch normalization operation layer-7 th PReLU activation function layer-8 th convolution layer-7 th batch normalization operation layer-8 th PReLU activation function layer the method comprises the steps of deconvolution layer 1, batch normalization operation layer 8, PReLU activation function layer 9, deconvolution layer 2, batch normalization operation layer 9, PReLU activation function layer 10, deconvolution layer 3, batch normalization operation layer 10, PReLU activation function layer 11, deconvolution layer 4, batch normalization operation layer 11, PReLU activation function layer 12, deconvolution layer 5, batch normalization operation layer 12, PReLU activation function layer 13, deconvolution layer 6, batch normalization operation layer 13, PReLU activation function layer 14, deconvolution layer 7, batch normalization operation layer 14, deconvolution layer 8, tanh activation function layer 1 and output layer;

Wherein the input layer represents a four-way compressed CFA image and the output layer represents an output image; in the structure, the 1 st convolution layer output is connected with the 14 th batch normalization operation layer output, the 1 st batch normalization operation layer output is connected with the 13 th batch normalization operation layer output, the 2 nd batch normalization operation layer output is connected with the 12 th batch normalization operation layer output, the 3 rd batch normalization operation layer output is connected with the 11 th batch normalization operation layer output, the 4 th batch normalization operation layer output is connected with the 10 th batch normalization operation layer output, the 5 th batch normalization operation layer output is connected with the 9 th batch normalization operation layer output, and the 6 th batch normalization operation layer output is connected with the 8 th batch normalization operation layer output to form symmetrical connection.

In step 2.1, parameters of each layer of the generator structure in the constructed GAN model are set as follows:

setting an input channel of an input layer to 4; setting the convolution kernel scale of the 1 st convolution layer as 3*3, the convolution step length as 1*1 and the number of feature maps as 16; setting the convolution kernel scale of the 2 nd convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 32; setting the convolution kernel scale of the 3 rd convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 64; setting the convolution kernel scale of the 4 th convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 128; setting the convolution kernel scale of the 5 th convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 256; setting the convolution kernel scale of the 6 th convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 256; the convolution kernel scale of the 7 th convolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 512; setting the convolution kernel scale of the 8 th convolution layer as 4*4, the convolution step length as 2 x 2 and the number of feature maps as 512; the convolution kernel scale of the 1 st deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 512; the convolution kernel scale of the 2 nd deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 512; setting the convolution kernel scale of the 3 rd deconvolution layer as 4*4, the convolution step length as 2 x 2, and the number of feature maps as 256; the convolution kernel scale of the 4 th deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 256; the convolution kernel scale of the 5 th deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 128; the convolution kernel scale of the 6 th deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 64; the convolution kernel scale of the 7 th deconvolution layer is set to 4*4, the convolution step length is set to 2 x 2, and the number of feature maps is set to 32; the convolution kernel scale of the 8 th deconvolution layer is set to 3*3, the convolution step size is set to 1*1, and the number of feature maps is set to 3;

For all PReLU activation function layers in the generator, its functions are defined as:

PReLU(x _p )＝max(0,x _p )+κ×min(0,x _p ) (1)

wherein, kappa represents a positive constant, kappa is E (0, 1), and kappa is set to be 0.1; x is x _p An input vector representing a PReLU activation function layer;

for the Tanh activation function layer in the generator, its function is defined as:

wherein x is _t Representing the input vector of the Tanh activation function layer.

In step 2.2, the discriminator structure adopts a dense residual error network model, and the structure is as follows in sequence: input layer-1 st dense residual block-2 nd dense residual block-3 rd dense residual block-4 th dense residual block-5 th dense residual block-convolution layer-Sigmoid activation function layer-output layer;

the input layer generates an output image of the generator and a corresponding real image; the output layer is the judgment result of the discriminator, which is 0 or 1; the outputs of the 1 st dense residual block, the 2 dense residual blocks, the 3 dense residual blocks and the 4 dense residual blocks are respectively connected with the output of the 5 th dense residual block to form long jump connection among the dense residual blocks;

the structure of all dense residual blocks in the discriminator is: the 1 st PReLU activation function layer-the 1 st convolution layer-the 1 st batch normalization operation layer-the 2 nd PReLU activation function layer-the 2 nd convolution layer-the 2 nd batch normalization operation layer-the 3 rd PReLU activation function layer-the 3 rd convolution layer-the 3 rd batch normalization operation layer;

The input of the 1 st PReLU activation function layer is connected with the output of the 1 st batch normalization operation layer; the output of the 1 st batch normalization operation layer is connected with the output of the 2 nd batch normalization operation layer; the output of the 2 nd batch normalization operation layer is connected with the output of the 3 rd batch normalization operation layer; the 1 st PReLU activation function layer input is connected with the 2 nd batch normalization operation layer output; the output of the 1 st batch normalization operation layer is connected with the output of the 3 rd batch normalization operation layer, and finally, dense connection and long jump connection of a dense residual block are formed;

the parameters of each layer of the discriminator structure in the constructed GAN model are set as follows:

for all PReLU activation function layers in the discriminator, the function definition is as shown in equation (1), and the parameter settings are consistent with those in equation (1).

For the Sigmoid activation function layer in the discriminator, its function is defined as:

wherein x is _s An input vector representing a Sigmoid activation function layer. The convolution kernel scale of all the convolution layers of the dense residual block in the discriminator is set to 3*3, the convolution step size is set to 1*1, and the number of feature maps is set to 64.

The step 3 specifically comprises the following steps:

taking the CFA image dataset in the training dataset of the step 1 as an input CFA image, and performing a series of pixel operations on the input CFA image: taking out three color components of red, green and blue in the CFA image to form a three-channel CFA image; then further separating the green component into two channels, thereby forming a four-channel CFA image; finally, carrying out pixel compression on the four-channel image, and extracting only pixel values which are not 0; taking the finally obtained four-channel compressed CFA image as the input of a generator, wherein the output of the generator is an interpolated output image; the output image is input to the discriminator at the same time as the real image, and the output of the discriminator is fed back to the generator.

The step 4 specifically comprises the following steps:

step 4.1, setting a loss function of the GAN network and setting y _i Compressing the CFA image for four channels obtained in step 3, x _i For the corresponding real image, where i=1, 2, …, N represents the number of images, for the generator part the antagonism loss function L _a Expressed as:

wherein G (y) _i ) Representing four-way compressed CFA image y _i With the output image after the generator, D (G (y _i ) Represents the output image G (y) _i ) By the output result of the discriminator, additionally using VGG feature perception function L _p It is defined as:

wherein F [ G (y) _i )]Representing extraction of the output image G (y) with respect to the generator according to conv2 layer in a pretrained VGG network _i ) Is a feature mapping matrix of F (x) _i ) Representing extraction of the true image x from conv2 layer in a pretrained VGG network _i In addition to using the reconstruction loss function L _r It is defined as:

in the method, in the process of the invention,

represents the total variance, λ represents the regularization weight, L _r The function can delete detail artifacts, preserve details in the picture, and finally L _a 、L _p And L _r Three losses combine to jointly constrain the proposed generator, defining the loss function of the generator as:

L _G ＝αL _a +βL _p +γL _r (7)

wherein, alpha, beta and gamma represent positive weights; setting parameters α=0.5, β=1, γ=1, λ=10- ⁵ After the parameters of the generator are correspondingly modified according to the above formula, the discriminator is updated, and the loss function of the discriminator is as follows:

wherein D (x _i ) Representing a real image x _i With the output result of the discriminator, D (G (y _i ) Represents the output image G (y) _i ) Output results from the discriminator;

the sum of the loss function of the generator and the loss function of the discriminator, i.e. L _G +L _D ；

Step 4.2, setting super parameters of the GAN network, and specifically enveloping input batch, learning rate and iteration times of the network;

setting the input batch to 64, the iteration number to 200, the initial learning rate to 0.01, and setting the learning rate to be reduced by 1/10 after every 40 iterations;

and 4.3, setting an optimization algorithm of the GAN network, wherein the optimization algorithm uses an adaptive moment estimation algorithm to optimize the loss function in the step 4.1.

The step 6 specifically comprises the following steps:

step 6.1, selecting a test color image, which is not in the network training set, and performing downsampling operation on the test color image by using a filter, wherein the obtained downsampled image is a test CFA image;

step 6.2, inputting the test CFA image obtained in the step 6.1 into the trained network model obtained in the step 4, and further obtaining an output image of the network, namely a test demosaicing image;

Step 6.3, in order to illustrate network performance, measuring the test demosaicing image obtained in step 6.2 by using the color peak signal-to-noise ratio and the structural similarity index, wherein the standard of the CPSNR index is as follows: the larger PSNR is above 20dB, which means that the image has better demosaicing effect; the SSIM index has a value ranging from [0,1], and a larger value indicates that the image structure is closer to the original image.

The beneficial effects of the invention are as follows: according to the method, the built network frame is trained, the trained network is used for directly completing image demosaicing, and all parameters of the network do not need to be manually adjusted.

Drawings

FIG. 1 is a flow chart of a method for demosaicing CFA images based on a generated antagonistic neural network according to the invention;

FIG. 2 is a model constructed in a CFA image demosaicing method based on a generated antagonistic neural network in accordance with the present invention;

FIG. 3 is a schematic diagram of the structure of a generator in the GAN model constructed in the method of the present invention;

FIG. 4 is a schematic diagram of the structure of a discriminator in a GAN model constructed in the method of the invention;

FIG. 5 is a simulation diagram in a simulation experiment of the present invention;

in fig. 5, fig. 5 (a) is a test color image used in the simulation experiment of the present invention, fig. 5 (b) is a corresponding enlarged area image at the square in fig. 5 (a), fig. 5 (c) is a simulation result image using the nearest neighbor interpolation method, fig. 5 (d) is a corresponding enlarged area image at the square in fig. 5 (c), fig. 5 (e) is a simulation result image using the bilinear interpolation method, fig. 5 (f) is a corresponding enlarged area image at the square in fig. 5 (e), fig. 5 (g) is a simulation result of the method of the present invention, and fig. 5 (h) is a corresponding enlarged area image at the square in fig. 5 (g).

In the figure, 1. Input CFA image, 2. Three-channel CFA image (red, green, blue color components in top-down order in the figure), 3. Four-channel CFA image (red, green, blue color components in top-down order in the figure), 4. Four-channel compressed CFA image (red, green, blue color components in top-down order in the figure), 5. Output image (red, green, blue color components in top-down order in the figure), 6. Real image (red, green, blue color components in top-down order in the figure), 7. Generator, 8. Discriminator, 9. Convolution layer, 10. Deconvolution layer, 11.PReLU activation function layer, 12. Batch Normalization (BN) operation layer, 13.Tanh activation function layer, 14. Dense residual block, 15.Sigmoid activation function layer.

Detailed Description

The invention will be described in detail below with reference to the drawings and the detailed description.

The invention discloses a CFA image demosaicing method based on a generated antagonistic neural network, which is implemented as shown in fig. 1, and specifically comprises the following steps:

the step 1 specifically comprises the following steps:

step 1.1, randomly finding out 400 color images from an existing database to serve as color image data sets, performing downsampling operation on each color image by using a filter, wherein the obtained downsampled images are Color Filter Array (CFA) images, all the CFA images form a CFA image data set, and the color image data set and the CFA image data set form a training data set;

the step 2 specifically comprises the following steps:

as shown in fig. 3, in step 2.1, the generator structure adopts a U-Net model, and the structure thereof is as follows: input layer-1 st convolution layer 9-1 st PReLU activation function layer 11-2 nd convolution layer 9-1 st batch normalization operation layer 12-2 nd PReLU activation function layer 11-3 rd convolution layer 9-2 nd batch normalization operation layer 12-3 rd PReLU activation function layer 11-4 th convolution layer 9-3 rd batch normalization operation layer 12-4 th PReLU activation function layer 11-5 th convolution layer 9-4 th batch normalization operation layer 12-5 th PReLU activation function layer 11-6 th convolution layer 9-5 th batch normalization operation layer 12-6 th PReLU activation function layer 11-7 th convolution layer 9-6 th batch normalization operation layer 12-7 th PReLU activation function layer 11-8 th convolution layer 9-7 th batch normalization operation layer 12-8 th PReLU activation function layer 11-8 th batch normalization operation layer the method comprises the steps of (1) deconvolution lamination 10-8 th batch normalization operation layer 12-9 th PReLU activation function layer 11-2 nd deconvolution layer 10-9 th batch normalization operation layer 12-10 th PReLU activation function layer 11-3 rd deconvolution layer 10-10 th batch normalization operation layer 12-11 th PReLU activation function layer 11-4 th deconvolution lamination 10-11 th batch normalization operation layer 12-12 th PReLU activation function layer 11-5 th deconvolution lamination 10-12 th batch normalization operation layer 12-13 th PReLU activation function layer 11-6 th deconvolution layer 10-13 th batch normalization operation layer 12-14 th PReLU activation function layer 11-7 th deconvolution layer 10-14 th batch normalization operation layer 12-8 th deconvolution layer 10-1 st Tanh activation function layer 13-output layer;

PReLU(x _p )＝max(0,x _p )+κ×min(0,x _p ) (1)

wherein x is _t An input vector representing a Tanh activation function layer;

step 2.2, constructing a discriminator part in the GAN, and setting each layer of parameters of the network by adopting a dense residual error network model;

as shown in fig. 4, in step 2.2, the discriminator structure adopts a dense residual network model, and the structure is as follows: input layer-1 st dense residual block 14-2 nd dense residual block 14-3 rd dense residual block 14-4 th dense residual block 14-5 th dense residual block 14-convolution layer-Sigmoid activation function layer 15-output layer;

wherein x is _s An input vector representing a Sigmoid activation function layer. The convolution kernel scale of all the convolution layers of the dense residual block in the discriminator is set to 3*3,the convolution step size is set to 1*1 and the number of feature maps is set to 64.

The step 3 specifically comprises the following steps:

as shown in fig. 2, the CFA image dataset in the training dataset of step 1 is taken as an input CFA image 1, and a series of pixel operations are performed on the input CFA image: taking out three color components of red, green and blue in the CFA image to form a three-channel CFA image 2 (the three-channel CFA image 2 comprises the color components of red, green and blue from top to bottom in sequence); then the green color component is further separated into two channels, thereby forming a four-channel CFA image 3 (red, green, blue color components in the four-channel CFA image 3 from top to bottom in order); finally, carrying out pixel compression on the four-channel image, and extracting only pixel values which are not 0; taking the finally obtained four-channel compressed CFA image 4 (red, green and blue color components are sequentially arranged from top to bottom in the four-channel compressed CFA image 4) as an input of a generator 7, wherein the output of the generator 7 is an interpolated output image 5 (red, green and blue color components are sequentially arranged from top to bottom in the output image 5); the output image 5 is input to the discriminator 8 simultaneously with the real image 6, and the output of the discriminator 8 is fed back to the generator 7.

The step 4 specifically comprises the following steps:

wherein G (y) _i ) Representing four-channel compressionCFA image y _i With the output image after the generator, D (G (y _i ) Represents the output image G (y) _i ) By the output result of the discriminator, additionally using VGG feature perception function L _p It is defined as:

wherein F [ G (y) _i )]Representing extraction of the output image G (y) with respect to the generator according to conv2 layer in a pretrained VGG network _i ) Is a feature mapping matrix of F (x) _i ) Representing extraction of the true image x from conv2 layer in a pretrained VGG network _i The above equation can help detail recovery, additionally use the reconstruction loss function L _r It is defined as:

in the method, in the process of the invention,

L _G ＝αL _a +βL _p +γL _r (7)

wherein, alpha, beta and gamma represent positive weights; setting parameters α=0.5, β=1, γ=1, λ=10 ^-5 After the parameters of the generator are correspondingly modified according to the above formula, the discriminator is updated, and the loss function of the discriminator is as follows:

wherein D (x _i ) Representing realityImage x _i With the output result of the discriminator, D (G (y _i ) Represents the output image G (y) _i ) Output results from the discriminator;

step 6, performing a test experiment on the trained network model in the step 5, and measuring demosaicing images by using the signal-to-noise ratio of the color peak and the structural similarity index to illustrate the network performance;

The step 6 specifically comprises the following steps:

step 6.3, to illustrate network performance, the test demosaicing image obtained in step 6.2 is measured with a color peak signal-to-noise ratio (CPSNR) and a Structural Similarity Index (SSIM), wherein the CPSNR index is based on: the larger PSNR is above 20dB, which means that the image has better demosaicing effect; the SSIM index has a value ranging from [0,1], and a larger value indicates that the image structure is closer to the original image.

The effects of the present invention will be further described with reference to simulation experiments.

1. Simulation conditions:

the simulation experiment of the invention is trained in a Tensorflow environment, and the installation environment is provided with Nvidia

Computer with MX250GPU and InterCore i5-8265U CPU.

2. Simulation content and result analysis:

fig. 5 is a simulation diagram of the present invention, in which fig. 5 (a) is a test color image used in the simulation experiment of the present invention, fig. 5 (b) is a corresponding enlarged area image at the square frame in fig. 5 (a), fig. 5 (c) is a simulation result image using the nearest neighbor interpolation method, fig. 5 (d) is a corresponding enlarged area image at the square frame in fig. 5 (c), fig. 5 (e) is a simulation result image using the bilinear interpolation method, fig. 5 (f) is a corresponding enlarged area image at the square frame in fig. 5 (e), fig. 5 (g) is a simulation result of the method of the present invention, fig. 5 (h) is a corresponding enlarged area image at the square frame in fig. 5 (g), and it can be seen that the present invention can better identify high frequency characteristics (edges and angles) of an input signal, effectively restore texture and edge information of an image, can be eliminated to some extent, and the present invention can be applied to the actual practice.

Claims

1. The CFA image demosaicing method based on the generated antagonistic neural network is characterized by comprising the following steps of:

The step 1 specifically comprises the following steps:

step 1.2, preprocessing the training data set obtained in the step 1.1, scaling each image in the training data set by 0.7,0.8,0.9,1 times, and selecting a sliding window with a proper scale to perform translation operation according to the size of the image in the training data set, namely completing the segmentation of the small blocks of the image to obtain a plurality of small block images so as to improve the network training performance; performing the augmentation operation of 90 DEG rotation, 180 DEG rotation, 270 DEG rotation and up-down turning on each small image, thereby obtaining a training data set with sufficient quantity;

the step 2 specifically comprises the following steps:

In the step 2.1, the generator structure adopts a U-Net model, and the structure is as follows in sequence: input layer-1 st convolution layer-1 st PReLU activation function layer-2 nd convolution layer-1 st batch normalization operation layer-2 nd PReLU activation function layer-3 rd convolution layer-2 nd batch normalization operation layer-3 rd PReLU activation function layer-4 th convolution layer-3 rd batch normalization operation layer-4 th PReLU activation function layer-5 th convolution layer-4 th batch normalization operation layer-5 th PReLU activation function layer-6 th convolution layer-5 th batch normalization operation layer-6 th PReLU activation function layer-7 th convolution layer-6 th batch normalization operation layer-7 th PReLU activation function layer-8 th convolution layer-7 th batch normalization operation layer-8 th PReLU activation function layer the method comprises the steps of deconvolution layer 1, batch normalization operation layer 8, PReLU activation function layer 9, deconvolution layer 2, batch normalization operation layer 9, PReLU activation function layer 10, deconvolution layer 3, batch normalization operation layer 10, PReLU activation function layer 11, deconvolution layer 4, batch normalization operation layer 11, PReLU activation function layer 12, deconvolution layer 5, batch normalization operation layer 12, PReLU activation function layer 13, deconvolution layer 6, batch normalization operation layer 13, PReLU activation function layer 14, deconvolution layer 7, batch normalization operation layer 14, deconvolution layer 8, tanh activation function layer 1 and output layer;

Wherein the input layer represents a four-way compressed CFA image and the output layer represents an output image; in the structure, the 1 st convolution layer output is connected with the 14 th batch normalization operation layer output, the 1 st batch normalization operation layer output is connected with the 13 th batch normalization operation layer output, the 2 nd batch normalization operation layer output is connected with the 12 th batch normalization operation layer output, the 3 rd batch normalization operation layer output is connected with the 11 th batch normalization operation layer output, the 4 th batch normalization operation layer output is connected with the 10 th batch normalization operation layer output, the 5 th batch normalization operation layer output is connected with the 9 th batch normalization operation layer output, and the 6 th batch normalization operation layer output is connected with the 8 th batch normalization operation layer output to form symmetrical connection;

PReLU(x _p )＝max(0,x _p )+κ×min(0,x _p ) (1)

wherein x is _t An input vector representing a Tanh activation function layer;

for all PReLU activation function layers in the discriminator, the function definition is as shown in formula (1), and the parameter setting is consistent with that in formula (1);

wherein x is _s An input vector representing a Sigmoid activation function layer; the convolution kernel scale of all the convolution layers of the dense residual block in the discriminator is set to 3*3, the convolution step size is set to 1*1, and the number of feature maps is set to 64;

the step 3 specifically comprises the following steps:

2. The method for demosaicing CFA images based on a generated antagonistic neural network according to claim 1, wherein said step 4 specifically comprises the steps of:

/>

in the method, in the process of the invention,

L _G ＝αL _a +βL _p +yL _r (7)

3. The method for demosaicing CFA images based on a generated antagonistic neural network according to claim 1, wherein said step 6 specifically comprises the steps of: