CN108171320B - Image domain conversion network and conversion method based on generative countermeasure network - Google Patents

Image domain conversion network and conversion method based on generative countermeasure network Download PDF

Info

Publication number
CN108171320B
CN108171320B CN201711273921.5A CN201711273921A CN108171320B CN 108171320 B CN108171320 B CN 108171320B CN 201711273921 A CN201711273921 A CN 201711273921A CN 108171320 B CN108171320 B CN 108171320B
Authority
CN
China
Prior art keywords
network
image
input
loss
true
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201711273921.5A
Other languages
Chinese (zh)
Other versions
CN108171320A (en
Inventor
肖锋
白猛猛
冯飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Technological University
Original Assignee
Xian Technological University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Technological University filed Critical Xian Technological University
Priority to CN201711273921.5A priority Critical patent/CN108171320B/en
Publication of CN108171320A publication Critical patent/CN108171320A/en
Application granted granted Critical
Publication of CN108171320B publication Critical patent/CN108171320B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image domain conversion network and a conversion method based on a generative countermeasure network, which comprises a U-shaped generative network, a true and false authentication network and a pairing authentication network, wherein the image domain conversion process mainly comprises the following steps: 1) training the U-shaped generation network, and establishing a network model of the U-shaped generation network; 2) inputting the image to be converted into the network model established in the step 1) after normalization processing, and completing image domain conversion of the image to be converted; the invention can realize the image domain conversion task of the local area in the image, and has high image local area conversion quality, strong network judgment capability and strong image conversion stability, thereby greatly improving the authenticity of the generated image.

Description

Image domain conversion network and conversion method based on generative countermeasure network
Technical Field
The invention relates to the technical field of image domain conversion, in particular to an image domain conversion network and a conversion method based on a generative countermeasure network.
Background
The image domain conversion is an important research direction in computer vision and has wide application prospect. Currently, the emergence of a countermeasure network (GAN) has achieved remarkable achievement in the field of image generation, which also provides a new solution for image domain transformation. And inputting the image and generating the image of the target domain of the network generation by using the generating type confrontation network, wherein the training of the network is completed based on the game between the generating network and the identifying network. The generation type countermeasure network is an unsupervised learning method when originally proposed, and gradually learns the data distribution in a training set through the game between the generation network and the discrimination network, so that the generation network can randomly generate data according to the learned data distribution by inputting a random value, and the earliest application is the generation of images. And then, adding artificial conditions to the input of the GAN by the Conditional GAN to ensure that the generated data is not generated randomly any more, but different data is generated according to different input conditions.
Later, a Conditional GAN that improves on the original GAN appeared, which was currently studied, by adding artificial conditions to the GAN input, can generate specific image data for a specific input, rather than inputting the immediate data to generate the immediate image data. The proposed Conditional GAN enables the image domain to be converted using a generative confrontation network, i.e. the image domain is input as an original domain image in the frame thereof, and the target domain image can be output through training and learning. The image domain conversion GAN implemented under this framework has: (1) pix2pix GAN, which uses a supervised method, the network is based on a generating network and a countermeasure identification network, and the conversion task of the whole image domain is solved; (2) the Cycle GAN uses an unsupervised method, two confrontation networks are used through two generation networks, and the conversion of an image domain is realized by utilizing a Cycle-consistency loss Cycle training generation network and a confrontation identification network. Although the unsupervised method does not need the training data in one-to-one pairing, the conversion effect is inferior to that of the supervised pix2pix network, the conversion of the whole image domain is still aimed at, and in the existing image domain conversion, a domain conversion task of a local region in an image does not have a special GAN.
Disclosure of Invention
The invention aims to provide an image domain conversion network and a conversion method based on a generating type countermeasure network, which can realize the image domain conversion task of a local area in an image, and have the advantages of high image local area conversion quality, strong network judgment capability and strong image conversion stability, thereby greatly improving the authenticity of the generated image.
The technical scheme adopted by the invention is as follows:
an image domain conversion network based on a generating type countermeasure network comprises a U-shaped generating network, a true and false identification network and a pairing identification network, wherein the U-shaped generating network comprises a coding network and a decoding network, the Input end of the coding network is connected with an Input image Input, the Output end of the coding network is connected with the Input end of the decoding network, and the Output end of the decoding network outputs an image Output; setting real target domain images matched with the Input images one by one as target domain images target; the network generated image Output is used as a training negative sample of a true and false identification network and input into a negative sample input end of the true and false identification network, the target domain image target is used as a training positive sample of the true and false identification network and input into a positive sample input end of the true and false identification network, and a value Output by the true and false identification network is used as a true and false loss value and fed back to a true and false loss input end of a decoding network; the network generated image Output and the corresponding Input image Input are used as training negative samples to be Input into a negative sample Input end of the pairing identification network, the target domain image target and the corresponding Input image Input are used as training positive samples to be Input into a positive sample Input end of the pairing identification network, and a value Output by the pairing identification network is used as a pairing loss value to be fed back to a pairing loss Input end of the decoding network; and feeding back the structural similarity value between the network generated image Output and the target domain image target as a compensation loss value to a compensation loss input end of the decoding network.
The coding network comprises eight layers of convolutional networks, the convolutional kernel size of each layer of convolutional network is 3 x 3, the step length is 2 x2, each layer of convolutional network comprises a convolutional layer, a Batch Normalization layer and a Leak ReLU active layer, and the alpha parameter of the Leak ReLU active layer is 0.2; the decoding network comprises eight layers of deconvolution networks, the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, the step length is 2 x2, each layer of deconvolution network comprises a deconvolution layer, a Batch Normalization layer and an activation layer, the activation layers of the first to seventh layers of deconvolution networks adopt ReLU activation layers, and the activation layer of the eighth layer of deconvolution network adopts a tanh activation layer.
The true and false authentication network comprises a plurality of layers of true and false authentication convolutional networks which are sequentially transmitted, each layer of true and false authentication convolutional network comprises a convolutional layer, a Batch Normalization layer and an activation layer, the activation layer of the last layer of true and false authentication convolutional network adopts a Sigmoid activation function, and the activation layers of the other layers of true and false authentication convolutional networks adopt ReLU functions.
The pairing authentication network comprises a Concat layer and a plurality of layers of pairing authentication convolutional networks which are transmitted in sequence, each layer of pairing authentication convolutional network comprises a convolutional layer, a Batch Normalization layer and an activation layer, the activation layer of the last layer of pairing authentication convolutional network adopts a Sigmoid activation function, and the activation layers of the other layers of pairing authentication convolutional networks adopt ReLU functions.
An image domain conversion method based on a generative countermeasure network comprises the following steps:
1) training the U-shaped generation network, and establishing a network model of the U-shaped generation network; the method specifically comprises the following steps:
A. collecting a training image set of a domain to be converted, wherein the training image set comprises original domain images and target domain images which are matched one by one, normalizing the original domain images in the training image set, the normalized images are Input images during network training, and the target domain images in the training image set are target domain images corresponding to the Input images;
B. converting the Input image Input obtained in the step A into a network generation image Output of a training network through a U-shaped generation network;
C. and (3) training a multi-pair discrimination network by using the Input image Input, the target domain image target and the network generation image Output obtained in the step A and the step B: the training of the multi-pair authentication network comprises the training of a true authentication network and a false authentication network and the training of a pairing authentication network, wherein the training of the true authentication network and the false authentication network comprises the following steps:
c11: initializing the network weight of the true and false authentication network by adopting a random initialization method;
c12: taking a network generation image Output as a negative sample, taking a target domain image target corresponding to the Input image Input as a positive sample, training in the true and false authentication network, and updating the network weight of the true and false authentication network by using a cross mutual information entropy loss function and an adam optimization algorithm;
the training of the pair-wise authentication network comprises the following steps:
c21: initializing the network weight of the pairing authentication network by adopting a random initialization method:
c22: taking the network generation image Output and the corresponding Input image Input as negative samples, taking the Input image Input and the corresponding target domain image target as positive samples, training in the pairing authentication network, and updating the network weight of the pairing authentication network by using a cross mutual information entropy loss function and an adam optimization algorithm;
D. repeating the step C, and fixing the network weights of the true and false authentication networks and the pairing authentication network after two times of multi-countermeasure authentication network training;
E. training the U-shaped generation network by using the multi-confrontation discrimination network obtained after the training in the step D; the method specifically comprises the following steps:
e1: initializing the network weight of the U-shaped generation network by adopting a Haville random initialization method;
e2: inputting the network generated image Output into a true and false identification network, outputting a true and false loss value by the true and false identification network, and feeding back the Output true and false loss value to a decoding network in the U-shaped generation network for updating the network weight: the true and false identification network outputs 30 × 1 images, and is used for returning loss values of the network generated image Output close to the real image, wherein the range of pixel point values of each image Output by the true and false identification network is 0 to 1, the closer the pixel point value is to 1, the closer the Input image Input is to the real image in the pixel point receptive field area, and the closer the pixel point value is to 0, the closer the Input image Input is to the real image in the pixel point receptive field area;
e3: inputting the Input image Input and the corresponding network generation image Output into a pairing identification network, outputting a pairing loss value by the pairing identification network, and feeding back the Output pairing loss value to a decoding network in the U-shaped generation network for updating the network weight: the pairing identification network outputs 30 × 1 images, and is used for returning whether the Input image Input and the network generation image Output are loss values of pairing of the Input image Input and the target domain image target or not; the range of each image pixel point value Output by the pairing identification network is 0 to 1, wherein the closer to 1, the more matched the Input image Input and the network generated image Output, and the closer to 0, the more unmatched the Input image Input and the network generated image Output;
e4: calculating a structural similarity value between the network generated image Output and the target domain image target, and feeding back the calculated structural similarity value as a loss to a decoding network in the U-shaped generating network for updating the network weight; the structural similarity value comprises an SSIM loss function calculation result and an L1 regularization calculation result, wherein the SSIM loss function is derived from an SSIM algorithm, an output value SSIM (x, y) of the SSIM algorithm represents the similarity between two images, namely the structural similarity between an input image x and a target domain image y, the range of SSIM (x, y) is-1 to 1, the similarity of the two images is higher when the similarity is close to 1, and when the input image x is the same as the target domain image y, the value of SSIM (x, y) is equal to 1;
the calculation formula of the output value of the SSIM algorithm is as follows:
Figure BDA0001496127490000041
in the formula (1), x is the Input image Input, and y is the target area image target, μ, corresponding to the Input image InputxIs the average value of x, μyIs the average value of y and is,
Figure BDA0001496127490000042
is the variance of x and is,
Figure BDA0001496127490000043
is the variance of y, σxyIs the covariance of x and y, c1=(k1L)2And c2=(k2L)2Is a constant for maintaining stability, L is the dynamic range of pixel values, k1=0.01,2=0.03;
F. C-E is weight training of the U-shaped generation network once, the steps C-E are repeated, after the weight training of the U-shaped generation network is completed twice, the training of the U-shaped generation network is completed, and the obtained generation network is a network model of the U-shaped generation network; 2) inputting the image to be converted into the network model established in the step 1) after normalization processing, and completing image domain conversion of the image to be converted: inputting the normalized image serving as an Input image into the network model established in the step 1), extracting high-dimensional features of the Input image Input by a coding network, outputting the network through a decoding network to generate an image Output, wherein the Output network generated image Output is a target domain image subjected to image domain conversion.
The overall loss function of the U-shaped generation network in the step E is as follows:
LGAN(G,D1,D2)=LD11LD22Lssim3L1 (2)
the overall loss function to be optimized in the overall generative countermeasure network is:
G*=argminGANmaxD1maxD2(LGAN(G,D1,D2)+LD1+LD2) (3)
in the formulas (2) and (3),
Figure BDA0001496127490000051
representing a loss of true or false output from the true or false discrimination network,
Figure BDA0001496127490000052
Figure BDA0001496127490000053
indicating a loss of pairing at the output of the pairing authentication network,
Figure BDA0001496127490000054
representing the loss SSIM loss calculated by the SSIM loss function,
Figure BDA0001496127490000055
represents L1A regular term loss, x represents the Input image Input, y represents the target domain image target, λ corresponding to the Input image Input1Generating a parameter, lambda, for the pairing loss accounting for the weight in the overall loss of the network2A parameter, lambda, representing the weight of SSIM loss in the overall loss of the U-type generation network3Represents L1The value of the regular term accounts for the weight parameter in the integral loss of the U-shaped generation network;
in the initial training stage of the U-shaped generation network, the proportion of true and false loss, pairing loss, SSIM loss and L1 regular term loss is 1:1:4:1, and along with the increase of the network training times, the proportion of true and false loss, pairing loss, SSIM loss and L1 regular term loss gradually changes to 1:1: and 0.5:1, namely the SSSIMloss accounts for the weight parameter in the overall loss of the U-shaped generation network and gradually decreases according to the set overall training times.
The cross mutual information entropy loss function in the step C is a cross mutual information entropy loss function with a smooth item; the formula of the cross mutual information entropy loss function with the smoothing term is:
Figure BDA0001496127490000056
in the formula (4), i is the size of batch, tiFor predicted sample value, yiThe value for the true sample value, for the added smoothing term, is chosen to be 0.005.
The generation process of the network generated image Output comprises the following steps:
a) the image to be converted is normalized into an image with 256 × 3 pixels, the normalized image is used as an Input image Input to be Input into a coding network, the Input image Input sequentially passes through 8 layers of convolution networks in the coding network, and finally output data is a characteristic image with 1 × 1024; the convolution kernel size of each layer of convolution network in the coding network is 3 x 3, and the step length is 2 x 2;
b) inputting the characteristic image of 1 × 1024 generated in the step a) into a decoding network, sequentially passing the characteristic image through 8 layers of deconvolution networks of the decoding network, and simultaneously inputting the characteristic image after the operation of each layer of convolution network in the step a) into a deconvolution layer with the same data tensor size for operation, and finally generating a complete network generated image Output, wherein the input of the deconvolution layer not only has the characteristic image from the previous layer of deconvolution operation, but also has a convolution operation characteristic image corresponding to the tensor size; and the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, and the step size is 2 x 2.
In the step b), in the characteristic images input by the deconvolution network of the first three layers, Dropout operation is added in the process of inputting the characteristic images after the convolution network operation of each layer in the step a) into the deconvolution layer with the same data tensor size for operation; wherein the parameter of Dropout operation is used as 0.2, i.e. 20% of the connected nodes in the two connection layers are randomly closed.
The SSIM algorithm in step E4 is calculated in the form of a sliding window with a convolution kernel, where the size of the sliding window is 7 × 7.
The invention has the following advantages:
(1) the method comprises the steps of establishing a network model of the U-shaped generation network through a countermeasure generation network comprising the U-shaped generation network, a pairing identification network and a true and false identification network, and realizing image domain conversion of local images through the established network model, so that the vacancy that the countermeasure generation network aiming at the local image conversion does not exist at present is filled, the use range of the countermeasure generation network in the image domain conversion field is improved, and the effect and the reliability of the image domain conversion are improved;
(2) in the network model training process of the generated network, a multi-countermeasure mode of a pairing identification network and a true and false identification network is adopted, in order to improve the problem that the judgment capability of the network identification network is poor at the initial training stage, an SSIM loss function is added, the similarity of images is calculated by using an SSIM algorithm, the calculation result is used as loss to update the weight of the generated network, and the calculation result of the SSIM algorithm is used as loss to make up the problem of low initial countermeasure capability of the countermeasure network, so that the generated network can be converged better, and a better image domain conversion effect is never obtained;
(3) the cross mutual information entropy loss function is used in the training process of the multi-confrontation authentication network, so that the training of the multi-confrontation authentication network is more stable, the traditional cross entropy loss function contains log operation, the loss fluctuation is large in the initial stage of calculating loss, and further the result that the loss function is 0 is possible to appear in the training stage, so that the training failure is caused, the fluctuation in the training process is reduced by adding a smooth item, the training failure condition is prevented, and the stability of deep confrontation network training is improved;
(4) by arranging the device in a generating network, the input of a deconvolution layer not only has a characteristic image from the deconvolution operation of the previous layer, but also has a convolution operation characteristic image corresponding to the tensor size, so that the information of the image is retained to the maximum extent, the characteristic information in the original image is better and more completely stored, the effect and the authenticity of image domain conversion are improved, Dropout operation is added into the operation of inputting the characteristic image to the corresponding deconvolution layer by the convolution layers of the first three layers of the coding network, the simplification of the image obtained after decoding is effectively prevented, and the image domain conversion quality is further improved;
(3) by adopting the Leak ReLU active layer and setting the parameters of the Leak ReLU to be 0.2, the information of the original image domain of the generated network is better reserved, the residual information is reserved as much as possible when the network reversely transmits, the integrity of the converted image is improved, and the conversion effect is ensured;
drawings
FIG. 1 is a diagram of a network architecture of the present invention;
FIG. 2 is a diagram of the U-shaped generation network of FIG. 1;
FIG. 3 is a network architecture diagram of the true and false authentication network of FIG. 1;
FIG. 4 is a network architecture diagram of the paired authentication network of FIG. 1;
FIG. 5 is a U-shaped generated network training diagram of FIG. 1;
FIG. 6 is a network training diagram of the true and false authentication network of FIG. 1;
fig. 7 is a network training diagram of the pair authentication network of fig. 1.
Detailed Description
For a better understanding of the present invention, the technical solutions of the present invention are further described below with reference to the accompanying drawings.
As shown in FIG. 1, the invention comprises a U-shaped generation network U-net, a true and false authentication network D1-net and a pairing authentication network D2-net, and also comprises a structural similarity numerical calculation part, wherein the structural similarity numerical calculation comprises an SSIM loss function calculation part and an L1 regularization part; the U-shaped generation network U-net is used for converting an image domain, the true and false identification network D1-net is an identifier used for judging whether the network generation image Output is true, and the identification network D2-net is an identifier used for judging whether the network generation image Output is matched with an original image;
as shown in fig. 2, the U-shaped generation network U-net includes a coding network F-net and a decoding network G-net, the coding network F-net performs convolution operation on the image to output a high-dimensional feature map thereof, and the decoding network G-net performs generation of the image by performing deconvolution on the feature map using a deconvolution network.
The Input end of the coding network F-net is connected with the Input image Input, the Output end of the coding network F-net is connected with the Input end of the decoding network G-net, and the Output end of the decoding network G-net generates a network generated image Output.
The coding network F-net comprises eight layers of convolutional networks, the convolutional kernel size of each layer of convolutional network is 3 x 3, the step length is 2 x2, each layer of convolutional network comprises a convolutional layer, a Batch Normalization layer and a Leak ReLU activation layer, and as the generation network needs to reserve the original image domain information as much as possible and reserve the residual error information as much as possible when the network reversely propagates, the invention adopts a Leak ReLU activation function for selecting each layer of activation functions of the coding network F-net in the generation network, wherein the alpha parameter of the Leak ReLU activation layer is 0.2.
The decoding network G-net comprises eight layers of deconvolution networks, the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, the step length is 2 x2, each layer of deconvolution network comprises a deconvolution layer, a Batch Normalization layer and an activation layer, the activation layers of the first to seventh layers of deconvolution networks adopt ReLU activation layers, and the activation layer of the eighth layer of deconvolution network adopts a tanh activation layer.
As shown in fig. 3, the true/false discrimination network D1-net is for discriminating whether or not the generated image is a true image, and thus it is input as one image and output as true/false; the true and false authentication network D1-net comprises multiple layers of successively transmitted true and false authentication convolutional networks, and each layer of true and false authentication convolutional network comprises a convolutional layer, a Batch Normalization layer and an activation layer.
The ReLU activation function can effectively transfer residual errors and can keep nonlinear fitting, the output value of the tanh activation function is between-1 and 1, the output value of the Sigmoid activation function value is between 0 and 1, and calculation is convenient for a label, so that the excitation function used by each layer except the last layer of the true and false identification network D1-net is the ReLU function, the last layer of the output layer is the Sigmoid activation function, namely the activation layer of the last layer of the true and false identification convolutional network adopts the Sigmoid activation function, and the activation layers of the other layers of the true and false identification convolutional networks adopt the ReLU function.
The design of the convolution kernel of each layer of true and false identification convolution network follows the principle of small convolution kernel, 3 × 3 convolution kernels are adopted, the convolution operation that the step size stride of each layer is 1 between 32 × 256 true and false identification convolution networks and 30 × 1 true and false identification convolution networks is removed, and the rest step sizes stride are 2. In order to prevent the gradient diffusion phenomenon, a back Normalization layer is added in the network of each layer, and meanwhile, the step length of partial layer convolution calculation is considered to be 2, so that the pooling effect is achieved, and therefore a posing layer is not added in the network.
The pair discrimination network D2-net is used to discriminate whether or not the generated image is paired with the input image, and thus its input is two images; the pairing identification network D2-net comprises a Concat layer and a plurality of layers of pairing identification convolution networks which are sequentially transmitted, each layer of pairing identification convolution network comprises a convolution layer, a Batch Normalization layer and an activation layer, the activation layer of the last layer of pairing identification convolution network adopts a Sigmoid activation function, and the activation layers of the other layers of pairing identification convolution networks adopt ReLU functions.
As shown in fig. 4, the paired authentication network D2-net is similar to the true and false authentication network D1-net in structure, and only one image is added at the input of the first layer, i.e., the input of the first layer is 256 × 6 tensor, the ReLU and sigmoid activation functions are also selected, the Bacth Normalization layer is used, and the loss function with smoothing terms is used as with D1-net.
Setting real target domain images matched with the Input images one by one as target domain images target; the network generation image Output is used as a training negative sample of the true and false authentication network D1-net and is input into a negative sample input end of the true and false authentication network D1-net, the target domain image target is used as a training positive sample of the true and false authentication network D1-net and is input into a positive sample input end of the true and false authentication network D1-net, and a value Output by the true and false authentication network D1-net is fed back to a true and false loss input end of the decoding network G-net as a true and false loss value; the network generated image Output and the corresponding Input image Input are used as training negative samples to be Input into a negative sample Input end of the paired identification network D2-net, the target domain image target and the corresponding Input image Input are used as training positive samples to be Input into a positive sample Input end of the paired identification network D2-net, and a value Output by the paired identification network D2-net is used as a pairing loss value to be fed back to a pairing loss Input end of the decoding network G-net; and feeding back a structural similarity value between the network generated image Output and the target domain image target as a compensation loss value to a compensation loss input end of the decoding network G-net, wherein the calculation of the structural similarity value comprises an SSIM loss function and L1 regularization.
The invention also comprises an image domain conversion method based on the generative countermeasure network, which comprises the following steps:
1) training the U-shaped generation network U-net, and establishing a network model of the U-shaped generation network U-net; the method specifically comprises the following steps:
A. collecting a training image set of a domain to be converted, wherein the training image set comprises original domain images and target domain images which are matched one by one, the original domain images in the training image set are normalized into images with 256 × 3 pixels, the normalized images are Input images during network training, and the target domain images in the training image set are target domain images corresponding to the Input images;
B. converting the Input image Input obtained in the step A into a network generation image Output of a training network through a U-shaped generation network U-net;
C. and (3) training a multi-pair discrimination network by using the Input image Input, the target domain image target and the network generation image Output obtained in the step A and the step B: the training of the multi-confrontation authentication network comprises the training of a true authentication network D1-net and the training of a pairing authentication network D2-net;
as shown in fig. 6, the training of the true-false discrimination network D1-net includes the following steps:
c11: initializing the network weight of the true and false authentication network D1-net by adopting a random initialization method;
c12: taking a network generated image Output as a negative sample, taking a target domain image target corresponding to the Input image Input as a positive sample, performing classification training in a true and false authentication network D1-net, and updating the network weight of the true and false authentication network D1-net by using a cross mutual information entropy loss function with a smooth term and an adam optimization algorithm;
the cross mutual information entropy is larger at the initial stage of calculation loss, and meanwhile, 0 possibly occurs in the training stage to cause training failure, smooth items are added to reduce fluctuation during training, the training failure condition is prevented from occurring, and the improved cross mutual information entropy function with the smooth items improves the stability of deep confrontation network training;
in the two-classification training, firstly, generating one-hot type labels for positive and negative samples, then calculating cross entropy loss by using sigmoid cross Entrol added with a smooth term according to values from 0 to 1 output by a last layer of sigmoid activation function, and finally updating the weight of a true and false discrimination network D1-net according to the loss of feedback;
wherein the formula of the cross mutual information entropy loss function with the smooth term is
Figure BDA0001496127490000101
In the formula (1), i is the size of batch, tiFor predicted sample value, yiSelecting the value of the added smoothing item as 0.005 for the real sample value;
as shown in fig. 7, the training of the pair-wise authentication network D2-net includes the following steps:
c21: initializing the network weight of the pairing authentication network D2-net by adopting a random initialization method:
c22: taking the network generated image Output and the corresponding Input image Input as negative samples, taking the Input image Input and the corresponding target domain image target as positive samples, performing two-classification training in the paired authentication network D2-net, and updating the network weight of the paired authentication network D2-net by using a cross mutual information entropy loss function with a smooth term and an adam optimization algorithm;
D. and C, repeating the step C, and after two times of multi-confrontation authentication network training, fixing the network weights of the true and false authentication network D1-net and the pairing authentication network D2-net: in order to enable the training of the U-shaped generation network U-net to be more stable, a strategy of training the true and false identification network D1-net and the matched identification network D2-net for multiple times and then training the U-shaped generation network U-net is adopted;
E. training the U-shaped generation network U-net by using the multi-confrontation authentication network obtained after the training in the step D; as shown in fig. 5, the training of the U-net includes the following steps:
e1: initializing the network weight of the U-shaped generation network U-net by adopting a Haville random initialization method;
e2: inputting the network generated image Output into a true and false identification network D1-net, outputting a true and false loss value by the true and false identification network D1-net, and feeding back the Output true and false loss value to a decoding network G-net in the U-shaped generation network U-net for updating the network weight: the true and false identification network D1-net outputs 30 × 1 images, which are used to return loss values of the network generated image Output close to the real image, wherein the pixel point value range of each image Output by the true and false identification network D1-net is 0 to 1, the closer the pixel point value is to 1, the closer the Input image Input is to the real image in the pixel receptive field area, and the closer the pixel point value is to 0, the closer the Input image Input is to the real image in the pixel receptive field area;
e3: inputting the Input image Input and the corresponding network generation image Output into the pairing identification network D2-net, outputting a pairing loss value by the pairing identification network D2-net, and feeding back the Output pairing loss value to a decoding network G-net in the U-shaped generation network U-net for updating the network weight: the pair discrimination network D2-net outputs 30 × 1 images for returning whether the Input image Input and the network generated image Output are loss values of the pair of the Input image Input and the target domain image target; wherein, the pixel point value of each image Output by the pair discrimination network D2-net ranges from 0 to 1, and a closer to 1 indicates a more matching between the Input image Input and the network generated image Output, and a closer to 0 indicates a more non-matching;
e4: calculating a structural similarity value between the network generated image Output and the target domain image target, and feeding back the calculated structural similarity value serving as loss to a decoding network G-net in the U-shaped generated network U-net for updating the network weight; the calculation of the structural similarity value comprises an SSIM loss function and L1 regularization, wherein the SSIM loss function is derived from an SSIM algorithm and is an index for measuring the similarity of two images, the output value SSIM (x, y) of the SSIM algorithm represents the similarity between the two images, namely the structural similarity between an input image x and a target domain image y, the range of the SSIM (x, y) is-1 to 1, the similarity between the two images is higher when the SSIM (x, y) is close to 1, and when the input image x and the target domain image y are identical, the value of the SSIM (x, y) is equal to 1; by using the SSIM as a loss function, the generated network can be converged better, so that a better image domain conversion effect is obtained;
the calculation formula of the output value of the SSIM algorithm is as follows:
Figure BDA0001496127490000121
in the formula (2), x is the Input image Input, and y is the target area image target, μ, corresponding to the Input image InputxIs the average value of x, μyIs the average value of y and is,
Figure BDA0001496127490000122
is the variance of x and is,
Figure BDA0001496127490000123
is the variance of y, σxyIs the covariance of x and y, c1=(k1L)2And2=(k2L)2is a constant for maintaining stability, L is the dynamic range of pixel values, k1=0.01,2=0.03;
F. C-E is weight training of the U-shaped generation network U-net once, the steps C-E are repeated, after the weight training of the U-shaped generation network U-net is completed twice, the training of the U-shaped generation network U-net is completed, and the generated network is a network model of the U-shaped generation network U-net;
however, when the SSIM value of two images is calculated, it needs to be converted into a sliding window form, which results in selecting sliding windows with different sizes and different parameters δ, and the obtained results are different. The SSIM algorithm originally proposed by Wang et al adopts a sliding window of 11 × 11, however, since the difference between the image generated by the generation network at the initial stage and the target image is large, the value of SSIM between two images calculated by the sliding window with large size is very close to 0 at the initial stage, so that the loss cannot be effectively transmitted back to the generation network, and the training of the antagonistic generation network GAN fails. In view of this problem and considering that the network inputs are photographs of 256 × 256 pixels, the computation of the SSIM algorithm in the present invention finally takes the form of a sliding window of convolution kernel, with a size of 7 × 7 being chosen.
In the step E of training the U-shaped generation network U-net, the overall loss function of the U-shaped generation network U-net is:
LGAN(G,D1,D2)=LD11LD22Lssim3L1 (3)
the overall loss function to be optimized in the overall generative countermeasure network is:
G*=argminGANmaxD1maxD2(LGAN(G,D1,D2)+LD1+LD2) (4)
in the formulas (3) and (4),
Figure BDA0001496127490000124
representing a loss of true or false output from the true or false discrimination network D1-net,
Figure BDA0001496127490000125
representing a loss of pairing output by the pairing authentication network D2-net,
Figure BDA0001496127490000131
representing the SSIM loss calculated by the SSIM loss function,
Figure BDA0001496127490000132
represents L1A regular term loss, x represents the Input image Input, y represents the target domain image target, λ corresponding to the Input image Input1Parameter, lambda, of weights in a decoding network G-net for generating a network for a pairing loss as a whole2Parameter, λ, representing the weight of SSIM loss in the decoding network G-net of the overall generation network3Represents L1The values of the regularization terms account for the parameters of the weights in the decoding network G-net of the overall generation network.
In the initial training stage of the U-shaped generation network U-net, the proportion of the true and false loss, the pairing loss, the SSIM loss and the L1 regular term loss is 1:1:4:1, and along with the increase of the network training times, the proportion of the true and false loss, the pairing loss, the SSIM loss and the L1 regular term loss gradually changes to 1:1: and 0.5:1, namely the SSSIMloss accounts for the weight parameter in the U-shaped generation network U-net overall loss and gradually decreases according to the set overall training times.
The method comprises the steps that when the discrimination capability of a true discrimination network D1-net and a false discrimination network D2-net is low in the initial stage of network training, SSIM loss functions can be used for feeding back U-shaped generated network U-net residual errors, so that target domain images can be effectively generated, and when the discrimination capability of the true discrimination network D1-net and the false discrimination network D2-net is continuously improved in training, the weight of the SSIM loss functions in the feedback generated network residual errors is reduced, so that the larger part of the generated network residual errors are from the loss fed back by the true discrimination network D1-net and the false discrimination network D2-net, and the effect is better than that of an existing image domain conversion method and the generated image is more real.
2) Normalizing the image to be converted, wherein the normalized pixels are 256 × 256, and the normalized image is input into the network model established in the step 1), so that the image domain conversion of the image to be converted can be completed: inputting the normalized image serving as an Input image into the network model established in the step 1), extracting high-dimensional characteristics of the Input image Input by the encoding network F-net, outputting the network through the decoding network G-net to generate an image Output, wherein the Output network generated image Output is the target domain image subjected to image domain conversion.
In the image domain conversion process of the U-shaped generation network, the image needs to be input into the coding network F-net to be subjected to convolution operation firstly, and then deconvolution operation is carried out to realize the conversion of the image domain; however; in the conventional U-shaped generation network, partial information of the original image is difficult to retain in the convolution process, so in order to better and more completely store the characteristic information in the original image, the generation process of the network generated image Output of the invention comprises the following steps:
a) the image to be converted is normalized into an image with 256 × 3 pixels, the normalized image is used as an Input image Input to be Input into a coding network F-net, the Input image Input sequentially passes through 8 layers of convolution networks in the coding network F-net, and finally output data is a characteristic image with 1 × 1024; the convolution kernel size of each layer of convolution network in the coding network F-net is 3 x 3, and the step length is 2 x 2;
b) inputting the characteristic image of 1 x 1024 generated in the step a) into a decoding network G-net, enabling the characteristic image to sequentially pass through 8 layers of deconvolution networks of the decoding network G-net, and simultaneously inputting the characteristic image after the operation of each layer of convolution network in the step a) into a deconvolution layer with the same data tensor size for operation to finally generate a complete network generated image Output, wherein the input of the deconvolution layer not only has the characteristic image from the previous layer of deconvolution operation, but also has the convolution operation characteristic image corresponding to the tensor size; and the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, and the step size is 2 x 2.
In the characteristic images input by the deconvolution networks of the first three layers, Dropout operation is added in the process of inputting the characteristic images after the convolution network operation of each layer in the step a) into the deconvolution layer with the same data tensor size for operation; wherein the parameter of Dropout operation is used as 0.2, i.e. 20% of the connected nodes in the two connection layers are randomly closed.
The input of the deconvolution layer not only has the characteristic image from the last layer of deconvolution operation, but also has the characteristic image corresponding to the size of tensor, so that the information of the image is reserved to the maximum extent, the characteristic information in the original image is better and more completely stored, the effect and the reality of image domain conversion are improved, and in order to prevent the image obtained by decoding the network G-net from being unified, Dropout operation is added into the operation of inputting the characteristic image to the corresponding deconvolution layer by the first three convolution layers of the coding network F-net, the image obtained after decoding is effectively prevented from being unified, and the image domain conversion quality is further improved.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that various changes, modifications and substitutions can be made therein without departing from the spirit and scope of the embodiments of the present invention.

Claims (10)

1. An image domain conversion network system based on a generative countermeasure network, characterized in that: the image matching method comprises a U-shaped generation network, a true and false identification network and a pairing identification network, wherein the U-shaped generation network comprises an encoding network and a decoding network, the Input end of the encoding network is connected with an Input image Input, the Output end of the encoding network is connected with the Input end of the decoding network, and the Output end of the decoding network outputs an image Output generated by the network; setting real target domain images matched with the Input images one by one as target domain images target; the network generated image Output is used as a training negative sample of a true and false identification network and input into a negative sample input end of the true and false identification network, the target domain image target is used as a training positive sample of the true and false identification network and input into a positive sample input end of the true and false identification network, and a value Output by the true and false identification network is used as a true and false loss value and fed back to a true and false loss input end of a decoding network; the network generated image Output and the corresponding Input image Input are used as training negative samples to be Input into a negative sample Input end of the pairing identification network, the target domain image target and the corresponding Input image Input are used as training positive samples to be Input into a positive sample Input end of the pairing identification network, and a value Output by the pairing identification network is used as a pairing loss value to be fed back to a pairing loss Input end of the decoding network; and feeding back the structural similarity value between the network generated image Output and the target domain image target as a compensation loss value to a compensation loss input end of the decoding network.
2. The image domain conversion network system based on the generative countermeasure network as claimed in claim 1, wherein: the coding network comprises eight layers of convolutional networks, the convolutional kernel size of each layer of convolutional network is 3 x 3, the step length is 2 x2, each layer of convolutional network comprises a convolutional layer, a Batch Normalization layer and a Leak ReLU active layer, and the alpha parameter of the Leak ReLU active layer is 0.2; the decoding network comprises eight layers of deconvolution networks, the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, the step length is 2 x2, each layer of deconvolution network comprises a deconvolution layer, a Batch Normalization layer and an activation layer, the activation layers of the first to seventh layers of deconvolution networks adopt ReLU activation layers, and the activation layer of the eighth layer of deconvolution network adopts a tanh activation layer.
3. The image domain conversion network system based on the generative countermeasure network as claimed in claim 2, wherein: the true and false authentication network comprises a plurality of layers of true and false authentication convolutional networks which are sequentially transmitted, each layer of true and false authentication convolutional network comprises a convolutional layer, a Batch Normalization layer and an activation layer, the activation layer of the last layer of true and false authentication convolutional network adopts a Sigmoid activation function, and the activation layers of the other layers of true and false authentication convolutional networks adopt ReLU functions.
4. The image domain conversion network system based on the generative countermeasure network as claimed in claim 3, wherein: the pairing identification network comprises a Concat layer and a plurality of layers of pairing identification convolution networks which are sequentially transmitted, each layer of pairing identification convolution network comprises a convolution layer, a Batch Norm optimization layer and an activation layer, the activation layer of the last layer of pairing identification convolution network adopts a Sigmoid activation function, and the activation layers of the other layers of pairing identification convolution networks adopt ReLU functions.
5. An image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 4, wherein: the method comprises the following steps:
1) training the U-shaped generation network, and establishing a network model of the U-shaped generation network; the method specifically comprises the following steps:
A. collecting a training image set of a domain to be converted, wherein the training image set comprises original domain images and target domain images which are matched one by one, normalizing the original domain images in the training image set, the normalized images are Input images during network training, and the target domain images in the training image set are target domain images corresponding to the Input images;
B. converting the Input image Input obtained in the step A into a network generation image Output of a training network through a U-shaped generation network;
C. and (3) training a multi-pair discrimination network by using the Input image Input, the target domain image target and the network generation image Output obtained in the step A and the step B: the training of the multi-pair authentication network comprises the training of a true authentication network and a false authentication network and the training of a pairing authentication network, wherein the training of the true authentication network and the false authentication network comprises the following steps:
c11: initializing the network weight of the true and false authentication network by adopting a random initialization method;
c12: taking a network generation image Output as a negative sample, taking a target domain image target corresponding to the Input image Input as a positive sample, training in the true and false authentication network, and updating the network weight of the true and false authentication network by using a cross mutual information entropy loss function and an adam optimization algorithm;
the training of the pair-wise authentication network comprises the following steps:
c21: initializing the network weight of the pairing authentication network by adopting a random initialization method:
c22: taking the network generation image Output and the corresponding Input image Input as negative samples, taking the Input image Input and the corresponding target domain image target as positive samples, training in the pairing authentication network, and updating the network weight of the pairing authentication network by using a cross mutual information entropy loss function and an adam optimization algorithm;
D. repeating the step C, and fixing the network weights of the true and false authentication networks and the pairing authentication network after two times of multi-countermeasure authentication network training;
E. training the U-shaped generation network by using the multi-confrontation discrimination network obtained after the training in the step D; the method specifically comprises the following steps:
f1: initializing the network weight of the U-shaped generation network by adopting a Haville random initialization method;
e2: inputting the network generated image Output into a true and false identification network, outputting a true and false loss value by the true and false identification network, and feeding back the Output true and false loss value to a decoding network in the U-shaped generation network for updating the network weight: the true and false identification network outputs 30 × 1 images, and is used for returning loss values of the network generated image Output close to the real image, wherein the range of pixel point values of each image Output by the true and false identification network is 0 to 1, the closer the pixel point value is to 1, the closer the Input image Input is to the real image in the pixel point receptive field area, and the closer the pixel point value is to 0, the closer the Input image Input is to the real image in the pixel point receptive field area;
e3: inputting the Input image Input and the corresponding network generation image Output into a pairing identification network, outputting a pairing loss value by the pairing identification network, and feeding back the Output pairing loss value to a decoding network in the U-shaped generation network for updating the network weight: the pairing identification network outputs 30 × 1 images, and is used for returning whether the Input image Input and the network generation image Output are loss values of pairing of the Input image Input and the target domain image target or not; the range of each image pixel point value Output by the pairing identification network is 0 to 1, wherein the closer to 1, the more matched the Input image Input and the network generated image Output, and the closer to 0, the more unmatched the Input image Input and the network generated image Output;
e4: calculating a structural similarity value between the network generated image Output and the target domain image target, and feeding back the calculated structural similarity value as a loss to a decoding network in the U-shaped generating network for updating the network weight; the structural similarity value comprises an SSIM loss function calculation result and an L1 regularization calculation result, wherein the SSIM loss function is derived from an SSIM algorithm, an output value SSIM (x, y) of the SSIM algorithm represents the similarity between two images, namely the structural similarity between an input image x and a target domain image y, the range of SSIM (x, y) is-1 to 1, the similarity of the two images is higher when the similarity is close to 1, and when the input image x is the same as the target domain image y, the value of SSIM (x, y) is equal to 1;
the calculation formula of the output value of the SSIM algorithm is as follows:
Figure FDA0003225457780000031
in the formula (1), x isThe Input image Input, y is a target area image target, μ corresponding to the Input image InputxIs the average value of x, μyIs the average value of y and is,
Figure FDA0003225457780000032
is the variance of x and is,
Figure FDA0003225457780000033
is the variance of y, σxyIs the covariance of x and y, c1=(k1L)2And c2=(k2L)2Is a constant for maintaining stability, L is the dynamic range of pixel values, k1=0.01,k2=0.03;
F. C-E is weight training of the U-shaped generation network once, the steps C-E are repeated, after the weight training of the U-shaped generation network is completed twice, the training of the U-shaped generation network is completed, and the obtained generation network is a network model of the U-shaped generation network;
2) inputting the image to be converted into the network model established in the step 1) after normalization processing, and completing image domain conversion of the image to be converted: inputting the normalized image serving as an Input image into the network model established in the step 1), extracting high-dimensional features of the Input image Input by a coding network, outputting the network through a decoding network to generate an image Output, wherein the Output network generated image Output is a target domain image subjected to image domain conversion.
6. The image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 5, wherein: the overall loss function of the U-shaped generation network in the step E is as follows:
LGAN(G,D1,D2)=LD11LD22Lssim3L1 (2)
the overall loss function to be optimized in the overall generative countermeasure network is:
G*=arg minGANmaxD1maxD2(LGAN(G,D1,D2)+LD1+LD2) (3)
in the formulas (2) and (3),
Figure FDA0003225457780000041
representing a loss of true or false output from the true or false discrimination network,
Figure FDA0003225457780000042
Figure FDA0003225457780000043
indicating a loss of pairing at the output of the pairing authentication network,
Figure FDA0003225457780000044
representing the loss SSIM loss calculated by the SSIM loss function,
Figure FDA0003225457780000045
represents L1A regular term loss, x represents the Input image Input, y represents the target domain image target, λ corresponding to the Input image Input1Generating a parameter, lambda, for the pairing loss accounting for the weight in the overall loss of the network2A parameter, lambda, representing the weight of SSIM loss in the overall loss of the U-type generation network3Represents L1The value of the regular term accounts for the weight parameter in the integral loss of the U-shaped generation network;
in the initial training stage of the U-shaped generation network, the proportion of true and false loss, pairing loss, SSIM loss and L1 regular term loss is 1:1:4:1, with the increase of the network training times, the proportion of true and false loss, pairing loss, SSIM loss and L1 regular term loss gradually becomes 1:1: 0.5:1, the SSSIMloss accounts for the weight of the overall loss of the U-shaped generation network, and the parameters are gradually reduced according to the set overall training times.
7. The image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 5, wherein: the cross mutual information entropy loss function in the step C is a cross mutual information entropy loss function with a smooth item;
the formula of the cross mutual information entropy loss function with the smoothing term is:
Figure FDA0003225457780000046
in the formula (4), i is the size of batch, tiFor predicted sample value, yiFor the true sample value, EPS is the added smoothing term, and the value of EPS is chosen to be 0.005.
8. The image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 5, wherein: the generation process of the network generated image Output comprises the following steps:
a) the image to be converted is normalized into an image with 256 × 3 pixels, the normalized image is used as an Input image Input to be Input into a coding network, the Input image Input sequentially passes through 8 layers of convolution networks in the coding network, and finally output data is a characteristic image with 1 × 1024; the convolution kernel size of each layer of convolution network in the coding network is 3 x 3, and the step length is 2 x 2;
b) inputting the characteristic image of 1 × 1024 generated in the step a) into a decoding network, sequentially passing the characteristic image through 8 layers of deconvolution networks of the decoding network, and simultaneously inputting the characteristic image after the operation of each layer of convolution network in the step a) into a deconvolution layer with the same data tensor size for operation, and finally generating a complete network generated image Output, wherein the input of the deconvolution layer not only has the characteristic image from the previous layer of deconvolution operation, but also has a convolution operation characteristic image corresponding to the tensor size; and the size of a deconvolution kernel of each layer of deconvolution network is 3 x 3, and the step size is 2 x 2.
9. The image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 8, wherein: in the step b), in the characteristic images input by the deconvolution network of the first three layers, Dropout operation is added in the process of inputting the characteristic images after the convolution network operation of each layer in the step a) into the deconvolution layer with the same data tensor size for operation; wherein the parameter of Dropout operation is used as 0.2, i.e. 20% of the connected nodes in the two connection layers are randomly closed.
10. The image domain conversion method of the image domain conversion network system based on the generative countermeasure network as claimed in claim 5, wherein: the SSIM algorithm in step E4 is calculated in the form of a sliding window with a convolution kernel, where the size of the sliding window is 7 × 7.
CN201711273921.5A 2017-12-06 2017-12-06 Image domain conversion network and conversion method based on generative countermeasure network Expired - Fee Related CN108171320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711273921.5A CN108171320B (en) 2017-12-06 2017-12-06 Image domain conversion network and conversion method based on generative countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711273921.5A CN108171320B (en) 2017-12-06 2017-12-06 Image domain conversion network and conversion method based on generative countermeasure network

Publications (2)

Publication Number Publication Date
CN108171320A CN108171320A (en) 2018-06-15
CN108171320B true CN108171320B (en) 2021-10-19

Family

ID=62525151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711273921.5A Expired - Fee Related CN108171320B (en) 2017-12-06 2017-12-06 Image domain conversion network and conversion method based on generative countermeasure network

Country Status (1)

Country Link
CN (1) CN108171320B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215014B (en) * 2018-07-02 2022-03-04 中国科学院深圳先进技术研究院 Training method, device and equipment of CT image prediction model and storage medium
CN109166144B (en) * 2018-07-20 2021-08-24 中国海洋大学 Image depth estimation method based on generation countermeasure network
CN109272486B (en) * 2018-08-14 2022-07-08 中国科学院深圳先进技术研究院 Training method, device and equipment of MR image prediction model and storage medium
CN109190524B (en) * 2018-08-17 2021-08-13 南通大学 Human body action recognition method based on generation of confrontation network
CN109543674B (en) * 2018-10-19 2023-04-07 天津大学 Image copy detection method based on generation countermeasure network
CN109711254B (en) * 2018-11-23 2020-12-15 北京交通大学 Image processing method and device based on countermeasure generation network
CN109671018A (en) * 2018-12-12 2019-04-23 华东交通大学 A kind of image conversion method and system based on production confrontation network and ResNets technology
CN109639710B (en) * 2018-12-29 2021-02-26 浙江工业大学 Network attack defense method based on countermeasure training
CN109740682B (en) * 2019-01-08 2020-07-28 南京大学 Image identification method based on domain transformation and generation model
CN110211035B (en) * 2019-04-18 2023-03-24 天津中科智能识别产业技术研究院有限公司 Image super-resolution method of deep neural network fusing mutual information
CN110225260B (en) * 2019-05-24 2021-02-19 宁波大学 Three-dimensional high dynamic range imaging method based on generation countermeasure network
CN110210422B (en) * 2019-06-05 2021-04-27 哈尔滨工业大学 Ship ISAR image identification method based on optical image assistance
CN110276811B (en) * 2019-07-02 2022-11-01 厦门美图之家科技有限公司 Image conversion method and device, electronic equipment and readable storage medium
CN110472528B (en) * 2019-07-29 2021-09-17 江苏必得科技股份有限公司 Subway environment target training set generation method and system
CN110660020B (en) * 2019-08-15 2024-02-09 天津中科智能识别产业技术研究院有限公司 Image super-resolution method of antagonism generation network based on fusion mutual information
CN110544239B (en) * 2019-08-19 2021-12-17 中山大学 Multi-modal MRI conversion method, system and medium for generating countermeasure network based on conditions
CN110517705B (en) * 2019-08-29 2022-02-18 北京大学深圳研究生院 Binaural sound source positioning method and system based on deep neural network and convolutional neural network
CN110674335B (en) * 2019-09-16 2022-08-23 重庆邮电大学 Hash code and image bidirectional conversion method based on multiple generation and multiple countermeasures
CN112633306B (en) * 2019-09-24 2023-09-22 杭州海康威视数字技术股份有限公司 Method and device for generating countermeasure image
CN111047525A (en) * 2019-11-18 2020-04-21 宁波大学 Method for translating SAR remote sensing image into optical remote sensing image
CN111161239B (en) * 2019-12-27 2024-02-27 上海联影智能医疗科技有限公司 Medical image analysis method, device, storage medium and computer equipment
CN111241614B (en) * 2019-12-30 2022-08-23 浙江大学 Engineering structure load inversion method based on condition generation confrontation network model
CN111210409B (en) * 2019-12-30 2022-08-23 浙江大学 Condition-based generation confrontation network structure damage identification method
CN111241725B (en) * 2019-12-30 2022-08-23 浙江大学 Structure response reconstruction method for generating countermeasure network based on conditions
CN113221897B (en) * 2020-02-06 2023-04-18 马上消费金融股份有限公司 Image correction method, image text recognition method, identity verification method and device
CN111325661B (en) * 2020-02-21 2024-04-09 京工慧创(福州)科技有限公司 Seasonal style conversion model and method for image named MSGAN
CN111597946B (en) * 2020-05-11 2022-04-08 腾讯医疗健康(深圳)有限公司 Processing method of image generator, image generation method and device
CN112001397A (en) * 2020-08-25 2020-11-27 广东光速智能设备有限公司 Method and system for generating identification card character recognition training data of intelligent construction site
CN112183325B (en) * 2020-09-27 2021-04-06 哈尔滨市科佳通用机电股份有限公司 Road vehicle detection method based on image comparison
CN112465115B (en) * 2020-11-25 2024-05-31 科大讯飞股份有限公司 GAN network compression method, device, equipment and storage medium
CN113837048B (en) * 2021-09-17 2023-08-01 南京信息工程大学 Vehicle re-identification method based on less sample attention
CN114638745B (en) * 2022-03-16 2023-08-18 江南大学 Medical image intelligent conversion method based on multi-borrowing information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN107123151A (en) * 2017-04-28 2017-09-01 深圳市唯特视科技有限公司 A kind of image method for transformation based on variation autocoder and generation confrontation network
CN107368752A (en) * 2017-07-25 2017-11-21 北京工商大学 A kind of depth difference method for secret protection based on production confrontation network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140279506A1 (en) * 2013-03-14 2014-09-18 Bank Of America Corporation User interface for mobile payment via transfer network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN107123151A (en) * 2017-04-28 2017-09-01 深圳市唯特视科技有限公司 A kind of image method for transformation based on variation autocoder and generation confrontation network
CN107368752A (en) * 2017-07-25 2017-11-21 北京工商大学 A kind of depth difference method for secret protection based on production confrontation network

Also Published As

Publication number Publication date
CN108171320A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN108171320B (en) Image domain conversion network and conversion method based on generative countermeasure network
Chi et al. Infogcn: Representation learning for human skeleton-based action recognition
Zheng et al. A full stage data augmentation method in deep convolutional neural network for natural image classification
CN109299342B (en) Cross-modal retrieval method based on cycle generation type countermeasure network
Wan et al. Variational autoencoder based synthetic data generation for imbalanced learning
Zhu et al. Multi-attention Meta Learning for Few-shot Fine-grained Image Recognition.
CN111583263B (en) Point cloud segmentation method based on joint dynamic graph convolution
Xiao et al. Bourgan: Generative networks with metric embeddings
Zhang et al. Learning structured low-rank representations for image classification
Lee et al. Wasserstein introspective neural networks
CN112784929B (en) Small sample image classification method and device based on double-element group expansion
CN109242097B (en) Visual representation learning system and method for unsupervised learning
Bai et al. Coordinate CNNs and LSTMs to categorize scene images with multi-views and multi-levels of abstraction
US20200143209A1 (en) Task dependent adaptive metric for classifying pieces of data
Chen et al. A convolutional neural network with dynamic correlation pooling
Yang et al. Cn: Channel normalization for point cloud recognition
Wang et al. Information maximizing adaptation network with label distribution priors for unsupervised domain adaptation
Singh et al. A sparse coded composite descriptor for human activity recognition
CN114780767A (en) Large-scale image retrieval method and system based on deep convolutional neural network
Ke et al. A clustering-guided contrastive fusion for multi-view representation learning
CN114372505A (en) Unsupervised network alignment method and system
Cui et al. Rethinking few-shot class-incremental learning with open-set hypothesis in hyperbolic geometry
Pathirage et al. Stacked face de-noising auto encoders for expression-robust face recognition
Wang et al. TFC: Transformer fused convolution for adversarial domain adaptation
Zhang et al. Improving pooling method for regularization of convolutional networks based on the failure probability density

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211019