CN111476294A - Zero sample image identification method and system based on generation countermeasure network - Google Patents

Zero sample image identification method and system based on generation countermeasure network Download PDF

Info

Publication number
CN111476294A
CN111476294A CN202010263452.4A CN202010263452A CN111476294A CN 111476294 A CN111476294 A CN 111476294A CN 202010263452 A CN202010263452 A CN 202010263452A CN 111476294 A CN111476294 A CN 111476294A
Authority
CN
China
Prior art keywords
semantic
visual
discriminator
features
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010263452.4A
Other languages
Chinese (zh)
Other versions
CN111476294B (en
Inventor
张桂梅
龙邦耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Hangkong University
Original Assignee
Nanchang Hangkong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Hangkong University filed Critical Nanchang Hangkong University
Priority to CN202010263452.4A priority Critical patent/CN111476294B/en
Publication of CN111476294A publication Critical patent/CN111476294A/en
Application granted granted Critical
Publication of CN111476294B publication Critical patent/CN111476294B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a zero sample image identification method and a zero sample image identification system based on a generation countermeasure network. The method comprises the following steps: acquiring a training image sample with marking information and a test image sample without marking information; constructing and generating a confrontation network model; the generation of the countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; constructing a multi-target loss function comprising a cycle consistency loss function, a countermeasure loss function of a semantic discriminator, a countermeasure loss function of a visual discriminator and a classification loss function of the semantic discriminator; taking the training image sample as the input of the generated countermeasure network model, and carrying out iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model; and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result. The invention can identify the sketch without the marked information, and the zero sample identification precision is high.

Description

Zero sample image identification method and system based on generation countermeasure network
Technical Field
The invention relates to the field of image identification based on weak/semi-supervision, in particular to a zero sample image identification method and system based on a generation countermeasure network.
Background
The concept of Zero-shot L earning, ZS L was first proposed in 2008 by H. L arochelle et al, which is mainly used to solve the problem of how to correctly classify and identify unknown new objects in the case that labeled training samples do not sufficiently cover all object classes.
Zero sample identification requires only labeled samples of known classes to predict unknown classes. The main idea is to introduce category semantic information as middle layer characteristics and to link visual characteristics with semantic characteristics. Therefore, at the feature level, the key problem of implementing zero sample identification is: 1) searching visual features capable of fully expressing visual information of the image and semantic information capable of fully representing semantic features; 2) how to relate visual features to category semantic information.
For the key problem 1), finding visual features that can sufficiently express image visual information is one of the challenges of zero sample identification. With the rise of deep learning, scholars extract the identification features of images by using deep convolutional neural networks. Zero-sample image recognition requires not only visual features of the image, but also semantic features that can represent the semantics of image classes to link known classes to unknown classes. The most widely used semantic features currently are attribute features and text features. Due to the fact that the attribute characteristics are marked manually, accuracy is poor. In recent years, with the development of natural language processing techniques, research using text description features instead of attribute features has been receiving much attention. Because the text description features can be extracted directly from the corpus, each class corresponds to a vector in the text description space. Compared with attribute features, the text description features can obtain text vectors of any words from the unlabeled text corpus through natural language processing technology, and therefore have better expansibility. A commonly used text vector extraction method is Word2 Vec.
Existing semantic feature spaces can be divided into three categories: (1) semantic feature space based on attributes. (2) A text-based semantic feature space. (3) A common semantic feature space. After the semantic feature space is selected, how to establish the mapping relationship between the visual features and the semantic features is the second key problem of zero sample identification.
For the key problem 2), after extracting semantic features of known classes and unknown classes in a given semantic space, the semantic correlation between the classes can be obtained from the similarity between the semantic features. However, sample images are represented by visual features in the visual space, and they cannot directly link semantic features of the semantic space due to the existence of semantic gaps. Most of the existing methods learn the mapping function which is mapped from the visual space to the semantic space through the visual features of the known class pictures and the semantic features of the corresponding labels. Then, the visual features of the test image are mapped to a semantic space through the mapping function, and predicted semantic features are obtained. And finally finding out the semantic features of the unknown class closest to the unknown class to determine the class to which the unknown class belongs.
In zero-sample image recognition, since the known class and the unknown class are not intersected, the direct application of the model learned from the training sample set to the test set results in a large deviation between the mapping of the test set samples in the semantic space and the real class semantics, which is called domain offset. Recently, to solve the domain shift problem in zero sample learning, many methods have been proposed, such as data enhancement, self-training, and pivot correction.
The zero sample recognition has received a wide attention of the middle and old scholars in recent years, and the application-related algorithm of the zero sample recognition has come to be applied in practice. Previous zero sample learning methods mainly identify targets in the conventional zero sample learning setting, i.e., the test image is limited to only the target class, whereas in an actual scenario, the test image comes not only from the target class but also possibly from the source class. In this case, data from both the source class and the target class should be taken into consideration, and thus the generalized zero sample setting has been introduced in recent years, however, the recognition accuracy of the zero sample based on the generalized zero sample learning is much lower than that based on the conventional zero sample learning. Therefore, the conventional generalized zero sample identification method has the problem of low identification precision.
Disclosure of Invention
Based on the above, there is a need for a zero-sample image recognition method and system based on a generation countermeasure network, which can perform high-precision recognition on test images from a target class and a source class.
In order to achieve the purpose, the invention provides the following scheme:
a zero sample image identification method based on a generation countermeasure network comprises the following steps:
acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain a trained generated countermeasure network model;
and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result.
The invention also provides a zero sample image recognition system based on the generation countermeasure network, which comprises:
the sample acquisition module is used for acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
the network model building module is used for building and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
the loss function construction module is used for constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
the training module is used for taking the training image sample as the input of the generated confrontation network model, and performing iterative training on the generated confrontation network model based on the multi-target loss function to obtain a trained generated confrontation network model;
and the test identification module is used for inputting the test image sample into the trained generated confrontation network model to obtain an identification result.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a zero sample image recognition method and a zero sample image recognition system based on a generated countermeasure network, wherein the method comprises the steps of constructing a generated countermeasure network model comprising a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; constructing a multi-target loss function comprising a cycle consistency loss function, a countermeasure loss function of a semantic discriminator, a countermeasure loss function of a visual discriminator and a classification loss function of the semantic discriminator; taking the training image sample as the input of the generated countermeasure network model, and carrying out iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model; and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result. The method can identify the sketch without the marked information, improve the zero sample identification precision and improve the generalization capability of the model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a zero-sample image recognition method based on a generation countermeasure network according to an embodiment of the present invention;
FIG. 2 is a semantic feature generator G according to an embodiment of the present invention1The network structure of (1);
FIG. 3 is a diagram of a visual feature generator G according to an embodiment of the present invention2The network structure of (1);
FIG. 4 is a diagram of a semantic discriminator D according to an embodiment of the present invention1The network structure of (1);
FIG. 5 is a diagram of a visual discriminator D according to an embodiment of the invention2The network structure of (1);
FIG. 6 is a block diagram of a trained generative confrontation network model according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a zero-sample image recognition system based on a generative countermeasure network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In order to improve the identification precision of the generalized zero sample, the following two problems need to be solved: on the one hand, aligned image pairs are required or inefficient feature fusion is required to map visual information to semantic space; on the other hand, when the self-encoder is used for extracting semantic information from Wikipedia, redundant noise texts exist, and the recognition effect is influenced.
Fig. 1 is a flowchart of a zero-sample image recognition method based on a generation countermeasure network according to an embodiment of the present invention. Referring to fig. 1, the zero-sample image recognition method based on the generation countermeasure network of the embodiment includes:
step 101: training image samples and test image samples are obtained.
The training image sample is a sample image with marking information, and the test image sample is a sample image without marking information.
Step 102: constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator.
The semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features.
Before this step is performed, it is also necessary: 1) inputting texts in Wikipedia into a layered model to obtain useful information of the texts, and inputting the useful information of the texts into a self-encoder to obtain real semantic features. 2) And inputting the training image sample into a CNN model based on an attention mechanism to obtain a real visual feature.
Step 103: constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, a confrontation loss function of a semantic discriminator, a confrontation loss function of the visual discriminator and a classification loss function of the semantic discriminator.
Step 104: and taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain the trained generated countermeasure network model.
Step 105: and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result.
Step 101 is a training initial stage of this embodiment, the training initial stage of the recognition model is completed under a deep learning tensflo framework, and a specific flow of obtaining a training image sample and a testing image sample is as follows:
the training image samples and the test image samples in this embodiment may be selected from Sketchy and TU-Berlin. Sketchy and TU-Berlin are two common and popular sketch datasets.
The Sketchy dataset is a large sketch set. The dataset consists of 125 different classes of slaves, each class having 100 drafts. The sketch of the object appearing in this 12500 sketch was collected by group sourcing, with the result being 75471 sketches. The data set also contains fine-grained correspondence (alignment) between particular images and sketches, as well as various data augmentations for deep learning based methods. The data set was then expanded by adding 60502 photos, yielding a total of 73002 sketches. We randomly draw 25 classes of sketches as invisible test sets for zero sample recognition (without their labeling information) and the remaining 100 classes of data are used for training (with labeling information).
The TU-Berlin dataset (extended) contains 250 categories, followed by an extension of 20000 sketches, the natural image corresponding to the sketches class, with a total size of 204489. Randomly selecting 30 types of sketches as a test set (without using the labeled information thereof); the remaining 220 classes are used for training (using annotation information).
Wherein step 102 is the training of this embodimentMid-exercise, i.e. building a structure that generates a countermeasure network model that includes a semantic feature generator G1Visual feature generator G2And a semantic discriminator D1And a visual sense discriminator D2. The specific construction process is as follows:
1) construction of the generator network:
constructing a generator network, the generator network having two: semantic feature generator G1And visual feature generator G2. As shown in FIG. 2, semantic feature generator G1The visual feature generator comprises 2 groups of convolution modules and 2 groups of fully-connected modules, wherein the convolution modules comprise a convolution layer (Conv), a Max Pooling layer (Max Pooling) and a normalization layer (normalization), the fully-connected modules comprise a fully-connected layer (FC) and an L eaky Re L U, and as shown in fig. 3, the visual feature generator G2The device comprises two groups of full-connection modules, a 3-layer 4096-dimensional full-connection layer (FC 4096), a resampling layer (Reshape) and 5 groups of up-sampling modules, wherein the full-connection module consists of a full-connection layer and L eaky Re L U, the up-sampling module consists of two up-sampling layers (Upconv) and two L eaky Re L U, the up-sampling layers and L eaky Re L U are alternately connected, G2The input comes from G1And outputting the semantic features.
In particular, semantic feature generator G1Comprises 2 groups of convolution modules and 2 groups of full connection modules. After an image is input into a generator, firstly, convolution processing is carried out on a convolution layer with a convolution kernel of 11 and a step length of 4 through a convolution module, the deviation of a mean square error left by parameter errors of the convolution layer is reduced through maximum pooling with a pooling layer of 3 and a step length of 2, and the dimension of input data is normalized in subsequent normalization; then convolution processing is carried out on the convolution layer with the convolution kernel of 5 and the step length of 1, the deviation of the mean square error left by the parameter error of the convolution layer is reduced through the maximum pooling with the pooling layer of 3 and the step length of 2, the dimensionality of input data is normalized in the subsequent normalization, and then the input data is input into a 1024 full-connection module; and finally, generating semantic features from the input visual features through two full-connection modules with the same size.
In particular, seeSensory characteristic generator G2The method comprises two groups of full-connection modules, 3 layers of 4096-dimensional full-connection layers, a resampling layer and 5 groups of up-sampling modules, wherein semantic features generated by a semantic feature generator are input into a visual feature generator, firstly pass through two 1024 full-connection modules, then three 4096-dimensional full-connection layers extract 4096-dimensional feature vectors from input data, then pass through a resampling layer, resample the dimensions of the input feature vectors to 4 × 4 × 256, and finally pass through 5 up-sampling modules with convolution kernels of 4 and step length of 2, up-sample the feature vectors, prevent gradient disappearance by using an activation function once every up-sampling, and output the feature vectors.
2) Construction of a discriminator network:
constructing a discriminator network, wherein the discriminator network comprises two networks: semantic discriminator D1And a visual sense discriminator D2。D1The network structure of the first branch comprises a group of fully-connected modules and a two-way fully-connected layer, the fully-connected modules consist of a fully-connected layer and a L eaky Re L U, the network structure of the other branch comprises a group of fully-connected modules and an n-way fully-connected layer, the fully-connected modules consist of a fully-connected layer and a L eaky Re L U, D2Comprises a group of full-connection modules and a full-connection layer, wherein the full-connection module consists of the full-connection layer and an L eaky Re L U, and two discriminators D1,D2The fully-connected layer of the last layer serves as a classifier in the overall convolutional neural network.
As shown in FIG. 4, in particular, semantic discriminator D1Two branches are included, one branch for the 0/1 second category; the other branch is used for class label classification. It receives true semantic features from the extracted self-encoder and a semantic feature generator G1Firstly, extracting features through a group of 1024 full-connection modules in a two-classification branch, then stabilizing the gradient by using an activation function, and finally, performing 0/1 two-classification through a full-connection layer to judge the truth of the input features; in another n-way classification branchAnd n paths of classification are carried out on the input data by using the last full connection layer.
As shown in FIG. 5, in particular, the visual discriminator D2For discriminating use of visual feature generator G2The authenticity of the features between the generated pseudo visual features and the CNN extracted real visual features. Inputting the generated pseudo-visual features into a discriminating network D2Firstly, using 1024 full-link layer to extract features, then using activation function to prevent gradient from disappearing, finally using full-link layer to make secondary classification of data and judging true and false of input features.
Wherein, a multi-target loss function is constructed in step 103, and the purpose of constructing the loss function is as follows: according to the convergence condition of the loss function value, the corresponding parameters in the zero sample identification network model can be better updated and optimized, the optimized generation countermeasure network model is finally obtained, and the image to be identified in the real data set is more accurately identified. Specifically, the method comprises the following steps:
the above-mentioned antagonism loss function is divided into two parts, one is the antagonism loss of CTGAN which evaluates the synthesis semantic features, the antagonism loss of CTGAN can make corresponding constraint to the gradient punishment to improve the quality of the synthesis features; and secondly, the antagonism loss of the general GAN for evaluating the synthesized pseudo-visual characteristics, and the general antagonism mechanism can well reduce the domain difference.
The degree of match between the visual features extracted by the CNN based on the attention mechanism and the generated pseudo-visual features can be well documented by the circular consistency loss function.
The classifier is attached to the semantic discriminator D1Therefore, the classifier can effectively classify the class label data so as to meet the task of zero sample image identification. Semantic discriminator D in the generation countermeasure network model1The countermeasure loss function of (1) is specifically as follows:
Figure BDA0002440299930000081
where x represents the true visual features, a represents the true semantic features, G1(x) Indication inputSemantic generator with visual features x, D1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,
Figure BDA0002440299930000082
representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features; first item
Figure BDA0002440299930000083
Representing a desire for a pseudo feature distribution; second item
Figure BDA0002440299930000084
An expectation representing a true feature distribution; the difference between the first term and the second term represents the Wasserstein distance between the feature distributions;
Figure BDA0002440299930000085
denotes the gradient penalty, λ, of enforcing the L ipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein the content of the first and second substances,
Figure BDA0002440299930000091
x 'and x' both represent perturbation data near the true visual features (arbitrarily extracted perturbation data near the true samples); c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features; the consistency term is used
Figure BDA0002440299930000092
To approximate the gradient and limit it to be less than c.
Construction of a countering loss function for a visual arbiter
Figure BDA0002440299930000093
Wherein the content of the first and second substances,
Figure BDA0002440299930000094
representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,
Figure BDA0002440299930000095
representing input pseudo-semantic features
Figure BDA0002440299930000096
The visual sense generator of (a) is,
Figure BDA0002440299930000097
presentation input
Figure BDA0002440299930000098
The visual characteristics generator of (1); continuously optimizing the network by a loss function such that the generated pseudo-visual features
Figure BDA0002440299930000099
And the true visual feature x is getting closer.
The loss resisting function is that the real characteristic distribution and the generated characteristic distribution are integrally analyzed, a feedback signal is output to the generator network, and the parameter of the network is adjusted and optimized.
Constructing a circular consistency loss function of real visual features and pseudo visual features
Figure BDA00024402999300000910
E[||G2(G1(x))-x||1]Representing a score of two visual features measured by cyclic consistencyA cloth expectation;
Figure BDA00024402999300000911
representing the expectation of distribution of two semantic features measured by cycle consistency, a loss of cycle consistency LcycTo optimize network parameters such that the true visual feature x and the pseudo-semantic feature
Figure BDA00024402999300000912
Better matching is possible.
Constructing a classification loss function for a semantic classifier
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a. The classification accuracy of class labels is improved by minimizing the classification loss of generated features.
And 104, performing iterative training on the constructed generation countermeasure network model, updating and optimizing parameters of the network model, and obtaining the trained generation countermeasure network model. Specifically, the training image sample is used as an input of the semantic feature generator, and the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are jointly trained in a back propagation manner according to the multi-objective loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained. Fig. 6 is a diagram of a trained structure for generating a confrontation network model according to an embodiment of the present invention. The specific iterative training steps are as follows:
inputting training sample data on the Sketchy data set and the TU-Berlin data set into a CNN based on an attention mechanism, extracting visual characteristic information of the training sample, and inputting the visual characteristic information into a semantic characteristic generator G1Generating pseudo-semantic features
Figure BDA0002440299930000101
Inputting the pseudo-semantic features obtained in the last step into a visual feature generator G2In generating pseudo-visual features
Figure BDA0002440299930000102
In order to better measure the similarity between the sketch and the real image in the training process, a cycle-GAN cycle consistency loss constraint is introduced. Because the cycle-GAN consists of two generators and two discriminators. Semantic feature generator G for generating semantic features and visual features as data information of two different domains1Generating pseudo-semantic features from real visual features x
Figure BDA0002440299930000103
Visual feature generator G2The obtained pseudo semantic features
Figure BDA0002440299930000104
Reverse generation of pseudo-visual features
Figure BDA0002440299930000105
Cyclios is then used to measure the similarity of true visual features and pseudo visual features.
The method comprises the steps of inputting texts in the Wikipedia into a layered model to obtain useful information of the texts, then inputting the useful information into a self-encoder, and extracting real semantic information of the Wikipedia texts. Using the real semantic information a as a discriminator D1Input of (1), with G1The generated pseudo-semantic features are used for counterstudy.
Using a variant of WGAN, CTGAN, as discriminator D1Because the gradient penalty of WGAN is unreasonable, if the real sample distribution and the generated pseudo sample distribution are far away, the gradient penalty can not detect the continuity of the area near the real sample, namely, the discriminator destroys L ipscriptz continuityIs constrained to enhance the L ipschitz continuity around the data sample distribution.
Visual feature generator G2Generated pseudo-visual features
Figure BDA0002440299930000111
And the real visual feature x as a visual discriminator D2Input of G2Judging the truth of visual features to generate countermeasures to loss, and updating the optimized network parameters via loss function to make the visual features pseudo
Figure BDA0002440299930000112
Closer and closer to the true visual feature x.
Constructing a discriminator D according to the characteristic information of the Wikipedia text and the sketch1Antagonistic loss function LCTGANAnd a discriminator D2Antagonistic loss function LadvConstructing L a circular consistency loss function from the true and pseudo visual features of the sketchcycThen a loss function L for classifying the label category is constructedcls
The specific updating optimization process comprises the following steps: the fixed generator network parameters are used for training the discriminator network to obtain a trained discriminator network model; and fixing the trained discriminator network model parameters, carrying out back propagation training on the generator network to obtain an optimized generator network model, and repeating the steps to obtain an optimal generation confrontation network model.
The zero sample image identification method based on the generation countermeasure network in the embodiment has the following advantages: introducing cycle consistent loss constraint of semantic alignment into a generation model to solve the problem that common semantic knowledge cannot be utilized between a training image and a testing image in a real scene, measuring the correlation between visual features and semantic features, and adding a classification network parallel to a discriminator at the output part of the discriminator to correctly classify class labels; performing antagonistic learning on the real feature and the synthesized feature by using a variant CTGAN of the WGAN, and adding a constencyterm on the basis of the WGAN so as to constrain the gradient of the distribution of the real feature; the zero sample learning has the problem that the training cost and the training complexity are high when the whole attribute set based on the features is identified, and the self-encoder extraction scheme based on the Wikipedia text and the hierarchical structure is proposed to extract the features of the subsets of the attributes, then the hierarchical structure is used for dividing the subsets, useful information is screened, and important feature information from the text is extracted, so that the training cost and the training complexity are reduced, and the zero sample learning is more effective in identifying the attribute subset than the whole attribute set.
The method in the embodiment adopts the generation of the countermeasure network to realize the identification of the zero sample, can identify the sketch without the labeled information, can improve the identification precision of the zero sample, and improves the generalization capability of the model.
Fig. 7 is a schematic structural diagram of a zero-sample image recognition system based on a generative countermeasure network according to an embodiment of the present invention. Referring to fig. 7, the zero-sample image recognition system based on the generation countermeasure network includes:
a sample obtaining module 201, configured to obtain a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information.
A network model construction module 202, configured to construct and generate a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features.
A loss function constructing module 203, configured to construct a multi-objective loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, a confrontation loss function of a semantic discriminator, a confrontation loss function of the visual discriminator and a classification loss function of the semantic discriminator.
The training module 204 is configured to use the training image sample as an input of the generated confrontation network model, and perform iterative training on the generated confrontation network model based on the multi-objective loss function to obtain a trained generated confrontation network model.
And the test recognition module 205 is configured to input the test image sample into the trained generated confrontation network model to obtain a recognition result.
As an optional implementation, the system for zero-sample image recognition based on generation of a countermeasure network further includes:
and the real semantic feature acquisition module is used for inputting the text in the Wikipedia into the layered model to obtain useful information of the text and inputting the useful information of the text into the self-encoder to obtain the real semantic features.
And the real visual feature acquisition module is used for inputting the training image sample into a CNN model based on an attention mechanism to obtain real visual features.
As an optional implementation manner, the network model building module 202 specifically includes:
the device comprises a first generator building unit, a semantic feature generator, a convolution module and a full-connection module, wherein the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules, the convolution modules comprise a convolution layer, a maximum pooling layer and a normalization layer which are sequentially connected, and the full-connection module comprises a full-connection layer and an L eaky Re L U layer.
The second generator building unit is used for building a visual feature generator, the visual feature generator comprises two groups of fully-connected modules, three layers of 4096-dimensional fully-connected layers, one resampling layer and five groups of up-sampling modules which are sequentially connected, each up-sampling module comprises two up-sampling layers and two L eaky Re L U layers, and the up-sampling layers in the up-sampling modules are alternately connected with the L eaky Re L U layers.
The first discriminator constructing unit is used for constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier.
A second discriminator establishing unit for establishing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
As an optional implementation manner, the loss function constructing module 203 specifically includes:
a first loss function construction unit for constructing a countermeasure loss function of the semantic discriminator
Figure BDA0002440299930000131
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,
Figure BDA0002440299930000132
representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;
Figure BDA0002440299930000133
representing a desire for a pseudo feature distribution;
Figure BDA0002440299930000134
an expectation representing a true feature distribution;
Figure BDA0002440299930000135
denotes the gradient penalty, λ, of enforcing the L ipschitz constraint2CT|x',x”A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty;λ2weights representing consistency or continuity terms; wherein the content of the first and second substances,
Figure BDA0002440299930000136
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents the semantic arbiter input as x', D (x ") represents the semantic arbiter input as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x" | | represents the distance between two perturbation data features.
A second loss function constructing unit for constructing a countering loss function of the visual discriminator
Figure BDA0002440299930000141
Wherein the content of the first and second substances,
Figure BDA0002440299930000142
representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,
Figure BDA0002440299930000143
representing input pseudo-semantic features
Figure BDA0002440299930000144
The visual sense generator of (a) is,
Figure BDA0002440299930000145
presentation input
Figure BDA0002440299930000146
The visual characteristics generator of (1).
A third loss function constructing unit for constructing a circular consistency loss function of the real visual features and the pseudo visual features
Figure BDA0002440299930000147
E[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;
Figure BDA0002440299930000148
representing the expectation of distribution of two semantic features measured by circular consistency.
A fourth loss function construction unit for constructing the classification loss function of the semantic discriminator
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
As an optional implementation manner, the training module 204 specifically includes:
and the training unit is used for taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
The zero sample image identification system based on the generation countermeasure network in the embodiment adopts the generation countermeasure network to realize the identification of the zero sample, can identify the sketch without the labeled information, can improve the identification precision of the zero sample, and improves the generalization capability of the model.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A zero sample image identification method based on a generation countermeasure network is characterized by comprising the following steps:
acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
constructing and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
taking the training image sample as the input of the generated countermeasure network model, and performing iterative training on the generated countermeasure network model based on the multi-target loss function to obtain a trained generated countermeasure network model;
and inputting the test image sample into the trained generated confrontation network model to obtain a recognition result.
2. The method for zero-sample image recognition based on generation of countermeasure network according to claim 1, further comprising, before the constructing of the generation of countermeasure network model:
inputting texts in Wikipedia into a layered model to obtain useful information of the texts, and inputting the useful information of the texts into a self-encoder to obtain real semantic features;
and inputting the training image sample into a CNN model based on an attention mechanism to obtain a real visual feature.
3. The method for zero-sample image recognition based on generation of a countermeasure network according to claim 1, wherein the constructing of the generation countermeasure network model specifically includes:
the method comprises the steps of constructing a semantic feature generator, wherein the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules, each convolution module comprises a convolution layer, a maximum pooling layer and a normalization layer which are sequentially connected, and each full-connection module comprises a full-connection layer and an L eaky Re L U layer;
the method comprises the steps of constructing a visual feature generator, wherein the visual feature generator comprises two groups of fully-connected modules, three layers of 4096-dimensional fully-connected layers, a resampling layer and five groups of up-sampling modules which are sequentially connected, each up-sampling module comprises two up-sampling layers and two L eakyRe L U layers, and the up-sampling layers in the up-sampling modules are alternately connected with the L eakyRe L U layers;
constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier;
constructing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
4. The method for zero-sample image recognition based on generation of a countermeasure network as claimed in claim 1, wherein the constructing of the multi-objective loss function specifically comprises:
construction of a penalty function for semantic discriminators
Figure FDA0002440299920000021
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,
Figure FDA0002440299920000022
representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;
Figure FDA0002440299920000023
representing a desire for a pseudo feature distribution;
Figure FDA0002440299920000024
an expectation representing a true feature distribution;
Figure FDA0002440299920000025
denotes the gradient penalty, λ, of enforcing the L ipschitz constraint2CT|x′,x″A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein the content of the first and second substances,
Figure FDA0002440299920000026
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x | | | represents the distance between two perturbation data features;
construction of a countering loss function for a visual arbiter
Figure FDA0002440299920000031
Wherein the content of the first and second substances,
Figure FDA0002440299920000032
representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,
Figure FDA0002440299920000033
representing input pseudo-semantic features
Figure FDA0002440299920000034
The visual sense generator of (a) is,
Figure FDA0002440299920000035
presentation input
Figure FDA0002440299920000036
The visual characteristics generator of (1);
constructing a circular consistency loss function of real visual features and pseudo visual features
Figure FDA0002440299920000037
E[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;
Figure FDA0002440299920000038
representing the distribution expectation of two semantic features measured by cycle consistency;
constructing a classification loss function for a semantic classifier
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
5. The method as claimed in claim 1, wherein the step of iteratively training the generated countermeasure network model based on the multi-objective loss function by using the training image samples as the input of the generated countermeasure network model to obtain the trained generated countermeasure network model specifically comprises:
and taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
6. A zero-sample image recognition system based on a generative countermeasure network, comprising:
the sample acquisition module is used for acquiring a training image sample and a test image sample; the training image sample is a sample image with marking information, and the test image sample is a sample image without marking information;
the network model building module is used for building and generating a confrontation network model; the generation countermeasure network model comprises a semantic feature generator, a visual feature generator, a semantic discriminator and a visual discriminator; the semantic feature generator is used for generating pseudo-semantic features according to the real visual features; the visual feature generator is used for generating a pseudo visual feature according to the pseudo semantic feature; the semantic discriminator is used for discriminating the real semantic features and the pseudo semantic features; the visual discriminator is used for discriminating the real visual features and the pseudo visual features;
the loss function construction module is used for constructing a multi-target loss function; the multi-target loss function comprises a cycle consistency loss function of a real visual feature and a pseudo visual feature, an antagonistic loss function of a semantic discriminator, an antagonistic loss function of the visual discriminator and a classification loss function of the semantic discriminator;
the training module is used for taking the training image sample as the input of the generated confrontation network model, and performing iterative training on the generated confrontation network model based on the multi-target loss function to obtain a trained generated confrontation network model;
and the test identification module is used for inputting the test image sample into the trained generated confrontation network model to obtain an identification result.
7. The system of claim 6, further comprising:
the real semantic feature acquisition module is used for inputting the text in the Wikipedia into the layered model to obtain useful information of the text and inputting the useful information of the text into the self-encoder to obtain real semantic features;
and the real visual feature acquisition module is used for inputting the training image sample into a CNN model based on an attention mechanism to obtain real visual features.
8. The system for zero-sample image recognition based on generation of a countermeasure network according to claim 6, wherein the network model construction module specifically comprises:
the device comprises a first generator construction unit, a semantic feature generator, a second generator construction unit and a third generator construction unit, wherein the semantic feature generator comprises two groups of convolution modules and two groups of full-connection modules, the convolution modules comprise convolution layers, a maximum pooling layer and a normalization layer which are sequentially connected, and the full-connection modules comprise full-connection layers and L eaky Re L U layers;
the second generator building unit is used for building a visual feature generator, wherein the visual feature generator comprises two groups of fully-connected modules, three 4096-dimensional fully-connected layers, a resampling layer and five groups of up-sampling modules which are sequentially connected, each up-sampling module comprises two up-sampling layers and two L eaky Re L U layers, and the up-sampling layers in the up-sampling modules are alternately connected with the L eaky Re L U layer;
the first discriminator constructing unit is used for constructing a semantic discriminator; the semantic discriminator comprises a group of full-connection modules, a two-path full-connection layer, an n-path full-connection layer, two classifiers and an input label classifier;
a second discriminator establishing unit for establishing a visual discriminator; the visual discriminator comprises a group of fully connected modules, a fully connected layer and a classifier.
9. The system for zero-sample image recognition based on generation of a countermeasure network according to claim 6, wherein the loss function construction module specifically comprises:
a first loss function construction unit for constructing a countermeasure loss function of the semantic discriminator
Figure FDA0002440299920000051
Where x represents the true visual features, a represents the true semantic features, G1(x) Semantic Generator, D, representing an input visual feature as x1(G1(x) Represents input G)1(x) Semantic discriminator of, D1(a) Semantic discriminator expressing input semantic features as a, PfA priori distribution, P, representing true visual featuresrA prior distribution representing the true semantic features,
Figure FDA0002440299920000052
representing linear interpolation between features, Pr,fRepresenting a prior distribution of obedient real visual features and real semantic features;
Figure FDA0002440299920000053
representing a desire for a pseudo feature distribution;
Figure FDA0002440299920000054
an expectation representing a true feature distribution;
Figure FDA0002440299920000055
denotes the gradient penalty, λ, of enforcing the L ipschitz constraint2CT|x′,x″A consistency or continuity term representing an increased constraint gradient penalty; lambda [ alpha ]1A weight representing a gradient penalty; lambda [ alpha ]2Weights representing consistency or continuity terms; wherein the content of the first and second substances,
Figure FDA0002440299920000056
x' and x "both represent perturbation data in the vicinity of the true visual feature; c is a fixed constant; d (x ') represents a semantic arbiter entered as x', D (x ") represents a semantic arbiter entered as x", | D (x ') -D (x ") | | represents the distance between two arbiter values, | | x' -x | | | represents the distance between two perturbation data features;
a second loss function constructing unit for constructing a countering loss function of the visual discriminator
Figure FDA0002440299920000061
Wherein the content of the first and second substances,
Figure FDA0002440299920000062
representing pseudo-semantic features, D2(x) A visual discriminator representing the input visual feature x,
Figure FDA0002440299920000063
representing input pseudo-semantic features
Figure FDA0002440299920000064
The visual sense generator of (a) is,
Figure FDA0002440299920000065
presentation input
Figure FDA0002440299920000066
The visual characteristics generator of (1);
a third loss function constructing unit for constructing a circular consistency loss function of the real visual features and the pseudo visual features
Figure FDA0002440299920000067
E[||G2(G1(x))-x||1]Representing a distribution expectation of two visual features measured by cyclic consistency;
Figure FDA0002440299920000068
representing the distribution expectation of two semantic features measured by cycle consistency;
a fourth loss function construction unit for constructing the classification loss function of the semantic discriminator
Lcls=-E[logP(b|G1(a);θ)];
Wherein, P (c | G)1(a) (ii) a θ) represents the class conditional probability of the class label, G1(a) And a semantic generator which represents the input semantic features as a, theta is a parameter of the classification network, and b is a class label of a.
10. The system of claim 6, wherein the training module specifically comprises:
and the training unit is used for taking the training image sample as the input of the semantic feature generator, and performing combined training on the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator in a back propagation mode according to the multi-target loss function, so that parameters in the semantic feature generator, the visual feature generator, the semantic discriminator and the visual discriminator are continuously updated and optimized, and a trained generated confrontation network model is obtained.
CN202010263452.4A 2020-04-07 2020-04-07 Zero sample image identification method and system based on generation countermeasure network Expired - Fee Related CN111476294B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010263452.4A CN111476294B (en) 2020-04-07 2020-04-07 Zero sample image identification method and system based on generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010263452.4A CN111476294B (en) 2020-04-07 2020-04-07 Zero sample image identification method and system based on generation countermeasure network

Publications (2)

Publication Number Publication Date
CN111476294A true CN111476294A (en) 2020-07-31
CN111476294B CN111476294B (en) 2022-03-22

Family

ID=71749908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010263452.4A Expired - Fee Related CN111476294B (en) 2020-04-07 2020-04-07 Zero sample image identification method and system based on generation countermeasure network

Country Status (1)

Country Link
CN (1) CN111476294B (en)

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950619A (en) * 2020-08-05 2020-11-17 东北林业大学 Active learning method based on dual-generation countermeasure network
CN112001122A (en) * 2020-08-26 2020-11-27 合肥工业大学 Non-contact physiological signal measuring method based on end-to-end generation countermeasure network
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112101470A (en) * 2020-09-18 2020-12-18 上海电力大学 Guide zero sample identification method based on multi-channel Gauss GAN
CN112149802A (en) * 2020-09-17 2020-12-29 广西大学 Image content conversion method with consistent semantic structure
CN112199637A (en) * 2020-09-21 2021-01-08 浙江大学 Regression modeling method for generating countermeasure network data enhancement based on regression attention
CN112232378A (en) * 2020-09-23 2021-01-15 中国人民解放军战略支援部队信息工程大学 Zero-order learning method for fMRI visual classification
CN112287779A (en) * 2020-10-19 2021-01-29 华南农业大学 Low-illuminance image natural illuminance reinforcing method and application
CN112308113A (en) * 2020-09-23 2021-02-02 济南浪潮高新科技投资发展有限公司 Target identification method, device and medium based on semi-supervision
CN112364138A (en) * 2020-10-12 2021-02-12 上海交通大学 Visual question-answer data enhancement method and device based on anti-attack technology
CN112364894A (en) * 2020-10-23 2021-02-12 天津大学 Zero sample image classification method of countermeasure network based on meta-learning
CN112415514A (en) * 2020-11-16 2021-02-26 北京环境特性研究所 Target SAR image generation method and device
CN112560034A (en) * 2020-12-11 2021-03-26 宿迁学院 Malicious code sample synthesis method and device based on feedback type deep countermeasure network
CN112580722A (en) * 2020-12-20 2021-03-30 大连理工大学人工智能大连研究院 Generalized zero sample image identification method based on conditional countermeasure automatic coding machine
CN112667496A (en) * 2020-12-14 2021-04-16 清华大学 Black box countermeasure test sample generation method and device based on multiple prior
CN112731327A (en) * 2020-12-25 2021-04-30 南昌航空大学 HRRP radar target identification method based on CN-LSGAN, STFT and CNN
CN112767507A (en) * 2021-01-15 2021-05-07 大连理工大学 Cartoon sketch coloring method based on dynamic memory module and generation confrontation network
CN112766386A (en) * 2021-01-25 2021-05-07 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network
CN112767505A (en) * 2020-12-31 2021-05-07 深圳市联影高端医疗装备创新研究院 Image processing method, training method, device, electronic terminal and storage medium
CN112818995A (en) * 2021-01-27 2021-05-18 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN113140020A (en) * 2021-05-13 2021-07-20 电子科技大学 Method for generating image based on text of countermeasure network generated by accompanying supervision
CN113191381A (en) * 2020-12-04 2021-07-30 云南大学 Image zero-order classification model based on cross knowledge and classification method thereof
CN113221948A (en) * 2021-04-13 2021-08-06 复旦大学 Digital slice image classification method based on countermeasure generation network and weak supervised learning
CN113222002A (en) * 2021-05-07 2021-08-06 西安交通大学 Zero sample classification method based on generative discriminative contrast optimization
CN113269274A (en) * 2021-06-18 2021-08-17 南昌航空大学 Zero sample identification method and system based on cycle consistency
CN113283423A (en) * 2021-01-29 2021-08-20 南京理工大学 Natural scene distortion text image correction method and system based on generation network
CN113361646A (en) * 2021-07-01 2021-09-07 中国科学技术大学 Generalized zero sample image identification method and model based on semantic information retention
CN113378959A (en) * 2021-06-24 2021-09-10 中国矿业大学 Zero sample learning method for generating countermeasure network based on semantic error correction
CN113505845A (en) * 2021-07-23 2021-10-15 黑龙江省博雅智睿科技发展有限责任公司 Deep learning training set image generation method based on language
CN113537322A (en) * 2021-07-02 2021-10-22 电子科技大学 Zero sample visual classification method for cross-modal semantic enhancement generation countermeasure network
CN113609569A (en) * 2021-07-01 2021-11-05 湖州师范学院 Discriminant generalized zero-sample learning fault diagnosis method
CN113642621A (en) * 2021-08-03 2021-11-12 南京邮电大学 Zero sample image classification method based on generation countermeasure network
CN113657272A (en) * 2021-08-17 2021-11-16 山东建筑大学 Micro-video classification method and system based on missing data completion
CN113706379A (en) * 2021-07-29 2021-11-26 山东财经大学 Interlayer interpolation method and system based on medical image processing
CN113706645A (en) * 2021-06-30 2021-11-26 酷栈(宁波)创意科技有限公司 Information processing method for landscape painting
CN113726545A (en) * 2021-06-23 2021-11-30 清华大学 Network traffic generation method and device for generating countermeasure network based on knowledge enhancement
CN113746087A (en) * 2021-08-19 2021-12-03 浙江大学 Power grid transient stability sample controllable generation and evaluation method and system based on CTGAN
CN113762180A (en) * 2021-09-13 2021-12-07 中国科学技术大学 Training method and system for human body activity imaging based on millimeter wave radar signals
CN113763442A (en) * 2021-09-07 2021-12-07 南昌航空大学 Deformable medical image registration method and system
CN113806584A (en) * 2021-09-17 2021-12-17 河海大学 Self-supervision cross-modal perception loss-based method for generating command actions of band
CN114005005A (en) * 2021-12-30 2022-02-01 深圳佑驾创新科技有限公司 Double-batch standardized zero-instance image classification method
CN114176549A (en) * 2021-12-23 2022-03-15 杭州电子科技大学 Fetal heart rate signal data enhancement method and device based on generative countermeasure network
CN114511737A (en) * 2022-01-24 2022-05-17 北京建筑大学 Training method of image recognition domain generalization model
WO2022142445A1 (en) * 2020-12-28 2022-07-07 ***股份有限公司 Model training method, and image quality evaluation method and apparatus
CN114723611A (en) * 2022-06-10 2022-07-08 季华实验室 Image reconstruction model training method, reconstruction method, device, equipment and medium
CN114757342A (en) * 2022-06-14 2022-07-15 南昌大学 Electronic data information evidence-obtaining method based on confrontation training
CN115222752A (en) * 2022-09-19 2022-10-21 之江实验室 Pathological image feature extractor training method and device based on feature decoupling
CN115314254A (en) * 2022-07-07 2022-11-08 中国人民解放军战略支援部队信息工程大学 Semi-supervised malicious flow detection method based on improved WGAN-GP
CN115424119A (en) * 2022-11-04 2022-12-02 之江实验室 Semantic fractal-based interpretable GAN image generation training method and device
CN115527216A (en) * 2022-11-09 2022-12-27 中国矿业大学(北京) Text image generation method based on modulation fusion and generation countermeasure network
CN116579414A (en) * 2023-03-24 2023-08-11 北京医准智能科技有限公司 Model training method, MRI thin layer data reconstruction method, device and equipment
CN117541883A (en) * 2024-01-09 2024-02-09 四川见山科技有限责任公司 Image generation model training, image generation method, system and electronic equipment
CN117610614A (en) * 2024-01-11 2024-02-27 四川大学 Attention-guided generation countermeasure network zero sample nuclear power seal detection method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875818A (en) * 2018-06-06 2018-11-23 西安交通大学 Based on variation from code machine and confrontation network integration zero sample image classification method
CN109460814A (en) * 2018-09-28 2019-03-12 浙江工业大学 A kind of deep learning classification method for attacking resisting sample function with defence
CN109816032A (en) * 2019-01-30 2019-05-28 中科人工智能创新技术研究院(青岛)有限公司 Zero sample classification method and apparatus of unbiased mapping based on production confrontation network
CN110334781A (en) * 2019-06-10 2019-10-15 大连理工大学 A kind of zero sample learning algorithm based on Res-Gan
CN110443293A (en) * 2019-07-25 2019-11-12 天津大学 Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing
CN110490946A (en) * 2019-07-15 2019-11-22 同济大学 Text generation image method based on cross-module state similarity and generation confrontation network
US20190378311A1 (en) * 2018-06-12 2019-12-12 Siemens Healthcare Gmbh Machine-Learned Network for Fourier Transform in Reconstruction for Medical Imaging
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875818A (en) * 2018-06-06 2018-11-23 西安交通大学 Based on variation from code machine and confrontation network integration zero sample image classification method
US20190378311A1 (en) * 2018-06-12 2019-12-12 Siemens Healthcare Gmbh Machine-Learned Network for Fourier Transform in Reconstruction for Medical Imaging
CN109460814A (en) * 2018-09-28 2019-03-12 浙江工业大学 A kind of deep learning classification method for attacking resisting sample function with defence
CN109816032A (en) * 2019-01-30 2019-05-28 中科人工智能创新技术研究院(青岛)有限公司 Zero sample classification method and apparatus of unbiased mapping based on production confrontation network
CN110334781A (en) * 2019-06-10 2019-10-15 大连理工大学 A kind of zero sample learning algorithm based on Res-Gan
CN110490946A (en) * 2019-07-15 2019-11-22 同济大学 Text generation image method based on cross-module state similarity and generation confrontation network
CN110443293A (en) * 2019-07-25 2019-11-12 天津大学 Based on double zero sample image classification methods for differentiating and generating confrontation network text and reconstructing
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUN-YAN ZHU等: ""Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks"", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
张桂梅等: ""基于去冗余特征和语义关系约束的零样本属性识别"", 《模式识别与人工智能》 *
张桂梅等: ""结合迁移引导和双向循环结构 GAN 的零样本文本识别"", 《模式识别与人工智能 》 *

Cited By (86)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950619A (en) * 2020-08-05 2020-11-17 东北林业大学 Active learning method based on dual-generation countermeasure network
CN112069397A (en) * 2020-08-21 2020-12-11 三峡大学 Rumor detection method combining self-attention mechanism with generation of confrontation network
CN112069397B (en) * 2020-08-21 2023-08-04 三峡大学 Rumor detection method combining self-attention mechanism and generation of countermeasure network
CN112001122A (en) * 2020-08-26 2020-11-27 合肥工业大学 Non-contact physiological signal measuring method based on end-to-end generation countermeasure network
CN112001122B (en) * 2020-08-26 2023-09-26 合肥工业大学 Non-contact physiological signal measurement method based on end-to-end generation countermeasure network
CN112149802A (en) * 2020-09-17 2020-12-29 广西大学 Image content conversion method with consistent semantic structure
CN112101470A (en) * 2020-09-18 2020-12-18 上海电力大学 Guide zero sample identification method based on multi-channel Gauss GAN
CN112101470B (en) * 2020-09-18 2023-04-11 上海电力大学 Guide zero sample identification method based on multi-channel Gauss GAN
CN112199637A (en) * 2020-09-21 2021-01-08 浙江大学 Regression modeling method for generating countermeasure network data enhancement based on regression attention
CN112199637B (en) * 2020-09-21 2024-04-12 浙江大学 Regression modeling method for generating contrast network data enhancement based on regression attention
CN112232378A (en) * 2020-09-23 2021-01-15 中国人民解放军战略支援部队信息工程大学 Zero-order learning method for fMRI visual classification
CN112308113A (en) * 2020-09-23 2021-02-02 济南浪潮高新科技投资发展有限公司 Target identification method, device and medium based on semi-supervision
CN112364138A (en) * 2020-10-12 2021-02-12 上海交通大学 Visual question-answer data enhancement method and device based on anti-attack technology
CN112287779A (en) * 2020-10-19 2021-01-29 华南农业大学 Low-illuminance image natural illuminance reinforcing method and application
CN112287779B (en) * 2020-10-19 2022-03-25 华南农业大学 Low-illuminance image natural illuminance reinforcing method and application
CN112364894B (en) * 2020-10-23 2022-07-08 天津大学 Zero sample image classification method of countermeasure network based on meta-learning
CN112364894A (en) * 2020-10-23 2021-02-12 天津大学 Zero sample image classification method of countermeasure network based on meta-learning
CN112415514A (en) * 2020-11-16 2021-02-26 北京环境特性研究所 Target SAR image generation method and device
CN112415514B (en) * 2020-11-16 2023-05-02 北京环境特性研究所 Target SAR image generation method and device
CN113191381A (en) * 2020-12-04 2021-07-30 云南大学 Image zero-order classification model based on cross knowledge and classification method thereof
CN112560034A (en) * 2020-12-11 2021-03-26 宿迁学院 Malicious code sample synthesis method and device based on feedback type deep countermeasure network
CN112560034B (en) * 2020-12-11 2024-03-29 宿迁学院 Malicious code sample synthesis method and device based on feedback type deep countermeasure network
CN112667496B (en) * 2020-12-14 2022-11-18 清华大学 Black box countermeasure test sample generation method and device based on multiple prior
CN112667496A (en) * 2020-12-14 2021-04-16 清华大学 Black box countermeasure test sample generation method and device based on multiple prior
CN112580722A (en) * 2020-12-20 2021-03-30 大连理工大学人工智能大连研究院 Generalized zero sample image identification method based on conditional countermeasure automatic coding machine
CN112731327A (en) * 2020-12-25 2021-04-30 南昌航空大学 HRRP radar target identification method based on CN-LSGAN, STFT and CNN
CN112731327B (en) * 2020-12-25 2023-05-23 南昌航空大学 HRRP radar target identification method based on CN-LSGAN, STFT and CNN
WO2022142445A1 (en) * 2020-12-28 2022-07-07 ***股份有限公司 Model training method, and image quality evaluation method and apparatus
CN112767505B (en) * 2020-12-31 2023-12-22 深圳市联影高端医疗装备创新研究院 Image processing method, training device, electronic terminal and storage medium
CN112767505A (en) * 2020-12-31 2021-05-07 深圳市联影高端医疗装备创新研究院 Image processing method, training method, device, electronic terminal and storage medium
CN112767507A (en) * 2021-01-15 2021-05-07 大连理工大学 Cartoon sketch coloring method based on dynamic memory module and generation confrontation network
CN112767507B (en) * 2021-01-15 2022-11-18 大连理工大学 Cartoon sketch coloring method based on dynamic memory module and generation confrontation network
CN112766386A (en) * 2021-01-25 2021-05-07 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network
CN112766386B (en) * 2021-01-25 2022-09-20 大连理工大学 Generalized zero sample learning method based on multi-input multi-output fusion network
CN112818995A (en) * 2021-01-27 2021-05-18 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112818995B (en) * 2021-01-27 2024-05-21 北京达佳互联信息技术有限公司 Image classification method, device, electronic equipment and storage medium
CN113283423B (en) * 2021-01-29 2022-08-16 南京理工大学 Natural scene distortion text image correction method and system based on generation network
CN113283423A (en) * 2021-01-29 2021-08-20 南京理工大学 Natural scene distortion text image correction method and system based on generation network
CN113221948A (en) * 2021-04-13 2021-08-06 复旦大学 Digital slice image classification method based on countermeasure generation network and weak supervised learning
CN113221948B (en) * 2021-04-13 2022-08-05 复旦大学 Digital slice image classification method based on countermeasure generation network and weak supervised learning
CN113222002B (en) * 2021-05-07 2024-04-05 西安交通大学 Zero sample classification method based on generative discriminative contrast optimization
CN113222002A (en) * 2021-05-07 2021-08-06 西安交通大学 Zero sample classification method based on generative discriminative contrast optimization
CN113140020A (en) * 2021-05-13 2021-07-20 电子科技大学 Method for generating image based on text of countermeasure network generated by accompanying supervision
CN113269274A (en) * 2021-06-18 2021-08-17 南昌航空大学 Zero sample identification method and system based on cycle consistency
CN113726545A (en) * 2021-06-23 2021-11-30 清华大学 Network traffic generation method and device for generating countermeasure network based on knowledge enhancement
CN113378959B (en) * 2021-06-24 2022-03-15 中国矿业大学 Zero sample learning method for generating countermeasure network based on semantic error correction
CN113378959A (en) * 2021-06-24 2021-09-10 中国矿业大学 Zero sample learning method for generating countermeasure network based on semantic error correction
CN113706645A (en) * 2021-06-30 2021-11-26 酷栈(宁波)创意科技有限公司 Information processing method for landscape painting
CN113609569B (en) * 2021-07-01 2023-06-09 湖州师范学院 Distinguishing type generalized zero sample learning fault diagnosis method
CN113609569A (en) * 2021-07-01 2021-11-05 湖州师范学院 Discriminant generalized zero-sample learning fault diagnosis method
CN113361646A (en) * 2021-07-01 2021-09-07 中国科学技术大学 Generalized zero sample image identification method and model based on semantic information retention
CN113537322A (en) * 2021-07-02 2021-10-22 电子科技大学 Zero sample visual classification method for cross-modal semantic enhancement generation countermeasure network
CN113537322B (en) * 2021-07-02 2023-04-18 电子科技大学 Zero sample visual classification method for cross-modal semantic enhancement generation countermeasure network
CN113505845A (en) * 2021-07-23 2021-10-15 黑龙江省博雅智睿科技发展有限责任公司 Deep learning training set image generation method based on language
CN113706379B (en) * 2021-07-29 2023-05-26 山东财经大学 Interlayer interpolation method and system based on medical image processing
CN113706379A (en) * 2021-07-29 2021-11-26 山东财经大学 Interlayer interpolation method and system based on medical image processing
CN113642621A (en) * 2021-08-03 2021-11-12 南京邮电大学 Zero sample image classification method based on generation countermeasure network
CN113657272A (en) * 2021-08-17 2021-11-16 山东建筑大学 Micro-video classification method and system based on missing data completion
CN113746087B (en) * 2021-08-19 2023-03-21 浙江大学 Power grid transient stability sample controllable generation and evaluation method and system based on CTGAN
CN113746087A (en) * 2021-08-19 2021-12-03 浙江大学 Power grid transient stability sample controllable generation and evaluation method and system based on CTGAN
CN113763442A (en) * 2021-09-07 2021-12-07 南昌航空大学 Deformable medical image registration method and system
CN113763442B (en) * 2021-09-07 2023-06-13 南昌航空大学 Deformable medical image registration method and system
CN113762180A (en) * 2021-09-13 2021-12-07 中国科学技术大学 Training method and system for human body activity imaging based on millimeter wave radar signals
CN113762180B (en) * 2021-09-13 2023-09-01 中国科学技术大学 Training method and system for human body activity imaging based on millimeter wave radar signals
CN113806584A (en) * 2021-09-17 2021-12-17 河海大学 Self-supervision cross-modal perception loss-based method for generating command actions of band
CN114176549B (en) * 2021-12-23 2024-04-16 杭州电子科技大学 Fetal heart rate signal data enhancement method and device based on generation type countermeasure network
CN114176549A (en) * 2021-12-23 2022-03-15 杭州电子科技大学 Fetal heart rate signal data enhancement method and device based on generative countermeasure network
CN114005005A (en) * 2021-12-30 2022-02-01 深圳佑驾创新科技有限公司 Double-batch standardized zero-instance image classification method
CN114511737A (en) * 2022-01-24 2022-05-17 北京建筑大学 Training method of image recognition domain generalization model
CN114723611A (en) * 2022-06-10 2022-07-08 季华实验室 Image reconstruction model training method, reconstruction method, device, equipment and medium
CN114757342A (en) * 2022-06-14 2022-07-15 南昌大学 Electronic data information evidence-obtaining method based on confrontation training
CN114757342B (en) * 2022-06-14 2022-09-09 南昌大学 Electronic data information evidence-obtaining method based on confrontation training
CN115314254B (en) * 2022-07-07 2023-06-23 中国人民解放军战略支援部队信息工程大学 Semi-supervised malicious traffic detection method based on improved WGAN-GP
CN115314254A (en) * 2022-07-07 2022-11-08 中国人民解放军战略支援部队信息工程大学 Semi-supervised malicious flow detection method based on improved WGAN-GP
CN115222752B (en) * 2022-09-19 2023-01-24 之江实验室 Pathological image feature extractor training method and device based on feature decoupling
CN115222752A (en) * 2022-09-19 2022-10-21 之江实验室 Pathological image feature extractor training method and device based on feature decoupling
CN115424119B (en) * 2022-11-04 2023-03-24 之江实验室 Image generation training method and device capable of explaining GAN based on semantic fractal
CN115424119A (en) * 2022-11-04 2022-12-02 之江实验室 Semantic fractal-based interpretable GAN image generation training method and device
CN115527216B (en) * 2022-11-09 2023-05-23 中国矿业大学(北京) Text image generation method based on modulation fusion and antagonism network generation
CN115527216A (en) * 2022-11-09 2022-12-27 中国矿业大学(北京) Text image generation method based on modulation fusion and generation countermeasure network
CN116579414B (en) * 2023-03-24 2024-04-02 浙江医准智能科技有限公司 Model training method, MRI thin layer data reconstruction method, device and equipment
CN116579414A (en) * 2023-03-24 2023-08-11 北京医准智能科技有限公司 Model training method, MRI thin layer data reconstruction method, device and equipment
CN117541883A (en) * 2024-01-09 2024-02-09 四川见山科技有限责任公司 Image generation model training, image generation method, system and electronic equipment
CN117541883B (en) * 2024-01-09 2024-04-09 四川见山科技有限责任公司 Image generation model training, image generation method, system and electronic equipment
CN117610614A (en) * 2024-01-11 2024-02-27 四川大学 Attention-guided generation countermeasure network zero sample nuclear power seal detection method
CN117610614B (en) * 2024-01-11 2024-03-22 四川大学 Attention-guided generation countermeasure network zero sample nuclear power seal detection method

Also Published As

Publication number Publication date
CN111476294B (en) 2022-03-22

Similar Documents

Publication Publication Date Title
CN111476294B (en) Zero sample image identification method and system based on generation countermeasure network
CN110147457B (en) Image-text matching method, device, storage medium and equipment
CN108875818B (en) Zero sample image classification method based on combination of variational self-coding machine and antagonistic network
CN110059217B (en) Image text cross-media retrieval method for two-stage network
CN111881262B (en) Text emotion analysis method based on multi-channel neural network
CN110232395B (en) Power system fault diagnosis method based on fault Chinese text
CN109766277A (en) A kind of software fault diagnosis method based on transfer learning and DNN
KR102095892B1 (en) Method, apparatus and system for determining similarity of patent documents using artificial intelligence model
CN112732916A (en) BERT-based multi-feature fusion fuzzy text classification model
CN113204675B (en) Cross-modal video time retrieval method based on cross-modal object inference network
CN113254678A (en) Training method of cross-media retrieval model, cross-media retrieval method and equipment thereof
Tang et al. Class-level prototype guided multiscale feature learning for remote sensing scene classification with limited labels
CN113378919B (en) Image description generation method for fusing visual sense and enhancing multilayer global features
Xie et al. Writer-independent online signature verification based on 2D representation of time series data using triplet supervised network
Huang et al. An effective multimodal representation and fusion method for multimodal intent recognition
Nijhawan et al. VTnet+ Handcrafted based approach for food cuisines classification
CN113420833A (en) Visual question-answering method and device based on question semantic mapping
Hu et al. Decouple the object: Component-level semantic recognizer for point clouds classification
CN115640418A (en) Cross-domain multi-view target website retrieval method and device based on residual semantic consistency
CN113723111B (en) Small sample intention recognition method, device, equipment and storage medium
Li et al. Evaluating BERT on cloud-edge time series forecasting and sentiment analysis via prompt learning
Gong et al. KDCTime: Knowledge distillation with calibration on InceptionTime for time-series classification
Zhang et al. Deep captioning with attention-based visual concept transfer mechanism for enriching description
Raboh et al. Learning latent scene-graph representations for referring relationships
Wang et al. Contrastive embedding-based feature generation for generalized zero-shot learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220322

CF01 Termination of patent right due to non-payment of annual fee