CN112529772B - Unsupervised image conversion method under zero sample setting - Google Patents

Unsupervised image conversion method under zero sample setting Download PDF

Info

Publication number
CN112529772B
CN112529772B CN202011501620.5A CN202011501620A CN112529772B CN 112529772 B CN112529772 B CN 112529772B CN 202011501620 A CN202011501620 A CN 202011501620A CN 112529772 B CN112529772 B CN 112529772B
Authority
CN
China
Prior art keywords
attribute
image
category
space
image conversion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011501620.5A
Other languages
Chinese (zh)
Other versions
CN112529772A (en
Inventor
陈元祺
余晓铭
刘杉
李革
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Original Assignee
Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Instritute Of Intelligent Video Audio Technology Longgang Shenzhen filed Critical Instritute Of Intelligent Video Audio Technology Longgang Shenzhen
Priority to CN202011501620.5A priority Critical patent/CN112529772B/en
Publication of CN112529772A publication Critical patent/CN112529772A/en
Application granted granted Critical
Publication of CN112529772B publication Critical patent/CN112529772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

An unsupervised image conversion method under zero sample setting includes applying attribute-visual association constraint and expanding attribute space with unseen attribute, wherein applying attribute-visual association constraint and expanding attribute space with unseen attribute are performed synchronously. By applying attribute-visual association constraints and expanding attribute space with unseen attributes, the model is prompted to fully utilize the attribute features of the category, thereby enabling unsupervised image conversion under zero samples.

Description

Unsupervised image conversion method under zero sample setting
Technical Field
The invention relates to the field of image generation and image conversion, in particular to an unsupervised image conversion method under zero sample setting.
Background
In recent years, with the development of a generation countermeasure network, a generation model is receiving more and more attention. On the one hand, the generation model based on the generation of the countermeasure network shows a surprisingly good generation effect, these generated images being both high-resolution and visually to a sufficient extent to be spurious; on the other hand, as the famous physicist, afman, only if i can create it, i really understand it. Although machine learning models in recent years are excellent in tasks such as image classification, the success of these applications does not suggest that we actually understand images and actually realize intelligence. Being able to generate images is significant for further understanding of the images.
Image-to-image translation is a branch in the generative model that is subject to the conditional generative model and is input on the condition of the input image. It is investigated how to transform an image from one domain to a corresponding image in another domain. For example, an image photographed during the daytime is converted into a night scene while the scene is kept unchanged. This is a challenging task, firstly, the output of the model should have both authenticity and characteristics of the target domain to which it is to be attached; second, the model should keep the output of the input individual characteristics and should not make the converted result in a complete other picture. The problem in the second point is also called mode collapse, i.e. the output collapses into a few modes, the network outputs the same single result even if different inputs are provided to the network.
The above problems are well addressed in a supervised situation. When having paired data sets (like daytime and nighttime images of the same scene), it is possible to approximate the truth image corresponding to the constraint image after the transition from the source domain to the target domain. However, in many scenarios in reality, paired samples are often not available at low cost, even if not present. In this case, how to train the image conversion model unsupervised is a difficulty. Furthermore, the pattern collapse problem of image conversion is particularly acute when some classes have an insufficient number of samples, even no samples at all. In summary, at zero sample settings, unsupervised image conversion is a challenging problem.
Disclosure of Invention
The invention provides an unsupervised image conversion method under zero sample setting, which realizes unsupervised image conversion under zero sample.
The technical scheme of the invention is as follows:
The invention relates to an unsupervised image conversion method under zero sample setting, which comprises the steps of applying attribute-visual association constraint and expanding an attribute space by using invisible attribute, wherein the attribute-visual association constraint is applied and the attribute space is expanded by using invisible attribute synchronously.
Preferably, in the above-mentioned unsupervised image conversion method under the zero sample setting, applying the attribute-visual association constraint includes the steps of: two visible category attributes a m and a n are sampled from attribute space and the relevance of the two is calculatedAccording to the adaptive instance normalization (AdaIN) method of style migration, the visual features w m、wn of the visual space determined by two seen category attributes a m and a n are calculated, and the correlation/>, of both are calculatedApplying an association constraint: constraint regularization term L reg=||s(am,an)-s(wm,wn)||2 is applied to the relevance of the two visible category attributes a m and a n and the relevance of the visual feature w m、wn determined by the two visible category attributes a m and a n.
Preferably, in the above-mentioned unsupervised image conversion method under the zero sample setting, the attribute space is extended by using the unseen attribute, and the method includes the following steps: the unseen category attribute a u and the input image x i are sampled and the image x t is generated by a generator: by loss functionConstraining image x t to have features of unseen category attribute a u; and performing attribute regression by using the discriminator, and expanding the attribute space.
According to the technical scheme of the invention, the beneficial effects are that:
According to the method, attribute-visual relevance constraint is applied, and the unseen attribute is utilized to expand the attribute space, so that the model can be fully utilized to the attribute characteristics of the category, and unsupervised image conversion under a zero sample is realized.
For a better understanding and explanation of the conception, working principle and inventive effect of the present invention, the present invention will be explained in detail below by means of specific embodiments with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application.
Fig. 1 is a general block diagram of an unsupervised image conversion method under a zero sample setting of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the present invention.
The image conversion model related to the unsupervised image conversion method under the zero sample setting of the invention is based on the generation of an countermeasure network and comprises a generator and a discriminator (shown in figure 1). Training the generation of the countermeasure network is a maximum and minimum game process in which the goal of the generator is to generate enough samples to confuse the arbiter with spurious artifacts; the arbiter then attempts to distinguish samples from the real data distribution from the generated samples. Training to the steady phase, the generator will be able to have a higher quality sample, and the arbiter will also have difficulty distinguishing it from a true sample.
In the case of zero samples, there is a portion of the class missing image sample data, referred to as the unseen class. For unseen categories, the only thing to know is the attribute characteristics they each hold. The image conversion model under zero sample aims at inputting the image to be converted and the category attribute and converting the image into the target category. The key to the problem is how to migrate knowledge of the visible category to the invisible category and how to cause the model to transform with the category attributes.
The working principle of the invention is as follows: by applying attribute-visual association constraints and expanding attribute space with unseen attributes, the model is prompted to fully utilize the attribute features of the category, thereby enabling unsupervised image conversion under zero samples.
Attribute-visual association constraints refer to both the association maintained for an attribute pair of an attribute space and the association of a converted image pair according to the attribute pair, the constraints being consistent. Because there are no image samples in the training phase for the unseen categories, attribute vectors need to be utilized to provide efficient guidance for image conversion. By introducing attribute-visual relevance constraints, the structure of maintaining attribute space in the learned visual space is guided, thereby facilitating the image conversion for the unseen category.
Extending the attribute space with the unseen attributes refers to applying the attributes of the unseen categories to the training process. For unseen categories, although their image samples cannot be obtained, the corresponding category attributes can be obtained. In the training phase, the invisible property is substituted into the image conversion model together with the input image, and the invisible property is caused to be captured from the converted image. The strategy can better avoid mapping deviation in zero sample image conversion, namely, a conversion model for an unseen category is biased to be converted to a similar seen category, so that conversion performance gap between the seen category and the unseen category is reduced.
Fig. 1 is a general framework diagram of an unsupervised image conversion method under zero sample setting of the present invention, and as shown, the method of the present invention includes two strategies, namely strategy 1: applying attribute-visual association constraints, policy 2: the attribute space is extended with unseen attributes, where both policies are done synchronously.
Wherein applying the attribute-visual association constraint comprises the steps of:
1) Two visible category attributes a m and a n (i.e., attribute 1 and attribute 2 in the attribute space of FIG. 1) are sampled from the attribute space and the correlation of the two is calculated
2) According to the adaptive instance normalization (AdaIN) method of style migration, the visual features w m、wn of the visual space (corresponding to image 1 and image 2 in the visual space of FIG. 1) determined by two seen category attributes a m、an are calculated, and the correlation of the two is calculatedAnd
3) Applying an association constraint: constraint regularization term L reg=||s(am,an)-s(wm,wn)||2 is applied to the relevance of the two visible category attributes a m and a n and the relevance of the visual feature w m、wn determined by the two visible category attributes a m、an.
Extending the attribute space with unseen attributes, comprising the steps of:
1) Sampling the unseen category attribute a u and the input image x i, and generating an image x t by using a generator;
2) By loss function Constraining the generated image x t to have the characteristics of the unseen category attribute a u; and
3) And carrying out attribute regression by utilizing a discriminator, and expanding the attribute space.
Compared with the existing image conversion method, the method provided by the invention has better conversion accuracy and better generation quality. The two concepts of conversion accuracy and generation quality in image conversion and the associated evaluation index are explained below, respectively.
Conversion accuracy: it is measured whether an image is subordinate to the domain to be converted after conversion. The probability that the converted image belongs to the target domain is judged by a pre-trained classifier, and the evaluation indexes comprise Top-1 classification accuracy and Top-5 classification accuracy, namely, for a picture, if the previous (or the previous five) of the probability contains a correct answer, the correct accuracy is considered.
The generation quality is as follows: and measuring whether the converted image has higher image quality. The evaluation index is classified into objective evaluation and subjective evaluation. Fries Lei Qie perceived distance (FID) is a commonly used objective assessment of the quality of production. To calculate the FID of the image conversion model, a batch of converted images is first generated using the model and a batch of images is sampled from the dataset as a comparison. Then, the features of the two batches of images are extracted, their statistical properties are calculated, and the difference of the distribution between the generated image and the real image is measured based on the statistical properties, and is used as an evaluation of the quality of the generated image. For subjective evaluation, the conversion results of several models are often presented to the testee at the same time, so that the testee can pick out an image with the highest quality. After a large number of tests are carried out, the model with higher selection rate is obtained, and the model has higher generation quality.
As shown in Table 1, the objective index comparison of the results of the present invention with other algorithms includes conversion accuracy for both the visible and invisible categories and the generation of a quality index FID. Compared with the existing model (FUNIT-1 and FUNIT-5 are not in zero sample setting and are in unfair contrast, and StarGAN is in zero sample setting), the invention can obtain better effect on both CUB and FLO data sets, wherein the promotion is more remarkable for the unseen category.
TABLE 1 comparison of objective indicators of the results of the present invention with other algorithms
As shown in table 2, for subjective evaluation, when the conversion results of several models are presented to the subject at the same time, the present invention is chosen to be much higher than StarGAN, which is also at zero sample setting; the present invention also presents competitive results for FUNIT-1 and FUNIT-5 at a low sample setting.
Table 2 shows the subjective index comparison of the results of the present invention with other algorithms.
Model CUB data set FLO data set
FUNIT-1 27.8% 21.8%
FUNIT-5 34.2% 27.8%
StarGAN 7.8% 14.3%
The invention is that 30.2% 36.1%
The foregoing is only illustrative of the present invention and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., within the spirit and principles of the present invention.

Claims (2)

1. An unsupervised image conversion method under zero sample setting, comprising applying attribute-visual association constraint and expanding attribute space by using invisible attribute, wherein said applying attribute-visual association constraint and said expanding attribute space by using invisible attribute are performed synchronously;
The applying attribute-visual association constraint comprises the steps of:
Sampling two visible category attributes from attribute space And/>And calculates the correlation/>; According to the adaptive instance normalization (AdaIN) method of style migration, the class attributes seen by the two are computed/>And/>Visual features of the determined visual space/>And calculates the correlation/>Applying an association constraint: for the two seen category attributes/>And/>And by the two seen category attributes/>And/>Determined visual characteristics/>、/>Constraint regularization term/>
2. The method for unsupervised image conversion under zero sample setting according to claim 1, wherein the expanding the attribute space with unseen attributes comprises the steps of:
sample unobserved category attribute And input image/>Generating an image/>, using a generator; By loss functionConstraining the image/>To have the unseen category attribute/>Is characterized by (2); and performing attribute regression by using the discriminator, and expanding the attribute space.
CN202011501620.5A 2020-12-18 2020-12-18 Unsupervised image conversion method under zero sample setting Active CN112529772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011501620.5A CN112529772B (en) 2020-12-18 2020-12-18 Unsupervised image conversion method under zero sample setting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011501620.5A CN112529772B (en) 2020-12-18 2020-12-18 Unsupervised image conversion method under zero sample setting

Publications (2)

Publication Number Publication Date
CN112529772A CN112529772A (en) 2021-03-19
CN112529772B true CN112529772B (en) 2024-05-28

Family

ID=75001328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011501620.5A Active CN112529772B (en) 2020-12-18 2020-12-18 Unsupervised image conversion method under zero sample setting

Country Status (1)

Country Link
CN (1) CN112529772B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI769820B (en) * 2021-05-19 2022-07-01 鴻海精密工業股份有限公司 Method for optimizing the generative adversarial network and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740888A (en) * 2016-01-26 2016-07-06 天津大学 Joint embedded model for zero sample learning
CN109359670A (en) * 2018-09-18 2019-02-19 北京工业大学 A kind of individual strength of association automatic testing method based on traffic big data
CN109582960A (en) * 2018-11-27 2019-04-05 上海交通大学 The zero learn-by-example method based on structured asso- ciation semantic embedding
CN109598279A (en) * 2018-09-27 2019-04-09 天津大学 Based on the zero sample learning method for generating network from coding confrontation
CN110097095A (en) * 2019-04-15 2019-08-06 天津大学 A kind of zero sample classification method generating confrontation network based on multiple view
CN110163796A (en) * 2019-05-29 2019-08-23 北方民族大学 A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740888A (en) * 2016-01-26 2016-07-06 天津大学 Joint embedded model for zero sample learning
CN109359670A (en) * 2018-09-18 2019-02-19 北京工业大学 A kind of individual strength of association automatic testing method based on traffic big data
CN109598279A (en) * 2018-09-27 2019-04-09 天津大学 Based on the zero sample learning method for generating network from coding confrontation
CN109582960A (en) * 2018-11-27 2019-04-05 上海交通大学 The zero learn-by-example method based on structured asso- ciation semantic embedding
CN110097095A (en) * 2019-04-15 2019-08-06 天津大学 A kind of zero sample classification method generating confrontation network based on multiple view
CN110163796A (en) * 2019-05-29 2019-08-23 北方民族大学 A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly
CN110795585A (en) * 2019-11-12 2020-02-14 福州大学 Zero sample image classification model based on generation countermeasure network and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
图像风格迁移方法研究;侯玉兵;;中国新通信;20200905(第17期);全文 *

Also Published As

Publication number Publication date
CN112529772A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
Wang et al. Minegan: effective knowledge transfer from gans to target domains with few images
Bielski et al. Emergence of object segmentation in perturbed generative models
CN110163258B (en) Zero sample learning method and system based on semantic attribute attention redistribution mechanism
CN111754596B (en) Editing model generation method, device, equipment and medium for editing face image
CN110378844B (en) Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network
CN109711426B (en) Pathological image classification device and method based on GAN and transfer learning
US9152926B2 (en) Systems, methods, and media for updating a classifier
CN112784929A (en) Small sample image classification method and device based on double-element group expansion
CN110543832A (en) Electroencephalogram data classification method based on random forest and convolutional neural network
Teo et al. Fair generative models via transfer learning
CN111027610B (en) Image feature fusion method, apparatus, and medium
CN112529772B (en) Unsupervised image conversion method under zero sample setting
CN115952493A (en) Reverse attack method and attack device for black box model and storage medium
KR20200058295A (en) Method and Device of High Magnetic Field Magnetic Resonance Image Synthesis
CN112995433B (en) Time sequence video generation method and device, computing equipment and storage medium
CN112488238B (en) Hybrid anomaly detection method based on countermeasure self-encoder
US20220101145A1 (en) Training energy-based variational autoencoders
Tayyub et al. Explaining deep neural networks for point clouds using gradient-based visualisations
WO2021171384A1 (en) Clustering device, clustering method, and clustering program
JP7148078B2 (en) Attribute estimation device, attribute estimation method, attribute estimator learning device, and program
Lyu et al. DeCapsGAN: generative adversarial capsule network for image denoising
KR20210107261A (en) Method and apparatus for clustering data by using latent vector
CN113486925B (en) Model training method, fundus image generation method, model evaluation method and device
Khalesi et al. Synthetic Data Augmentation to Aid Small Training Datasets
EP4386657A1 (en) Image optimization method and apparatus, electronic device, medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant