CN112529772B - Unsupervised image conversion method under zero sample setting - Google Patents
Unsupervised image conversion method under zero sample setting Download PDFInfo
- Publication number
- CN112529772B CN112529772B CN202011501620.5A CN202011501620A CN112529772B CN 112529772 B CN112529772 B CN 112529772B CN 202011501620 A CN202011501620 A CN 202011501620A CN 112529772 B CN112529772 B CN 112529772B
- Authority
- CN
- China
- Prior art keywords
- attribute
- image
- category
- space
- image conversion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000000007 visual effect Effects 0.000 claims description 11
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000013508 migration Methods 0.000 claims description 3
- 230000005012 migration Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 description 8
- 238000012549 training Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 description 1
- 101100370075 Mus musculus Top1 gene Proteins 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
An unsupervised image conversion method under zero sample setting includes applying attribute-visual association constraint and expanding attribute space with unseen attribute, wherein applying attribute-visual association constraint and expanding attribute space with unseen attribute are performed synchronously. By applying attribute-visual association constraints and expanding attribute space with unseen attributes, the model is prompted to fully utilize the attribute features of the category, thereby enabling unsupervised image conversion under zero samples.
Description
Technical Field
The invention relates to the field of image generation and image conversion, in particular to an unsupervised image conversion method under zero sample setting.
Background
In recent years, with the development of a generation countermeasure network, a generation model is receiving more and more attention. On the one hand, the generation model based on the generation of the countermeasure network shows a surprisingly good generation effect, these generated images being both high-resolution and visually to a sufficient extent to be spurious; on the other hand, as the famous physicist, afman, only if i can create it, i really understand it. Although machine learning models in recent years are excellent in tasks such as image classification, the success of these applications does not suggest that we actually understand images and actually realize intelligence. Being able to generate images is significant for further understanding of the images.
Image-to-image translation is a branch in the generative model that is subject to the conditional generative model and is input on the condition of the input image. It is investigated how to transform an image from one domain to a corresponding image in another domain. For example, an image photographed during the daytime is converted into a night scene while the scene is kept unchanged. This is a challenging task, firstly, the output of the model should have both authenticity and characteristics of the target domain to which it is to be attached; second, the model should keep the output of the input individual characteristics and should not make the converted result in a complete other picture. The problem in the second point is also called mode collapse, i.e. the output collapses into a few modes, the network outputs the same single result even if different inputs are provided to the network.
The above problems are well addressed in a supervised situation. When having paired data sets (like daytime and nighttime images of the same scene), it is possible to approximate the truth image corresponding to the constraint image after the transition from the source domain to the target domain. However, in many scenarios in reality, paired samples are often not available at low cost, even if not present. In this case, how to train the image conversion model unsupervised is a difficulty. Furthermore, the pattern collapse problem of image conversion is particularly acute when some classes have an insufficient number of samples, even no samples at all. In summary, at zero sample settings, unsupervised image conversion is a challenging problem.
Disclosure of Invention
The invention provides an unsupervised image conversion method under zero sample setting, which realizes unsupervised image conversion under zero sample.
The technical scheme of the invention is as follows:
The invention relates to an unsupervised image conversion method under zero sample setting, which comprises the steps of applying attribute-visual association constraint and expanding an attribute space by using invisible attribute, wherein the attribute-visual association constraint is applied and the attribute space is expanded by using invisible attribute synchronously.
Preferably, in the above-mentioned unsupervised image conversion method under the zero sample setting, applying the attribute-visual association constraint includes the steps of: two visible category attributes a m and a n are sampled from attribute space and the relevance of the two is calculatedAccording to the adaptive instance normalization (AdaIN) method of style migration, the visual features w m、wn of the visual space determined by two seen category attributes a m and a n are calculated, and the correlation/>, of both are calculatedApplying an association constraint: constraint regularization term L reg=||s(am,an)-s(wm,wn)||2 is applied to the relevance of the two visible category attributes a m and a n and the relevance of the visual feature w m、wn determined by the two visible category attributes a m and a n.
Preferably, in the above-mentioned unsupervised image conversion method under the zero sample setting, the attribute space is extended by using the unseen attribute, and the method includes the following steps: the unseen category attribute a u and the input image x i are sampled and the image x t is generated by a generator: by loss functionConstraining image x t to have features of unseen category attribute a u; and performing attribute regression by using the discriminator, and expanding the attribute space.
According to the technical scheme of the invention, the beneficial effects are that:
According to the method, attribute-visual relevance constraint is applied, and the unseen attribute is utilized to expand the attribute space, so that the model can be fully utilized to the attribute characteristics of the category, and unsupervised image conversion under a zero sample is realized.
For a better understanding and explanation of the conception, working principle and inventive effect of the present invention, the present invention will be explained in detail below by means of specific embodiments with reference to the drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application.
Fig. 1 is a general block diagram of an unsupervised image conversion method under a zero sample setting of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the present invention.
The image conversion model related to the unsupervised image conversion method under the zero sample setting of the invention is based on the generation of an countermeasure network and comprises a generator and a discriminator (shown in figure 1). Training the generation of the countermeasure network is a maximum and minimum game process in which the goal of the generator is to generate enough samples to confuse the arbiter with spurious artifacts; the arbiter then attempts to distinguish samples from the real data distribution from the generated samples. Training to the steady phase, the generator will be able to have a higher quality sample, and the arbiter will also have difficulty distinguishing it from a true sample.
In the case of zero samples, there is a portion of the class missing image sample data, referred to as the unseen class. For unseen categories, the only thing to know is the attribute characteristics they each hold. The image conversion model under zero sample aims at inputting the image to be converted and the category attribute and converting the image into the target category. The key to the problem is how to migrate knowledge of the visible category to the invisible category and how to cause the model to transform with the category attributes.
The working principle of the invention is as follows: by applying attribute-visual association constraints and expanding attribute space with unseen attributes, the model is prompted to fully utilize the attribute features of the category, thereby enabling unsupervised image conversion under zero samples.
Attribute-visual association constraints refer to both the association maintained for an attribute pair of an attribute space and the association of a converted image pair according to the attribute pair, the constraints being consistent. Because there are no image samples in the training phase for the unseen categories, attribute vectors need to be utilized to provide efficient guidance for image conversion. By introducing attribute-visual relevance constraints, the structure of maintaining attribute space in the learned visual space is guided, thereby facilitating the image conversion for the unseen category.
Extending the attribute space with the unseen attributes refers to applying the attributes of the unseen categories to the training process. For unseen categories, although their image samples cannot be obtained, the corresponding category attributes can be obtained. In the training phase, the invisible property is substituted into the image conversion model together with the input image, and the invisible property is caused to be captured from the converted image. The strategy can better avoid mapping deviation in zero sample image conversion, namely, a conversion model for an unseen category is biased to be converted to a similar seen category, so that conversion performance gap between the seen category and the unseen category is reduced.
Fig. 1 is a general framework diagram of an unsupervised image conversion method under zero sample setting of the present invention, and as shown, the method of the present invention includes two strategies, namely strategy 1: applying attribute-visual association constraints, policy 2: the attribute space is extended with unseen attributes, where both policies are done synchronously.
Wherein applying the attribute-visual association constraint comprises the steps of:
1) Two visible category attributes a m and a n (i.e., attribute 1 and attribute 2 in the attribute space of FIG. 1) are sampled from the attribute space and the correlation of the two is calculated
2) According to the adaptive instance normalization (AdaIN) method of style migration, the visual features w m、wn of the visual space (corresponding to image 1 and image 2 in the visual space of FIG. 1) determined by two seen category attributes a m、an are calculated, and the correlation of the two is calculatedAnd
3) Applying an association constraint: constraint regularization term L reg=||s(am,an)-s(wm,wn)||2 is applied to the relevance of the two visible category attributes a m and a n and the relevance of the visual feature w m、wn determined by the two visible category attributes a m、an.
Extending the attribute space with unseen attributes, comprising the steps of:
1) Sampling the unseen category attribute a u and the input image x i, and generating an image x t by using a generator;
2) By loss function Constraining the generated image x t to have the characteristics of the unseen category attribute a u; and
3) And carrying out attribute regression by utilizing a discriminator, and expanding the attribute space.
Compared with the existing image conversion method, the method provided by the invention has better conversion accuracy and better generation quality. The two concepts of conversion accuracy and generation quality in image conversion and the associated evaluation index are explained below, respectively.
Conversion accuracy: it is measured whether an image is subordinate to the domain to be converted after conversion. The probability that the converted image belongs to the target domain is judged by a pre-trained classifier, and the evaluation indexes comprise Top-1 classification accuracy and Top-5 classification accuracy, namely, for a picture, if the previous (or the previous five) of the probability contains a correct answer, the correct accuracy is considered.
The generation quality is as follows: and measuring whether the converted image has higher image quality. The evaluation index is classified into objective evaluation and subjective evaluation. Fries Lei Qie perceived distance (FID) is a commonly used objective assessment of the quality of production. To calculate the FID of the image conversion model, a batch of converted images is first generated using the model and a batch of images is sampled from the dataset as a comparison. Then, the features of the two batches of images are extracted, their statistical properties are calculated, and the difference of the distribution between the generated image and the real image is measured based on the statistical properties, and is used as an evaluation of the quality of the generated image. For subjective evaluation, the conversion results of several models are often presented to the testee at the same time, so that the testee can pick out an image with the highest quality. After a large number of tests are carried out, the model with higher selection rate is obtained, and the model has higher generation quality.
As shown in Table 1, the objective index comparison of the results of the present invention with other algorithms includes conversion accuracy for both the visible and invisible categories and the generation of a quality index FID. Compared with the existing model (FUNIT-1 and FUNIT-5 are not in zero sample setting and are in unfair contrast, and StarGAN is in zero sample setting), the invention can obtain better effect on both CUB and FLO data sets, wherein the promotion is more remarkable for the unseen category.
TABLE 1 comparison of objective indicators of the results of the present invention with other algorithms
As shown in table 2, for subjective evaluation, when the conversion results of several models are presented to the subject at the same time, the present invention is chosen to be much higher than StarGAN, which is also at zero sample setting; the present invention also presents competitive results for FUNIT-1 and FUNIT-5 at a low sample setting.
Table 2 shows the subjective index comparison of the results of the present invention with other algorithms.
Model | CUB data set | FLO data set |
FUNIT-1 | 27.8% | 21.8% |
FUNIT-5 | 34.2% | 27.8% |
StarGAN | 7.8% | 14.3% |
The invention is that | 30.2% | 36.1% |
The foregoing is only illustrative of the present invention and is not to be construed as limiting thereof, but rather as various modifications, equivalent arrangements, improvements, etc., within the spirit and principles of the present invention.
Claims (2)
1. An unsupervised image conversion method under zero sample setting, comprising applying attribute-visual association constraint and expanding attribute space by using invisible attribute, wherein said applying attribute-visual association constraint and said expanding attribute space by using invisible attribute are performed synchronously;
The applying attribute-visual association constraint comprises the steps of:
Sampling two visible category attributes from attribute space And/>And calculates the correlation/>; According to the adaptive instance normalization (AdaIN) method of style migration, the class attributes seen by the two are computed/>And/>Visual features of the determined visual space/>And calculates the correlation/>Applying an association constraint: for the two seen category attributes/>And/>And by the two seen category attributes/>And/>Determined visual characteristics/>、/>Constraint regularization term/>。
2. The method for unsupervised image conversion under zero sample setting according to claim 1, wherein the expanding the attribute space with unseen attributes comprises the steps of:
sample unobserved category attribute And input image/>Generating an image/>, using a generator; By loss functionConstraining the image/>To have the unseen category attribute/>Is characterized by (2); and performing attribute regression by using the discriminator, and expanding the attribute space.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011501620.5A CN112529772B (en) | 2020-12-18 | 2020-12-18 | Unsupervised image conversion method under zero sample setting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011501620.5A CN112529772B (en) | 2020-12-18 | 2020-12-18 | Unsupervised image conversion method under zero sample setting |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112529772A CN112529772A (en) | 2021-03-19 |
CN112529772B true CN112529772B (en) | 2024-05-28 |
Family
ID=75001328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011501620.5A Active CN112529772B (en) | 2020-12-18 | 2020-12-18 | Unsupervised image conversion method under zero sample setting |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112529772B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI769820B (en) * | 2021-05-19 | 2022-07-01 | 鴻海精密工業股份有限公司 | Method for optimizing the generative adversarial network and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105740888A (en) * | 2016-01-26 | 2016-07-06 | 天津大学 | Joint embedded model for zero sample learning |
CN109359670A (en) * | 2018-09-18 | 2019-02-19 | 北京工业大学 | A kind of individual strength of association automatic testing method based on traffic big data |
CN109582960A (en) * | 2018-11-27 | 2019-04-05 | 上海交通大学 | The zero learn-by-example method based on structured asso- ciation semantic embedding |
CN109598279A (en) * | 2018-09-27 | 2019-04-09 | 天津大学 | Based on the zero sample learning method for generating network from coding confrontation |
CN110097095A (en) * | 2019-04-15 | 2019-08-06 | 天津大学 | A kind of zero sample classification method generating confrontation network based on multiple view |
CN110163796A (en) * | 2019-05-29 | 2019-08-23 | 北方民族大学 | A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly |
CN110795585A (en) * | 2019-11-12 | 2020-02-14 | 福州大学 | Zero sample image classification model based on generation countermeasure network and method thereof |
-
2020
- 2020-12-18 CN CN202011501620.5A patent/CN112529772B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105740888A (en) * | 2016-01-26 | 2016-07-06 | 天津大学 | Joint embedded model for zero sample learning |
CN109359670A (en) * | 2018-09-18 | 2019-02-19 | 北京工业大学 | A kind of individual strength of association automatic testing method based on traffic big data |
CN109598279A (en) * | 2018-09-27 | 2019-04-09 | 天津大学 | Based on the zero sample learning method for generating network from coding confrontation |
CN109582960A (en) * | 2018-11-27 | 2019-04-05 | 上海交通大学 | The zero learn-by-example method based on structured asso- ciation semantic embedding |
CN110097095A (en) * | 2019-04-15 | 2019-08-06 | 天津大学 | A kind of zero sample classification method generating confrontation network based on multiple view |
CN110163796A (en) * | 2019-05-29 | 2019-08-23 | 北方民族大学 | A kind of image generating method and frame that unsupervised multi-modal confrontation encodes certainly |
CN110795585A (en) * | 2019-11-12 | 2020-02-14 | 福州大学 | Zero sample image classification model based on generation countermeasure network and method thereof |
Non-Patent Citations (1)
Title |
---|
图像风格迁移方法研究;侯玉兵;;中国新通信;20200905(第17期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112529772A (en) | 2021-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Minegan: effective knowledge transfer from gans to target domains with few images | |
Bielski et al. | Emergence of object segmentation in perturbed generative models | |
CN110163258B (en) | Zero sample learning method and system based on semantic attribute attention redistribution mechanism | |
CN111754596B (en) | Editing model generation method, device, equipment and medium for editing face image | |
CN110378844B (en) | Image blind motion blur removing method based on cyclic multi-scale generation countermeasure network | |
CN109711426B (en) | Pathological image classification device and method based on GAN and transfer learning | |
US9152926B2 (en) | Systems, methods, and media for updating a classifier | |
CN112784929A (en) | Small sample image classification method and device based on double-element group expansion | |
CN110543832A (en) | Electroencephalogram data classification method based on random forest and convolutional neural network | |
Teo et al. | Fair generative models via transfer learning | |
CN111027610B (en) | Image feature fusion method, apparatus, and medium | |
CN112529772B (en) | Unsupervised image conversion method under zero sample setting | |
CN115952493A (en) | Reverse attack method and attack device for black box model and storage medium | |
KR20200058295A (en) | Method and Device of High Magnetic Field Magnetic Resonance Image Synthesis | |
CN112995433B (en) | Time sequence video generation method and device, computing equipment and storage medium | |
CN112488238B (en) | Hybrid anomaly detection method based on countermeasure self-encoder | |
US20220101145A1 (en) | Training energy-based variational autoencoders | |
Tayyub et al. | Explaining deep neural networks for point clouds using gradient-based visualisations | |
WO2021171384A1 (en) | Clustering device, clustering method, and clustering program | |
JP7148078B2 (en) | Attribute estimation device, attribute estimation method, attribute estimator learning device, and program | |
Lyu et al. | DeCapsGAN: generative adversarial capsule network for image denoising | |
KR20210107261A (en) | Method and apparatus for clustering data by using latent vector | |
CN113486925B (en) | Model training method, fundus image generation method, model evaluation method and device | |
Khalesi et al. | Synthetic Data Augmentation to Aid Small Training Datasets | |
EP4386657A1 (en) | Image optimization method and apparatus, electronic device, medium, and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |