CN113902613A - Image style migration system and method based on three-branch clustering semantic segmentation - Google Patents
Image style migration system and method based on three-branch clustering semantic segmentation Download PDFInfo
- Publication number
- CN113902613A CN113902613A CN202111399319.2A CN202111399319A CN113902613A CN 113902613 A CN113902613 A CN 113902613A CN 202111399319 A CN202111399319 A CN 202111399319A CN 113902613 A CN113902613 A CN 113902613A
- Authority
- CN
- China
- Prior art keywords
- image
- style
- points
- content
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 238000013508 migration Methods 0.000 title claims abstract description 77
- 230000005012 migration Effects 0.000 title claims abstract description 77
- 230000011218 segmentation Effects 0.000 title claims abstract description 54
- 230000008569 process Effects 0.000 claims abstract description 40
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 29
- 238000007781 pre-processing Methods 0.000 claims abstract description 16
- 238000000605 extraction Methods 0.000 claims abstract description 14
- 238000005259 measurement Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 19
- 230000009466 transformation Effects 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 13
- 230000000694 effects Effects 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 10
- 238000005192 partition Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000013434 data augmentation Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 7
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 claims description 4
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 3
- 230000003321 amplification Effects 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000011524 similarity measure Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000003786 synthesis reaction Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims description 2
- 238000013519 translation Methods 0.000 claims description 2
- 238000013136 deep learning model Methods 0.000 abstract 1
- 230000005764 inhibitory process Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000010428 oil painting Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image style migration system and method based on three-branch clustering semantic segmentation, which comprises the following steps: image preprocessing, image semantic segmentation, extraction of image content and style features, style matching, and similarity measurement of images. The semantic segmentation technology is adopted, and the problem of style overflow possibly generated in the style migration process is effectively solved by the application of the semantic segmentation technology in the image style migration; the used MUNIT model belongs to an unsupervised deep learning model, a paired data set is not needed, images in various styles can be produced, and the diversity requirements of users are met to a great extent; the step of adopting an image similarity measurement algorithm based on SSIM indexes realizes generation inhibition of similar style images, meets the diversity requirement and simultaneously ensures the stability and effectiveness of the whole system.
Description
Technical Field
The invention belongs to the technical field of image processing and pattern recognition, and particularly relates to an image style migration system and method based on three-branch clustering semantic segmentation.
Background
The method has a special application in the application of the deep neural network, namely, the style migration of images, the image style migration is developed by Gatys, Johnson and the like, and the stylization of the images can obtain a satisfactory result under specific conditions. At present, popular image style migration algorithms are mainly divided into two types, one is slow style migration based on image iteration, and the other is fast style migration based on model iteration. Model iteration based methods include feed forward stylized model based and GAN based methods. The representative work based on the feedforward stylized model is mainly two, namely the work of Johnson et al and Ulianov et al, while the GAN-based methods are more in variety and have advantages and disadvantages in different scenes. The method can be well expressed in scenes with unobtrusive semantic information, but the problem of semantic mismatching is easy to occur in scenes with sensitive semantics, so that the application of the semantic segmentation technology in image style migration is of great significance.
The semantic segmentation combines image classification, target detection and image segmentation, and the segmented image with semantic annotation can be finally obtained by segmenting the image into region blocks with certain semantic meanings and identifying the semantic category of each region block. The application of combining the semantic segmentation technology and the image style migration is still few, and most of the current researches are focused on respectively improving the precision of semantic segmentation and the speed of image style migration.
Chinese patent publication No.: CN 112950454 a, name: the invention discloses an image style migration method based on multi-scale semantic matching. The method is mainly characterized in that multi-scale depth features in a content image and a style image are extracted.
Although the above method can obtain a good style migration effect, the following problems still remain: 1) paired data sets are difficult to collect or even cannot be acquired, and great limitation is brought to image style migration; 2) after training, results of only one style can be obtained, and the diversity requirements of users cannot be met; 3) only the similarity of the overall style of the image is considered, and the specific style of a specific object cannot be reserved; 4) the problem of style overflow exists, so that the harmony and the appreciation of the whole image are damaged; 5) other clustering method technical schemes have the problem that the clustering effect is not good and the style migration effect is influenced in most models using the clustering method in the image style migration process 6) images with higher similarity exist among the images with various styles output by other technical schemes.
Disclosure of Invention
Referring to fig. 1, the present invention is directed to overcome the defects of the prior art, and provides an image style migration system based on three-branch clustering semantic segmentation and a method thereof, which can effectively solve some problems existing in the image style migration process, and make up for the technical shortages thereof; before the image style migration, firstly, semantic information in the image is extracted, and in the image style migration process, the result obtained by semantic segmentation is matched with the semantic information in the target image, so that the purpose of overall style migration is achieved.
In order to solve the technical problems, the invention adopts the following technical scheme.
The invention relates to an image style migration system based on three-branch clustering semantic segmentation, which comprises the following steps:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching, and the process comprises the following steps: normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm; the pixel value normalization processing is to convert the image into a standard form to resist subsequent affine transformation; the K-means algorithm takes the obtained clustering center as the initial input of a K nearest neighbor algorithm which is improved subsequently; the improved k nearest neighbor algorithm introduces the concept of three-branch clustering into the k nearest neighbor algorithm, sets different discrimination rules for a core domain and a discrimination domain, and distributes labels for sample points in two steps; the points needing to be distributed in the boundary domain are points which are not distributed to the labels after the labels are distributed to the core domain, namely the points which cannot be distinguished by the core domain are classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
The characteristic extraction module is used for simultaneously extracting low-order and high-order characteristics of the content image and the style image and inputting the characteristics into a characteristic synthesis network to obtain an image fusing the content characteristics and the style characteristics;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects; including a content encoder, a pattern encoder, and a joint decoder; the content encoder is composed of a plurality of convolutional layers for downsampling the input and further processing using a residual block, all of which are followed by an instance normalization that acts to remove the original feature mean and variance representing the style information; the style encoder comprises a plurality of convolutional layers, an average pooling layer and a full-link layer; the joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
The image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system, and comprises: and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
The image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
The invention discloses an image style migration method based on three-branch clustering semantic segmentation, which comprises the following steps of:
step 1, image preprocessing: adding Gaussian noise to the original image; expanding the sample set by using a data augmentation method;
step 2, semantic segmentation: performing semantic segmentation on the image by a K-means three-branch clustering method improved by K neighbors to obtain semantic images of different objects in the image;
step 3, feature extraction: extracting the content and style characteristics of the image by using a MUNIT model;
step 4, style matching: in order to fully integrate semantic information, the style matching network is divided into a semantic matching sub-network and a style integration sub-network; the two sub-networks can fully utilize the semantic information image obtained in the step 2;
step 5, measuring image similarity: and (3) calculating similarity values between different images pairwise by adopting an SSIM similarity measurement function, so that optimization is performed in the generated images of different styles, and a plurality of images with low similarity are further screened out and finally displayed to the user for output.
Further, the step 1 image preprocessing process includes:
step 1.1, Gaussian noise is added; preprocessing the content image to construct a content image IcThe size and the channel number of the Gaussian noise matrix are the same, and the noise matrix is added with the original image to obtain an image containing Gaussian noise, and the image is used as a content input image; for any point (x) of a channel in the content imagei,yi) The pixel value of which can be expressed asz, the probability density function of gaussian noise is:
wherein z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
step 1.2, data augmentation; by adopting any one or more of scaling transformation, clipping, color transformation, rotation and translation, a series of random changes are made on the training images to generate similar but different training samples, so that the scale of the training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
Further, the step 2 semantic segmentation process includes:
step 2.1, pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
step 2.2.K-means algorithm to obtain clustering center: selecting K points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the K points, namely: dividing data into K clusters to finish one-time division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
then, obtaining a relatively reasonable clustering center, and making early preparation for subsequently dividing a core domain, namely a boundary domain;
step 2.3, core domain category label distribution: the idea of three-branch clustering is introduced to assist decision making, and the three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, Tgamma (C) is the outer region,is an empty set;
namely, the three areas are mutually exclusive and have no intersection;
an improved k nearest neighbor algorithm is used, and the idea of three-branch clustering is introduced, so that labels are distributed to sample points except for a sample clustering center, and the clustering effect is achieved;
the K-nearest neighbor algorithm is characterized in that the distance between one point and all other points is calculated, the K points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the K points, and the distance between the point and the point is generally the Euclidean distance, and the formula is as follows:
where ρ is the Euclidean distance between two points, (x)1,y1) And(x2,y2) Any two points are included;
therefore, K points closest to a certain point are obtained and called as change point neighborhood points, then a shared neighborhood is obtained according to the fields of the two points, and preparation work is prepared for label distribution of subsequent core domain points and boundary domain points;
if the outer region is not considered, the manner of the core domain and the edge domain category labels should be different;
label assignment of core domain points: the discrimination formula of the core domain point is as follows
Wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
label assignment of boundary domain points: the method is a process for redistributing points which are not allocated with labels in core domain allocation, and comprises the steps of forming an allocation matrix M for recording the types of all neighborhood points of a certain point, taking the cluster where the neighborhood points are located most, and allocating the labels to the points which are not allocated with the labels.
Further, in the step 3, the process of extracting the content and style features includes;
the MUNIT model is an extension to the UNIT model, which is called conversion between multi-modal data; UNIT considers that different data sets can share the same hidden space, and the MUNIT model further divides the hidden space into a content hidden space and a style hidden space, wherein the style hidden space is a space for measuring the difference between an original image and a target image;
the coding stage is composed of two self-coders as same as the UNIT model, and is different from the prior art that the coding stage is mapped to a hidden space through two parts of networks and is decomposed into the characteristics of two parts of content and style in the hidden space; then the reconstruction is also done from these two parts in the decoding phase; the whole process requires the content and style loss to be minimized, and the loss function is defined as follows:
wherein the content of the first and second substances,in order to counteract the loss of resistance,for loss of reconfigurability, λx,λc,λsTo control the weight of the importance of the reconstruction term.
Further, in step 4, the style matching process includes:
performing style migration on semantic information obtained based on semantic segmentation, namely migration among objects of the same class;
in order to incorporate semantic information, firstly, the semantic mask of the original image is correspondingly downsampled, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
when the style matches the sub-network part, the cosine similarity is used for judgment, and the formula is as follows:
where phi is a function for extracting the features of the image block,is a characteristic of the style of the target image,the style characteristics of the style image.
Further, in step 5, the process of measuring the similarity of the image includes:
the structural similarity index SSIM is used for measuring the similarity between two images and is often used for evaluating the image restoration condition after image restoration modeling;
the SSIM index extracts three main features of brightness, contrast, and structure from an image to compare the images, and from the specific implementation point of view, the brightness of an image is characterized by a mean value, the contrast is characterized by a variance, and the structure is characterized by a correlation coefficient, and the specific formula is as follows:
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention applies the semantic segmentation technology improved by the methods of K-means, KNN, three-branch clustering and the like to the system of image style migration, introduces the method based on SSIM index image similarity measurement in the system, and is used for enhancing the contrast intensity among different results and inhibiting similar images from being repeatedly generated.
2. The method can effectively solve the problem of style overflow which often occurs in the image style migration process by utilizing the improved semantic segmentation algorithm, and the MUNIT model is also fused and used for effectively solving the problems that a matched data set is lacked before model training and only a single style image can be generated after the model training.
3. The image similarity measurement method introduced in the later stage can ensure that the whole system cannot output images with various styles, which have similar styles or cannot be directly observed by naked eyes to have differences, and effectively improves the diversity and stability characteristics of style migration while ensuring the style migration effect.
4. The image style migration system and method based on three-branch clustering semantic segmentation can be applied to style migration of traditional clothes, oil paintings and ceramic patterns, and are beneficial to developing traditional culture in China and further promoting vigorous development of the culture industry.
Drawings
Fig. 1 is a system block diagram and a method flowchart of an embodiment of the present invention.
Fig. 2 is a landscape artistic drawing with the characteristic of the skrit sky produced by the invention, and the original drawing is a color image which is processed and changed into a gray image.
Fig. 3 is a result obtained after image preprocessing according to an embodiment of the present invention, in which fig. 3a is an original image; FIG. 3b is an image with Gaussian noise added; FIG. 3c is the image of the original image after being flipped; the original image is a color image, which is now processed to be a grayscale image.
Fig. 4 is a diagram of semantic segmentation input/output according to an embodiment of the present invention, where fig. 4a is an original diagram, and fig. 4b is an image obtained by semantic segmentation.
FIG. 5 is a schematic diagram of the MUNIT model hidden space according to the present invention; the original image is a color image, which is now processed to be a grayscale image.
FIG. 6 is a diagram of a MUNIT model self-encoder structure according to the present invention.
Fig. 7 is two exemplary graphs for calculating image similarity in an embodiment of the present invention, where fig. 7a is a pre-style migration image,
FIG. 7b is an image after style migration; the original image is a color image, which is now processed to be a grayscale image.
Detailed description of the preferred embodiments
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in FIG. 2, the image style migration system based on the three-branch clustering semantic segmentation provided by the invention can finally obtain various and artistic-interest-rich style images by applying the improved K-means clustering method to the MUNIT model.
As shown in fig. 1, an image style migration system based on three-branch clustering semantic segmentation of the present invention includes:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching;
the content and style feature extraction module is used for simultaneously extracting low-order and high-order features of the content image and the style image and inputting the features into a feature synthesis network to obtain an image fused with the content feature and the style feature;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects;
the image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system;
the image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
The image preprocessing module is used for the following processes:
(1) adding Gaussian noise to sample set images
Gaussian noise is added to all images in the initial sample set according to the following formula:
wherein: z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
(2) sample set data augmentation
The insufficient number of samples is usually an important factor influencing the training effect of the model and the implementation effect of the whole system, and the data augmentation method generates similar but different training samples by randomly changing a series of training images, so that the scale of a training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
The semantic segmentation module is used for the following processes, including:
normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm;
(1) the main purpose of the pixel value normalization process is to transform the image into a standard form to resist subsequent affine transformations.
Pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
(2) the K-means algorithm mainly aims at taking the obtained clustering center as the initial input of a subsequent improved KNN algorithm; selecting k points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the k points, namely: dividing data into k clusters to finish one division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
(3) the improved KNN algorithm has the main idea that the concept of three-branch clustering is introduced into the KNN algorithm
Three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, Tgamma (C) is the outer region,is an empty set;
the KNN algorithm is characterized in that the distance between one point and all other points is calculated, k points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the k points, and the distance between the point and the point is generally determined by using the Euclidean distance, and the formula is as follows:
where ρ is the Euclidean distance between two points, (x)1,y1) And (x)2,y2) Any two points are included;
the specific method of label distribution is to set different discrimination rules for the core domain and the discrimination domain, and distribute labels for the sample points in two steps; the discrimination formula of the core domain is as follows:
wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
the point needing to be distributed in the boundary domain is a point which is not distributed with the label after the label is distributed to the core domain, namely, the point which cannot be distinguished by the core domain is classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
The content and style feature extraction module comprises:
a content encoder, a pattern encoder, and a joint decoder; the content encoder is composed of a plurality of convolutional layers for down-sampling the input and further processing using a residual block, all of which are followed by instance normalization, which is mainly used to remove the original feature mean and variance representing the style information; the style encoder is composed of a plurality of convolutional layers, an average pooling layer and a full link layer. The joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
The loss function is defined as follows:
wherein the content of the first and second substances,in order to counteract the loss of resistance,for loss of reconfigurability, λx,λc,λsTo control the weight of the importance of the reconstruction term.
The style matching module comprises a semantic matching sub-module and a style fusion sub-module.
(1) The semantic matching sub-module is mainly used for matching the segmented image with each semantic and constructing an image style, firstly, a semantic mask of an original image is correspondingly downsampled in order to integrate semantic information, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
(2) the semantic matching sub-module mainly aims at matching on the granularity of the image block, and judges by using cosine similarity when the style matches the sub-network part, and the formula is as follows:
where phi is a function for extracting the features of the image block,is a characteristic of the style of the target image,the style characteristics of the style image.
The image similarity measurement module is used for the following processes, including:
and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
The three main characteristics of the image are brightness characteristic, contrast characteristic and structural characteristic, and the specific formula is as follows:
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
The invention discloses an image style migration method based on three-branch clustering semantic segmentation, which comprises the following steps of:
first, preprocessing of the image.
Fig. 3a and 3b are respectively the gaussian noise adding operation and the image flipping operation, as shown in fig. 3.
1. Adding Gaussian noise
In order to enable uniform texture to appear in the output image background after the image style migration, the content image is required to be preprocessed to construct a texture I corresponding to the content imagecAnd (characteristic dimension) and the same channel number, and adding the noise matrix and the original image to obtain an image containing Gaussian noise.
2. Data augmentation
In order to solve the problem of poor image style migration effect caused by insufficient image data, data needs to be augmented, similar but different training samples are generated by carrying out a series of random changes on training images, so that the scale of a training data set is enlarged, the dependence of a model on certain attributes is greatly reduced, and the generalization capability of the model is improved.
Second step, semantic segmentation
The whole semantic segmentation process is as follows:
1. pixel value normalization processing:
as shown in fig. 4a, the image has 499 x 701 pixels and the pixel values of the pixels of the image are initially in the range of 0-255, as shown in the following table a, which maps the pixel values of all the pixels into the range of 0-1, and the following table b is the processed data:
obtaining a clustering center by a K-means method:
k-means basic procedure:
assume that the input of the normalized algorithm is data ═ { point ═ point1,point2,...pointmAnd the class number is k, the maximum iteration time is set to be N, and then the output of the sample is a division of an original sample set, namely, C1,C2,...Ck}。
(1) Selecting k objects from the data as initial clustering centers { μ1,μ2,...μk}
(2) For each iteration, the distance of each cluster object to the cluster center is calculated to partition the partition criteria as follows:
a) partition of initialization clusters CkNot equal to empty set, t 1,2
b) For each point in the sample set, the sample point is calculated using the following formulaiAnd each cluster center point mujThe distance of (c):
dij=||pointi-μj||2 (12)
x is to beiMarked as minimum dijThe corresponding cluster is changed to Cλi=Cλi∪{pointi}.
(3) and repeating for multiple times until the clustering center is not changed or the maximum iteration number is reached, otherwise, continuing to repeat.
3. Core domain class label assignment
The manner of core domain and edge domain class labels should not be the same.
Obtaining k neighborhood points of each point according to a KNN algorithm, wherein the algorithm comprises the following steps:
1) calculating the distance between the test data and each training data;
2) sorting according to the increasing relation of the distances;
3) selecting K points with the minimum distance between each point as neighborhood points of the point;
4) calculating and storing a point shared in the neighborhood between every two points, wherein the point is a shared domain;
the KNN algorithm python is realized by:
according to the obtained clustering center and k adjacent points of the clustering center, distributing labels to the adjacent points according to the following rules:
label assignment of core domain points:
wherein | KNNP (this, next) | is the number of two points sharing domain points.
If the above rules are satisfied, the label is assigned the same label as the cluster center. If not, the label is classified as unallocated.
And then distributing labels to the neighborhood points of the neighbor points of the clustering center, and so on until the neighborhood points do not exist any more.
4. Boundary domain label assignment
The assignment of labels is made to points not completed in the previous step until no more points are assigned labels.
The boundary domain point label allocation rule is as follows:
label assignment for boundary domain points
As shown in fig. 4b, the process of clustering the pixel points of one image is completed, and different objects in the image can be identified according to different categories, so that semantic information of one image is obtained.
And thirdly, extracting the content and style characteristics.
The MUNIT model training process comprises the following steps: as shown in fig. 5 and 6.
Before the MUNIT model is formally trained, parameters of the model need to be configured, and the parameter table is as follows:
the training of the MUNIT model is mainly divided into three processes, namely forward propagation of the network, generation of the model and optimization, identification of the model and optimization, and integration of the optimization of the generator and the discriminator in an initialization function.
The network forward transmission is a coding process, firstly, coding is carried out on two input pictures to respectively obtain content codes and style codes of the two pictures, then, the two pictures are interchanged, and then, noise conforming to normal distribution is added to generate new pictures x _ ab and x _ ba. A network model, i.e. a mapping between the two data sets, is generated, and an authentication model identifies whether the generated image is consistent with the distribution of the other data set. Various parameters of the discriminator and the generator, a learning rate attenuation strategy and a corresponding optimizer are also defined in the initialization function; also loaded is the VGG loaded model used to calculate the perceived loss.
The main loss is divided into 4 parts:
1. loss between reconstructed picture and real picture
2. Calculation loss of late code obtained by reconstructing picture coding and late code obtained by real picture coding
3. The picture is translated to a target domain and then returned to calculate the loss with the original picture
4. Computing domain aware loss using VGG
After the single training reaches a specified number of times, the samples selected in advance are deduced through a def sample (self, x _ a, x _ b) function, and the deduced result is stored in outputs.
And fourthly, style matching.
In order to incorporate semantic information, firstly, a semantic mask (mask) of an original image needs to be correspondingly downsampled, as described in the following formula:
m1=downsampling(m,scale(l)) (7)
wherein l represents a network layer number, m1Semantic mask (mask) indicating the network layer l, scale (l) indicating the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of the network layer l.
And then the style features are spliced together s and m in feature dimension to form a new style feature snIn order to balance the influence of the traditional characteristics and semantic information on the style, introducing a hyper-parameter lambda, when lambda is 0, only using the traditional characteristics for style migration, and when lambda is + ∞, only using the semantic information for style migration; the user can use different values depending on the actual situation.
sn=norm(sl)||λ·norm(ml), (8)
The style fusion sub-network part is judged by cosine similarity, and the formula is as follows:
and fifthly, measuring the image similarity.
As shown in fig. 7, taking the images before and after the style migration as an example, the SSIM index is used to calculate the image similarity before and after the migration, and the basic process is as follows:
1) for an input original image and an image y after lattice migration, firstly, calculating a brightness representation, and comparing to obtain a first similarity-related evaluation
2) Eliminating the influence of brightness characteristics, calculating contrast characterization, and comparing to obtain a second evaluation
3) Excluding the influence of brightness characteristic and contrast characteristic, and comparing the structures
4) Calculating the similarity value of the two images from the three calculated characteristic values to be 0.2292
At this point, the training process of the model is completed, and in the testing stage, a single image is input, so that a plurality of images with different styles can be generated.
According to the embodiment, the data sets do not need to be matched (manually labeled in advance), the problem of style overflow possibly generated in the image style migration process is effectively solved, the images in various styles can be generated by receiving a single input image input by a user, and the requirement of the user on diversity is met.
Claims (10)
1. An image style migration system based on three-branch clustering semantic segmentation is characterized by comprising:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching;
the characteristic extraction module is used for simultaneously extracting low-order and high-order characteristics of the content image and the style image and inputting the characteristics into a characteristic synthesis network to obtain an image fusing the content characteristics and the style characteristics;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects;
the image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system;
the image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
2. The image style migration system based on three-branch clustering semantic segmentation according to claim 1, wherein the semantic segmentation module is used for the following processes comprising:
normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm; the pixel value normalization processing is to convert the image into a standard form to resist subsequent affine transformation; the K-means algorithm takes the obtained clustering center as the initial input of a K nearest neighbor algorithm which is improved subsequently; the improved k nearest neighbor algorithm introduces the concept of three-branch clustering into the k nearest neighbor algorithm, sets different discrimination rules for a core domain and a discrimination domain, and distributes labels for sample points in two steps; the points needing to be distributed in the boundary domain are points which are not distributed to the labels after the labels are distributed to the core domain, namely the points which cannot be distinguished by the core domain are classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
3. The image style migration system based on the three-branch clustering semantic segmentation of the claim 1 is characterized in that the content and style feature extraction module comprises a content encoder, a style encoder and a joint decoder; the content encoder is composed of a plurality of convolutional layers for downsampling the input and further processing using a residual block, all of which are followed by an instance normalization that acts to remove the original feature mean and variance representing the style information; the style encoder comprises a plurality of convolutional layers, an average pooling layer and a full-link layer; the joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
4. The image style migration system based on three-branch clustering semantic segmentation according to claim 1, wherein the image similarity measurement module is used for the following processes, including: and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
5. An image style migration method based on three-branch clustering semantic segmentation is characterized by comprising the following steps:
step 1, image preprocessing: adding Gaussian noise to the original image; expanding the sample set by using a data augmentation method;
step 2, semantic segmentation: performing semantic segmentation on the image by a K-means three-branch clustering method improved by K neighbors to obtain semantic images of different objects in the image;
step 3, feature extraction: extracting the content and style characteristics of the image by using a MUNIT model;
step 4, style matching: in order to fully integrate semantic information, the style matching network is divided into a semantic matching sub-network and a style integration sub-network; the two sub-networks can fully utilize the semantic information image obtained in the step 2;
step 5, measuring image similarity: and (3) calculating similarity values between different images pairwise by adopting an SSIM similarity measurement function, so that optimization is performed in the generated images of different styles, and a plurality of images with low similarity are further screened out and finally displayed to the user for output.
6. The image style migration method based on three-branch clustering semantic segmentation according to claim 5, wherein the step 1 image preprocessing process comprises:
step 1.1, Gaussian noise is added; preprocessing the content image to construct a content image IcThe size and the channel number of the Gaussian noise matrix are the same, and the noise matrix is added with the original image to obtain an image containing Gaussian noise, and the image is used as a content input image; for any point (x) of a channel in the content imagei,yi) The pixel value can be expressed as z, and the probability density function of gaussian noise is:
wherein z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
step 1.2, data augmentation; by adopting any one or more of scaling transformation, clipping, color transformation, rotation and translation, a series of random changes are made on the training images to generate similar but different training samples, so that the scale of the training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
7. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein the step 2 semantic segmentation process comprises:
step 2.1, pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
step 2.2.K-means algorithm to obtain clustering center: selecting K points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the K points, namely: dividing data into K clusters to finish one-time division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
then, obtaining a relatively reasonable clustering center, and making early preparation for subsequently dividing a core domain, namely a boundary domain;
step 2.3, core domain category label distribution: the idea of three-branch clustering is introduced to assist decision making, and the three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, and Tgamma (C) is the outer domainThe area of the part is provided with a plurality of grooves,is an empty set;
namely, the three areas are mutually exclusive and have no intersection;
an improved k nearest neighbor algorithm is used, and the idea of three-branch clustering is introduced, so that labels are distributed to sample points except for a sample clustering center, and the clustering effect is achieved;
the K-nearest neighbor algorithm is characterized in that the distance between one point and all other points is calculated, the K points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the K points, and the distance between the point and the point is generally the Euclidean distance, and the formula is as follows:
where ρ is the Euclidean distance between two points, (x)1,y1) And (x)2,y2) Any two points are included;
therefore, K points closest to a certain point are obtained and called as change point neighborhood points, then a shared neighborhood is obtained according to the fields of the two points, and preparation work is prepared for label distribution of subsequent core domain points and boundary domain points;
if the outer region is not considered, the manner of the core domain and the edge domain category labels should be different;
label assignment of core domain points: the discrimination formula of the core domain point is as follows
Wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
label assignment of boundary domain points: the method is a process for redistributing points which are not allocated with labels in core domain allocation, and comprises the steps of forming an allocation matrix M for recording the types of all neighborhood points of a certain point, taking the cluster where the neighborhood points are located most, and allocating the labels to the points which are not allocated with the labels.
8. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein in the step 3, the process of extracting the content and style features comprises;
the MUNIT model is an extension to the UNIT model, which is called conversion between multi-modal data; UNIT considers that different data sets can share the same hidden space, and the MUNIT model further divides the hidden space into a content hidden space and a style hidden space, wherein the style hidden space is a space for measuring the difference between an original image and a target image;
the coding stage is composed of two self-coders as same as the UNIT model, and is different from the prior art that the coding stage is mapped to a hidden space through two parts of networks and is decomposed into the characteristics of two parts of content and style in the hidden space; then the reconstruction is also done from these two parts in the decoding phase; the whole process requires the content and style loss to be minimized, and the loss function is defined as follows:
9. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein in the step 4, the style matching process comprises:
performing style migration on semantic information obtained based on semantic segmentation, namely migration among objects of the same class;
in order to incorporate semantic information, firstly, the semantic mask of the original image is correspondingly downsampled, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
when the style matches the sub-network part, the cosine similarity is used for judgment, and the formula is as follows:
10. The image style migration method based on three-branch clustering semantic segmentation according to claim 5, wherein in the step 5, the image similarity measurement process comprises:
the structural similarity index SSIM is used for measuring the similarity between two images and is often used for evaluating the image restoration condition after image restoration modeling;
the SSIM index extracts three main features of brightness, contrast, and structure from an image to compare the images, and from the specific implementation point of view, the brightness of an image is characterized by a mean value, the contrast is characterized by a variance, and the structure is characterized by a correlation coefficient, and the specific formula is as follows:
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111399319.2A CN113902613A (en) | 2021-11-19 | 2021-11-19 | Image style migration system and method based on three-branch clustering semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111399319.2A CN113902613A (en) | 2021-11-19 | 2021-11-19 | Image style migration system and method based on three-branch clustering semantic segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113902613A true CN113902613A (en) | 2022-01-07 |
Family
ID=79194933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111399319.2A Pending CN113902613A (en) | 2021-11-19 | 2021-11-19 | Image style migration system and method based on three-branch clustering semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113902613A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114511646A (en) * | 2022-04-19 | 2022-05-17 | 南通东德纺织科技有限公司 | Cloth style identification method and system based on image processing |
CN114549554A (en) * | 2022-02-22 | 2022-05-27 | 山东融瓴科技集团有限公司 | Air pollution source segmentation method based on style invariance |
CN116137060A (en) * | 2023-04-20 | 2023-05-19 | 城云科技(中国)有限公司 | Same-scene multi-grid image matching method, device and application |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160364625A1 (en) * | 2015-06-10 | 2016-12-15 | Adobe Systems Incorporated | Automatically Selecting Example Stylized Images for Image Stylization Operations Based on Semantic Content |
CN112069940A (en) * | 2020-08-24 | 2020-12-11 | 武汉大学 | Cross-domain pedestrian re-identification method based on staged feature learning |
-
2021
- 2021-11-19 CN CN202111399319.2A patent/CN113902613A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160364625A1 (en) * | 2015-06-10 | 2016-12-15 | Adobe Systems Incorporated | Automatically Selecting Example Stylized Images for Image Stylization Operations Based on Semantic Content |
CN112069940A (en) * | 2020-08-24 | 2020-12-11 | 武汉大学 | Cross-domain pedestrian re-identification method based on staged feature learning |
Non-Patent Citations (2)
Title |
---|
SIDDHARTHA GAIROLA: "Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks", 《PROCEEDINGS OF THE IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)》, 31 December 2020 (2020-12-31), pages 3281 - 3289 * |
田瑶琳;陈善雄;赵富佳;林小渝;熊海灵;: "手写体版面分析和多风格古籍背景融合", 计算机辅助设计与图形学学报, no. 07, 11 July 2020 (2020-07-11) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114549554A (en) * | 2022-02-22 | 2022-05-27 | 山东融瓴科技集团有限公司 | Air pollution source segmentation method based on style invariance |
CN114549554B (en) * | 2022-02-22 | 2024-05-14 | 山东融瓴科技集团有限公司 | Air pollution source segmentation method based on style invariance |
CN114511646A (en) * | 2022-04-19 | 2022-05-17 | 南通东德纺织科技有限公司 | Cloth style identification method and system based on image processing |
CN116137060A (en) * | 2023-04-20 | 2023-05-19 | 城云科技(中国)有限公司 | Same-scene multi-grid image matching method, device and application |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107977671B (en) | Tongue picture classification method based on multitask convolutional neural network | |
CN106547880B (en) | Multi-dimensional geographic scene identification method fusing geographic area knowledge | |
CN113902613A (en) | Image style migration system and method based on three-branch clustering semantic segmentation | |
Sowmya et al. | Colour image segmentation using fuzzy clustering techniques and competitive neural network | |
CN111242841B (en) | Image background style migration method based on semantic segmentation and deep learning | |
Varish et al. | Image retrieval scheme using quantized bins of color image components and adaptive tetrolet transform | |
Kadam et al. | Detection and localization of multiple image splicing using MobileNet V1 | |
CN110766051A (en) | Lung nodule morphological classification method based on neural network | |
CN110119753B (en) | Lithology recognition method by reconstructed texture | |
Li et al. | Globally and locally semantic colorization via exemplar-based broad-GAN | |
CN110047139B (en) | Three-dimensional reconstruction method and system for specified target | |
Cao et al. | Ancient mural restoration based on a modified generative adversarial network | |
JP2008097607A (en) | Method to automatically classify input image | |
Pesaresi et al. | A new compact representation of morphological profiles: Report on first massive VHR image processing at the JRC | |
CN113743484A (en) | Image classification method and system based on space and channel attention mechanism | |
Zhang et al. | Improved adaptive image retrieval with the use of shadowed sets | |
CN113592893A (en) | Image foreground segmentation method combining determined main body and refined edge | |
CN112990340B (en) | Self-learning migration method based on feature sharing | |
Kim et al. | Image-based TF colorization with CNN for direct volume rendering | |
CN115033721A (en) | Image retrieval method based on big data | |
CN112836755B (en) | Sample image generation method and system based on deep learning | |
Hanif et al. | Blind bleed-through removal in color ancient manuscripts | |
CN110210561B (en) | Neural network training method, target detection method and device, and storage medium | |
Yuan et al. | Explore double-opponency and skin color for saliency detection | |
CN112884773B (en) | Target segmentation model based on target attention consistency under background transformation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |