CN113902613A - Image style migration system and method based on three-branch clustering semantic segmentation - Google Patents

Image style migration system and method based on three-branch clustering semantic segmentation Download PDF

Info

Publication number
CN113902613A
CN113902613A CN202111399319.2A CN202111399319A CN113902613A CN 113902613 A CN113902613 A CN 113902613A CN 202111399319 A CN202111399319 A CN 202111399319A CN 113902613 A CN113902613 A CN 113902613A
Authority
CN
China
Prior art keywords
image
style
points
content
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111399319.2A
Other languages
Chinese (zh)
Inventor
程柳
祁云嵩
姜元昊
吴婷凤
赵呈祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University of Science and Technology
Original Assignee
Jiangsu University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University of Science and Technology filed Critical Jiangsu University of Science and Technology
Priority to CN202111399319.2A priority Critical patent/CN113902613A/en
Publication of CN113902613A publication Critical patent/CN113902613A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image style migration system and method based on three-branch clustering semantic segmentation, which comprises the following steps: image preprocessing, image semantic segmentation, extraction of image content and style features, style matching, and similarity measurement of images. The semantic segmentation technology is adopted, and the problem of style overflow possibly generated in the style migration process is effectively solved by the application of the semantic segmentation technology in the image style migration; the used MUNIT model belongs to an unsupervised deep learning model, a paired data set is not needed, images in various styles can be produced, and the diversity requirements of users are met to a great extent; the step of adopting an image similarity measurement algorithm based on SSIM indexes realizes generation inhibition of similar style images, meets the diversity requirement and simultaneously ensures the stability and effectiveness of the whole system.

Description

Image style migration system and method based on three-branch clustering semantic segmentation
Technical Field
The invention belongs to the technical field of image processing and pattern recognition, and particularly relates to an image style migration system and method based on three-branch clustering semantic segmentation.
Background
The method has a special application in the application of the deep neural network, namely, the style migration of images, the image style migration is developed by Gatys, Johnson and the like, and the stylization of the images can obtain a satisfactory result under specific conditions. At present, popular image style migration algorithms are mainly divided into two types, one is slow style migration based on image iteration, and the other is fast style migration based on model iteration. Model iteration based methods include feed forward stylized model based and GAN based methods. The representative work based on the feedforward stylized model is mainly two, namely the work of Johnson et al and Ulianov et al, while the GAN-based methods are more in variety and have advantages and disadvantages in different scenes. The method can be well expressed in scenes with unobtrusive semantic information, but the problem of semantic mismatching is easy to occur in scenes with sensitive semantics, so that the application of the semantic segmentation technology in image style migration is of great significance.
The semantic segmentation combines image classification, target detection and image segmentation, and the segmented image with semantic annotation can be finally obtained by segmenting the image into region blocks with certain semantic meanings and identifying the semantic category of each region block. The application of combining the semantic segmentation technology and the image style migration is still few, and most of the current researches are focused on respectively improving the precision of semantic segmentation and the speed of image style migration.
Chinese patent publication No.: CN 112950454 a, name: the invention discloses an image style migration method based on multi-scale semantic matching. The method is mainly characterized in that multi-scale depth features in a content image and a style image are extracted.
Although the above method can obtain a good style migration effect, the following problems still remain: 1) paired data sets are difficult to collect or even cannot be acquired, and great limitation is brought to image style migration; 2) after training, results of only one style can be obtained, and the diversity requirements of users cannot be met; 3) only the similarity of the overall style of the image is considered, and the specific style of a specific object cannot be reserved; 4) the problem of style overflow exists, so that the harmony and the appreciation of the whole image are damaged; 5) other clustering method technical schemes have the problem that the clustering effect is not good and the style migration effect is influenced in most models using the clustering method in the image style migration process 6) images with higher similarity exist among the images with various styles output by other technical schemes.
Disclosure of Invention
Referring to fig. 1, the present invention is directed to overcome the defects of the prior art, and provides an image style migration system based on three-branch clustering semantic segmentation and a method thereof, which can effectively solve some problems existing in the image style migration process, and make up for the technical shortages thereof; before the image style migration, firstly, semantic information in the image is extracted, and in the image style migration process, the result obtained by semantic segmentation is matched with the semantic information in the target image, so that the purpose of overall style migration is achieved.
In order to solve the technical problems, the invention adopts the following technical scheme.
The invention relates to an image style migration system based on three-branch clustering semantic segmentation, which comprises the following steps:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching, and the process comprises the following steps: normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm; the pixel value normalization processing is to convert the image into a standard form to resist subsequent affine transformation; the K-means algorithm takes the obtained clustering center as the initial input of a K nearest neighbor algorithm which is improved subsequently; the improved k nearest neighbor algorithm introduces the concept of three-branch clustering into the k nearest neighbor algorithm, sets different discrimination rules for a core domain and a discrimination domain, and distributes labels for sample points in two steps; the points needing to be distributed in the boundary domain are points which are not distributed to the labels after the labels are distributed to the core domain, namely the points which cannot be distinguished by the core domain are classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
The characteristic extraction module is used for simultaneously extracting low-order and high-order characteristics of the content image and the style image and inputting the characteristics into a characteristic synthesis network to obtain an image fusing the content characteristics and the style characteristics;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects; including a content encoder, a pattern encoder, and a joint decoder; the content encoder is composed of a plurality of convolutional layers for downsampling the input and further processing using a residual block, all of which are followed by an instance normalization that acts to remove the original feature mean and variance representing the style information; the style encoder comprises a plurality of convolutional layers, an average pooling layer and a full-link layer; the joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
The image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system, and comprises: and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
The image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
The invention discloses an image style migration method based on three-branch clustering semantic segmentation, which comprises the following steps of:
step 1, image preprocessing: adding Gaussian noise to the original image; expanding the sample set by using a data augmentation method;
step 2, semantic segmentation: performing semantic segmentation on the image by a K-means three-branch clustering method improved by K neighbors to obtain semantic images of different objects in the image;
step 3, feature extraction: extracting the content and style characteristics of the image by using a MUNIT model;
step 4, style matching: in order to fully integrate semantic information, the style matching network is divided into a semantic matching sub-network and a style integration sub-network; the two sub-networks can fully utilize the semantic information image obtained in the step 2;
step 5, measuring image similarity: and (3) calculating similarity values between different images pairwise by adopting an SSIM similarity measurement function, so that optimization is performed in the generated images of different styles, and a plurality of images with low similarity are further screened out and finally displayed to the user for output.
Further, the step 1 image preprocessing process includes:
step 1.1, Gaussian noise is added; preprocessing the content image to construct a content image IcThe size and the channel number of the Gaussian noise matrix are the same, and the noise matrix is added with the original image to obtain an image containing Gaussian noise, and the image is used as a content input image; for any point (x) of a channel in the content imagei,yi) The pixel value of which can be expressed asz, the probability density function of gaussian noise is:
Figure BDA0003364880200000031
wherein z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
step 1.2, data augmentation; by adopting any one or more of scaling transformation, clipping, color transformation, rotation and translation, a series of random changes are made on the training images to generate similar but different training samples, so that the scale of the training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
Further, the step 2 semantic segmentation process includes:
step 2.1, pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
Figure BDA0003364880200000032
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
step 2.2.K-means algorithm to obtain clustering center: selecting K points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the K points, namely: dividing data into K clusters to finish one-time division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
then, obtaining a relatively reasonable clustering center, and making early preparation for subsequently dividing a core domain, namely a boundary domain;
step 2.3, core domain category label distribution: the idea of three-branch clustering is introduced to assist decision making, and the three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
Figure BDA0003364880200000041
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, Tgamma (C) is the outer region,
Figure BDA0003364880200000042
is an empty set;
namely, the three areas are mutually exclusive and have no intersection;
an improved k nearest neighbor algorithm is used, and the idea of three-branch clustering is introduced, so that labels are distributed to sample points except for a sample clustering center, and the clustering effect is achieved;
the K-nearest neighbor algorithm is characterized in that the distance between one point and all other points is calculated, the K points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the K points, and the distance between the point and the point is generally the Euclidean distance, and the formula is as follows:
Figure BDA0003364880200000043
where ρ is the Euclidean distance between two points, (x)1,y1) And(x2,y2) Any two points are included;
therefore, K points closest to a certain point are obtained and called as change point neighborhood points, then a shared neighborhood is obtained according to the fields of the two points, and preparation work is prepared for label distribution of subsequent core domain points and boundary domain points;
if the outer region is not considered, the manner of the core domain and the edge domain category labels should be different;
label assignment of core domain points: the discrimination formula of the core domain point is as follows
Figure BDA0003364880200000051
Wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
label assignment of boundary domain points: the method is a process for redistributing points which are not allocated with labels in core domain allocation, and comprises the steps of forming an allocation matrix M for recording the types of all neighborhood points of a certain point, taking the cluster where the neighborhood points are located most, and allocating the labels to the points which are not allocated with the labels.
Further, in the step 3, the process of extracting the content and style features includes;
the MUNIT model is an extension to the UNIT model, which is called conversion between multi-modal data; UNIT considers that different data sets can share the same hidden space, and the MUNIT model further divides the hidden space into a content hidden space and a style hidden space, wherein the style hidden space is a space for measuring the difference between an original image and a target image;
the coding stage is composed of two self-coders as same as the UNIT model, and is different from the prior art that the coding stage is mapped to a hidden space through two parts of networks and is decomposed into the characteristics of two parts of content and style in the hidden space; then the reconstruction is also done from these two parts in the decoding phase; the whole process requires the content and style loss to be minimized, and the loss function is defined as follows:
Figure BDA0003364880200000052
wherein the content of the first and second substances,
Figure BDA0003364880200000053
in order to counteract the loss of resistance,
Figure BDA0003364880200000054
for loss of reconfigurability, λx,λc,λsTo control the weight of the importance of the reconstruction term.
Further, in step 4, the style matching process includes:
performing style migration on semantic information obtained based on semantic segmentation, namely migration among objects of the same class;
in order to incorporate semantic information, firstly, the semantic mask of the original image is correspondingly downsampled, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
when the style matches the sub-network part, the cosine similarity is used for judgment, and the formula is as follows:
Figure BDA0003364880200000061
where phi is a function for extracting the features of the image block,
Figure BDA0003364880200000062
is a characteristic of the style of the target image,
Figure BDA0003364880200000063
the style characteristics of the style image.
Further, in step 5, the process of measuring the similarity of the image includes:
the structural similarity index SSIM is used for measuring the similarity between two images and is often used for evaluating the image restoration condition after image restoration modeling;
the SSIM index extracts three main features of brightness, contrast, and structure from an image to compare the images, and from the specific implementation point of view, the brightness of an image is characterized by a mean value, the contrast is characterized by a variance, and the structure is characterized by a correlation coefficient, and the specific formula is as follows:
Figure BDA0003364880200000064
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
Figure BDA0003364880200000065
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention applies the semantic segmentation technology improved by the methods of K-means, KNN, three-branch clustering and the like to the system of image style migration, introduces the method based on SSIM index image similarity measurement in the system, and is used for enhancing the contrast intensity among different results and inhibiting similar images from being repeatedly generated.
2. The method can effectively solve the problem of style overflow which often occurs in the image style migration process by utilizing the improved semantic segmentation algorithm, and the MUNIT model is also fused and used for effectively solving the problems that a matched data set is lacked before model training and only a single style image can be generated after the model training.
3. The image similarity measurement method introduced in the later stage can ensure that the whole system cannot output images with various styles, which have similar styles or cannot be directly observed by naked eyes to have differences, and effectively improves the diversity and stability characteristics of style migration while ensuring the style migration effect.
4. The image style migration system and method based on three-branch clustering semantic segmentation can be applied to style migration of traditional clothes, oil paintings and ceramic patterns, and are beneficial to developing traditional culture in China and further promoting vigorous development of the culture industry.
Drawings
Fig. 1 is a system block diagram and a method flowchart of an embodiment of the present invention.
Fig. 2 is a landscape artistic drawing with the characteristic of the skrit sky produced by the invention, and the original drawing is a color image which is processed and changed into a gray image.
Fig. 3 is a result obtained after image preprocessing according to an embodiment of the present invention, in which fig. 3a is an original image; FIG. 3b is an image with Gaussian noise added; FIG. 3c is the image of the original image after being flipped; the original image is a color image, which is now processed to be a grayscale image.
Fig. 4 is a diagram of semantic segmentation input/output according to an embodiment of the present invention, where fig. 4a is an original diagram, and fig. 4b is an image obtained by semantic segmentation.
FIG. 5 is a schematic diagram of the MUNIT model hidden space according to the present invention; the original image is a color image, which is now processed to be a grayscale image.
FIG. 6 is a diagram of a MUNIT model self-encoder structure according to the present invention.
Fig. 7 is two exemplary graphs for calculating image similarity in an embodiment of the present invention, where fig. 7a is a pre-style migration image,
FIG. 7b is an image after style migration; the original image is a color image, which is now processed to be a grayscale image.
Detailed description of the preferred embodiments
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in FIG. 2, the image style migration system based on the three-branch clustering semantic segmentation provided by the invention can finally obtain various and artistic-interest-rich style images by applying the improved K-means clustering method to the MUNIT model.
As shown in fig. 1, an image style migration system based on three-branch clustering semantic segmentation of the present invention includes:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching;
the content and style feature extraction module is used for simultaneously extracting low-order and high-order features of the content image and the style image and inputting the features into a feature synthesis network to obtain an image fused with the content feature and the style feature;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects;
the image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system;
the image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
The image preprocessing module is used for the following processes:
(1) adding Gaussian noise to sample set images
Gaussian noise is added to all images in the initial sample set according to the following formula:
Figure BDA0003364880200000081
wherein: z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
(2) sample set data augmentation
The insufficient number of samples is usually an important factor influencing the training effect of the model and the implementation effect of the whole system, and the data augmentation method generates similar but different training samples by randomly changing a series of training images, so that the scale of a training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
The semantic segmentation module is used for the following processes, including:
normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm;
(1) the main purpose of the pixel value normalization process is to transform the image into a standard form to resist subsequent affine transformations.
Pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
Figure BDA0003364880200000091
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
(2) the K-means algorithm mainly aims at taking the obtained clustering center as the initial input of a subsequent improved KNN algorithm; selecting k points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the k points, namely: dividing data into k clusters to finish one division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
(3) the improved KNN algorithm has the main idea that the concept of three-branch clustering is introduced into the KNN algorithm
Three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
Figure BDA0003364880200000092
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, Tgamma (C) is the outer region,
Figure BDA0003364880200000093
is an empty set;
the KNN algorithm is characterized in that the distance between one point and all other points is calculated, k points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the k points, and the distance between the point and the point is generally determined by using the Euclidean distance, and the formula is as follows:
Figure BDA0003364880200000094
where ρ is the Euclidean distance between two points, (x)1,y1) And (x)2,y2) Any two points are included;
the specific method of label distribution is to set different discrimination rules for the core domain and the discrimination domain, and distribute labels for the sample points in two steps; the discrimination formula of the core domain is as follows:
Figure BDA0003364880200000101
wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
the point needing to be distributed in the boundary domain is a point which is not distributed with the label after the label is distributed to the core domain, namely, the point which cannot be distinguished by the core domain is classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
The content and style feature extraction module comprises:
a content encoder, a pattern encoder, and a joint decoder; the content encoder is composed of a plurality of convolutional layers for down-sampling the input and further processing using a residual block, all of which are followed by instance normalization, which is mainly used to remove the original feature mean and variance representing the style information; the style encoder is composed of a plurality of convolutional layers, an average pooling layer and a full link layer. The joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
The loss function is defined as follows:
Figure BDA0003364880200000102
wherein the content of the first and second substances,
Figure BDA0003364880200000103
in order to counteract the loss of resistance,
Figure BDA0003364880200000104
for loss of reconfigurability, λx,λc,λsTo control the weight of the importance of the reconstruction term.
The style matching module comprises a semantic matching sub-module and a style fusion sub-module.
(1) The semantic matching sub-module is mainly used for matching the segmented image with each semantic and constructing an image style, firstly, a semantic mask of an original image is correspondingly downsampled in order to integrate semantic information, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
(2) the semantic matching sub-module mainly aims at matching on the granularity of the image block, and judges by using cosine similarity when the style matches the sub-network part, and the formula is as follows:
Figure BDA0003364880200000111
where phi is a function for extracting the features of the image block,
Figure BDA0003364880200000112
is a characteristic of the style of the target image,
Figure BDA0003364880200000113
the style characteristics of the style image.
The image similarity measurement module is used for the following processes, including:
and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
The three main characteristics of the image are brightness characteristic, contrast characteristic and structural characteristic, and the specific formula is as follows:
Figure BDA0003364880200000114
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
Figure BDA0003364880200000115
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
The invention discloses an image style migration method based on three-branch clustering semantic segmentation, which comprises the following steps of:
first, preprocessing of the image.
Fig. 3a and 3b are respectively the gaussian noise adding operation and the image flipping operation, as shown in fig. 3.
1. Adding Gaussian noise
In order to enable uniform texture to appear in the output image background after the image style migration, the content image is required to be preprocessed to construct a texture I corresponding to the content imagecAnd (characteristic dimension) and the same channel number, and adding the noise matrix and the original image to obtain an image containing Gaussian noise.
2. Data augmentation
In order to solve the problem of poor image style migration effect caused by insufficient image data, data needs to be augmented, similar but different training samples are generated by carrying out a series of random changes on training images, so that the scale of a training data set is enlarged, the dependence of a model on certain attributes is greatly reduced, and the generalization capability of the model is improved.
Second step, semantic segmentation
The whole semantic segmentation process is as follows:
1. pixel value normalization processing:
as shown in fig. 4a, the image has 499 x 701 pixels and the pixel values of the pixels of the image are initially in the range of 0-255, as shown in the following table a, which maps the pixel values of all the pixels into the range of 0-1, and the following table b is the processed data:
Figure BDA0003364880200000121
obtaining a clustering center by a K-means method:
k-means basic procedure:
assume that the input of the normalized algorithm is data ═ { point ═ point1,point2,...pointmAnd the class number is k, the maximum iteration time is set to be N, and then the output of the sample is a division of an original sample set, namely, C1,C2,...Ck}。
(1) Selecting k objects from the data as initial clustering centers { μ12,...μk}
(2) For each iteration, the distance of each cluster object to the cluster center is calculated to partition the partition criteria as follows:
a) partition of initialization clusters CkNot equal to empty set, t 1,2
b) For each point in the sample set, the sample point is calculated using the following formulaiAnd each cluster center point mujThe distance of (c):
dij=||pointij||2 (12)
x is to beiMarked as minimum dijThe corresponding cluster is changed to Cλi=Cλi∪{pointi}.
c) Recalculating the cluster centers:
Figure BDA0003364880200000131
(3) and repeating for multiple times until the clustering center is not changed or the maximum iteration number is reached, otherwise, continuing to repeat.
3. Core domain class label assignment
The manner of core domain and edge domain class labels should not be the same.
Obtaining k neighborhood points of each point according to a KNN algorithm, wherein the algorithm comprises the following steps:
1) calculating the distance between the test data and each training data;
2) sorting according to the increasing relation of the distances;
3) selecting K points with the minimum distance between each point as neighborhood points of the point;
4) calculating and storing a point shared in the neighborhood between every two points, wherein the point is a shared domain;
the KNN algorithm python is realized by:
Figure BDA0003364880200000132
Figure BDA0003364880200000141
according to the obtained clustering center and k adjacent points of the clustering center, distributing labels to the adjacent points according to the following rules:
label assignment of core domain points:
Figure BDA0003364880200000142
wherein | KNNP (this, next) | is the number of two points sharing domain points.
If the above rules are satisfied, the label is assigned the same label as the cluster center. If not, the label is classified as unallocated.
And then distributing labels to the neighborhood points of the neighbor points of the clustering center, and so on until the neighborhood points do not exist any more.
4. Boundary domain label assignment
The assignment of labels is made to points not completed in the previous step until no more points are assigned labels.
The boundary domain point label allocation rule is as follows:
label assignment for boundary domain points
Figure BDA0003364880200000151
As shown in fig. 4b, the process of clustering the pixel points of one image is completed, and different objects in the image can be identified according to different categories, so that semantic information of one image is obtained.
And thirdly, extracting the content and style characteristics.
The MUNIT model training process comprises the following steps: as shown in fig. 5 and 6.
Before the MUNIT model is formally trained, parameters of the model need to be configured, and the parameter table is as follows:
Figure BDA0003364880200000152
Figure BDA0003364880200000161
the training of the MUNIT model is mainly divided into three processes, namely forward propagation of the network, generation of the model and optimization, identification of the model and optimization, and integration of the optimization of the generator and the discriminator in an initialization function.
The network forward transmission is a coding process, firstly, coding is carried out on two input pictures to respectively obtain content codes and style codes of the two pictures, then, the two pictures are interchanged, and then, noise conforming to normal distribution is added to generate new pictures x _ ab and x _ ba. A network model, i.e. a mapping between the two data sets, is generated, and an authentication model identifies whether the generated image is consistent with the distribution of the other data set. Various parameters of the discriminator and the generator, a learning rate attenuation strategy and a corresponding optimizer are also defined in the initialization function; also loaded is the VGG loaded model used to calculate the perceived loss.
The main loss is divided into 4 parts:
1. loss between reconstructed picture and real picture
2. Calculation loss of late code obtained by reconstructing picture coding and late code obtained by real picture coding
3. The picture is translated to a target domain and then returned to calculate the loss with the original picture
4. Computing domain aware loss using VGG
After the single training reaches a specified number of times, the samples selected in advance are deduced through a def sample (self, x _ a, x _ b) function, and the deduced result is stored in outputs.
And fourthly, style matching.
In order to incorporate semantic information, firstly, a semantic mask (mask) of an original image needs to be correspondingly downsampled, as described in the following formula:
m1=downsampling(m,scale(l)) (7)
wherein l represents a network layer number, m1Semantic mask (mask) indicating the network layer l, scale (l) indicating the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of the network layer l.
And then the style features are spliced together s and m in feature dimension to form a new style feature snIn order to balance the influence of the traditional characteristics and semantic information on the style, introducing a hyper-parameter lambda, when lambda is 0, only using the traditional characteristics for style migration, and when lambda is + ∞, only using the semantic information for style migration; the user can use different values depending on the actual situation.
sn=norm(sl)||λ·norm(ml), (8)
The style fusion sub-network part is judged by cosine similarity, and the formula is as follows:
Figure BDA0003364880200000171
and fifthly, measuring the image similarity.
As shown in fig. 7, taking the images before and after the style migration as an example, the SSIM index is used to calculate the image similarity before and after the migration, and the basic process is as follows:
1) for an input original image and an image y after lattice migration, firstly, calculating a brightness representation, and comparing to obtain a first similarity-related evaluation
2) Eliminating the influence of brightness characteristics, calculating contrast characterization, and comparing to obtain a second evaluation
3) Excluding the influence of brightness characteristic and contrast characteristic, and comparing the structures
4) Calculating the similarity value of the two images from the three calculated characteristic values to be 0.2292
At this point, the training process of the model is completed, and in the testing stage, a single image is input, so that a plurality of images with different styles can be generated.
According to the embodiment, the data sets do not need to be matched (manually labeled in advance), the problem of style overflow possibly generated in the image style migration process is effectively solved, the images in various styles can be generated by receiving a single input image input by a user, and the requirement of the user on diversity is met.

Claims (10)

1. An image style migration system based on three-branch clustering semantic segmentation is characterized by comprising:
the image preprocessing module is used for adding Gaussian noise to the sample image and expanding image data so as to solve the problems of uneven texture in the image style migration process and poor style migration effect caused by insufficient sample data;
the semantic segmentation module is used for segmenting each semantic block in the content image and the style image respectively and providing basic semantic information for the subsequent style matching;
the characteristic extraction module is used for simultaneously extracting low-order and high-order characteristics of the content image and the style image and inputting the characteristics into a characteristic synthesis network to obtain an image fusing the content characteristics and the style characteristics;
the style matching module is used for matching the same type of objects in the content image and the original image so as to carry out style migration between the same type of objects;
the image similarity measurement module is used for measuring the similarity between every two images generated by the system and screening out the image with lower similarity as the final output of the system;
the image preprocessing module adopts a Gaussian noise adding method to avoid the problem of uneven texture possibly occurring in the content and style extraction module, and the adopted data amplification method effectively solves the problem of under-fitting in the image style migration process; the semantic features of the content image and the style image obtained by the semantic segmentation module and the semantic features of the content and style image obtained by the content and style feature extraction module are used for providing input images for the style matching module, and the image similarity measurement module is used for optimizing the output of the whole system.
2. The image style migration system based on three-branch clustering semantic segmentation according to claim 1, wherein the semantic segmentation module is used for the following processes comprising:
normalizing the pixel values, and solving a clustering center, core domain label distribution and boundary domain label distribution by using a K-means algorithm; the pixel value normalization processing is to convert the image into a standard form to resist subsequent affine transformation; the K-means algorithm takes the obtained clustering center as the initial input of a K nearest neighbor algorithm which is improved subsequently; the improved k nearest neighbor algorithm introduces the concept of three-branch clustering into the k nearest neighbor algorithm, sets different discrimination rules for a core domain and a discrimination domain, and distributes labels for sample points in two steps; the points needing to be distributed in the boundary domain are points which are not distributed to the labels after the labels are distributed to the core domain, namely the points which cannot be distinguished by the core domain are classified into the boundary domain; the clustering of the sample points is completed through the steps, and a semantic segmentation image is further obtained.
3. The image style migration system based on the three-branch clustering semantic segmentation of the claim 1 is characterized in that the content and style feature extraction module comprises a content encoder, a style encoder and a joint decoder; the content encoder is composed of a plurality of convolutional layers for downsampling the input and further processing using a residual block, all of which are followed by an instance normalization that acts to remove the original feature mean and variance representing the style information; the style encoder comprises a plurality of convolutional layers, an average pooling layer and a full-link layer; the joint decoder encodes the content through a set of residual blocks and then generates a reconstructed image through an upsampled layer and a convolutional layer.
4. The image style migration system based on three-branch clustering semantic segmentation according to claim 1, wherein the image similarity measurement module is used for the following processes, including: and (3) calculating the similarity between every two style images generated by the SSIM index calculation system, respectively comparing the brightness, the contrast and the structural characteristics between the two images to finally calculate a similarity value, and screening out the image with low similarity as the final output of the system.
5. An image style migration method based on three-branch clustering semantic segmentation is characterized by comprising the following steps:
step 1, image preprocessing: adding Gaussian noise to the original image; expanding the sample set by using a data augmentation method;
step 2, semantic segmentation: performing semantic segmentation on the image by a K-means three-branch clustering method improved by K neighbors to obtain semantic images of different objects in the image;
step 3, feature extraction: extracting the content and style characteristics of the image by using a MUNIT model;
step 4, style matching: in order to fully integrate semantic information, the style matching network is divided into a semantic matching sub-network and a style integration sub-network; the two sub-networks can fully utilize the semantic information image obtained in the step 2;
step 5, measuring image similarity: and (3) calculating similarity values between different images pairwise by adopting an SSIM similarity measurement function, so that optimization is performed in the generated images of different styles, and a plurality of images with low similarity are further screened out and finally displayed to the user for output.
6. The image style migration method based on three-branch clustering semantic segmentation according to claim 5, wherein the step 1 image preprocessing process comprises:
step 1.1, Gaussian noise is added; preprocessing the content image to construct a content image IcThe size and the channel number of the Gaussian noise matrix are the same, and the noise matrix is added with the original image to obtain an image containing Gaussian noise, and the image is used as a content input image; for any point (x) of a channel in the content imagei,yi) The pixel value can be expressed as z, and the probability density function of gaussian noise is:
Figure FDA0003364880190000021
wherein z is a pixel point, P (z) is probability density, sigma is standard deviation, and mu is the average value of pixel values of all points;
step 1.2, data augmentation; by adopting any one or more of scaling transformation, clipping, color transformation, rotation and translation, a series of random changes are made on the training images to generate similar but different training samples, so that the scale of the training data set is enlarged, the dependence of the model on certain attributes is reduced, and the generalization capability of the model is improved.
7. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein the step 2 semantic segmentation process comprises:
step 2.1, pixel value normalization processing: the invariant moment of the image is utilized to search parameters to eliminate the influence of other transformation functions on image transformation, so that the image can resist the attack of subsequent geometric transformation;
for ease of processing, the pixel values of all points are mapped to a range of 0-1, which is the formula:
Figure FDA0003364880190000031
wherein, data is the original pixel value, min (data) is the minimum value of the original pixel value, and max (data) is the maximum value of the original pixel value;
step 2.2.K-means algorithm to obtain clustering center: selecting K points as the initial center of each cluster according to a certain strategy, and dividing data into the clusters closest to the K points, namely: dividing data into K clusters to finish one-time division; considering that the initial partition is not necessarily the best partition, the center point of each cluster is recalculated in the generated new clusters, and then the new clusters are divided again until the result of each division is kept unchanged; in practical application, the maximum iteration times are usually preset, and when the maximum iteration times are reached, the calculation is terminated;
then, obtaining a relatively reasonable clustering center, and making early preparation for subsequently dividing a core domain, namely a boundary domain;
step 2.3, core domain category label distribution: the idea of three-branch clustering is introduced to assist decision making, and the three-branch clustering divides data sample data into three regions, namely: c represents a certain category, namely Co (C), F gamma (C) and T gamma (C) respectively represent a core domain, a boundary domain and an outer region; the core domain represents a set of sample points that must be subordinate to class C, the boundary domain represents a set of sample points that may be subordinate to class C, and the outer region represents a set of sample points that may be subordinate to class C;
the relationship of the three regions is as follows:
Figure FDA0003364880190000032
wherein U is the corpus, Co (C) is the core domain, Fgamma (C) is the boundary domain, and Tgamma (C) is the outer domainThe area of the part is provided with a plurality of grooves,
Figure FDA0003364880190000033
is an empty set;
namely, the three areas are mutually exclusive and have no intersection;
an improved k nearest neighbor algorithm is used, and the idea of three-branch clustering is introduced, so that labels are distributed to sample points except for a sample clustering center, and the clustering effect is achieved;
the K-nearest neighbor algorithm is characterized in that the distance between one point and all other points is calculated, the K points closest to the point are taken out, the class of the point is judged according to the class with the largest classification proportion in the K points, and the distance between the point and the point is generally the Euclidean distance, and the formula is as follows:
Figure FDA0003364880190000034
where ρ is the Euclidean distance between two points, (x)1,y1) And (x)2,y2) Any two points are included;
therefore, K points closest to a certain point are obtained and called as change point neighborhood points, then a shared neighborhood is obtained according to the fields of the two points, and preparation work is prepared for label distribution of subsequent core domain points and boundary domain points;
if the outer region is not considered, the manner of the core domain and the edge domain category labels should be different;
label assignment of core domain points: the discrimination formula of the core domain point is as follows
Figure FDA0003364880190000041
Wherein, | SNN (this, next) | is the number of two-point shared neighborhood points, this is the current point, and next is the point to be judged; when the number of points in the shared neighborhood of the next point and the this point, namely | SNN (this, next) | satisfies the formula, the next point is classified as the class to which the this point belongs;
label assignment of boundary domain points: the method is a process for redistributing points which are not allocated with labels in core domain allocation, and comprises the steps of forming an allocation matrix M for recording the types of all neighborhood points of a certain point, taking the cluster where the neighborhood points are located most, and allocating the labels to the points which are not allocated with the labels.
8. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein in the step 3, the process of extracting the content and style features comprises;
the MUNIT model is an extension to the UNIT model, which is called conversion between multi-modal data; UNIT considers that different data sets can share the same hidden space, and the MUNIT model further divides the hidden space into a content hidden space and a style hidden space, wherein the style hidden space is a space for measuring the difference between an original image and a target image;
the coding stage is composed of two self-coders as same as the UNIT model, and is different from the prior art that the coding stage is mapped to a hidden space through two parts of networks and is decomposed into the characteristics of two parts of content and style in the hidden space; then the reconstruction is also done from these two parts in the decoding phase; the whole process requires the content and style loss to be minimized, and the loss function is defined as follows:
Figure FDA0003364880190000042
wherein the content of the first and second substances,
Figure FDA0003364880190000043
in order to counteract the loss of resistance,
Figure FDA0003364880190000044
for loss of reconfigurability, λx,λc,λsTo control the weight of the importance of the reconstruction term.
9. The image style migration method based on the three-branch clustering semantic segmentation according to claim 5, wherein in the step 4, the style matching process comprises:
performing style migration on semantic information obtained based on semantic segmentation, namely migration among objects of the same class;
in order to incorporate semantic information, firstly, the semantic mask of the original image is correspondingly downsampled, and the formula is as follows:
m1=downsampling(m,scale(l)) (7)
wherein m is1Semantic mask, scale, (l) representing the downsampling ratio of Caller to m, which is determined by the resolution of the input image and the output resolution of network layer l;
then, splicing the style features on feature dimensions to form new style features, introducing a hyper-parameter lambda for balancing the influence of the traditional features and semantic information on the style, and only using the traditional features for style migration when the lambda is 0 and only using the semantic information for style migration when the lambda is + ∞;
sn=norm(sl)||λ·norm(ml), (8)
wherein s islStyle characteristics m after fusing semantic information for network layer llSemantic information from the content image;
when the style matches the sub-network part, the cosine similarity is used for judgment, and the formula is as follows:
Figure FDA0003364880190000051
where phi is a function for extracting the features of the image block,
Figure FDA0003364880190000052
is a characteristic of the style of the target image,
Figure FDA0003364880190000053
the style characteristics of the style image.
10. The image style migration method based on three-branch clustering semantic segmentation according to claim 5, wherein in the step 5, the image similarity measurement process comprises:
the structural similarity index SSIM is used for measuring the similarity between two images and is often used for evaluating the image restoration condition after image restoration modeling;
the SSIM index extracts three main features of brightness, contrast, and structure from an image to compare the images, and from the specific implementation point of view, the brightness of an image is characterized by a mean value, the contrast is characterized by a variance, and the structure is characterized by a correlation coefficient, and the specific formula is as follows:
Figure FDA0003364880190000054
where l (x, y) represents luminance, c (x, y) represents contrast, s (x, y) represents structure, μxIs the mean value of sample x, μyIs the mean, σ, of the sample yxIs the variance, σ, of xyVariance of y, σxyCovariance of x and y;
the similarity function is:
Figure FDA0003364880190000061
wherein SSIM is an image similarity measure index, C1、C2Is a constant.
CN202111399319.2A 2021-11-19 2021-11-19 Image style migration system and method based on three-branch clustering semantic segmentation Pending CN113902613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111399319.2A CN113902613A (en) 2021-11-19 2021-11-19 Image style migration system and method based on three-branch clustering semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111399319.2A CN113902613A (en) 2021-11-19 2021-11-19 Image style migration system and method based on three-branch clustering semantic segmentation

Publications (1)

Publication Number Publication Date
CN113902613A true CN113902613A (en) 2022-01-07

Family

ID=79194933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111399319.2A Pending CN113902613A (en) 2021-11-19 2021-11-19 Image style migration system and method based on three-branch clustering semantic segmentation

Country Status (1)

Country Link
CN (1) CN113902613A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511646A (en) * 2022-04-19 2022-05-17 南通东德纺织科技有限公司 Cloth style identification method and system based on image processing
CN114549554A (en) * 2022-02-22 2022-05-27 山东融瓴科技集团有限公司 Air pollution source segmentation method based on style invariance
CN116137060A (en) * 2023-04-20 2023-05-19 城云科技(中国)有限公司 Same-scene multi-grid image matching method, device and application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160364625A1 (en) * 2015-06-10 2016-12-15 Adobe Systems Incorporated Automatically Selecting Example Stylized Images for Image Stylization Operations Based on Semantic Content
CN112069940A (en) * 2020-08-24 2020-12-11 武汉大学 Cross-domain pedestrian re-identification method based on staged feature learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160364625A1 (en) * 2015-06-10 2016-12-15 Adobe Systems Incorporated Automatically Selecting Example Stylized Images for Image Stylization Operations Based on Semantic Content
CN112069940A (en) * 2020-08-24 2020-12-11 武汉大学 Cross-domain pedestrian re-identification method based on staged feature learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SIDDHARTHA GAIROLA: "Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks", 《PROCEEDINGS OF THE IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)》, 31 December 2020 (2020-12-31), pages 3281 - 3289 *
田瑶琳;陈善雄;赵富佳;林小渝;熊海灵;: "手写体版面分析和多风格古籍背景融合", 计算机辅助设计与图形学学报, no. 07, 11 July 2020 (2020-07-11) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114549554A (en) * 2022-02-22 2022-05-27 山东融瓴科技集团有限公司 Air pollution source segmentation method based on style invariance
CN114549554B (en) * 2022-02-22 2024-05-14 山东融瓴科技集团有限公司 Air pollution source segmentation method based on style invariance
CN114511646A (en) * 2022-04-19 2022-05-17 南通东德纺织科技有限公司 Cloth style identification method and system based on image processing
CN116137060A (en) * 2023-04-20 2023-05-19 城云科技(中国)有限公司 Same-scene multi-grid image matching method, device and application

Similar Documents

Publication Publication Date Title
CN107977671B (en) Tongue picture classification method based on multitask convolutional neural network
CN106547880B (en) Multi-dimensional geographic scene identification method fusing geographic area knowledge
CN113902613A (en) Image style migration system and method based on three-branch clustering semantic segmentation
Sowmya et al. Colour image segmentation using fuzzy clustering techniques and competitive neural network
CN111242841B (en) Image background style migration method based on semantic segmentation and deep learning
Varish et al. Image retrieval scheme using quantized bins of color image components and adaptive tetrolet transform
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN110766051A (en) Lung nodule morphological classification method based on neural network
CN110119753B (en) Lithology recognition method by reconstructed texture
Li et al. Globally and locally semantic colorization via exemplar-based broad-GAN
CN110047139B (en) Three-dimensional reconstruction method and system for specified target
Cao et al. Ancient mural restoration based on a modified generative adversarial network
JP2008097607A (en) Method to automatically classify input image
Pesaresi et al. A new compact representation of morphological profiles: Report on first massive VHR image processing at the JRC
CN113743484A (en) Image classification method and system based on space and channel attention mechanism
Zhang et al. Improved adaptive image retrieval with the use of shadowed sets
CN113592893A (en) Image foreground segmentation method combining determined main body and refined edge
CN112990340B (en) Self-learning migration method based on feature sharing
Kim et al. Image-based TF colorization with CNN for direct volume rendering
CN115033721A (en) Image retrieval method based on big data
CN112836755B (en) Sample image generation method and system based on deep learning
Hanif et al. Blind bleed-through removal in color ancient manuscripts
CN110210561B (en) Neural network training method, target detection method and device, and storage medium
Yuan et al. Explore double-opponency and skin color for saliency detection
CN112884773B (en) Target segmentation model based on target attention consistency under background transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination