CN115965818A - Small sample image classification method based on similarity feature fusion - Google Patents
Small sample image classification method based on similarity feature fusion Download PDFInfo
- Publication number
- CN115965818A CN115965818A CN202310032701.2A CN202310032701A CN115965818A CN 115965818 A CN115965818 A CN 115965818A CN 202310032701 A CN202310032701 A CN 202310032701A CN 115965818 A CN115965818 A CN 115965818A
- Authority
- CN
- China
- Prior art keywords
- sample
- representation
- image
- text
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 60
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000012360 testing method Methods 0.000 claims abstract description 5
- 239000011541 reaction mixture Substances 0.000 claims description 14
- 238000009826 distribution Methods 0.000 claims description 9
- 238000009827 uniform distribution Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 101100379081 Emericella variicolor andC gene Proteins 0.000 claims description 2
- 239000003054 catalyst Substances 0.000 claims 1
- 230000004044 response Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a small sample image classification method based on similarity feature fusion, which comprises the following steps: the method comprises the following steps: performing feature extraction on an input image; step two: extracting similarity relation of text ends; step three: extracting similarity relation among samples; step four: fusing features based on text similarity; step five: feature fusion based on sample similarity; step six: multi-stage feature fusion; step seven: and (5) training and testing the model. According to the method, based on the similarity between the samples and the categories, the characteristics of the input small sample images and the natural image characteristics of the basic categories are fused, so that the diversity of the characteristics of the small sample images can be improved, the category expression of the small sample images is perfected, the response capability of a classifier on the small sample images is improved, and the accuracy of the classification of the small sample images is improved.
Description
Technical Field
The invention belongs to the field of image classification, and particularly relates to a small sample image classification method based on similarity feature fusion.
Background
In recent years, convolutional Neural Networks (CNNs) have shown strong performance on a large number of visual tasks including image classification, segmentation, and the like, but they rely on large-scale labeling data for training, and the labeling of the large-scale data requires a large amount of manpower and material resource cost, which limits the application scenarios. To solve this problem, a task of small sample learning (FSL) has been proposed. It aims to accomplish the classification of test samples by a limited training sample.
Currently, a pre-training approach is often adopted in a small sample learning (FSL) task. It uses a pre-trained feature extractor (backphone) on the base class to directly extract sample features of the support classes and uses the features of the support samples to train a classifier. Training a robust feature extractor (backphone) can effectively improve the performance of a small sample learning (FSL) model, however, designing, training, and validating one feature extractor from zero is time consuming and expensive. Moreover, because the base class and the support class are disjoint, a feature extractor (backhaul) pre-trained on the base class tends to focus more on the texture and structure information of the base class samples it learns, causing it to ignore the details of the support samples, which has the problem of poor classification performance.
To solve the above problem of insufficient classification performance on a small number of support samples, a data generation-based approach generates more new samples based on the current support samples to assist the optimization process of the classifier, but ignores the difference between the basic class and the support class, and introduces extra noise in the data generation process, which may mislead the classifier.
Based on the above analysis, how to reduce the deviation between feature representations introduced by the difference between the basic category and the support category and between the basic sample and the support sample so as to improve the response capability of the classifier to the support category is a problem that small sample learning is urgently needed to solve.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a small sample image classification method based on similarity feature fusion, and can improve the accuracy of small sample image classification by directly modeling the similarity between a support sample and a basic sample and between a support class and a basic class.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a small sample image classification method based on similarity feature fusion, which is characterized by comprising the following steps of:
step 1.1, acquiring a natural image set, inputting the natural image set into a pre-trained CNN model for feature extraction to obtain feature representation of a natural image and a basic category set thereof, and recording the feature representation and the basic category set asWherein it is present>Represents a feature representation of the i-th natural image, and->d represents the dimension of the feature representation>Represents the base class to which the i-th natural image belongs, and->C base Set of base classes, | C, representing a set of natural images base I denotes natureNumber of base classes of image set, N base Representing the number of natural images in each base category; />
Step 1.2, another image sample set is obtained and input into the pre-trained CNN model for feature extraction, and feature representation and support category sets of the image samples are obtained and recorded asWherein it is present>Represents a feature representation of a jth image sample, and +> Represents the support class to which the jth image sample belongs, and->C novel Represents a set of support classes for the image sample and satisfies C novel ∩C base =φ,|C novel I represents the number of supported classes of an image sample, N novel Representing the number of image samples in each support category;
step 2: extracting the similarity relation of the text ends:
step 2.1, extracting a basic category set C by using a pre-trained word embedding model base Vector representation of text information of each basic categoryWherein it is present>Vector representation representing the text information of the kth base class>t represents the dimension of the vector representation;
step 2.2, extracting a support category set C by using the pre-trained word embedding model novel Vector representation of text information of each support categoryWherein it is present>Vector representation of the text information representing the s-th support category, and->
Step 2.3, calculating the vector representation of the text information of the s-th support category by using the formula (1)With a vector representation of the ith base category text information->Is greater than or equal to>And the similarity relation between the text end of the s-th support category and the text end of one basic category is used as the similarity relation between the text end of the s-th support category and the text end of one basic category, so that a text end similarity relation vector between the s-th support category and all the basic categories is obtained>
In the formula (1), the reaction mixture is,represents->And/or>Is greater than or equal to>And/or>Respectively represent->And/or>The L2 paradigm of (1);
and step 3: extracting similarity relation among samples:
computing a feature representation for a jth image sample using equation (2)Is compared with the characteristic representation of the i-th natural image->Is greater than or equal to>And the similarity relation is used as the similarity relation between the jth image sample and a natural image, so that a sample similarity relation vector between the jth image sample and all natural images is obtained>
In the formula (2), the reaction mixture is,represents->And/or>Is greater than or equal to>And &>Respectively represent->And &>The L2 paradigm of (1);
And 7: model training and testing:
step 7.1, extracting the feature representation of the image for the basic sample set and the support set according to the feature extraction module, forming a similarity feature fusion module by the feature fusion based on the text similarity, the feature fusion based on the sample similarity and the multi-stage feature fusion, and extracting the feature representation of the image for the support sample set and the support sample set according to the feature extraction modulePerforming feature fusion according to the selection of the feature fusion mode to obtain fused samples->
7.2, constructing a loss function L by using the formula (3);
in formula (3), L CE Representing cross entropy loss, gamma representing a classifier, and lambda being a harmonic factor when the features are fused;represents the class of the support sample and is fused with the fused sample>The categories of the data are consistent;
7.3, training the classifier gamma by using a gradient descent algorithm, calculating a loss function L to update the parameters of the classifier gamma, and stopping training when the training iteration times reach the set times to obtain the trained classifier gamma * For predicting the class of the new image sample.
The small sample image classification method based on similarity feature fusion is also characterized in that the step 4 comprises the following steps:
step 4.1, representing the characteristics of the jth image sampleAt V novel The vector representation of the text information corresponding to the support category is marked as ≥>And extracts->And a base set of classes C base Text similarity relation R of all basic categories in T (j);
Step 4.2, from the feature representation of the jth image sampleThe text similarity relation R T (j) Selecting basic category sets corresponding to the beta closest distances, and representing the characteristics of all natural images in the beta basic category sets as a text side selection set (or greater than or equal to the standard value)>Wherein +>Representing alternate corpus of text ends D textual Representing the characteristic of the r-th natural image as an alternative characteristic;
step 4.3, generating a text end random vector V T ∈R d And the text side random vector V T Obey 0-1 uniform distribution V T U (0, 1), defining a hyper-parameter α, and α ∈ [0,1 ]]According to a random vector V T With the hyper-parameter α, a text end mask vector M is constructed using equation (4) T ∈R d ;
In the formula (4), v Tt Random vector V representing text end T The tth random value; m is a unit of Tt Represents M T The t-th mask value;
step 4.4, according to the alternative characteristic representationAnd a text end mask vector M T The feature representation for the jth image sample is ^ based on equation (5)>Performing feature fusion to generate fused features->
In the formula (5), the reaction mixture is,denotes the vector inner product, λ is the randomly sampled harmonic factor in the Beta (2, 2) distribution.
The step 5 comprises the following steps:
step 5.1, feature representation for jth image samplePick up>And a set of base classes D base Similarity relation R between samples of feature representation of all natural images I (j);
Step 5.2, from the current sample(ii) inter-sample similarity relationship R I (j) The feature representation of the gamma natural images with the closest distance is selected as a sample end alternative set D instance And->Wherein it is present>Representing sample end candidate set D instance Representing the characteristic of the r-th natural image as an alternative characteristic;
step 5.3, generatingSample-side random vector V I ∈R d ,V I Obey 0-1 uniform distribution V I U (0, 1), defining a hyper-parameter α, and α ∈ [0,1 ]]According to a random vector V I With the hyper-parameter α, a sample-end mask vector M is constructed using equation (6) I ∈R d ;
In the formula (6), v Ik Represents the sample-end random vector V I The kth random value; m is Ik Represents M I The kth mask value;
step 5.4, according to the alternative characteristic representationAnd a sample end mask vector M T Representing a characteristic of the jth image sample ^ using equation (7)>Performing feature fusion to generate fused features->
In the formula (7), the reaction mixture is,denotes the vector inner product, λ is the randomly sampled harmonic factor in the Beta (2, 2) distribution.
The step 6 comprises the following steps:
step 6.1, feature representation for jth image sampleV novel Vector representation of text information corresponding to its support categoryIs recorded as->Extraction>And a base set of classes C base Text similarity relation R of all basic categories in T (j) And pick up->And a set of base classes D base Sample similarity relation R of feature representation of all natural images I (j);
Step 6.2, from the feature representation of the jth image sampleThe text similarity relation R T (j) Selecting basic category sets corresponding to the beta closest distances, and representing the characteristics of all natural images in the beta basic category sets as a text side selection set (or greater than or equal to the standard value)>Wherein it is present>Representing alternate corpus of text ends D textual Characteristic representation of the r-th natural image;
step 6.3, select set D from text textual According to similarity relation R between samples I (s) selecting gamma nearest base image samples as an alternative set D candidate And is andwherein x is f candidate Represents an alternative set D candidate Performing feature fusion by using the feature representation of the f-th natural image as a candidate feature representation;
step 6.4, generating a random vector V, wherein V belongs to R d And are randomThe vector V obeys 0-1 uniform distribution V-U (0, 1), a hyper-parameter alpha is defined, and the alpha belongs to [0,1 ]]And constructing a sample end mask vector M by using a formula (8) according to the random vector V and the hyperparameter alpha, wherein M belongs to R d ;
Step 6.5, according to the alternative characteristic representationAnd a mask vector M representing ^ the feature of the jth image sample using equation (9)>Performing feature fusion to generate fused features->
In the formula (9), the reaction mixture is,denotes the vector inner product, λ is the randomly sampled harmonic factor in the Beta (2, 2) distribution.
The electronic device comprises a memory and a processor, wherein the memory is used for storing programs for supporting the processor to execute the small sample image classification method, and the processor is configured to execute the programs stored in the memory.
The invention relates to a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, performs the steps of the method for classifying images of small samples.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention designs a small sample image classification method based on similarity feature fusion, which solves the problems of information loss and insufficient attention to support feature details caused by extracting the features of a support class sample by using a feature extractor pre-trained on a basic class through directly modeling the similarity between the support sample and the basic sample and between the support class and the basic class.
2. The method and the device simultaneously utilize the similarities between the basic type and the support type and between the basic sample and the support sample to generate a new sample with more discriminative, representative and strong expression capability, reduce the deviation and noise introduced in the data generation process compared with the traditional data generation-based method, fully consider the difference between the basic type and the support type, better assist the training of the classifier and improve the classification accuracy of the small sample classification method.
3. Compared with the traditional scheme based on the feature extractor training, the method is simpler and more efficient by generating the support feature direct training classifier, greatly reduces the complex time cost and the expensive calculation cost caused by training the feature extractor, simultaneously makes up the semantic bias caused by the category difference, and improves the classification accuracy.
Drawings
FIG. 1 is a flowchart of a small sample image classification method based on similarity feature fusion according to the present invention;
FIG. 2 is a schematic diagram of inter-sample similarity relationship extraction according to the present invention;
FIG. 3 is a diagram illustrating text-end similarity relationship extraction according to the present invention;
FIG. 4 is a schematic diagram of a feature fusion method of the present invention.
Detailed Description
In the embodiment, a small sample classification method based on similarity feature fusion is characterized in that the similarity between a support sample and a basic sample and the similarity between a support category and a basic category are directly modeled, a new sample is generated based on the similarity, the description of the support sample is perfected, and the optimization process of a classifier is assisted, so that semantic bias caused by category difference is reduced, and the accuracy of the small sample image classification method is improved. Specifically, as shown in fig. 1, the method comprises the following steps:
before similarity relation extraction, image samples from a natural image set and another image set are first converted into feature representations through a CNN model pre-trained on the natural image set.
Step 1.1, acquiring a natural image set, inputting the natural image set into a pre-trained CNN model for feature extraction to obtain feature representation of a natural image and a basic category set thereof, and recording the feature representation and the basic category set asWherein it is present>Represents a feature representation of the i-th natural image, and->d represents the dimension of the characteristic representation, and->Represents the base class to which the i-th natural image belongs, and->C base A set of base classes, | C, representing a set of natural images base I represents the number of basic categories of the natural image set, N base Representing the number of natural images in each base category;
step 1.2, another image sample set is obtained and input into a pre-trained CNN model for feature extraction, and feature representation and support category set of the image sample are obtained and recorded asWherein it is present>Represents a characteristic representation of the jth image sample, and-> Represents the support class to which the jth image sample belongs, andC novel represents a set of support classes for the image sample and satisfies C novel ∩C base =φ,|C novel I denotes the number of supported classes of the image sample, N novel Representing the number of image samples in each support category;
step 2: extracting the similarity relation of the text ends:
in order to implement feature fusion based on category text similarity, the similarity relationship between the text information of each support category and the text information of all basic categories needs to be extracted. Firstly, converting semantic labels of a basic category and a support category into a vector representation form by a pre-trained word embedding method, and then calculating a Cosine distance between the support category vector representation and the vector representation of each basic category as a similarity relation of a text end.
Step 2.1, extracting a basic category set C by using a pre-trained word embedding model base Vector representation of text information of each basic categoryWherein it is present>Vector representation representing the text information of the kth base class>t represents the dimension of the vector representation;
step 2.2, use pre-trained word to embed mouldType extraction support class set C novel Vector representation of text information of each support categoryWherein it is present>Vector representation of the text information representing the s-th support category, and->
Step 2.3, calculating vector representation of the text information of the s-th support category by using the formula (1)And the vector representation of the ith base category text information->In conjunction with a distance>And the similarity relation between the text end of the s-th support category and the text end of one basic category is used as the similarity relation between the text end of the s-th support category and the text end of one basic category, so that a text end similarity relation vector between the s-th support category and all the basic categories is obtained>
In the formula (1), the reaction mixture is,represents->And/or>Is greater than or equal to>And &>Respectively denote->And/or>The L2 paradigm of (1);
and step 3: extracting similarity relation among samples:
in order to realize the similarity feature fusion between the samples, the similarity relationship extraction needs to be performed on the image sample of each support category and all the natural image samples, and for the image sample of each support category, the Cosine distance between the feature representation of the image sample of each support category and the feature representation of all the natural image samples is calculated as the similarity relationship between the samples.
Step 3.1, calculating the feature representation of the jth image sample by using the formula (2)Is compared with the characteristic representation of the i-th natural image->Is greater than or equal to>And the similarity relation between the jth image sample and a natural image is used as the similarity relation between the jth image sample and the natural image, so that a sample similarity relation vector between the jth image sample and all the natural images is obtained>
In the formula (2), the reaction mixture is,represents->And/or>Is greater than or equal to>And/or>Respectively represent->And/or>The L2 paradigm of (1);
and 4, step 4: feature fusion based on text similarity:
step 4.1, representing the characteristics of the jth image sampleAt V novel In which a vector representation of the text information corresponding to the support category is marked as &>And extracts->And a base set of classes C base Text similarity relation R of all basic categories in T (j);
Step 4.2, from the feature representation of the jth image sample, as shown in FIG. 2The text similarity relation R T (j) Selecting basic category sets corresponding to the beta closest distances, and representing the characteristics of all natural images in the beta basic category sets as a text side selection set (or greater than or equal to the standard value)>Wherein it is present>Representing alternate corpus of text ends D textual Representing the characteristic of the r-th natural image as an alternative characteristic;
step 4.3, generating a text end random vector V T ∈R d And the text side random vector V T Obey 0-1 uniform distribution V T U (0, 1), defining a hyper-parameter α, and α ∈ [0,1 ]]In this example, α =0.7, in terms of a random vector V T With the hyper-parameter α, a text end mask vector M is constructed using equation (3) T ∈R d ;
In the formula (3), v Tt Representing text-end random vector V T The tth random value; m is Tt Represents M T The t-th mask value;
step 4.4, according to the alternative characteristic representationAnd a text end mask vector M T The feature representation for the jth image sample is ^ based on equation (4)>Performing feature fusion to generate fused features->
In the formula (4), the reaction mixture is,represents the vector inner product, λ is the harmonic factor of the random sampling in the Beta (2, 2) distribution;
and 5: and (3) feature fusion based on sample similarity:
step 5.1, feature representation for jth image samplePick up>And a set of base classes D base Similarity relation R between samples of feature representation of all natural images I (j);
Step 5.2, from the current sample, as shown in FIG. 3Is related to similarity between samples R I (j) Selecting the feature representation of the gamma closest natural images as a sample end alternative set D instance In which>Wherein,representing sample end candidate set D instance And (5) representing the characteristic of the r-th natural image, wherein gamma =512 in the example, and serving as an alternative characteristic representation. />
Step 5.3, generating a random vector V of a sample end I ∈R d ,V I Obey 0-1 uniform distribution V I U (0, 1), defining a hyper-parameter α, and α ∈ [0,1 ]]In this example, α =0.7, in terms of a random vector V I With the hyper-parameter α, a sample-end mask vector M is constructed using equation (5) I ∈R d ;
In the formula (5), v Ik Represents the sample-end random vector V I The kth random value; m is Ik Represents M I The kth mask value;
step 5.4, according to the alternative characteristic representationAnd a sample end mask vector M T The feature representation for the jth image sample is ^ based on equation (6)>Performing feature fusion to generate fused features->
In the formula (6), the reaction mixture is,represents the vector inner product, λ is the harmonic factor of the random sampling in the Beta (2, 2) distribution;
step 6: multi-stage feature fusion:
step 6.1, feature representation for jth image sampleV novel The vector representation of the text information corresponding to the support category is marked as ≥>Extraction>And a base set of classes C base Text similarity relation R of all basic categories in T (j) Extract and/or pick up>And a set of base classes D base Similarity relation R between samples of feature representation of all natural images I (j);
Step 6.2, from the feature representation of the jth image sampleThe text similarity relation R T (j) Selecting basic category sets corresponding to the beta closest distances, and representing the characteristics of all natural images in the beta basic category sets as a text side selection set (or greater than or equal to the standard value)>Wherein it is present>Representing alternate corpus of text ends D textual And as an alternative feature representation, in this example, β =2;
step 6.3, select set D from the text textual According to similarity relation R between samples I (s) selecting gamma nearest base image samples as an alternative set D candidate And is andwherein x is f candidate Represents an alternative set D candidate Performing feature fusion on the feature representation of the f-th natural image as an alternative feature representation, wherein in the example, gamma =512;
step 6.4, as shown in FIG. 4, a random vector V is generated, where V ∈ R d And the random vector V obeys 0-1 and is uniformly distributed V-U (0, 1), defining a hyperparameter α, and α ∈ [, [ solution ] ]0,1]In this example, α =0.7, and a sample end mask vector M is constructed using equation (7) based on the random vector V and the hyperparameter α, where M ∈ R d ;
Step 6.5, according to the alternative characteristic representationAnd a mask vector M representing ^ the feature of the jth image sample using equation (8)>Performing feature fusion to generate fused features->
In the formula (8), the reaction mixture is,represents the vector inner product, λ is the randomly sampled harmonic factor in the Beta (2, 2) distribution;
and 7: model training and testing:
step 7.1, according to the feature extraction module, extracting the feature representation of the image for the basic sample set and the support set, forming a similarity feature fusion module by feature fusion based on text similarity, feature fusion based on sample similarity and multi-stage feature fusion, and extracting the feature representation of the image for the support sample set and the support set, and forming a similarity feature fusion module for the support sample setPerforming feature fusion according to the selection of the feature fusion mode to obtain fused samples->
Step 7.2, constructing a loss function L by using the formula (9);
in formula (9), L CE Representing cross entropy loss, gamma representing a classifier, and lambda being a harmonic factor in feature fusion;represents the class of the support sample and is fused with the fused sample>The categories of the data are consistent;
7.3, training the classifier gamma by using a gradient descent algorithm, calculating a loss function L to update the parameter of the classifier gamma, and stopping training when the training iteration times reach the set times to obtain the trained classifier gamma * For predicting the class of the new image sample.
In this embodiment, an electronic device includes a memory for storing a program that supports a processor to execute the above-described small sample classification method, and a processor configured to execute the program stored in the memory.
In this embodiment, a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program performs the steps of the small sample classification method.
Claims (6)
1. A small sample image classification method based on similarity feature fusion is characterized by comprising the following steps:
step 1, feature extraction of an input image:
step 1.1, acquiring a natural image set and inputting the natural image set into a pre-trained CNN model for feature extraction to obtain feature representation and a basic category set of the natural image, and recording the feature representation and the basic category set asWherein it is present>Represents a feature representation of the i-th natural image, and->d represents the dimension of the characteristic representation, and->Represents the base class to which the i-th natural image belongs, and->C base Set of base classes, | C, representing a set of natural images base I represents the number of basic categories of the natural image set, N base Representing the number of natural images in each base category;
step 1.2, another image sample set is obtained and input into the pre-trained CNN model for feature extraction, and feature representation and support category set of the image sample are obtained and recorded asWherein it is present>Represents a characteristic representation of the jth image sample, and-> Represents the support class to which the jth image sample belongs, andC novel represents a set of support classes for the image sample and satisfies C novel ∩C base =φ,|C novel I represents the number of supported classes of an image sample, N novel Representing the number of image samples in each support category;
step 2: extracting the similarity relation of the text ends:
step 2.1, extracting a basic category set C by using a pre-trained word embedding model base Vector representation of text information of each basic categoryWherein +>Vector representation representing the text information of the kth base class>t represents the dimension of the vector representation;
step 2.2, extracting a support category set C by using the pre-trained word embedding model novel Vector representation of text information of each support categoryWherein +>A vector representation of the textual information representing the s-th support category, device for selecting or keeping>
Step 2.3, calculating the vector representation of the text information of the s-th support category by using the formula (1)With a vector representation of the ith base category text information->Is greater than or equal to>And the similarity relation between the text end of the s-th support category and the text end of one basic category is used as the similarity relation between the text end of the s-th support category and the text end of one basic category, so that a text end similarity relation vector between the s-th support category and all the basic categories is obtained>
In the formula (1), the acid-base catalyst,represents->And/or>Is greater than or equal to>And/or>Respectively represent->And/or>The L2 paradigm of (1);
and step 3: extracting similarity relation among samples:
computing a feature representation for a jth image sample using equation (2)In relation to a feature representation of an i-th natural image>Is greater than or equal to>And the similarity relation between the jth image sample and a natural image is used as the similarity relation between the jth image sample and the natural image, so that a sample similarity relation vector between the jth image sample and all the natural images is obtained>/>
In the formula (2), the reaction mixture is,represents->And/or>Is greater than or equal to>And &>Respectively represent->Andthe L2 paradigm of (1);
And 7: model training and testing:
step 7.1, extracting the feature representation of the image for the basic sample set and the support set according to the feature extraction module, forming a similarity feature fusion module by the feature fusion based on the text similarity, the feature fusion based on the sample similarity and the multi-stage feature fusion, and extracting the feature representation of the image for the support sample set and the support sample set according to the feature extraction modulePerforming feature fusion according to the selection of the feature fusion mode to obtain fused samples->
7.2, constructing a loss function L by using the formula (3);
in the formula (3), L CE Representing cross entropy loss, gamma representing a classifier, and lambda being a harmonic factor in feature fusion;represents a category of supporting samples and is fused with the samples>The categories of the data are consistent;
7.3, training the classifier gamma by using a gradient descent algorithm, calculating a loss function L to update the parameters of the classifier gamma, and stopping training when the training iteration times reach the set times to obtain the trained classifier gamma * For predicting the class of the new image sample.
2. The method for classifying small sample images based on similarity feature fusion according to claim 1, wherein the step 4 comprises:
step 4.1, representing the characteristics of the jth image sampleAt V novel The vector representation of the text information corresponding to the support category is marked as ≥>And extracts->And a base set of classes C base Text similarity relation R of all basic categories in T (j);
Step 4.2, from the feature representation of the jth image sampleThe text similarity relation R T (j) Selecting beta ones of the bestBasic category sets corresponding to the short distance, and the characteristics of all natural images in the beta basic category sets are expressed as text side selection sets>Wherein it is present>Representing alternate corpus of text ends D textual Representing the characteristic of the r-th natural image as an alternative characteristic;
step 4.3, generating text end random vector V T ∈R d And the text side random vector V T Obey 0-1 uniform distribution V T U (0, 1), defining a hyper-parameter α, and α ∈ [0,1 ]]According to a random vector V T With the hyper-parameter α, a text end mask vector M is constructed using equation (4) T ∈R d ;
In the formula (4), v Tt Representing text-end random vector V T The tth random value; m is a unit of Tt Represents M T The t-th mask value;
step 4.4, according to the alternative characteristic representationAnd a text end mask vector M T The feature representation for the jth image sample is ^ based on equation (5)>Performing feature fusion to generate fused features->
3. The method for classifying small sample images based on similarity characteristic fusion according to claim 2, wherein the step 5 comprises:
step 5.1, feature representation for jth image sampleExtraction>And a base set of categories D base Similarity relation R between samples of feature representation of all natural images I (j);
Step 5.2, from the current sampleIs related to similarity between samples R I (j) Selecting the feature representation of the gamma closest natural images as a sample end alternative set D instance And->Wherein it is present>Representing sample end candidate set D instance Representing the characteristic of the r-th natural image as an alternative characteristic;
step 5.3, generating a random vector V of a sample end I ∈R d ,V I Obey 0-1 uniform distribution V I U (0, 1), defining a hyper-parameter α, anα∈[0,1]According to a random vector V I With the hyper-parameter α, a sample-end mask vector M is constructed using equation (6) I ∈R d ;
In the formula (6), v Ik Represents the sample-end random vector V I The kth random value; m is Ik Represents M I The kth mask value;
step 5.4, according to the alternative characteristic representationAnd a sample end mask vector M T The feature representation ^ for the jth image sample using equation (7)>Performing feature fusion to generate fused features->
4. The method for classifying small sample images based on similarity feature fusion according to claim 3, wherein the step 6 comprises:
step 6.1, feature representation for jth image sampleV novel The vector representation of the text information corresponding to its support category is marked as @>Pick up>And a base set of classes C base Text similarity relation R of all basic categories in T (j) And pick up->And a set of base classes D base Sample similarity relation R of feature representation of all natural images I (j);
Step 6.2, from the feature representation of the jth image sampleThe text similarity relation R T (j) Selecting basic category sets corresponding to the beta closest distances, and representing the characteristics of all natural images in the beta basic category sets as a text side selection set (or greater than or equal to the standard value)>Wherein +>Representing alternate corpus of text ends D textual Characteristic representation of the r-th natural image;
step 6.3, select set D from text textual According to similarity relation R between samples I (s) selecting gamma nearest base image samples as an alternative set D candidate And is made ofWherein x is f candidate Represents an alternative set D candidate Performing feature fusion by using the feature representation of the f-th natural image as a candidate feature representation;
step 6.4, generating a random vector V, wherein V belongs to R d And the random vector V obeys 0-1 to uniformly distribute V-U (0, 1), a hyper-parameter alpha is defined, and alpha belongs to [0,1 ]]And constructing a sample end mask vector M by using a formula (8) according to the random vector V and the hyperparameter alpha, wherein M belongs to R d ;
Step 6.5, according to the alternative characteristic representationAnd a mask vector M representing ^ the feature of the jth image sample using equation (9)>Performing feature fusion to generate fused features->
5. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that enables the processor to perform the method of classifying a small sample image according to any one of claims 1-4, and the processor is configured to execute the program stored in the memory.
6. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for classifying a small sample image according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310032701.2A CN115965818A (en) | 2023-01-10 | 2023-01-10 | Small sample image classification method based on similarity feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310032701.2A CN115965818A (en) | 2023-01-10 | 2023-01-10 | Small sample image classification method based on similarity feature fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115965818A true CN115965818A (en) | 2023-04-14 |
Family
ID=87363362
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310032701.2A Pending CN115965818A (en) | 2023-01-10 | 2023-01-10 | Small sample image classification method based on similarity feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115965818A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116452895A (en) * | 2023-06-13 | 2023-07-18 | 中国科学技术大学 | Small sample image classification method, device and medium based on multi-mode symmetrical enhancement |
CN116503674A (en) * | 2023-06-27 | 2023-07-28 | 中国科学技术大学 | Small sample image classification method, device and medium based on semantic guidance |
-
2023
- 2023-01-10 CN CN202310032701.2A patent/CN115965818A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116452895A (en) * | 2023-06-13 | 2023-07-18 | 中国科学技术大学 | Small sample image classification method, device and medium based on multi-mode symmetrical enhancement |
CN116452895B (en) * | 2023-06-13 | 2023-10-20 | 中国科学技术大学 | Small sample image classification method, device and medium based on multi-mode symmetrical enhancement |
CN116503674A (en) * | 2023-06-27 | 2023-07-28 | 中国科学技术大学 | Small sample image classification method, device and medium based on semantic guidance |
CN116503674B (en) * | 2023-06-27 | 2023-10-20 | 中国科学技术大学 | Small sample image classification method, device and medium based on semantic guidance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111160037B (en) | Fine-grained emotion analysis method supporting cross-language migration | |
CN107943784B (en) | Relationship extraction method based on generation of countermeasure network | |
CN112733866B (en) | Network construction method for improving text description correctness of controllable image | |
CN107590177B (en) | Chinese text classification method combined with supervised learning | |
CN115965818A (en) | Small sample image classification method based on similarity feature fusion | |
CN115131613B (en) | Small sample image classification method based on multidirectional knowledge migration | |
WO2018188653A1 (en) | Inspection method and inspection device | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN115130591A (en) | Cross supervision-based multi-mode data classification method and device | |
Feng et al. | Incremental few-shot object detection via knowledge transfer | |
CN111242059A (en) | Method for generating unsupervised image description model based on recursive memory network | |
CN114780723A (en) | Portrait generation method, system and medium based on guide network text classification | |
Juyal et al. | Multilabel image classification using the CNN and DC-CNN model on Pascal VOC 2012 dataset | |
CN116543146B (en) | Image dense description method based on window self-attention and multi-scale mechanism | |
CN110609895B (en) | Sample automatic generation method for actively selecting examples to conduct efficient text classification | |
CN117056506A (en) | Public opinion emotion classification method based on long-sequence text data | |
CN114817537A (en) | Classification method based on policy file data | |
CN114492386A (en) | Combined detection method for drug name and adverse drug reaction in web text | |
CN112364654A (en) | Education-field-oriented entity and relation combined extraction method | |
CN112347258A (en) | Short text aspect level emotion classification method | |
CN116503674B (en) | Small sample image classification method, device and medium based on semantic guidance | |
Wei et al. | A hybrid representation of word images for keyword spotting | |
CN117171343B (en) | Text classification method | |
Malik et al. | SIGN LANGUAGE RECOGNITION AND DETECTION: A COMPREHENSIVE SURVEY | |
CN113297845B (en) | Resume block classification method based on multi-level bidirectional circulation neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |