CN113963165A - Small sample image classification method and system based on self-supervision learning - Google Patents

Small sample image classification method and system based on self-supervision learning Download PDF

Info

Publication number
CN113963165A
CN113963165A CN202111098484.4A CN202111098484A CN113963165A CN 113963165 A CN113963165 A CN 113963165A CN 202111098484 A CN202111098484 A CN 202111098484A CN 113963165 A CN113963165 A CN 113963165A
Authority
CN
China
Prior art keywords
image
classifier
learning
representing
rotation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111098484.4A
Other languages
Chinese (zh)
Other versions
CN113963165B (en
Inventor
王蕊
施璠
操晓春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202111098484.4A priority Critical patent/CN113963165B/en
Priority claimed from CN202111098484.4A external-priority patent/CN113963165B/en
Publication of CN113963165A publication Critical patent/CN113963165A/en
Application granted granted Critical
Publication of CN113963165B publication Critical patent/CN113963165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample image classification method and system based on self-supervised learning, belongs to the technical field of computer vision, and trains a feature extractor with generalization performance on all training data of a data set by methods of self-supervised learning, contrast learning, co-learning and the like. The self-supervision learning is applied to the training of small sample learning, and the representation capability of a feature extractor in the small sample learning is improved; the contrast learning is applied to the small sample learning, and meanwhile, the metric function is optimized, so that the learned features of the feature extractor have more obvious classification boundaries; the common learning is applied to the training of the small sample learning, so that regularization constraint is introduced, and the generalization performance of the network is improved.

Description

Small sample image classification method and system based on self-supervision learning
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method and a system capable of classifying images in a small sample scene.
Background
Image classification is a fundamental task that computer vision needs to address. The current development trend of deep learning is to continuously deepen the network structure so as to improve the classification accuracy of the model, however, a deeper model means more learnable parameters, and more data is needed for training the parameters, which is also one of the mainstream trends of the current deep learning, namely, data-driven model training. This data-driven deep learning approach typically requires a large amount of annotation data, which poses a significant problem. On one hand, the process of labeling data is very labor-consuming, and the semi-supervised learning and unsupervised learning methods are also based on the angle, so that the dependence of the network on the labeled data is reduced; on the other hand, in many application scenarios, we have no way to acquire large-scale data, such as rare species and emerging things. In both cases, the traditional deep learning method has difficulty in achieving the ideal classification effect.
The small sample learning aims to solve the problem that the model learning is difficult under the condition of less data. Small sample learning generally requires only a small number of images of a certain class to be obtained for classification prediction of such class. However, the quality of the small sample learning model depends largely on the data distribution difference of the test set and the training set, because many deep learning networks have the capability of processing similar tasks, but are difficult to adapt to the field not involved. This is also the difference between the small sample learning task and the traditional image classification task, i.e. the network needs the ability to process unseen data classes.
From the type, the small sample learning is mainly divided into two types, one is direct-push learning and the other is inductive learning. The main difference is whether unlabeled samples to be predicted can be obtained in the training. The direct-push learning can obtain test data in training, the final target only needs to predict the label of the test data, and when a new sample to be predicted appears, the model needs to be retrained; inductive learning does not require the acquisition of test data during training, i.e., the trained model can be used directly to predict unknown test data. In the two types, the inductive small sample learning can process more data which are not seen by the network, retraining is not needed, the network is required to have generalization, and the use scene is wider.
The main methods for small sample learning include the following three methods:
1) based on an optimization method, small sample learning is regarded as a new task, the concept of meta-learning is used, namely, a model learns, and the final aim is to converge as soon as possible when the model faces a new group of learning tasks. Training based on this method usually consists of two cycles. For example, the MAML is composed of a base learner and a meta learner, in the training process, the base learner is trained for each independent task by the inner loop, the meta learner is optimized by the outer loop according to the obtained verification effect of the base learner, and finally, the optimal initialization parameter of the base learner which can be quickly adapted to the new task is obtained.
2) And (3) an augmentation method based on the generated data. At the data level, the problem of too little original data can be solved by generating new data samples; on the aspect of characteristics, by using a generation method, not only can the classification boundary between specific categories be learned, but also the complete boundary of category distribution can be obtained by introducing the concept of data distribution, so that the problem of category combination is solved.
3) A method of metric-based learning. Firstly, a feature extractor is obtained through training, the image obtains a feature vector in a feature space through the feature extractor, the distance between different images is obtained through a proper metric function (such as Euclidean distance, cosine distance and the like), and finally the images are classified through a distance relation. The importance of obtaining a better feature space in small sample learning is also indicated in the context of RFS.
The current small sample learning also faces great challenges, in the training process, the data volume is small, so that a deeper model is under-fitted, the model representation capability is poor, a shallower model is easy to over-fit, and the model generalization performance is poor; in the test process, too few support sets can cause great difference between the whole data and the real data distribution, and the support sets cannot represent the new class well, so that the classification accuracy of the query set is low, and the problems are to be solved.
Disclosure of Invention
The invention aims to provide a small sample image classification method and system based on self-supervised learning, which can realize image classification by using less data volume based on the self-supervised learning.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a small sample image classification method based on self-supervision learning comprises the following steps:
constructing two image classification networks with the same structure but without sharing weights, wherein each image classification network comprises a feature extractor, a rotary class classifier, a supervision comparison learning classifier and an image class classifier;
in the training stage, image training data with class labels are respectively input into two image classifiers, and the two image classifiers are trained simultaneously, wherein the training steps are as follows:
rotating each input image every 90 degrees to obtain four images in four directions, and extracting feature vectors of the images through a feature extractor respectively;
inputting the feature vector of the image obtained by rotation into a rotation category classifier to classify the rotation direction of the image, and calculating the cross entropy loss of the rotation category classifier;
taking the images obtained by rotation and the images obtained by rotation of the same type as positive examples of the images, taking the images obtained by rotation of other types as negative examples, inputting the feature vectors of the positive examples and the negative examples of the images into a supervised contrast learning classifier for classification to obtain the probability of belonging to the same type, and calculating the cross entropy loss of the supervised contrast learning classifier;
directly inputting the feature vector of each image into an image category classifier for classification, and calculating the cross entropy loss of the image category classifier;
performing joint learning between the outputs of the image category classifiers of the two image classification networks through KL divergence constraint, and calculating joint learning cross entropy loss;
carrying out weighted summation on the cross entropy loss of the rotation category classifier, the cross entropy loss of the supervised contrast learning classifier, the cross entropy loss of the image category classifier and the cross entropy loss of the common learning to obtain the total loss; through iterative training, the overall loss is minimized, and a trained feature extractor is obtained;
in the using stage, classifying the image to be classified, and the steps are as follows:
inputting training images with class labels consistent with the classes of the images to be classified into a trained feature extractor to extract feature vectors, and training a rotary class classifier, a supervised contrast learning classifier and an image class classifier by using the feature vectors;
and inputting the images to be classified into a trained feature extractor to extract feature vectors, inputting the extracted feature vectors into a trained rotary class classifier, a supervised contrast learning classifier and an image class classifier, and outputting image classification results.
A small sample image classification system based on self-supervision learning comprises two image classification networks with the same structure and without sharing weight, wherein each image classification network comprises:
a feature extractor for extracting a feature vector of the image obtained by the rotation;
a rotation category classifier for classifying the feature vectors of the image obtained by rotation according to a rotation direction;
the supervised contrast learning classifier is used for classifying the feature vectors of positive examples and negative examples to obtain the probability of belonging to the same class, wherein the positive examples refer to the images obtained by rotation and the images obtained by rotation of the same class, and the negative examples refer to the images obtained by rotation of other classes;
the image category classifier is used for classifying the images according to the input feature vector of each image;
the feature extractor is trained through image training data with class labels in advance, the outputs of the image class classifiers of the two image classification networks are jointly learned through KL divergence constraint during training, overall loss is minimized through iterative training, and training is completed.
The main innovation points of the invention comprise the following three points:
1) the self-supervision learning is applied to the training of small sample learning, and the representation capability of the feature extractor in the small sample learning is improved.
2) The contrast learning is applied to the small sample learning, and meanwhile, the metric function is optimized, so that the features learned by the feature extractor have more obvious classification boundaries.
3) The common learning is applied to the training of the small sample learning, so that regularization constraint is introduced, and the generalization performance of the network is improved.
Drawings
FIG. 1 is a schematic diagram of a network architecture during a training phase of the method of the present invention;
fig. 2 is a schematic diagram of a network structure at the test stage of the method of the present invention.
Detailed Description
In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.
In this embodiment, an image classification network structure as shown in fig. 1 is established, and is composed of two networks with the same structure but not sharing weights, and each image classification network includes a feature extractor, a rotation class classifier, a supervised contrast learning classifier, and an image class classifier.
In training, images in four directions, in which training images with category labels (for example, birds, fruits, billboards, etc.) are augmented by rotation (0 °, 90 °, 180 °, 270 °), are input as a network input to a feature extractor, and feature vectors of the images are extracted.
1. In each network, the features extracted by the feature extractor are input to the following three classifiers:
1) rotating class classifier Θ
An auto-supervised learning task is constructed by rotating the images. Intuitively, the shooting angle of the imageThe direction of the object is greatly related, for example, a house which is normally shot must be the roof on top, a billboard on the ground, and a rotated picture can be deduced according to the clues to find the correct direction. In an experimental level, for each input image, the rotation angle needs to be predicted, and the rotation angles of 0 degree, 90 degrees, 180 degrees and 270 degrees clockwise are taken as 4 rotation angle categories, namely, the problem of four categories is solved. Specifically, D is a small sample training data set, CrSet of 4 rotation classes, xrRepresenting the r-th rotation transformation on the input x, L being the cross entropy loss function, FiA feature extractor representing the ith network,
Figure BDA0003269904140000041
a rotating class classifier for the ith network.
Figure BDA0003269904140000042
2) Supervised contrast learning classifier phi
By inputting the features obtained by the feature extractor into the multi-layer neural network, the features are mapped into a smaller space. In this space, it is desirable that the features of the same class are as close in distance as possible and that the different classes are as far apart as possible, resulting in more robust classification boundaries. Specifically, in a training batch, four input images obtained by rotating an image, images in the same category and images in four directions after rotation of the images belong to a positive example of the image, images in other categories of the training batch and images after rotation of the images belong to a negative example of the image, so that a binary classification task is constructed, image feature vectors of the positive example and the negative example are input into a supervised contrast learning classifier phi, and finally cross entropy is used as a loss function of the positive example and the negative example.
And calculating the similarity of the features in a training batch, wherein the label of a positive case is 1, and the label of a negative case is 0, and calculating a cross entropy loss function. Specifically, the formula is shown below, wherein D*Representing the data set after rotation enhancement, B (x, y) representing the sample with x in the same training batch and labeled y,
Figure BDA0003269904140000051
samples representing x in the same dataset but labeled other than y, FiA feature extractor representing the ith network,
Figure BDA0003269904140000052
represents the supervised contrast learning classifier for the ith network. τ denotes the temperature coefficient, the lower the temperature coefficient the better the training, but a particularly low temperature coefficient makes the network more difficult to train.
Figure BDA0003269904140000053
Where E is the mathematical expectation, (x, y) E D*Subscript of E, indicating data range; x, x,
Figure BDA0003269904140000054
Is an input image, respectively belonging to a set D*、B(x,y)、
Figure BDA0003269904140000055
y、
Figure BDA0003269904140000056
Represent different tags; the log base is not limited and τ is the temperature coefficient.
3) Image class classifier Ψ
And inputting the input feature vector of the rotated image into an image category classifier Ψ for category prediction, and judging the real category of the image. Cross entropy is used as a loss function. Specifically, the formula is shown below, wherein D*Representing the data set after rotation enhancement, and L is a cross entropy loss function. FiA feature extractor representing the ith network,
Figure BDA0003269904140000057
to representAn image class classifier for the ith network.
Figure BDA0003269904140000058
2. Co-learning between two feature extractors
By the co-learning method, co-learning is performed between the outputs of the two image class classifiers by a KL divergence constraint.
Figure BDA0003269904140000059
Wherein L isklIs the loss function corresponding to the KL divergence; d*Representing the rotation enhanced data set, FiA feature extractor representing the ith network,
Figure BDA00032699041400000510
and expressing an image class classifier of the ith network, and KL expresses a KL divergence calculation formula. Considering the asymmetry of the KL divergence, the purpose of enabling two networks to learn each other is achieved by exchanging the positions of two items in the common learning.
3. Overall loss function
The above loss functions are multiplied by respective coefficients and added to obtain an overall loss function, which is specifically expressed as follows, wherein α, β, γ, and η represent weight coefficients of the loss functions.
Ltotal=α·Lcls+β·Lrot+γ·Lscl+η·Lkl
4. Algorithm flow
The algorithm flow of the training process is shown in table 1, the data set D is changed into D through rotation augmentation in 2-3 steps, and finally two feature extractors F for extracting features in the testing stage are output1And F2
The image classification in the small sample scene based on the self-supervised learning mainly comprises a training stage and a testing stage.
1) Training phase
In the training process, unlike the training in units of segments used in the optimization-based small sample learning method, each image and its class label are used as a data unit in the embodiment. A supervised contrast learning classifier based on an automatic supervised learning method used in the training process does not need to additionally increase a training label of a rotation direction, and the label is constructed through data. The common learning strategy is used as a variant of knowledge distillation, and is different from an iterative process of firstly training a teacher network and then training a student network in the knowledge distillation, and two networks are trained together to finish the training in one step.
In the training process, one image is input, four images are obtained through rotation in sequence, and the four images are input into a feature extractor to obtain respective feature vectors. And inputting each feature vector into a rotation class classifier to obtain the rotation angle of the image. All pictures in a training batch are input into a supervised contrast learning classifier, and the probability that other images and an image in the training batch belong to the same class is obtained. Each feature vector is input into an image class classifier to obtain the class of the image.
The steps are repeated in another same network, and considering that the classification task is the main task of the experiment compared with other auxiliary tasks, the KL divergence of the classification results of the two networks is only constructed, so that the common learning is realized, and the generalization performance of the model is improved.
In the co-learning, the KL divergence adjusts the degree of the co-learning by the temperature coefficient T, which is 4 in this embodiment. In contrast learning, the loss is adjusted by the temperature coefficient τ, and the value of τ in this embodiment is 0.5. In the overall loss function, the weighting coefficients α, β, γ, η of the respective loss functions are all 0.5 in this embodiment.
Figure BDA0003269904140000061
Figure BDA0003269904140000071
2) Testing phase
During testing, the test set is divided into a support set and a query set. And extracting the features of the support set by using the feature extractor trained in the training stage, training a classifier on the features of the support set, and using the classifier for classifying the query set to obtain the prediction label of the query set. The present embodiment performs the experiment using two general settings for small sample learning, namely, a support set of "five categories, each having one image" and "five categories, each having five images". And (3) using the classifier obtained on the support set for prediction of the query set, and calculating classification accuracy, wherein 15 images of each category are taken in each group of experiments, and 75 images are taken as the query set in total. The 600 sets of experiments were repeated and the average accuracy was calculated. The above steps are repeated three times, and the median of the average accuracy is taken as the final result.
The network structure of the testing stage is shown in fig. 2, where F is a feature extractor obtained in the training stage, and F is a feature classifier trained by a single task in its support set in the testing stage, and is finally used for prediction in the querier. The algorithm flow is shown as algorithm 2, at this time, the data set D does not need to be subjected to rotating data amplification, and the feature extractor F selects F1And F2Any one of them may be used. S and Q in the algorithm respectively represent a support set and a query set of a test stage, and LR represents logistic regression. And finally, outputting the average classification accuracy in small sample learning.
Figure BDA0003269904140000072
The method mainly solves the image classification task under the small sample scene, and trains a feature extractor with generalization performance on all training data of a data set through methods such as self-supervision learning, comparison learning and joint learning. In the testing process, each classification task of small sample learning is composed of a support set and a query set, wherein the support set contains classes to be learned, and each class generally has only a small number of data samples. The query set comprises the categories which need to be predicted and appear in the support set, and the prediction preparation rate of the query set is the accuracy rate of the small sample learning. The support set obtains features through a feature extractor, classification functions are obtained on the obtained features through methods such as logistic regression, Euclidean distance and cosine distance, the category of the query set is predicted on the basis, and classification accuracy is calculated.
When the method is applied to a specific example, the method is substantially the same as the testing stage, except that a query set does not need to be separated, namely a trained feature extractor is used for extracting feature vectors of an image to be classified, then the classifier is subjected to temporary training related to classes through the image of the specified class, and after the classifier is trained, the knowledge for classifying the specified class can be mastered, so that the classifier can be used for classifying the image to be classified.
The invention provides a method for classifying images in a small sample scene, which comprises the following steps of:
(1) and (3) testing environment:
the system environment is as follows: centos 7;
hardware environment: memory: 64GB, GPU: TITAN XP, hard disk: 2 TB;
(2) experimental data:
experiments were performed on four datasets, MiniImageNet, Tiered ImageNet, CIFAR-FS, FC 100.
MiniImageNet is a subset of ImageNet, with data sizes and dimensions much smaller than ImageNet, on which training requires fewer resources, often used in the task of small sample classification. There are 100 classes, of which 64 classes are training sets, 16 classes are validation sets, and 20 classes are test sets, each class having 600 images, each image having a size of 84 × 84.
Tiered ImageNet is another subset of ImageNet, slightly larger than MiniImageNet, shares 608 categories, and can be combined into 34 major categories, of which 20 categories are training sets, 6 categories are validation sets, 8 categories are test sets, and total 779165 images.
CIFAR-FS is a small sample learning data set constructed based on CIFAR100, 100 classes are randomly divided into 64 classes, 16 classes and 20 classes which are respectively used as a training set, a verification set and a test set, each class comprises 600 images, and the size of each image is 32 x 32.
The FC100 is another small sample learning data set constructed based on the CIFAR100, which is more complex than the CIFAR-FS, and has 100 classes belonging to 20 major classes, wherein 60 classes belonging to 12 major classes are used as a training set, two 20 classes belonging to 4 major classes are respectively used as a verification set and a test set, each class has 600 images, and the size of each image is 32 x 32.
In the CIFAR-FS and FC100, in the training stage, after a boundary with the size of 4 is filled around, the boundary is randomly cut into 32 × 32, and color disturbance and horizontal turning data are added for enhancing and regularizing; the test phase only performs the regularization operation.
For MiniImageNet and Tiered ImageNet, in the training phase, after filling a boundary with the size of 4 around, randomly cutting the boundary into 84 × 84, adding data enhancement of color disturbance and horizontal inversion, and performing regularization operation; the test phase only performs the regularization operation.
The training optimization method comprises the following steps: adam, initial learning rate 0.05. MiniImageNet was attenuated with a weight of 0.1 at 60 th and 80 th training periods for a total of 90 periods; FC100 and CIFAR-FS are attenuated by 0.1 weight in 50 th, 65 th and 80 th training periods, and are trained for 90 periods; tiered ImageNet was trained for 60 cycles with a weight decay of 0.1 at 30, 40, 50 epochs.
(3) The experimental results are as follows:
the results of comparing the results of the current mainstream protocol and the inventive experiment are shown in tables 1 and 2 below, and experiments were performed on CIFAR-FS and FC100, and MiniImageNet and Tiered ImageNet, respectively. Experimental results show that the method is superior to the current mainstream algorithm, and the classification accuracy of small sample learning is improved in different experimental settings of a plurality of data sets.
TABLE 1 comparison of the Experimental results on CIFAR-FS and FC-100
Figure BDA0003269904140000091
TABLE 2 comparison of the Experimental Effect on MiniImageNet and Tiered ImageNet
Figure BDA0003269904140000092
Although the present invention has been described with reference to the above embodiments, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A small sample image classification method based on self-supervision learning is characterized by comprising the following steps:
constructing two image classification networks with the same structure but without sharing weights, wherein each image classification network comprises a feature extractor, a rotary class classifier, a supervision comparison learning classifier and an image class classifier;
in the training stage, image training data with class labels are respectively input into two image classifiers, and the two image classifiers are trained simultaneously, wherein the training steps are as follows:
rotating each input image every 90 degrees to obtain four images in four directions, and extracting feature vectors of the images through a feature extractor respectively;
inputting the feature vector of the image obtained by rotation into a rotation category classifier to classify the rotation direction of the image, and calculating the cross entropy loss of the rotation category classifier;
taking the images obtained by rotation and the images obtained by rotation of the same type as positive examples of the images, taking the images obtained by rotation of other types as negative examples, inputting the feature vectors of the positive examples and the negative examples of the images into a supervised contrast learning classifier for classification to obtain the probability of belonging to the same type, and calculating the cross entropy loss of the supervised contrast learning classifier;
directly inputting the feature vector of each image into an image category classifier for classification, and calculating the cross entropy loss of the image category classifier;
performing joint learning between the outputs of the image category classifiers of the two image classification networks through KL divergence constraint, and calculating joint learning cross entropy loss;
carrying out weighted summation on the cross entropy loss of the rotation category classifier, the cross entropy loss of the supervised contrast learning classifier, the cross entropy loss of the image category classifier and the cross entropy loss of the common learning to obtain the total loss; through iterative training, the overall loss is minimized, and a trained feature extractor is obtained;
in the using stage, classifying the image to be classified, and the steps are as follows:
inputting training images with class labels consistent with the classes of the images to be classified into a trained feature extractor to extract feature vectors, and training a rotary class classifier, a supervised contrast learning classifier and an image class classifier by using the feature vectors;
and inputting the images to be classified into a trained feature extractor to extract feature vectors, inputting the extracted feature vectors into a trained rotary class classifier, a supervised contrast learning classifier and an image class classifier, and outputting image classification results.
2. The method of claim 1, wherein the function that computes the cross-entropy loss for a rotating class classifier is as follows:
Figure FDA0003269904130000011
wherein L isrotRepresents the cross-entropy loss of the rotating class classifier, D represents the small sample training data set, CrSet representing four rotation direction classes, xrShowing the r-th rotation transformation on the input image x, L showing the cross entropy loss function, FiA feature extractor representing the ith image classification network,
Figure FDA0003269904130000012
a rotating class classifier of the network classifies the ith image.
3. The method of claim 1, wherein the function of cross entropy loss for the supervised contrast learning classifier is calculated as follows:
Figure FDA0003269904130000021
wherein L issclRepresents the cross-entropy loss of the contrast learning classifier, E represents the mathematical expectation, D*Representing the rotation enhanced data set, x,
Figure FDA0003269904130000022
Representing the input image, y,
Figure FDA0003269904130000023
Labels representing different classes, B (x, y) represents samples with x in the same training batch and label y,
Figure FDA0003269904130000024
samples representing x in the same dataset but labeled other than y, FiA feature extractor representing the ith image classification network,
Figure FDA0003269904130000027
and (4) a supervised contrast learning classifier of an ith image classification network is represented, tau represents a temperature coefficient, and the log base number is not limited.
4. The method of claim 1, wherein the function that calculates the cross-entropy loss for an image class classifier is as follows:
Figure FDA0003269904130000025
wherein L isclsRepresenting cross entropy loss of image class classifier, E represents mathematical expectation, D*Representing the rotation enhanced data set, x representing the input image, y representing the class label, L representing the cross entropy loss function, FiA feature extractor representing the ith image classification network,
Figure FDA0003269904130000028
an image class classifier representing an ith image classification network.
5. The method of claim 1, wherein the function that computes cross-entropy loss for co-learning is as follows:
Figure FDA0003269904130000026
wherein L isklRepresenting the cross-entropy loss of co-learning corresponding to KL divergence, E representing the mathematical expectation, D*Representing the rotation enhanced data set, x representing the input image, y representing the class label, F1、F2A feature extractor representing the 1 st and 2 nd image classification networks,
Figure FDA0003269904130000029
an image class classifier representing the 1 st and 2 nd image classification networks.
6. The method according to claim 1 or 5, wherein the KL divergence adjusts the degree of co-learning by means of a temperature coefficient T.
7. The method of any one of claims 1-5, wherein the function of the overall loss is calculated as follows:
Ltotal=α·Lcls+β·Lrot+γ·Lscl+η·Lkl
wherein,LtotalDenotes the total loss, LclsRepresenting the cross-entropy loss, L, of the image class classifierrotRepresents the cross-entropy loss, L, of the rotating class classifiersclRepresents the cross-entropy loss, L, of the contrast learning classifierklThe cross entropy loss of the co-learning corresponding to the KL divergence is expressed, and α, β, γ, and η represent weight coefficients.
8. A small sample image classification system based on self-supervised learning, for implementing the method of any one of claims 1-7, wherein the system includes two image classification networks with the same structure but not sharing weight, and each image classification network includes:
a feature extractor for extracting a feature vector of the image obtained by the rotation;
a rotation category classifier for classifying the feature vectors of the image obtained by rotation according to a rotation direction;
the supervised contrast learning classifier is used for classifying the feature vectors of positive examples and negative examples to obtain the probability of belonging to the same class, wherein the positive examples refer to the images obtained by rotation and the images obtained by rotation of the same class, and the negative examples refer to the images obtained by rotation of other classes;
the image category classifier is used for classifying the images according to the input feature vector of each image;
the feature extractor is trained through image training data with class labels in advance, the outputs of the image class classifiers of the two image classification networks are jointly learned through KL divergence constraint during training, overall loss is minimized through iterative training, and training is completed.
CN202111098484.4A 2021-09-18 Small sample image classification method and system based on self-supervision learning Active CN113963165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111098484.4A CN113963165B (en) 2021-09-18 Small sample image classification method and system based on self-supervision learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111098484.4A CN113963165B (en) 2021-09-18 Small sample image classification method and system based on self-supervision learning

Publications (2)

Publication Number Publication Date
CN113963165A true CN113963165A (en) 2022-01-21
CN113963165B CN113963165B (en) 2024-08-02

Family

ID=

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299398A (en) * 2022-03-10 2022-04-08 湖北大学 Small sample remote sensing image classification method based on self-supervision contrast learning
CN114580571A (en) * 2022-04-01 2022-06-03 南通大学 Small sample power equipment image classification method based on migration mutual learning
CN114936615A (en) * 2022-07-25 2022-08-23 南京大数据集团有限公司 Small sample log information anomaly detection method based on characterization consistency correction
CN114943859A (en) * 2022-05-05 2022-08-26 兰州理工大学 Task correlation metric learning method and device for small sample image classification
CN115358392A (en) * 2022-10-21 2022-11-18 北京百度网讯科技有限公司 Deep learning network training method, text detection method and text detection device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
CN112348792A (en) * 2020-11-04 2021-02-09 广东工业大学 X-ray chest radiography image classification method based on small sample learning and self-supervision learning
US20210124993A1 (en) * 2019-10-23 2021-04-29 Adobe Inc. Classifying digital images in few-shot tasks based on neural networks trained using manifold mixup regularization and self-supervision
CN113239947A (en) * 2021-03-10 2021-08-10 安徽省农业科学院农业经济与信息研究所 Pest image classification method based on fine-grained classification technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
US20210124993A1 (en) * 2019-10-23 2021-04-29 Adobe Inc. Classifying digital images in few-shot tasks based on neural networks trained using manifold mixup regularization and self-supervision
CN112348792A (en) * 2020-11-04 2021-02-09 广东工业大学 X-ray chest radiography image classification method based on small sample learning and self-supervision learning
CN113239947A (en) * 2021-03-10 2021-08-10 安徽省农业科学院农业经济与信息研究所 Pest image classification method based on fine-grained classification technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
贡荣麟;施俊;王骏;: "面向乳腺超声图像分割的混合监督双通道反馈U-Net", 中国图象图形学报, no. 10, 16 October 2020 (2020-10-16) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299398A (en) * 2022-03-10 2022-04-08 湖北大学 Small sample remote sensing image classification method based on self-supervision contrast learning
CN114299398B (en) * 2022-03-10 2022-05-17 湖北大学 Small sample remote sensing image classification method based on self-supervision contrast learning
CN114580571A (en) * 2022-04-01 2022-06-03 南通大学 Small sample power equipment image classification method based on migration mutual learning
CN114580571B (en) * 2022-04-01 2023-05-23 南通大学 Small sample power equipment image classification method based on migration mutual learning
CN114943859A (en) * 2022-05-05 2022-08-26 兰州理工大学 Task correlation metric learning method and device for small sample image classification
CN114943859B (en) * 2022-05-05 2023-06-20 兰州理工大学 Task related metric learning method and device for small sample image classification
CN114936615A (en) * 2022-07-25 2022-08-23 南京大数据集团有限公司 Small sample log information anomaly detection method based on characterization consistency correction
CN115358392A (en) * 2022-10-21 2022-11-18 北京百度网讯科技有限公司 Deep learning network training method, text detection method and text detection device
CN115358392B (en) * 2022-10-21 2023-05-05 北京百度网讯科技有限公司 Training method of deep learning network, text detection method and device

Similar Documents

Publication Publication Date Title
Ke et al. End-to-end automatic image annotation based on deep CNN and multi-label data augmentation
Zhong et al. Ghostvlad for set-based face recognition
Wang et al. Low-shot learning from imaginary data
Chen et al. Underwater object detection using Invert Multi-Class Adaboost with deep learning
Kuo et al. Green learning: Introduction, examples and outlook
Denton et al. Semi-supervised learning with context-conditional generative adversarial networks
Xie et al. Delving into inter-image invariance for unsupervised visual representations
CN102314614B (en) Image semantics classification method based on class-shared multiple kernel learning (MKL)
CN112036447B (en) Zero-sample target detection system and learnable semantic and fixed semantic fusion method
CN112733965B (en) Label-free image classification method based on small sample learning
Li et al. A comprehensive survey on source-free domain adaptation
CN109242097B (en) Visual representation learning system and method for unsupervised learning
CN110598022B (en) Image retrieval system and method based on robust deep hash network
Vallet et al. A multi-label convolutional neural network for automatic image annotation
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
CN114067385A (en) Cross-modal face retrieval Hash method based on metric learning
US20230214719A1 (en) Method for performing continual learning using representation learning and apparatus thereof
WO2023088174A1 (en) Target detection method and apparatus
CN114821237A (en) Unsupervised ship re-identification method and system based on multi-stage comparison learning
Zhu et al. Few-shot common-object reasoning using common-centric localization network
Han et al. Multi-scale feature network for few-shot learning
CN114299362A (en) Small sample image classification method based on k-means clustering
Li et al. Progressive cross-domain knowledge distillation for efficient unsupervised domain adaptive object detection
Bastanlar et al. Self-supervised contrastive representation learning in computer vision
Ruiz et al. IDA: Improved data augmentation applied to salient object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant