CN115294381B - Small sample image classification method and device based on feature migration and orthogonal prior - Google Patents

Small sample image classification method and device based on feature migration and orthogonal prior Download PDF

Info

Publication number
CN115294381B
CN115294381B CN202210487137.9A CN202210487137A CN115294381B CN 115294381 B CN115294381 B CN 115294381B CN 202210487137 A CN202210487137 A CN 202210487137A CN 115294381 B CN115294381 B CN 115294381B
Authority
CN
China
Prior art keywords
orthogonal
feature
module
training
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210487137.9A
Other languages
Chinese (zh)
Other versions
CN115294381A (en
Inventor
李晓旭
张志敏
刘俊
汤卓和
刘忠源
张文斌
曾俊瑀
马占宇
陶剑
董洪飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University of Technology
Original Assignee
Lanzhou University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University of Technology filed Critical Lanzhou University of Technology
Priority to CN202210487137.9A priority Critical patent/CN115294381B/en
Publication of CN115294381A publication Critical patent/CN115294381A/en
Application granted granted Critical
Publication of CN115294381B publication Critical patent/CN115294381B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample image classification method and device based on feature migration and orthogonal priori, and a small sample classification framework for researching high-identification feature extraction based on a small sample image classification research based on depth measurement. By introducing feature migration and small sample image feature learning of orthogonal priori, the orthogonal feature subspace is learned by constructing an orthogonalization feature adaptation network on the assumption that the new class and the base class share a feature extraction mode and the features of different classes of new class data are orthogonal, so that the features of different classes are orthogonal to each other, and the identification degree of the features is improved. The invention has very important significance for theoretical research of small sample learning and promotion of wide application of machine identification technology. Meanwhile, the method plays a role in adding bricks and tiles for the advanced technology of breaking through the theoretical bottleneck of small sample learning and mastering artificial intelligence in China.

Description

Small sample image classification method and device based on feature migration and orthogonal prior
Technical Field
The invention relates to the technical field of image classification, in particular to a small sample image classification method and device based on feature migration and orthogonal prior.
Background
In recent years, with the development of deep learning, recognition performance of machines has exceeded that of humans on many large sample image classification tasks. However, when the sample size is small, the recognition level of the machine still has a large gap from that of human beings. Thus, image classification of a small number of training samples, particularly small sample image classification (Few-shot Image Classification) with only one or a few marker samples per class, has received considerable attention from researchers in the last two years.
The small sample classification (Few-shot Classification) belongs to the category of small sample Learning (Few-shot Learning), and often includes two types of data, namely base type data and new type data, with disjoint class spaces. Small sample classification aims at learning classification rules by using knowledge learned by base class data and a small number of marked samples (supporting samples) of new class data, and accurately predicts the class of unmarked samples (query samples) in new class tasks, and the framework of the small sample classification is shown in fig. 1.
Small sample image classification is a research problem that is currently in need of solution in the field of computer vision and artificial intelligence. The existing and successful large-sample image classification method is severely dependent on the number of samples, and the sample size of things in the real world is subject to long tail distribution, i.e. the sample size of a large number of things is severely insufficient, for example, in the fields of military, medical treatment, industry, astronomy and the like, the sample collection needs to consume a large amount of manpower, material resources, time and economic cost, and large-scale image samples are difficult to collect. Therefore, research on small sample image classification is of great value to the wide application of image classification technology.
In the prior art, a classification method based on depth measurement mainly judges a class by comparing a distance between a sample or a sample and a class prototype. Often combine data enhancement, techniques such as migration study to make up for the defect that data volume is insufficient and the model is easy to fit excessively, obtain better classification performance on many small sample classification tasks. However, compared with the large sample image classification, the performance of the existing small sample image classification is still not satisfactory, the practicability of the small sample image classification technology is limited to a great extent, and a plurality of problems are still faced to be solved: and (5) high recognition feature learning. For large sample image classification, existing deep learning techniques can learn high-resolution image features by increasing model elasticity and sample size. However, existing deep learning techniques are not applicable to small sample classification tasks with few labeled samples. Thus, how to learn a high-recognition feature representation based on the base class data and the new class data with few labeled samples is a worth exploring problem.
Disclosure of Invention
Aiming at the technical problem of high-recognition feature learning in small sample image classification, the invention provides a small sample image classification method and device based on feature migration and orthogonal prior.
In order to achieve the above object, the present invention provides the following technical solutions:
the invention firstly provides a small sample image classification method based on feature migration and orthogonal prior, which comprises the following steps:
s1, preparing data, and pre-training an image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
s2, introducing an orthogonal priori thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and the orthogonal priori;
s3, learning a network model objective function based on training and optimizing the characteristics of the orthogonal priori;
and S4, classifying the test set images by utilizing the optimized image orthogonal priori feature learning network model.
Further, step S1 includes:
s11, data is processed
Figure GDA0003856221160000021
Is divided into->
Figure GDA0003856221160000022
And->
Figure GDA0003856221160000023
Two parts, and the two parts are mutually exclusive in class space, D is calculated train D as a model for training base class data test Testing the model as new class data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are used as query samples Q i ,S i And Q i Form a task T i Also for D test With tasks
Figure GDA0003856221160000024
S13, first stage of training: pre-training an embedded module f using base class data θ ,f θ Comprising 4 convolution blocks, each convolution block comprising a convolution layer, a pooling layer, a nonlinear activation function; the convolution kernel window size used by each convolution block is 3×3, a batch normalization, three channels of RGB, a pooling layer, a 2×2 maximum pooling layer is adopted, the maximum pooling layer of the last two blocks is cut, and a nonlinear activation layer is adopted as an activation function of ReLu.
Further, in step S2, in the feature learning network model based on feature migration and orthogonal prior, the orthogonalization feature adaptation network is composed of three parts: embedded module f θ Orthogonal adaptation module
Figure GDA0003856221160000031
And a metrology module; orthogonal adaptation module->
Figure GDA0003856221160000032
Is composed of two layers of convolution layers, the convolution kernel size is 5×5, and is used for transforming the characteristics of the new sample and learning the orthogonalization characteristic subspace.
Further, step S3 includes:
s31, training the new class number in the second stageAccording to a classification task, all supporting samples are input into an embedded module f with fixed parameters θ In the process, corresponding support sample characteristics f are obtained θ (S ck );
S32, performing feature transformation by using an orthogonal adaptation module to obtain
Figure GDA0003856221160000033
S33, mask M corresponding to each class is used for the transformed features c Multiplying to enable the features between different classes to be orthogonal pairwise;
s34, calculating cosine distance C (P) between similar features by using the measurement module ci ,P cj )(i∈[0,K),i≠j);
S35, optimizing orthogonal adaptation module by mean square error loss function
Figure GDA0003856221160000034
Further, the calculation formula in step S33 is:
Figure GDA0003856221160000035
wherein S is ck For the kth support sample of class c,
Figure GDA0003856221160000036
representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure GDA0003856221160000037
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1 and the remaining positions have a value of 0.
Further, the calculation formula in step S34 is as follows:
Figure GDA0003856221160000038
wherein C (P) ci ,P cj ) To calculate the cosine distance between the classes, K is the number of support samples, c is the c-th class, P ci Representing the ith support sample feature in class c, P cj Representing the j-th support sample feature in class c,
Figure GDA0003856221160000041
multiplication of corresponding elements representing the matrix, |P ci The expression matrix P ci Is a binary norm of (c).
Further, the mean square error loss function calculation formula in step S35 is as follows:
Figure GDA0003856221160000042
wherein N is the total category number under the current task, C (P ci ,P cj ) To calculate cosine distances between the classes, where MSE [ cos (P ci ,P cj ),1]=[cos(P ci ,P cj )-1] 2
After loss of the support sample is calculated, gradient descent is carried out, and a mini-batch and Adam optimizer is adopted to update an orthogonal adaptation module
Figure GDA0003856221160000043
Training multiple tasks is repeated until the network converges.
Further, the Adam adaptive optimization algorithm specifically comprises the following steps:
initializing data: v dW =0,S dW =0,v db =0,S db =0, they represent biased first and second moment estimates, dW, db represent the differentiation of W and b, respectively;
calculating a Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponential weighted average of the square of the gradient derivative of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating Momentum, RMSprop deviation correction of two algorithms:
deviation correction of Momentum algorithm:
Figure GDA0003856221160000044
Figure GDA0003856221160000045
deviation correction of RMSprop algorithm:
Figure GDA0003856221160000046
Figure GDA0003856221160000051
gradient descent is performed, and the weight is updated:
Figure GDA0003856221160000052
Figure GDA0003856221160000053
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 12 Respectively represent the exponential decay rates of the first and second moment estimates,
Figure GDA0003856221160000054
representing the first and second moment estimates after the bias correction.
Further, step S4 includes:
s41, testing process, each task
Figure GDA0003856221160000055
By support set->
Figure GDA0003856221160000056
And query set->
Figure GDA0003856221160000057
Composition, query set of test set->
Figure GDA0003856221160000058
Input to the embedding module f θ Orthogonal adaptation module after fine tuning->
Figure GDA0003856221160000059
In (3) get the characteristics->
Figure GDA00038562211600000510
S42, respectively associating the characteristics output by the orthogonal adaptation module with different types of masks M c The specific operation of multiplication is shown in formula (1):
Figure GDA00038562211600000511
wherein,,
Figure GDA00038562211600000512
for the kth query sample,/->
Figure GDA00038562211600000513
Representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure GDA00038562211600000514
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1, the values of the remaining positions are 0;
s43, sending the product into a measurement module to calculate a query sample
Figure GDA00038562211600000515
Cosine distances from all support samples;
s44, taking the support sample category closest to the query sample as the prediction category of the query sample.
On the other hand, the invention also provides a small sample image classification device based on characteristic migration and orthogonal priori, which is used for realizing the method, and comprises the following functional modules:
the pre-training module is used for pre-training the image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
the processing module introduces the idea of orthogonal priori and constructs a feature learning network model based on feature migration and orthogonal priori;
the computing module is used for learning network model objective function solving model parameters based on the features of training and optimizing the orthogonal priori;
and the classification module classifies the test set images by utilizing the optimized image orthogonal priori feature learning network model.
Compared with the prior art, the invention has the beneficial effects that:
the small sample image classification method and device based on feature migration and orthogonal prior, provided by the invention, are based on a depth convolutional neural network (Deep Convolutional Neural Networks, DCNN for short), and a small sample classification framework of high-identification feature extraction is researched on the basis of small sample image classification research based on depth measurement. By introducing feature migration and small sample image feature learning of orthogonal priori, the feature extraction mode is assumed to be shared by the new class and the base class, and the feature orthogonality of the new class data among different classes is assumed to be free of correlation, and by constructing an orthogonalization feature adaptation network, an orthogonalization feature subspace is learned, so that the different classes of features are orthogonalized with each other, the different classes are easy to distinguish, and the identification degree of the features is improved. The invention has very important significance for theoretical research of small sample learning and promotion of wide application of machine identification technology. Meanwhile, the method plays a role in adding bricks and tiles for the advanced technology of breaking through the theoretical bottleneck of small sample learning and mastering artificial intelligence in China.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
FIG. 1 is a small sample classification (Few-shot Classification) framework.
Fig. 2 is a flowchart of a small sample image classification method and apparatus based on feature migration and orthogonal prior according to an embodiment of the present invention.
Fig. 3 is an embedded module f according to an embodiment of the present invention θ Structure diagram.
Fig. 4 is a network diagram for feature learning of a small sample image with feature migration and orthogonal prior introduced according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of an orthogonal adaptation module according to an embodiment of the present invention
Figure GDA0003856221160000071
Is a model structure diagram of the (c).
Fig. 6 is a schematic diagram of a functional module of a small sample image classification device based on feature migration and orthogonal prior according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. Embodiments of the present invention are intended to be within the scope of the present invention as defined by the appended claims.
The invention also provides a small sample image classification method based on feature migration and orthogonal prior, and the flow is shown in figure 2 and comprises the following steps.
The method comprises the following steps:
s1, preparing data, and pre-training an image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
specifically, step S1 includes:
s11, data is processed
Figure GDA0003856221160000072
Is divided into->
Figure GDA0003856221160000073
And->
Figure GDA0003856221160000074
Two parts, and the two parts are mutually exclusive in class space, D is calculated train D as a model for training base class data test Testing the model as new class data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are taken asQuery sample Q i ,S i And Q i Form a task T i Also for D test With tasks
Figure GDA0003856221160000075
S13, first stage of training: pre-training an embedded module f using base class data θ ,f θ Comprising 4 convolution blocks, each convolution block comprising a convolution layer, a pooling layer, a nonlinear activation function; the convolution kernel window size used by each convolution block is 3×3, a batch normalization, three channels of RGB, a pooling layer, a 2×2 maximum pooling layer is adopted, the maximum pooling layer of the last two blocks is cut, and a nonlinear activation layer is adopted as an activation function of ReLu. For example, for an 84 x 3RGB image, a 3x3 convolution kernel with 64 filters is used for each block. Each block consists of 1 convolution, 1 ReLu, one pooling, as shown in fig. 3. The pre-trained embedded modules can be multiplexed according to different scenes.
S2, introducing an orthogonal priori thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and the orthogonal priori; as shown in fig. 4.
Specifically, in step S2, in the feature learning network model based on feature migration and orthogonal prior, the orthogonalization feature adaptation network is composed of three parts: embedded module f θ Orthogonal adaptation module
Figure GDA0003856221160000081
And a metrology module; orthogonal adaptation module->
Figure GDA0003856221160000082
Is composed of two layers of convolution layers, the convolution kernel size is 5 x 5, which is a feature subspace used for transforming and learning orthogonalization of new sample features, as shown in fig. 5.
S3, learning a network model objective function based on training and optimizing the characteristics of the orthogonal priori;
specifically, step S3 includes:
s31, training the firstTwo stages, a classification task is carried out on the new class data, and all support samples are input into an embedding module f with fixed parameters θ In the process, corresponding support sample characteristics f are obtained θ (S ck );
S32, performing feature transformation by using an orthogonal adaptation module to obtain
Figure GDA0003856221160000083
S33, mask (Mask) M corresponding to each class of the transformed features c Multiplying to enable the features between different classes to be orthogonal pairwise;
the calculation formula in step S33 is:
Figure GDA0003856221160000084
wherein S is ck For the kth support sample of class c,
Figure GDA0003856221160000085
representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure GDA0003856221160000086
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1 and the remaining positions have a value of 0.
S34, calculating cosine distance C (P) between similar features by using the measurement module ci ,P cj ) (i epsilon [0, K), i not equal to j), the cosine distance of the features among the same class under a plurality of classes can be obtained;
the calculation formula of step S34 is as follows:
Figure GDA0003856221160000091
wherein C (P) ci ,P cj ) To calculate cosine distance between classes, K is the number of support samples, c is the c-th class, P ci Representing the ith support sample feature in class c, P cj Representing the j-th support sample feature in class c,
Figure GDA0003856221160000094
multiplication of corresponding elements representing the matrix, |P ci The expression matrix P ci Is a binary norm of (c).
S35, optimizing orthogonal adaptation module by mean square error loss function
Figure GDA0003856221160000092
Step S35 includes:
the calculation formula using the mean square error loss function is as follows:
Figure GDA0003856221160000093
wherein N is the total category number under the current task, C (P ci ,P cj ) To calculate cosine distances between the classes, where MSE [ cos (P ci ,P cj ),l]=[cos(P ci ,P cj )-1] 2
After loss of the support sample is calculated, gradient descent is carried out, and a mini-batch and Adam optimizer is adopted to update an orthogonal adaptation module
Figure GDA0003856221160000095
Training multiple tasks is repeated until the network converges.
The Adam self-adaptive optimization algorithm comprises the following specific steps:
initializing data: v dW =0,S dW =0,v db =0,S db =0, they represent biased first and second moment estimates, dW, db represent the differentiation of W and b, respectively;
calculating a Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponential weighted average of the square of the gradient derivative of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating Momentum, RMSprop deviation correction of two algorithms:
deviation correction of Momentum algorithm:
Figure GDA0003856221160000101
Figure GDA0003856221160000102
deviation correction of RMSprop algorithm:
Figure GDA0003856221160000103
Figure GDA0003856221160000104
gradient descent is performed, and the weight is updated:
Figure GDA0003856221160000105
Figure GDA0003856221160000106
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 12 Respectively represent the exponential decay rates of the first and second moment estimates,
Figure GDA0003856221160000107
representing the first and second moment estimates after the bias correction.
And S4, classifying the test set images by utilizing the optimized image orthogonal priori feature learning network model.
The step S4 includes:
s41, testing process, each task
Figure GDA0003856221160000108
By support set->
Figure GDA0003856221160000109
And query set->
Figure GDA00038562211600001010
Composition, query set of test set->
Figure GDA00038562211600001011
Input to the embedding module f θ Orthogonal adaptation module after fine tuning->
Figure GDA00038562211600001012
In (3) get the characteristics->
Figure GDA00038562211600001013
S42, respectively associating the characteristics output by the orthogonal adaptation module with different types of masks M c The specific operation of multiplication is shown in formula (1):
Figure GDA0003856221160000111
wherein,,
Figure GDA0003856221160000112
for the kth query sample,/->
Figure GDA0003856221160000113
Representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure GDA0003856221160000114
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within a given range (the range of h starts from 0), M cijh Equal to 1, the values of the remaining positions are 0;
s43, sending the product into a measurement module to calculate a query sample
Figure GDA0003856221160000115
Cosine distances from all support samples; in the training stage, the measurement module calculates cosine distances among similar features without calculation in different categories, which is different from the measurement module in the testing stage in use;
s44, taking the support sample category closest to the query sample as the prediction category of the query sample. Different from the traditional training, the support sample fine tuning model is used under the new class, and the query sample is directly tested after the optimization is finished.
On the other hand, the invention also provides a small sample image classification device based on feature migration and orthogonal priori, which is used for realizing the method, as shown in fig. 6, and comprises the following functional modules:
the pre-training module is used for pre-training the image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
the processing module introduces the idea of orthogonal priori and constructs a feature learning network model based on feature migration and orthogonal priori;
the computing module is used for learning network model objective function solving model parameters based on the features of training and optimizing the orthogonal priori;
and the classification module classifies the test set images by utilizing the optimized image orthogonal priori feature learning network model.
According to the method, feature migration and orthogonal priori small sample image feature learning are introduced, a new class and a base class share feature extraction mode is assumed, the orthogonality of features of different classes of new class data is assumed, and an orthogonalization feature adaptation network is constructed to learn an orthogonalization feature subspace, so that the features of different classes are orthogonalized to each other, and the recognition degree of the features is improved.
The specific embodiments of the small sample image classification method, device and method based on feature migration and orthogonal prior are set forth above in connection with the accompanying drawings. The implementation of the method and apparatus will be apparent to those skilled in the art from the description of the embodiments above.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the disclosure herein is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present disclosure as described herein, and the above description of specific languages is provided for disclosure of enablement and best mode of the present disclosure.
Similarly, it should be appreciated that in the above description of exemplary embodiments disclosed herein, various features disclosed herein are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various disclosed aspects. However, the disclosed method should not be construed as reflecting the following schematic diagram: i.e., the claims are directed to the disclosed herein with more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this disclosure.
The foregoing examples are merely specific embodiments of the present application, and are not intended to limit the scope of the present application, but the present application is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, the present application is not limited thereto. Any person skilled in the art, within the technical scope of the disclosure of the present application, may modify or easily conceive of changes to the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical details; such modifications, changes or substitutions do not depart from the spirit and scope of the corresponding technical solutions. Are intended to be encompassed within the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (6)

1. The small sample image classification method based on feature migration and orthogonal priori is characterized by comprising the following steps:
s1, preparing data, and pre-training an image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
s2, introducing an orthogonal priori thought into the convolutional neural network model, and constructing a feature learning network model based on feature migration and the orthogonal priori;
s3, learning a network model objective function based on training and optimizing the characteristics of the orthogonal priori;
step S3, feature learning network model based on training optimization orthogonal priori comprises:
s31, training secondStage, a classification task is carried out on the new class data, and all support samples are input into an embedding module f with fixed parameters θ In the process, corresponding support sample characteristics f are obtained θ (S ck );
S32, performing feature transformation by using an orthogonal adaptation module to obtain
Figure FDA0004250409670000011
S33, mask M corresponding to each class is used for the transformed features c Multiplying to enable the features between different classes to be orthogonal pairwise; the calculation formula in step S33 is:
Figure FDA0004250409670000012
wherein S is ck For the kth support sample of class c,
Figure FDA0004250409670000013
representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure FDA0004250409670000014
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within the given range, the range of h starts from 0, M cijh Equal to 1, the values of the remaining positions are 0;
s34, calculating cosine distance cos (P) between similar features by using the measurement module ci ,P cj ) (i ε [0, K), i+.j); the calculation formula of step S34 is as follows:
Figure FDA0004250409670000015
wherein cos (P ci ,P cj ) To calculate cosine distance between classes, K is the number of support samples, c is the c-th class, P ci Representing the ith support sample feature in class c, P cj Representing the j-th support sample feature in class c,
Figure FDA0004250409670000016
multiplication of corresponding elements representing the matrix, |P ci The expression matrix P ci Is a binary norm of (2);
s35, optimizing orthogonal adaptation module by mean square error loss function
Figure FDA0004250409670000029
The mean square error loss function calculation formula in step S35 is as follows:
Figure FDA0004250409670000021
wherein C is the total category number under the current task, cos (P ci ,P cj ) To calculate cosine distances between the classes, where MSE [ cos (P ci ,P cj ),1]=[cos(P ci ,P cj )-1] 2
After loss of the support sample is calculated, gradient descent is carried out, and a mini-batch and Adam optimizer is adopted to update an orthogonal adaptation module
Figure FDA0004250409670000022
Repeating training the plurality of tasks until the network converges;
and S4, classifying the test set images by utilizing the optimized image orthogonal priori feature learning network model.
2. The small sample image classification method based on feature migration and orthogonal priors according to claim 1, wherein step S1 comprises:
s11, data is processed
Figure FDA0004250409670000023
Is divided into->
Figure FDA0004250409670000024
And->
Figure FDA0004250409670000025
Two parts, and the two parts are mutually exclusive in class space, D is calculated train D as a model for training base class data test Testing the model as new class data;
s12, for the C-way K-shot classification task, from D train Randomly selecting C classes, randomly selecting M samples from each class, wherein K samples are used as support samples S i The remaining M-K samples are used as query samples Q i ,S i And Q i Form a task T i Also for D test With tasks
Figure FDA0004250409670000026
S13, first stage of training: pre-training an embedded module f using base class data θ ,f θ Comprising 4 convolution blocks, each convolution block comprising a convolution layer, a pooling layer, a nonlinear activation function; the convolution kernel window size used by each convolution block is 3×3, a batch normalization, three channels of RGB, a pooling layer, a 2×2 maximum pooling layer is adopted, the maximum pooling layer of the last two blocks is cut, and a nonlinear activation layer is adopted as an activation function of ReLu.
3. The small sample image classification method based on feature migration and orthogonal prior according to claim 1, wherein in step S2, in a feature learning network model based on feature migration and orthogonal prior, an orthogonalization feature adaptation network is composed of three parts: embedded module f θ Orthogonal adaptation module
Figure FDA0004250409670000027
And a metrology module; orthogonal adaptation module->
Figure FDA0004250409670000028
Is composed of two layers of convolution layers, the convolution kernel size is 5×5, and is used for transforming the characteristics of the new sample and learning the orthogonalization characteristic subspace.
4. The small sample image classification method based on feature migration and orthogonal prior according to claim 1, wherein Adam adaptive optimization algorithm specifically comprises the following steps:
initializing data: v dW =0,S dW =0,v db =0,S db =0,v dW ,v db ,S dW ,S db Respectively representing biased first and second moment estimates, dW and db respectively representing the differentiation of W and b;
calculating a Momentum exponential weighted average:
v dW =β 1 v dW +(1-β 1 )dW (5)
v db =β 1 v db +(1-β 1 )db (6)
calculating an exponential weighted average of the square of the gradient derivative of the RMSprop algorithm formula:
S dW =β 2 S dW +(1-β 2 )(dW) 2 (7)
S db =β 2 S db +(1-β 2 )(db) 2 (8)
calculating Momentum, RMSprop deviation correction of two algorithms:
deviation correction of Momentum algorithm:
Figure FDA0004250409670000031
Figure FDA0004250409670000032
deviation correction of RMSprop algorithm:
Figure FDA0004250409670000033
Figure FDA0004250409670000034
gradient descent is performed, and the weight is updated:
Figure FDA0004250409670000035
Figure FDA0004250409670000036
in equations (5) - (14), t represents the t-th iteration, α represents the learning rate, which controls the update rate of the weights, ε represents a very small constant, β 12 Respectively represent the exponential decay rates of the first and second moment estimates,
Figure FDA0004250409670000041
representing the first and second moment estimates after the bias correction.
5. The small sample image classification method based on feature migration and orthogonal priors according to claim 1, wherein step S4 comprises:
s41, testing process, each task
Figure FDA0004250409670000042
By support set->
Figure FDA0004250409670000043
And query set->
Figure FDA0004250409670000044
Composition, query set of test set->
Figure FDA0004250409670000045
Input to the embedding module f θ Orthogonal adaptation module after fine tuning->
Figure FDA00042504096700000412
In (3) get the characteristics->
Figure FDA0004250409670000046
S42, respectively associating the characteristics output by the orthogonal adaptation module with different types of masks M c The specific operation of multiplication is shown in a formula (1);
Figure FDA0004250409670000047
wherein,,
Figure FDA0004250409670000048
for the kth query sample,/->
Figure FDA0004250409670000049
Representing multiplication of corresponding elements of the same order matrix, M cijh For class c mask M c The value of the ith channel of the ith row, the jth column, M c The elements of (2) are as follows:
Figure FDA00042504096700000410
wherein C is the total category number under the current task, H is the characteristic channel number, and H is the integer multiple of C; in the above formula, when h is within the given range, the range of hStarting from 0, M cijh Equal to 1, the values of the remaining positions are 0;
s43, sending the product into a measurement module to calculate a query sample
Figure FDA00042504096700000411
Cosine distances from all support samples;
s44, taking the support sample category closest to the query sample as the prediction category of the query sample.
6. A small sample image classification device based on feature migration and orthogonal priors, characterized by being adapted to implement the method of any of claims 1-5, comprising the following functional modules:
the pre-training module is used for pre-training the image to obtain an embedded module f θ Extracting features of an image, wherein the image comprises a training set and a testing set;
the processing module introduces the idea of orthogonal priori and constructs a feature learning network model based on feature migration and orthogonal priori;
the computing module is used for learning network model objective function solving model parameters based on the features of training and optimizing the orthogonal priori;
and the classification module classifies the test set images by utilizing the optimized image orthogonal priori feature learning network model.
CN202210487137.9A 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior Active CN115294381B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210487137.9A CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210487137.9A CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Publications (2)

Publication Number Publication Date
CN115294381A CN115294381A (en) 2022-11-04
CN115294381B true CN115294381B (en) 2023-06-30

Family

ID=83819949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210487137.9A Active CN115294381B (en) 2022-05-06 2022-05-06 Small sample image classification method and device based on feature migration and orthogonal prior

Country Status (1)

Country Link
CN (1) CN115294381B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116778268A (en) * 2023-04-20 2023-09-19 江苏济远医疗科技有限公司 Sample selection deviation relieving method suitable for medical image target classification

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929603A (en) * 2019-11-09 2020-03-27 北京工业大学 Weather image identification method based on lightweight convolutional neural network
CN113379614A (en) * 2021-03-31 2021-09-10 西安理工大学 Computed ghost imaging reconstruction recovery method based on Resnet network

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201703908D0 (en) * 2017-03-10 2017-04-26 Artificial Intelligence Res Group Ltd Object identification and motion tracking
CN109508655B (en) * 2018-10-28 2023-04-25 北京化工大学 SAR target recognition method based on incomplete training set of twin network
CN110188795B (en) * 2019-04-24 2023-05-09 华为技术有限公司 Image classification method, data processing method and device
CN111985611A (en) * 2020-07-21 2020-11-24 上海集成电路研发中心有限公司 Computing method based on physical characteristic diagram and DCNN machine learning reverse photoetching solution

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929603A (en) * 2019-11-09 2020-03-27 北京工业大学 Weather image identification method based on lightweight convolutional neural network
CN113379614A (en) * 2021-03-31 2021-09-10 西安理工大学 Computed ghost imaging reconstruction recovery method based on Resnet network

Also Published As

Publication number Publication date
CN115294381A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
WO2022160771A1 (en) Method for classifying hyperspectral images on basis of adaptive multi-scale feature extraction model
CN110197205A (en) A kind of image-recognizing method of multiple features source residual error network
CN104616029B (en) Data classification method and device
CN111401156B (en) Image identification method based on Gabor convolution neural network
CN110826056B (en) Recommended system attack detection method based on attention convolution self-encoder
Golovko et al. Development of solar panels detector
CN115294381B (en) Small sample image classification method and device based on feature migration and orthogonal prior
CN111832580B (en) SAR target recognition method combining less sample learning and target attribute characteristics
CN114782752B (en) Small sample image integrated classification method and device based on self-training
CN114943859B (en) Task related metric learning method and device for small sample image classification
Termritthikun et al. Accuracy improvement of Thai food image recognition using deep convolutional neural networks
CN112926485A (en) Few-sample sluice image classification method
CN114612450A (en) Image detection segmentation method and system based on data augmentation machine vision and electronic equipment
CN114329031A (en) Fine-grained bird image retrieval method based on graph neural network and deep hash
CN113837046A (en) Small sample remote sensing image scene classification method based on iterative feature distribution learning
CN116341629A (en) Self-supervision learning method based on position coding prediction combined with information packet
CN114818945A (en) Small sample image classification method and device integrating category adaptive metric learning
CN114627496A (en) Robust pedestrian re-identification method based on depolarization batch normalization of Gaussian process
CN110110769A (en) Image classification method based on width radial basis function network
Han et al. An Image Classification Approach based on Deep Learning and Transfer Learning
CN114782779B (en) Small sample image feature learning method and device based on feature distribution migration
CN113807400B (en) Hyperspectral image classification method, hyperspectral image classification system and hyperspectral image classification equipment based on attack resistance
CN116721278B (en) Hyperspectral image collaborative active learning classification method based on capsule network
CN111984816B (en) Tobacco bale random code image retrieval method based on L2 normalization
CN114372537B (en) Image description system-oriented universal countermeasure patch generation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zhang Zhimin

Inventor after: Dong Hongfei

Inventor after: Li Xiaoxu

Inventor after: Liu Jun

Inventor after: Tang Zhuohe

Inventor after: Liu Zhongyuan

Inventor after: Zhang Wenbin

Inventor after: Zeng Junyu

Inventor after: Ma Zhanyu

Inventor after: Tao Jian

Inventor before: Li Xiaoxu

Inventor before: Dong Hongfei

Inventor before: Zhang Zhimin

Inventor before: Liu Jun

Inventor before: Tang Zhuohe

Inventor before: Liu Zhongyuan

Inventor before: Zhang Wenbin

Inventor before: Zeng Junyu

Inventor before: Ma Zhanyu

Inventor before: Tao Jian

CB03 Change of inventor or designer information