CN114330650A - Small sample characteristic analysis method and device based on evolutionary element learning model training - Google Patents

Small sample characteristic analysis method and device based on evolutionary element learning model training Download PDF

Info

Publication number
CN114330650A
CN114330650A CN202111520388.4A CN202111520388A CN114330650A CN 114330650 A CN114330650 A CN 114330650A CN 202111520388 A CN202111520388 A CN 202111520388A CN 114330650 A CN114330650 A CN 114330650A
Authority
CN
China
Prior art keywords
model
neural network
sample set
parameters
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111520388.4A
Other languages
Chinese (zh)
Inventor
李书晓
朱承飞
朱晓萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202111520388.4A priority Critical patent/CN114330650A/en
Publication of CN114330650A publication Critical patent/CN114330650A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a method and a device for analyzing characteristics of a small sample based on an evolutionary element learning model training, an electronic device and a storage medium, wherein the method for analyzing characteristics of the small sample based on the evolutionary element learning model training comprises the following steps: acquiring a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model. The method can improve the effect of small sample characteristic analysis under the condition of small samples.

Description

Small sample characteristic analysis method and device based on evolutionary element learning model training
Technical Field
The invention relates to the technical field of machine learning, in particular to a method and a device for analyzing characteristics of a small sample based on training of an evolutionary element learning model, electronic equipment and a storage medium.
Background
With the development of machine learning technology, machine learning is almost spread to various application fields, such as military field, financial field, medical field, and part defect inspection field. However, in the above-mentioned field, it is often necessary to perform a related model training on a small sample basis. For example, in a special application scenario such as abnormal behavior diagnosis in the financial field or abnormal state diagnosis of equipment in the key industry, an occurrence of an abnormality belongs to a small probability event, and a large number of training samples cannot be obtained. In the medical field of disease diagnosis, the lack of annotated samples is also often faced with a range of factors such as privacy, individual differences, or high annotation costs. In the aspect of monitoring endangered wild animals, the number of the endangered wild animals is rare, most of the endangered wild animals live in an unmanned area, and images in a typical environment and a typical posture are difficult to shoot. In the aspect of high-end part defect detection, the high-end part has high qualification rate, the number of the defect parts is small, the defect types are complex and various, samples of each defect type are extremely rare, and the like.
In the prior art, a small sample is used for analysis, and the problem of poor characteristic analysis effect of the small sample exists.
Disclosure of Invention
The invention provides a small sample feature analysis method and device based on an evolutionary element learning model training, electronic equipment and a storage medium, which are used for overcoming the defect of poor small sample feature analysis effect in the prior art and improving the small sample feature analysis effect.
The invention provides a small sample characteristic analysis method based on an evolutionary element learning model training, which comprises the following steps: acquiring a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
The invention provides a small sample characteristic analysis method based on an evolutionary element learning model training, wherein the method for training a second neural network model by utilizing the evolutionary element learning model training method to obtain general model parameters comprises the following steps: initializing the characteristic layer parameters of the second neural network model by using the characteristic layer parameters in the pre-training parameters, and randomly initializing the parameters of other network layers of the second neural network model; randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, and obtaining the class accuracy corresponding to the model parameters by using test samples; based on the model parameters of each candidate sample set and the category accuracy, obtaining synthetic model parameters and initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation; continuously executing the steps, randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, obtaining category accuracy corresponding to the model parameters by using test samples, and obtaining synthetic model parameters and initial model parameters of each candidate sample set in a backward generation by using federal calculation and evolutionary calculation based on the model parameters and the category accuracy of each candidate sample set until the neural network model converges to obtain the general model parameters.
The invention provides a small sample feature analysis method based on an evolutionary element learning model training, wherein the category accuracy comprises an accuracy statistic value, and the acquisition process of the accuracy statistic value comprises the following steps: randomly sampling samples with the same task as the candidate sample set from the base class sample set to obtain the test samples; carrying out validity verification on the model parameters of the candidate sample set by using the test sample to obtain a classification accuracy set; and counting each classification accuracy in the classification accuracy set to obtain the accuracy statistic value.
The invention provides a small sample feature analysis method based on an evolutionary element learning model training, wherein the method for obtaining synthetic model parameters and initial model parameters of each candidate sample set of a backward generation by using federal calculation and evolutionary calculation based on model parameters of each candidate sample set and the category accuracy comprises the following steps: screening the model parameters of each candidate sample set according to the corresponding category accuracy, and determining the highest category accuracy as the optimal model parameters; performing model synthesis on the model parameters of each candidate sample set and the category accuracy rates corresponding to the model parameters by using federal calculation to obtain the synthetic model parameters; randomly selecting two model parameters from the model parameters of each candidate sample set and the synthesis model parameters for cross processing to obtain variation model parameters, and executing in a circulating manner until reaching an execution time threshold to obtain a variation model parameter set; and synthesizing the preferred model parameters, the synthesis model parameters and the variation model parameter sets to determine initial model parameters of each candidate sample set of the backward surrogate.
The invention provides a small sample characteristic analysis method based on the training of an evolutionary element learning model, wherein the step of adjusting the second neural network model to obtain a target neural network model comprises the following steps: initializing the second neural network model by using the general model parameters to obtain an initialized neural network model; and in the network layer of the initialized neural network model, fixing model parameters of the network layer except the fully-connected layer, and training the second neural network model by using the target sample set until convergence to obtain the target neural network model.
The present invention also provides a model training apparatus, comprising: the first processing module is used for acquiring a base class sample set and a target sample set; the second processing module is used for training by utilizing the first neural network model based on the base class sample set to obtain a pre-training parameter; a third processing module, configured to train a second neural network model by using an evolutionary learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, where feature layers of the second neural network model and the first neural network model have the same network structure; and the fourth processing module is used for adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of any of the above-mentioned model training methods.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the model training method as described in any of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the model training method as described in any one of the above.
The invention provides a method, a device, electronic equipment and a storage medium for analyzing characteristics of a small sample based on training of an evolutionary element learning model, wherein a base class sample set and a target sample set are obtained; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on a base class sample set and pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain the target neural network model. The general model parameters are obtained through the pre-training parameters and the base class sample set, and the second neural network model is adjusted based on the target sample set and the general model parameters to obtain the target neural network model, so that the optimizing capability is improved, and the model training performance under the small sample condition is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a small sample feature analysis method based on the training of an evolutionary element learning model according to the present invention;
FIG. 2 is a second schematic flow chart of the method for analyzing characteristics of a small sample based on the training of an evolutionary learning model according to the present invention;
FIG. 3 is a third schematic flowchart of a small sample feature analysis method based on the training of an evolutionary learning model according to the present invention;
FIG. 4 is a fourth schematic flowchart of a small sample feature analysis method based on the training of an evolutionary element learning model according to the present invention;
FIG. 5 is a fifth schematic flowchart of a small sample feature analysis method based on the training of an evolutionary element learning model according to the present invention;
FIG. 6 is a schematic diagram of an architecture of a small sample feature analysis method based on the training of an evolutionary element learning model according to the present invention;
FIG. 7 is a schematic structural diagram of a small sample feature analysis device based on the training of an evolutionary element learning model according to the present invention;
fig. 8 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The model training method of the present invention is described below in conjunction with fig. 1-5.
In one embodiment, as shown in fig. 1, a method for analyzing characteristics of a small sample based on training of an evolutionary element learning model is provided, which is described by taking an example of applying the method to a server, and includes the following steps:
step 102, a base class sample set and a target sample set are obtained.
The base class sample refers to a public data set of typical tasks which can be downloaded from the internet, the data set comprises a plurality of tasks, and each task comprises a large number of samples. For example, the ImageNet dataset in the field of image classification contains 1000 object classes, i.e. tasks, each object class has more than 1000 annotated images, i.e. samples, and the total number of samples exceeds ten million. Target samples refer to small samples that are trained.
Specifically, the server may obtain the base class sample set in a public downloading manner. And acquiring a target sample set according to the requirement.
And 104, training by using the first neural network model based on the base class sample set to obtain a pre-training parameter.
The first neural network model is a common model suitable for the completed task and comprises a characteristic layer and a classification layer. The characteristic layer is determined according to task characteristics, such as a Transformer network model in the field of voice recognition, a ResNet network model in image classification, a PointNet network model in point cloud recognition and the like. The classification layer adopts a full connection structure, namely a full connection layer, and the output neuron number of the classification layer is the task number, namely the category number, of the sample set.
Specifically, after the server obtains the base class sample set and the target sample set, the neural network model is initialized at random, and then the base class sample set is used for training the neural network model until convergence occurs. And only the characteristic layer network parameters are reserved as the pre-training parameters of the neural network model because the number of the classification layer neurons of the base class sample set is inconsistent with the number of the classification layer neurons of the target task sample set.
106, training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure;
the second neural network model is a small sample training model, and can train a small sample set. The small sample set is the target sample set. The common model parameters are model parameters that can be applied to different small sample learning tasks having the same task size as the small sample set.
Specifically, after the server obtains the pre-training parameters, based on the base class sample set and the pre-training parameters, the server trains the second neural network model by using an evolutionary element learning model training method to obtain the general model parameters.
And step 108, adjusting the second neural network model based on the target sample set and the general model parameters to obtain the target neural network model.
Specifically, after obtaining the general model parameters, the server inputs each target sample in the target sample set into the second neural network model for training, and continuously adjusts the classification layer parameters of the second neural network model in the training process until the model converges to obtain the target neural network model.
The small sample characteristic analysis method based on the training of the evolutionary element learning model obtains a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on a base class sample set and pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain the target neural network model. The general model parameters are obtained through the pre-training parameters and the base class sample set, and the second neural network model is adjusted based on the target sample set and the general model parameters to obtain the target neural network model, so that the optimizing capability is improved, and the model training performance under the small sample condition is improved.
In one embodiment, as shown in fig. 2, training the second neural network model by using the method for training the evolutionary element learning model to obtain the general model parameters includes:
202, initializing the characteristic layer parameters of the second neural network model by using the characteristic layer parameters in the pre-training parameters, and randomly initializing the parameters of other network layers of the second neural network model.
The feature layer parameters refer to parameters of a network layer for extracting features of the sample.
Specifically, after obtaining the pre-training parameters, the server initializes the feature layer parameters of the second neural network model by using the feature layer parameters in the pre-training parameters, and randomly initializes the parameters of other network layers of the second neural network model.
And 204, randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, and obtaining the class accuracy corresponding to the model parameters by using the test samples.
The candidate task sample set refers to a set formed by randomly selecting C tasks from a base class sample set, wherein C is the number of categories of a target task sample set, and each task randomly samples K samples, wherein K is equal to the number of samples of the target task sample set. For example, if the target task sample set includes 2 types of tasks, wherein one type of task includes 6 sample images, and the other type of task includes 10 sample images, 2 types of target categories may be randomly selected from ImageNet, and each target category selects 8 samples, it can be understood that the number of selected samples is the average of the number of samples in the tasks, and a candidate task sample set is formed. It should be noted that the candidate task sample set is essentially a simulated small sample feature analysis task. The test sample refers to a set formed by samples which have the same task as the candidate task sample set and do not intersect with the candidate task sample set. For example, samples in the candidate task sample set are first removed from ImageNet, and then 100 samples are randomly selected as test samples for each of the C tasks of the candidate task sample set. The category accuracy refers to the accuracy of category identification of the test sample. For example, testing two tasks in a sample, 100 samples per task, where one task correctly identified 60 samples and the other task correctly identified 80 samples, would result in a category accuracy of 70%.
Specifically, the server firstly generates k candidate task sample sets, wherein k is the total number of the candidate task sample sets in the same generation; then, small stride training is carried out for certain iteration times on the basis of pre-training parameters, so that overfitting of the model can be prevented, and model parameters corresponding to k candidate task sample sets are obtained; and finally, carrying out validity verification on the obtained model parameters by using the test samples to obtain category accuracy rates corresponding to the k candidate task sample sets.
And step 206, obtaining synthetic model parameters and initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation based on the model parameters and category accuracy of each candidate sample set.
The synthesis model parameters are parameters obtained by performing average synthesis, weighted average synthesis or adaptive weighted average synthesis by using federal calculation based on model parameters of k candidate task sample sets.
Specifically, after obtaining the model parameters and the category accuracy of each candidate sample set, the server obtains the synthetic model parameters and the initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation.
In one embodiment, the model parameter assumed to result in k sets of candidate task samples is denoted as { M }1,M2,...,MkThe corresponding class accuracy is expressed as { mAP }1,mAP2,...,mAPkUsing average synthesisThe formula for obtaining the synthesis model parameters F is:
Figure BDA0003407311110000091
the formula for obtaining the synthesis model parameter F using weighted average synthesis is:
Figure BDA0003407311110000092
the formula for obtaining the synthesis model parameter F by using the adaptive weighted average synthesis is represented as:
Figure BDA0003407311110000093
where g (M) is the entropy of the information of model M to reflect the degree of full training of the model.
Hypothesis model parameters { M1,M2,…,MkMean value of the corresponding parameters
Figure BDA0003407311110000101
Then, the model parameters are subjected to adaptive weighted average, and the formula of the synthetic model parameter F is expressed as:
Figure BDA0003407311110000102
wherein i is a positive integer of 1 to k.
The synthetic model parameters are obtained by utilizing average synthesis, weighted average synthesis or adaptive weighted average synthesis, and the purpose of improving the stability of the model parameters can be achieved.
And 208, continuously executing the steps, randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, obtaining class accuracy corresponding to the model parameters by using the test samples, and obtaining synthetic model parameters and initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation based on the model parameters and the class accuracy of each candidate sample set until the neural network model converges to obtain general model parameters.
In particular, assume that the model parameters for each candidate sample set are denoted as { M }1,M2,…,MkThe corresponding class accuracy is expressed as { mAP }1,mAP2,...,mAPkFinding out a corresponding model parameter E with the maximum category accuracy as one of the initial parameters; and obtaining a federal model parameter F as one of the initial parameters through federal calculation, as shown in formulas (1), (2), (3) and (4). And finally, randomly selecting one of the model parameters E or the federal model parameters F as a parent through evolutionary computation, wherein the model parameters { M ] are used as the parent1,M2,…,MkRandomly selecting one of the candidate sample sets as another parent, and performing random cross operation to obtain a variation model parameter EV or a federal variation model parameter FV as one of initial model parameters of each candidate sample set of the backward generations.
In the embodiment, the characteristic layer parameters of the second neural network model are initialized by utilizing the characteristic layer parameters in the pre-training parameters, the parameters of other network layers of the second neural network model are randomly initialized, a plurality of candidate sample sets with the size equivalent to that of the target sample set are randomly sampled from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, obtaining category accuracy corresponding to the model parameters by using the test samples, obtaining synthetic model parameters and initial model parameters of each candidate sample set in the backward generation by using federal calculation and evolutionary calculation based on the model parameters and the category accuracy of each candidate sample set, and continuously executing the step 204 and the step 206 until the neural network model converges to obtain general model parameters, thereby achieving the purpose of accurately obtaining the general model parameters.
In one embodiment, as shown in fig. 3, the category accuracy includes an accuracy statistic, and the obtaining process of the accuracy statistic includes:
step 302, randomly sampling samples with the same task as the candidate sample set from the base class sample set to obtain test samples.
Specifically, after obtaining the base class sample set, the server randomly samples of the same task as the candidate sample set in the set to obtain a test sample.
And 304, verifying the validity of the model parameters of the candidate sample set by using the test sample to obtain a classification accuracy set.
Specifically, after obtaining a test sample, the server inputs the test sample into a neural network, and model parameters used by the neural network are model parameters for training a candidate sample set. And after the server obtains the object class accuracy set, the classification accuracy set corresponding to each object class is determined according to any object class accuracy in the object class accuracy set. For example, the test sample is verified by using the neural network model of the candidate sample set, so as to obtain a classification accuracy set of each object class in the test sample, and assuming that there are 5 object classes in the test sample, the classification accuracy sets corresponding to classes 1 to 5 are { 30%, 50%, 20%, 90%, 80% }.
And step 306, counting the classification accuracy of each classification accuracy in the classification accuracy set to obtain an accuracy statistic value.
Specifically, after the server obtains the classification accuracy set, the server performs statistics on each classification accuracy in the classification accuracy set to obtain an accuracy statistic value. For example, the average value of the classification accuracy rates in the classification accuracy rate set may be calculated to obtain an average value of the accuracy rates, and the average value may be used as the accuracy rate statistic.
In this embodiment, a sample with the same task as that of the candidate sample set is randomly sampled from the base class sample set to obtain a test sample, validity verification is performed on model parameters of the candidate sample set by using the test sample to obtain a classification accuracy set, and each classification accuracy in the classification accuracy set is counted to obtain an accuracy statistic. The purpose of accurately obtaining the accuracy statistic value can be achieved.
In one embodiment, as shown in fig. 4, obtaining the synthetic model parameters and the initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation based on the model parameters and category accuracy of each candidate sample set includes:
and 402, screening the model parameters of each candidate sample set according to the corresponding category accuracy, and determining the model parameters with the highest category accuracy as the preferred model parameters.
The optimal model parameters refer to the parameters with the best performance in the model parameters of the k candidate task sample sets. For example, assuming that the category accuracy set is { 20%, 30%, 60%, 70% }, and the corresponding neural network model parameters are { M1, M2, M3, M4}, respectively, it can be known that the category accuracy corresponding to the candidate task sample set 4 is the maximum value of the category accuracy, and the model parameter M4 is used as the optimal model parameter.
Specifically, the accuracy rates of the object categories in the object category accuracy rate set are sorted from large to small, and the accuracy rate of the object category with the order of 1 is obtained and serves as the maximum value of the accuracy rate of the object category.
Specifically, after obtaining the maximum value of the object class accuracy, the server determines the neural network model parameter corresponding to the maximum value of the object class accuracy as the optimal model parameter. For example, assuming that the object class accuracy set corresponding to the { candidate sample set 1, candidate sample set 2, candidate sample set 3, candidate sample set 4} is { 20%, 30%, 60%, 70% }, and the corresponding neural network model parameters are expressed as { M1, M2, M3, M4}, it is known that the object class accuracy corresponding to the candidate sample set 4 is the maximum value of the object class accuracy, and the model parameter M4 corresponding to the candidate sample set 4 is taken as the preferred model parameter E.
And 404, performing model synthesis on the model parameters of each candidate sample set and the class accuracy rates corresponding to the model parameters by using federal calculation to obtain synthetic model parameters.
In particular toIn other words, assuming that the intermediate model parameter is represented as F, the candidate object class accuracy in the object class accuracy set is represented as mAPiThe model parameter is expressed as MiThe synthesis model parameters F are expressed as the above equations (1) (2) (3) (4).
And 406, randomly selecting two model parameters from the model parameters and the synthesis model parameters of each candidate sample set for cross processing to obtain variation model parameters, and executing in a circulating manner until reaching an execution time threshold to obtain a variation model parameter set.
Specifically, after obtaining the optimal model parameters E and the synthetic model parameters F, the server randomly selects one as a parent, randomly selects one of the model parameters in each candidate sample set as another parent, and performs cross processing on the randomly selected two model parameters to obtain the variant model parameters EV or FV. Executing circularly until reaching the threshold of executing times to obtain a variation model parameter set EnOr Fn
And step 408, integrating the preferred model parameters, the synthetic model parameters and the variation model parameter sets, and determining initial model parameters of each candidate sample set of the backward surrogate.
Specifically, the server obtains an optimal model parameter E, a synthetic model parameter F and a variation model parameter set EnOr FnThereafter, at least one type of model parameter may be selected therein as an initial model parameter for each candidate sample set of the backward surrogate.
In one embodiment, after obtaining model parameters and category accuracy of k candidate task sample sets, a server obtains optimal model parameters by adopting high-out-of-the-best mode, obtains synthetic model parameters by adopting federal calculation, obtains k-2 variant model parameter sets by adopting evolutionary calculation, and combines the k model parameters to be used as initial model parameters of each candidate task sample set of the next generation.
In one embodiment, a server already obtains model parameters and category accuracy of k first-generation candidate task sample sets, and k model parameters are obtained by utilizing the advantages and disadvantages, federal calculation and evolutionary calculation and serve as initial model parameters of a second-generation candidate task sample set; the second generation randomly samples k candidate task sample sets, and then trains on the basis of the initial model parameters given by the first generation to obtain the model parameters and the category accuracy of the k candidate task sample sets of the second generation; and training is continued until the neural network model converges, until a training time threshold is reached, for example, the training threshold is 100 generations, and training of 100 generations is finished, so as to obtain the generalized general model parameters.
In one embodiment, the set of variant model parameters is derived from model parameters { M } from a set of k candidate task samples1,M2,…,MkAnd randomly selecting two model parameters from the synthesis model parameters F to carry out cross processing to obtain variation model parameters, and executing k-2 times in total to obtain a variation model parameter set, wherein k is the total number of the candidate task sample sets.
In this embodiment, model parameters of each candidate sample set are screened according to the category accuracy corresponding to the model parameters, the highest category accuracy is determined as an optimal model parameter, model parameters of each candidate sample set and the category accuracy corresponding to the model parameters are subjected to model synthesis by federal calculation to obtain a synthetic model parameter, two model parameters are randomly selected from the model parameters and the synthetic model parameters of each candidate sample set to be subjected to cross processing to obtain a variant model parameter, the model parameters are executed in a circulating manner until an execution time threshold is reached to obtain a variant model parameter set, the optimal model parameter, the synthetic model parameter and the variant model parameter set are synthesized, and initial model parameters of each candidate sample set of a backward surrogate are determined. The synthetic variation model parameters are obtained through the cross processing, and the purpose of improving the diversity of the model parameters can be achieved. Synthetic model parameters are obtained through federal calculation, so that the convergence stability of the candidate task sample set under the condition that the candidate task sample set does not meet independent same distribution is enhanced; the variation model parameters are obtained through evolutionary computation, the diversity of the initial model parameters of each candidate task sample set of the next generation is enriched, and the optimization capability is improved; the advantages of meta-learning, federal calculation and evolutionary calculation are combined, and the model training performance under the condition of small samples is improved.
In one embodiment, as shown in fig. 5, adjusting the second neural network model to obtain the target neural network model comprises:
and 502, initializing the second neural network model by using the general model parameters to obtain an initialized neural network model.
Specifically, after obtaining the general model parameters, the server initializes the neural network model by using the general model parameters, so as to obtain an initialized neural network model.
And step 504, in the network layer of the initialized neural network model, fixing model parameters of the network layer except the full connection layer, and training the second neural network model by using the target sample set until convergence to obtain the target neural network model.
Specifically, after obtaining the initialized neural network model, the server fixes the model parameters of the last layer of the initialized neural network model, i.e., the network layers except for the full connection, and trains the neural network model by using the target sample set until convergence to obtain the target neural network model. The target sample set can be regarded as a new class, i.e. a small sample, which is a sample that needs to be classified into object classes and has a certain sample capacity, typically 1-100 samples.
In this embodiment, the neural network model is initialized by using the general model parameters to obtain an initialized neural network model, so that model parameters of a network layer other than the fully-connected layer are fixed in the network layer of the initialized neural network model, the neural network model is trained by using a sample to be trained until convergence, a target neural network model is obtained, and the purpose of obtaining an optimal target neural network model capable of rapidly converging is achieved.
In this embodiment, the second neural network model is initialized by using the general model parameters to obtain an initialized neural network model, the model parameters of the network layer other than the fully-connected layer are fixed in the network layer of the initialized neural network model, and the second neural network model is trained by using the target sample set until convergence to obtain a target neural network model, so that the purpose of obtaining the target neural network model with the optimal performance can be achieved.
In one embodiment, as shown in FIG. 6, a training process for small sample feature analysis is taken as an example. The following steps are required:
step 1: and selecting a plurality of meta-samples from the base class samples, namely a large number of samples, to perform parallel training on the same intergeneration neural network model to obtain model parameters corresponding to different meta-samples. For example, the meta sample 1-1, the meta sample 1-2, the meta sample 1-3, and the meta sample 1-4 are respectively input into the corresponding sub-neural network models for parallel training to obtain the corresponding parameter M1-1、M1-2、M1-3And M1-4
Step 2: and obtaining initial parameters of the next generation of the model by using survival and genetic variation mechanisms of the fittest.
And step 3: and (5) circularly executing the step 1 and the step 2 to obtain the general characteristic model parameters of the neural network model.
And 4, step 4: and 3, performing parameter optimization on the last layer of the neural network model by using the general characteristic model parameters obtained in the step 3, and performing neural network model training by using the new sample set to obtain a small sample characteristic analysis model capable of rapidly converging.
It is understood that the neural network model in step 3 or step 4 has the same network structure as the neural network model in step 1.
According to the method, through the operation of the steps, the small sample characteristic analysis model can be initialized by utilizing the general characteristic model parameters with good stability and strong generalization, and the rapid convergence capability of small sample learning is improved.
In an embodiment, the specific implementation process of step 2 is as follows: obtaining a test sample, inputting the test sample into the parallel neural network models corresponding to the meta sample sets, obtaining the average accuracy of the object types of the neural network models corresponding to the meta sample sets, based on the survival principle of the fittest, prolonging the individual life cycle with strong adaptability, and keeping the neural network model parameter E corresponding to the meta sample set with the highest average accuracy as an initial parameter of the next generation of the model; based on the federal calculation thought, the federal model parameters F with certain fault tolerance and good convergence are obtained to be used as one or more initial parameters of the next generation of inter-model.
In one embodiment, the federated model parameters F are derived based on individual model parameters { M }1,M2,…,MkAnd its average accuracy { mAP1,mAP2,…,mAPkAnd k is the number of individual models in the same generation. It is understood that M therein1The model parameters M including the first meta-sample output in each agent as shown in FIG. 61-1、M2-1、M3-1、……,M2Model parameters M including the second meta-sample output in each agent as shown in FIG. 61-2、M2-2、M3-2… …, and so on, the individual model parameters include parameters of the neural network model output obtained after training and iteration by using the same era each meta-sample. Similarly, mAP in average accuracy1Including the average accuracy, mAP, of the model parameters output by the first meta-sample in each agent as shown in FIG. 62The average accuracy of the model parameters output by the second meta-sample in each agent, as shown in fig. 6, is included, and so on. The federal model parameter F is obtained by adopting any one of the following modes through comprehensive calculation:
mode 1: and (3) carrying out average value calculation on each individual model parameter to obtain a federal model parameter F, which is expressed as the formula (1):
mode 2: and (3) carrying out weighted average on all the individual model parameters according to the average accuracy to obtain a federal model parameter F which is expressed as the formula (2).
12k
Mode 3: and (4) carrying out self-adaptive weighted average on all the individual models according to the average value { m, m, …, m } of all the parameters in the individual model parameters to obtain the federal model parameter F, which is expressed as the formula (4).
In an embodiment, as shown in fig. 6, based on the survival and genetic variation mechanism of the fittest, variation may be performed on the basis of the model parameter E or the federal model parameter F of the neural network model with the highest average accuracy, so as to obtain an individual variation model parameter EV or a federal variation model parameter FV, which is used as an initial parameter of the next generation model. Specifically, one of the model parameters E and the federal model parameters F of the neural network model with the highest average accuracy is randomly selected as a parent, and model parameters { M } of all individuals1,M2,…,MkRandomly selecting one as another parent, and then performing random cross operation to generate an initial parameter of the next generation of the inter-model. The number of the variant model parameters is determined according to actual needs, and the operation with the corresponding number is executed, so that the number of the individual models in the same generation is ensured to be unchanged.
In an embodiment, in step 3, specifically, the initial model parameters of the first generation of individuals are obtained by utilizing random initialization or integral training on the base class set, and all individuals of the first generation share the same initial model parameters to avoid the inconsistency of the overall forms of the model parameters among the individuals; running the step 1 to obtain updated individual model parameters under different sample tasks, running the step 2 to obtain initial model parameters of next generation of inter-individual, circularly executing the step 1 and the step 2 for n times to obtain general characteristic model parameters F of the neural network modeln(ii) a It should be noted that, because the pre-training classification layer network structure is not consistent with the small sample feature analysis classification layer network structure, only the pre-training parameters of the feature network in the neural network model are retained after the whole training on the base class set, and the classification layer parameters are initialized randomly, or the initialization can be performed by using the class prototype feature vectors in the meta-sample or the meta-task set, so that the convergence speed can be increased. All individuals of the first generation share the same initial model parameters to avoid the overall shape inconsistency of the model parameters among the individuals.
In one embodiment, it should be noted that the base class large samples and the new class small samples shown in fig. 6 are represented as follows: suppose that the small sample new class set has C classes, each class has K samples, i.e. the small sample new class set can be represented as a small sample classification task of { C-Way K-Shot }. Assuming that the base class set of a large number of samples has C '(C' > C) classes, each class has K '(K' > K) samples, it can be extracted based on a large-scale common type target dataset, e.g., from the common type target dataset ImageNet. The small sample new class set is an uncommon type target and has no intersection with the target class in the base class set. C categories are randomly sampled in the base class set, K samples are randomly sampled in each category to obtain a meta-sample, namely a meta-task set, and multiple meta-samples, namely the meta-task set, can be obtained by repeatedly sampling for multiple times.
In one embodiment, the process of obtaining the parameters of the individual model is: on the basis of initial model parameters, namely first surrogate or evolutionary computation model parameters, a plurality of obtained meta-samples or meta-task sets of the same surrogate are used for respectively training neural network models under different scenario tasks to obtain individual model parameters under different scenario tasks, for example, ResNet50 feature networks and back propagation training of full-connection classification layers are used for obtaining individual model parameters of each meta-sample or meta-task set. It will be appreciated that different contextual tasks are different meta-samples or sets of meta-tasks.
The model training device provided by the present invention is described below, and the model training device described below and the model training method described above may be referred to in correspondence with each other.
In one embodiment, as shown in FIG. 7, there is provided a model training apparatus 700 comprising: a first processing module 702, a second processing module 704, a third processing module 706, and a fourth processing module 708, wherein: the first processing module is used for acquiring a base class sample set and a target sample set; the second processing module is used for training by utilizing the first neural network model based on the base class sample set to obtain a pre-training parameter; a third processing module, configured to train a second neural network model by using an evolutionary learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, where feature layers of the second neural network model and the first neural network model have the same network structure; and the fourth processing module is used for adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
In one embodiment, the third processing module 706 is configured to initialize feature layer parameters of the second neural network model using the feature layer parameters in the pre-training parameters, and randomly initialize parameters of other network layers of the second neural network model; randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, and obtaining the class accuracy corresponding to the model parameters by using test samples; based on the model parameters of each candidate sample set and the category accuracy, obtaining synthetic model parameters and initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation; continuously executing the steps, randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, obtaining category accuracy corresponding to the model parameters by using test samples, and obtaining synthetic model parameters and initial model parameters of each candidate sample set in a backward generation by using federal calculation and evolutionary calculation based on the model parameters and the category accuracy of each candidate sample set until the neural network model converges to obtain the general model parameters.
In an embodiment, the third processing module 706 is configured to randomly sample a sample of the same task as the candidate sample set from the base class sample set to obtain the test sample; carrying out validity verification on the model parameters of the candidate sample set by using the test sample to obtain a classification accuracy set; and counting each classification accuracy in the classification accuracy set to obtain the accuracy statistic value.
In an embodiment, the third processing module 706 is configured to perform screening on the model parameter of each candidate sample set according to the category accuracy corresponding to the model parameter, and determine the model parameter with the highest category accuracy as the preferred model parameter; performing model synthesis on the model parameters of each candidate sample set and the category accuracy rates corresponding to the model parameters by using federal calculation to obtain the synthetic model parameters; randomly selecting two model parameters from the model parameters of each candidate sample set and the synthesis model parameters for cross processing to obtain variation model parameters, and executing in a circulating manner until reaching an execution time threshold to obtain a variation model parameter set; and synthesizing the preferred model parameters, the synthesis model parameters and the variation model parameter sets to determine initial model parameters of each candidate sample set of the backward surrogate.
In an embodiment, the fourth processing module 708 is configured to initialize the second neural network model by using the general model parameters, so as to obtain an initialized neural network model; and in the network layer of the initialized neural network model, fixing model parameters of the network layer except the fully-connected layer, and training the second neural network model by using the target sample set until convergence to obtain the target neural network model.
Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. Processor 810 may invoke logic instructions in memory 830 to perform a method of small sample feature analysis based on training of an evolutionary meta-learning model, the method comprising: acquiring a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing a method for small sample feature analysis based on training of an evolutionary meta learning model, the method comprising: acquiring a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform a method of small sample feature analysis based on training of an evolutionary learning model, the method comprising: acquiring a base class sample set and a target sample set; training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters; training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure; and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A small sample feature analysis method based on the training of an evolutionary element learning model is characterized by comprising the following steps:
acquiring a base class sample set and a target sample set;
training by utilizing a first neural network model based on the base class sample set to obtain pre-training parameters;
training a second neural network model by using an evolutionary element learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, wherein the characteristic layers of the second neural network model and the first neural network model have the same network structure;
and adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
2. The method of claim 1, wherein training the second neural network model using the training method for the evolutionary element learning model to obtain the generic model parameters comprises:
initializing the characteristic layer parameters of the second neural network model by using the characteristic layer parameters in the pre-training parameters, and randomly initializing the parameters of other network layers of the second neural network model;
randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, and obtaining the class accuracy corresponding to the model parameters by using test samples;
based on the model parameters of each candidate sample set and the category accuracy, obtaining synthetic model parameters and initial model parameters of each candidate sample set of the backward generation by using federal calculation and evolutionary calculation;
continuously executing the steps, randomly sampling a plurality of candidate sample sets with the same scale as the target sample set from the base class sample set, training the second neural network model to obtain model parameters of each candidate sample set, obtaining category accuracy corresponding to the model parameters by using test samples, and obtaining synthetic model parameters and initial model parameters of each candidate sample set in a backward generation by using federal calculation and evolutionary calculation based on the model parameters and the category accuracy of each candidate sample set until the neural network model converges to obtain the general model parameters.
3. The method of claim 2, wherein the category accuracy comprises an accuracy statistic, and the obtaining of the accuracy statistic comprises:
randomly sampling samples with the same task as the candidate sample set from the base class sample set to obtain the test samples;
carrying out validity verification on the model parameters of the candidate sample set by using the test sample to obtain a classification accuracy set;
and counting each classification accuracy in the classification accuracy set to obtain the accuracy statistic value.
4. The method of claim 2, wherein the obtaining synthetic model parameters and backward generation initial model parameters of each candidate sample set by using federal calculation and evolutionary calculation based on the model parameters and the category accuracy of each candidate sample set comprises:
screening the model parameters of each candidate sample set according to the corresponding category accuracy, and determining the highest category accuracy as the optimal model parameters;
performing model synthesis on the model parameters of each candidate sample set and the category accuracy rates corresponding to the model parameters by using federal calculation to obtain the synthetic model parameters;
randomly selecting two model parameters from the model parameters of each candidate sample set and the synthesis model parameters for cross processing to obtain variation model parameters, and executing in a circulating manner until reaching an execution time threshold to obtain a variation model parameter set;
and synthesizing the preferred model parameters, the synthesis model parameters and the variation model parameter sets to determine initial model parameters of each candidate sample set of the backward surrogate.
5. The method of claim 1, wherein said adapting the second neural network model to obtain a target neural network model comprises:
initializing the second neural network model by using the general model parameters to obtain an initialized neural network model;
and in the network layer of the initialized neural network model, fixing model parameters of the network layer except the fully-connected layer, and training the second neural network model by using the target sample set until convergence to obtain the target neural network model.
6. The utility model provides a small sample feature analysis device based on training of evolutionary element learning model which characterized in that includes:
the first processing module is used for acquiring a base class sample set and a target sample set;
the second processing module is used for training by utilizing the first neural network model based on the base class sample set to obtain a pre-training parameter;
a third processing module, configured to train a second neural network model by using an evolutionary learning model training method based on the base class sample set and the pre-training parameters to obtain general model parameters, where feature layers of the second neural network model and the first neural network model have the same network structure;
and the fourth processing module is used for adjusting the second neural network model based on the target sample set and the general model parameters to obtain a target neural network model.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the analysis of the features of the small sample trained on the evolutionary meta learning model according to any of claims 1 to 5.
8. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, performs the steps of the analysis of the features of the small sample trained on the basis of the evolutionary meta-learning model according to any one of claims 1 to 5.
9. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, performs the steps of the small sample feature analysis based on the training of the evolutionary meta learning model as claimed in any one of claims 1 to 5.
CN202111520388.4A 2021-12-13 2021-12-13 Small sample characteristic analysis method and device based on evolutionary element learning model training Pending CN114330650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111520388.4A CN114330650A (en) 2021-12-13 2021-12-13 Small sample characteristic analysis method and device based on evolutionary element learning model training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111520388.4A CN114330650A (en) 2021-12-13 2021-12-13 Small sample characteristic analysis method and device based on evolutionary element learning model training

Publications (1)

Publication Number Publication Date
CN114330650A true CN114330650A (en) 2022-04-12

Family

ID=81050280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111520388.4A Pending CN114330650A (en) 2021-12-13 2021-12-13 Small sample characteristic analysis method and device based on evolutionary element learning model training

Country Status (1)

Country Link
CN (1) CN114330650A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821907A (en) * 2023-06-29 2023-09-29 哈尔滨工业大学 Drop-MAML-based small sample learning intrusion detection method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821907A (en) * 2023-06-29 2023-09-29 哈尔滨工业大学 Drop-MAML-based small sample learning intrusion detection method
CN116821907B (en) * 2023-06-29 2024-02-02 哈尔滨工业大学 Drop-MAML-based small sample learning intrusion detection method

Similar Documents

Publication Publication Date Title
Shmelkov et al. How good is my GAN?
CN109063724B (en) Enhanced generation type countermeasure network and target sample identification method
Xiao et al. Attentional factorization machines: Learning the weight of feature interactions via attention networks
Xu et al. Reasoning-rcnn: Unifying adaptive global reasoning into large-scale object detection
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
CN112101190A (en) Remote sensing image classification method, storage medium and computing device
CN111785329B (en) Single-cell RNA sequencing clustering method based on countermeasure automatic encoder
CN112036513B (en) Image anomaly detection method based on memory-enhanced potential spatial autoregression
CN112199536A (en) Cross-modality-based rapid multi-label image classification method and system
CN109614611B (en) Emotion analysis method for fusion generation of non-antagonistic network and convolutional neural network
CN116662817B (en) Asset identification method and system of Internet of things equipment
CN110851654A (en) Industrial equipment fault detection and classification method based on tensor data dimension reduction
Huang et al. Deep prototypical networks for imbalanced time series classification under data scarcity
CN116015932A (en) Intrusion detection network model generation method and data flow intrusion detection method
CN111371611A (en) Weighted network community discovery method and device based on deep learning
CN113591962B (en) Network attack sample generation method and device
CN114330650A (en) Small sample characteristic analysis method and device based on evolutionary element learning model training
CN110288002B (en) Image classification method based on sparse orthogonal neural network
Dong et al. Scene-oriented hierarchical classification of blurry and noisy images
CN109325140B (en) Method and device for extracting hash code from image and image retrieval method and device
CN115497564A (en) Antigen identification model establishing method and antigen identification method
CN115146788A (en) Training method and device of distributed machine learning model and electric equipment storage medium
Silva et al. Analysis of CNN architectures for human action recognition in video
CN109145132B (en) Method and device for extracting hash code from image and image retrieval method and device
CN111858999A (en) Retrieval method and device based on difficult-to-segment sample generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination