CN110516718B - Zero sample learning method based on deep embedding space - Google Patents
Zero sample learning method based on deep embedding space Download PDFInfo
- Publication number
- CN110516718B CN110516718B CN201910740748.8A CN201910740748A CN110516718B CN 110516718 B CN110516718 B CN 110516718B CN 201910740748 A CN201910740748 A CN 201910740748A CN 110516718 B CN110516718 B CN 110516718B
- Authority
- CN
- China
- Prior art keywords
- label
- network
- deep
- branch
- embedding space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a zero sample learning method based on a deep embedding space, which is used for solving the technical problem that the existing zero sample learning method is poor in generalization capability. The technical scheme includes that an effective deep intermediary embedding space is learned through a deep learning technology, semantic category description and image information description of known categories and unknown categories are mapped into the deep intermediary embedding space through a trained deep network, and finally, features in the embedding space are classified through corresponding classifiers to obtain corresponding prediction labels. In the prediction process, a mapping network self-learning algorithm is adopted, the generalization capability is effectively improved, and the classification accuracy of unknown class samples is improved.
Description
Technical Field
The invention relates to a zero sample learning method, in particular to a zero sample learning method based on a deep embedding space.
Background
In recent years, deep neural networks have achieved significant success in many computer vision applications, such as target recognition, detection, and the like. The key point of success is that based on a large number of learning examples with marks, a supervised learning method is utilized, and the extremely strong nonlinear fitting capacity of a deep neural network is fully exerted to mine the complex structural relationship existing between task input and task output. However, in practical applications, because the artificial labeling of the learning samples requires high cost, especially in relatively complex tasks such as semantic segmentation, etc., it is often difficult to obtain sufficient labeled learning samples, and even in many applications, any labeled learning samples cannot be obtained (for example, for newly emerging substances, or unknown environments, etc.), thereby seriously affecting the generalization ability of the deep neural network.
The zero sample learning-based method proposed in the documents "Y.Anandani and S.Biswas.Preserving semiconducting relationships for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7603-7612, 2018" can effectively solve the above problems. Different from the traditional supervised learning, in zero sample learning, each class is associated with a specific semantic description, and the learning aims to realize accurate classification and identification of samples of unknown classes (without any labeled training samples) by mining the relation between the samples in the classes and the corresponding semantic descriptions. The key of zero sample learning lies in learning an effective embedding space, which can accurately establish the structural relationship between the category and the corresponding semantic description, and generalize to the unknown category and the associated semantic description. However, the existing zero sample learning model cannot fully consider the structural characteristics embedded in the space, and thus is generally affected by the problems of hubness and bias towards seconds classes, and the generalization capability is limited.
Disclosure of Invention
In order to overcome the defect that the existing zero sample learning method is poor in generalization capability, the invention provides a zero sample learning method based on a deep embedding space. The method learns an effective deep intermediary embedding space through a deep learning technology, maps semantic class descriptions and image information descriptions of known classes and unknown classes into the deep intermediary embedding space through a trained deep network, and classifies features in the embedding space through a corresponding classifier to obtain a corresponding prediction label. In the prediction process, a mapping network self-learning algorithm is adopted, the generalization capability is effectively improved, and the classification accuracy of unknown class samples is improved.
The technical scheme adopted by the invention for solving the technical problems is as follows: a zero sample learning method based on a deep embedding space is characterized by comprising the following steps:
step one, representing a training set with N samples asWhereinIndicating that the ith image sample is b in length and corresponds to a category label of->And->Then the label set for all known categories is represented. During testing, the goal of zero sample learning is to predict the new sample x j Affiliated category label>Represented is a set of labels of all unknown classes, and +>In respect of each known class->Or unknown class->Has a corresponding semantic description->Or->
Step two, establishing a two-branch deep embedded network, wherein one branch is an image mapping branch, the branch network is a preprocessed deep convolution network, and the input of the branch network is the extracted image characteristic x i Then, through a multi-layer perceptronTo learn the image feature x i A mapping process embedded into the implicit space. The other branch of the two-branch network is a semantic class mapping branch which is also based on a multi-level perceptron->Based on semantic description information->Mapping into the same implicit embedding space. The loss function of the two-branch network is defined in the form,
wherein, theta v And theta s The parameters of the multi-layer perceptron involved in the two-branch network are shown, while W refers to the parameters of the linear classifier to be learned, otherwiseIt refers to classification loss, where a cross-entropy function is chosen as a way to compute the classification loss. To avoid overfitting, l is adopted 2 The norm limits all parameters and is constrained by η weighting. The loss function is optimized and solved through a back propagation algorithm, and therefore the corresponding network parameter theta is obtained v And theta s . At the obtained parameter theta v And theta s Thereafter, the test sample is->The predicted label of (a) is expressed as,
where z represents the semantic description information for tag y.
Step three, giving a test sample, firstly, the rootPredicting the pseudo label of the test sample set according to the embedding space learned in the step two, and then generating the pseudo label and the image-semantic difference, namelyThe M test samples in the test sample set which are closest to the pseudo label are selected, M =40, and the selected samples and the pseudo labels assigned thereto are manually combined as new training data into the training set->In, the expanded training set is obtained>
And step four, after the trained mapping network and the trained classifier are obtained, in order to avoid the phenomenon that the predicted label of the unknown sample is biased to the label of the known sample due to the learned deep embedding space, an adaptive adjustment model is adopted to solve the problem. The new optimization objective function is expressed as
Wherein C represents the number of unknown classes,indicates the ith selected test sample, <' > is selected>And &>Respectively indicate the extended training set->The corresponding pseudo label in (1) and the semantic description of the category to which it belongs.
The invention has the beneficial effects that: the method learns an effective deep intermediary embedding space through a deep learning technology, maps semantic class descriptions and image information descriptions of known classes and unknown classes into the deep intermediary embedding space through a trained deep network, and classifies features in the embedding space through a corresponding classifier to obtain a corresponding prediction label. In the prediction process, a mapping network self-learning algorithm is adopted, the generalization capability is effectively improved, and the classification accuracy of unknown class samples is improved.
The present invention will be described in detail with reference to the following embodiments.
Detailed Description
The zero sample learning method based on the deep embedding space comprises the following specific steps:
1. and (4) preprocessing data.
Representing the training set with N samples asTraining sample set having a size N, wherein>Represents the ith image feature vector with length b and the corresponding class label-> A set of tags representing all known categories. During testing, zero sample learning aims at predicting new sample x j Affiliated category label>A set of tags representing all unknown classes, and +>In respect of each known class->Or unknown class->There is a corresponding semantic feature vector z that describes the feature of the class, which is/are greater than>Represents a semantic feature vector in the training set, <' >>Representing semantic feature vectors in the test set. Taking the AwA data set as an example, the data set comprises 30,745 pictures of 50 different animal species, wherein the semantic feature vector of each semantic categoryThe performance of this type of animal was characterized in 85 different characteristics. A training set of the data set->Is taken as an image sample>The length of the feature vector obtained after ResNet101 processing of the corresponding picture in the data set is 2048, and the sample data x in the corresponding test set j Have the same shape.
2. And (5) deep embedded network training.
After data preprocessing, the image features and the category semantic features need to be respectively mapped into the same implicit deep embedding space through establishing a deep network, and the space can enable the embedded image features and the category semantic features to meet the intra-class compactness and the inter-class separability. Mapping of image features and category semantic features to an implicit depth embedding space is realized by establishing a two-branch depth embedding network, wherein one branch is image mapping branch, and the other branch is image mapping branchCategory semantic feature mapping branches. The invention respectively learns the mapping process of the two characteristics to the embedding space by a multilayer perceptron. The image mapping branch network can be expressed asThe network combines the image features x i Mapping to implicit space. Theta v Mapping parameters of the branched network for the image, and x i Then the ith image feature vector is represented, and the branched multi-layered perceptron is implemented by a Fully Connected Layer (FC) plus a linearly Rectified Layer (ReLU), where the input and output channel sizes of the Fully Connected Layer are 2048 and 1024, respectively.
Another category semantic feature mapping branch network can be expressed asThe branch network combines the semantic description information>Mapping into the same implicit embedding space. Wherein theta is s The category semantic feature maps a parameter of the branch network and->The class semantic feature vector corresponding to the training sample is represented, and the multi-layer perceptron of the branch is realized by two fully-connected layers and two linear rectification layers. The two fully-connected layers are connected in series, and each fully-connected layer is followed by a linear rectifying layer, wherein the input channel size of the first fully-connected layer is the size of the semantic feature->For an AwA data set of size 85, the output channel of the fully-connected layer is sized->And the output channel size of the second fully-connected layer is 10The input channel size is the same as the output of the first fully connected layer 24.
The error function of the deep-embedded network is defined as,
where W is the parameter of the linear classifier learned by the proposed network structure during the training process, W T Is a transpose of the classifier parameters. In addition, theRefers to a classification loss function for calculating the difference between the classification result and the correct result of the linear classifier on the training sample, and a cross entropy function is selected as a method for calculating the classification loss. Lambda is taken as the equilibrium coefficient, the value range is (0.1, 0.3), and l is adopted to avoid overfitting 2 The norm limits all learnable parameters and is constrained by η weighting. Formula (1) is optimized and solved through a typical back propagation algorithm, so that a corresponding network parameter theta is obtained v And theta s . In the training process, the learning rate is set to be 1e-4, and the cycle number is T =50.
After the corresponding network has been learned, the test samples can be classified by
3. And (5) data set expansion.
Calculating the distance between the image feature vector and the category semantic feature vector in the learned depth interpolation space by the formula (2), and dividing M image feature vectors with the minimum distance from the category semantic feature vector into the category and assigning the M image feature vectorsThe pseudo label expands the training data set, the expanded training setCan be expressed as
M denotes the number of selected pseudo-tagged test samples, C is the number of unknown classes, M =40 in the present invention, C varies from data set to data set and is 10 on the AwA data set.Indicating the ith sample assigned a pseudo label. Additionally, is>The corresponding pseudo label of the M samples, i.e. the prediction label obtained according to equation (2) corresponding to the test sample, is represented and usedTo represent the semantic description of the category to which the pseudo-tag belongs.
4. And (4) self-adaptive learning of the mapping network.
In most zero sample learning, only samples in known classes can be considered as training samples to learn the embedding space, and therefore, the learned embedding space generates a phenomenon that the prediction labels of unknown samples bias towards the labels of known samples. To better solve this problem, a deep embedding space adaptive tuning model is adopted, which can apply unlabeled test data to the training of the model to improve the classification accuracy.
Training set after obtaining expansionThen, the objective function of the adaptive tuning model is expressed as
Where C represents the number of unknown classes. Step 4 Using the extended training setThe learned mapping network is adaptively adjusted by the data in (1). After each round of adjustment, the combination is paired>And (4) updating the data set according to the step 3, wherein the total updating turn is R =10, and the learning rate of the mapping network in the self-adaptive adjustment process is 1e-4. After that, by updating the parameter theta v And theta s Substituting into equation (4) can be applied to the corresponding test sample x j The label of (2) is predicted.
The method of the invention compares the PSR method proposed in the paper "Y.Anandani and S.Biswas.Preserving semiconducting relationships for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7603-7612, 2018" with the RN method proposed in the background art method on the AwA data set, and the experimental result shows that the proposed method has better performance, for example, under the conventional zero-sample learning experiment, the global classification accuracy of the proposed method on the AwA data set for unknown samples is higher than the PSR of the existing best method by 2.7%. In the general zero-sample learning experiment, the classification accuracy for unknown samples on the same AwA dataset was also 5.2% higher than that of the background art method RN.
Claims (1)
1. A zero sample learning method based on a deep embedding space is characterized by comprising the following steps:
step one, representing a training set with N samples asWhereinIndicates that the ith image sample has a length of b and a corresponding class label ofWhileThen the label set of all known categories is represented; during testing, zero sample learning aims at predicting new sample x j Class label to which Representing a set of labels of all unknown classes, anAbout each known classOr unknown classAll have a corresponding semantic descriptionOr
Step two, establishing a two-branch deep embedded network, wherein one branch is an image mapping branch, the branch network is a preprocessed deep convolution network, and the input of the branch network is the extracted image characteristic x i Then, through a multi-layer perceptronTo learn the image feature x i A mapping process embedded into an implicit space; the other branch of the two-branch network is a semantic class mapping branch which is also passed through a multi-layer perceptronDescribing information semanticallyMapping into the same implicit embedding space; the loss function of the two-branch network is defined in the form,
wherein, theta v And theta s The parameters of the multi-layer perceptron involved in the two-branch network are shown, while W refers to the parameters of the linear classifier to be learned, otherwiseThen it refers to classification loss, where a cross entropy function is chosen as the method to compute the classification loss; to avoid overfitting, l is adopted 2 Norm to limit all parameters and is constrained by η weighting; the loss function is optimized and solved through a back propagation algorithm, and therefore the corresponding network parameter theta is obtained v And theta s (ii) a At the acquisition of the parameter theta v And theta s Thereafter, the sample is testedThe predicted label of (a) is expressed as,
wherein z represents semantic description information of a label y;
step three, giving a test sample, predicting a pseudo label of the test sample set according to the embedding space learned in the step two, and then predicting the pseudo label of the test sample set according to the generated pseudo label and the image-semantic difference, namelySelecting M test samples closest to the pseudo label in the test sample set, wherein M =40, and manually combining the selected samples and the pseudo label given to the selected samples into a training set as new training dataIn the training set, the extended training set is obtained
After the trained mapping network and classifier are obtained, in order to avoid the phenomenon that the prediction label of the unknown sample is biased to the label of the known sample caused by the learned deep embedding space, a self-adaptive adjustment model is adopted to solve the problem; the new optimization objective function is represented as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740748.8A CN110516718B (en) | 2019-08-12 | 2019-08-12 | Zero sample learning method based on deep embedding space |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740748.8A CN110516718B (en) | 2019-08-12 | 2019-08-12 | Zero sample learning method based on deep embedding space |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110516718A CN110516718A (en) | 2019-11-29 |
CN110516718B true CN110516718B (en) | 2023-03-24 |
Family
ID=68625047
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910740748.8A Active CN110516718B (en) | 2019-08-12 | 2019-08-12 | Zero sample learning method based on deep embedding space |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516718B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111553378B (en) * | 2020-03-16 | 2024-02-20 | 北京达佳互联信息技术有限公司 | Image classification model training method, device, electronic equipment and computer readable storage medium |
CN111126576B (en) * | 2020-03-26 | 2020-09-01 | 北京精诊医疗科技有限公司 | Deep learning training method |
CN111461025B (en) * | 2020-04-02 | 2022-07-05 | 同济大学 | Signal identification method for self-evolving zero-sample learning |
CN111797910B (en) * | 2020-06-22 | 2023-04-07 | 浙江大学 | Multi-dimensional label prediction method based on average partial Hamming loss |
CN112380374B (en) * | 2020-10-23 | 2022-11-18 | 华南理工大学 | Zero sample image classification method based on semantic expansion |
CN112651403B (en) * | 2020-12-02 | 2022-09-06 | 浙江大学 | Zero-sample visual question-answering method based on semantic embedding |
CN112686318B (en) * | 2020-12-31 | 2023-08-29 | 广东石油化工学院 | Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration |
CN113283514B (en) * | 2021-05-31 | 2024-05-21 | 高新兴科技集团股份有限公司 | Unknown class classification method, device and medium based on deep learning |
CN114092747A (en) * | 2021-11-30 | 2022-02-25 | 南通大学 | Small sample image classification method based on depth element metric model mutual learning |
CN114241260B (en) * | 2021-12-14 | 2023-04-07 | 四川大学 | Open set target detection and identification method based on deep neural network |
CN114998613B (en) * | 2022-06-24 | 2024-04-26 | 安徽工业大学 | Multi-mark zero sample learning method based on deep mutual learning |
CN114861670A (en) * | 2022-07-07 | 2022-08-05 | 浙江一山智慧医疗研究有限公司 | Entity identification method, device and application for learning unknown label based on known label |
CN116433977B (en) * | 2023-04-18 | 2023-12-05 | 国网智能电网研究院有限公司 | Unknown class image classification method, unknown class image classification device, computer equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018188240A1 (en) * | 2017-04-10 | 2018-10-18 | 北京大学深圳研究生院 | Cross-media retrieval method based on deep semantic space |
CN108846412A (en) * | 2018-05-08 | 2018-11-20 | 复旦大学 | A kind of method of extensive zero sample learning |
CN108875818A (en) * | 2018-06-06 | 2018-11-23 | 西安交通大学 | Based on variation from code machine and confrontation network integration zero sample image classification method |
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
-
2019
- 2019-08-12 CN CN201910740748.8A patent/CN110516718B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018188240A1 (en) * | 2017-04-10 | 2018-10-18 | 北京大学深圳研究生院 | Cross-media retrieval method based on deep semantic space |
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN108846412A (en) * | 2018-05-08 | 2018-11-20 | 复旦大学 | A kind of method of extensive zero sample learning |
CN108875818A (en) * | 2018-06-06 | 2018-11-23 | 西安交通大学 | Based on variation from code machine and confrontation network integration zero sample image classification method |
Non-Patent Citations (2)
Title |
---|
基于公共空间嵌入的端到端深度零样本学习;秦牧轩等;《计算机技术与发展》;20180629(第11期);全文 * |
度量学习改进语义自编码零样本分类算法;陈祥凤等;《北京邮电大学学报》;20180920(第04期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110516718A (en) | 2019-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516718B (en) | Zero sample learning method based on deep embedding space | |
CN114241282B (en) | Knowledge distillation-based edge equipment scene recognition method and device | |
Liu et al. | Ssd: Single shot multibox detector | |
CN111783831B (en) | Complex image accurate classification method based on multi-source multi-label shared subspace learning | |
Dhurandhar et al. | Improving simple models with confidence profiles | |
CN113610173B (en) | Knowledge distillation-based multi-span domain few-sample classification method | |
CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
CN111239137B (en) | Grain quality detection method based on transfer learning and adaptive deep convolution neural network | |
CN113343989B (en) | Target detection method and system based on self-adaption of foreground selection domain | |
CN111898704B (en) | Method and device for clustering content samples | |
CN110991500A (en) | Small sample multi-classification method based on nested integrated depth support vector machine | |
CN114782752B (en) | Small sample image integrated classification method and device based on self-training | |
CN116011507A (en) | Rare fault diagnosis method for fusion element learning and graph neural network | |
CN114780767A (en) | Large-scale image retrieval method and system based on deep convolutional neural network | |
CN117516937A (en) | Rolling bearing unknown fault detection method based on multi-mode feature fusion enhancement | |
CN116910571A (en) | Open-domain adaptation method and system based on prototype comparison learning | |
CN116681128A (en) | Neural network model training method and device with noisy multi-label data | |
CN113642499B (en) | Human body behavior recognition method based on computer vision | |
CN115098681A (en) | Open service intention detection method based on supervised contrast learning | |
Trentin et al. | Unsupervised nonparametric density estimation: A neural network approach | |
Zhang et al. | Underwater target recognition method based on domain adaptation | |
Khadempir et al. | Domain adaptation based on incremental adversarial learning | |
CN116388933B (en) | Communication signal blind identification system based on deep learning | |
CN114626520B (en) | Method, device, equipment and storage medium for training model | |
CN114187510B (en) | Small sample remote sensing scene classification method based on metanuclear network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |