CN111079790A - Image classification method for constructing class center - Google Patents

Image classification method for constructing class center Download PDF

Info

Publication number
CN111079790A
CN111079790A CN201911129753.1A CN201911129753A CN111079790A CN 111079790 A CN111079790 A CN 111079790A CN 201911129753 A CN201911129753 A CN 201911129753A CN 111079790 A CN111079790 A CN 111079790A
Authority
CN
China
Prior art keywords
image
class
vector
vectors
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911129753.1A
Other languages
Chinese (zh)
Other versions
CN111079790B (en
Inventor
王好谦
刘志宏
张永兵
杨芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN201911129753.1A priority Critical patent/CN111079790B/en
Publication of CN111079790A publication Critical patent/CN111079790A/en
Application granted granted Critical
Publication of CN111079790B publication Critical patent/CN111079790B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An image classification method for constructing a class center comprises the following steps: extracting image feature vectors, and constructing category center vectors in a feature space; classifying the image features by taking the Euclidean distance between the image feature vector and the category central vector as a classification basis; calculating the dispersion degree of different categories according to the Euclidean distance between every two category center vectors; and calculating a loss function of the network according to the classification result and the dispersion degrees of different classes, and learning the network parameters and the central vector by using the loss function. The image classification method has the characteristic of controlling the distance between the intra-class and the inter-class, and the method controls the distance between the intra-class and the inter-class by directly constructing the class center, so that the distribution of the image characteristics is more favorable for classification, and a better classification effect is obtained. Compared with the prior art, the method of the invention can ensure that the characteristics extracted by the network have better distribution characteristics within and among classes.

Description

Image classification method for constructing class center
Technical Field
The invention relates to the field of computer vision and image processing, in particular to an image classification method.
Background
Image classification, which is a traditional computer vision problem, refers to inputting a picture into a computer that identifies the type of object in the picture, such as identifying a cat or dog in the picture. The image classification problem has many applications in practical scenes, for example, in the case of face recognition used in the entry and exit customs, it is necessary to recognize whether a face in a camera and a face picture in a database belong to the same person.
Before deep learning is applied to image classification, the traditional image recognition adopts a pattern recognition method, which mainly comprises three parts of image feature acquisition, classifier training and image prediction. The acquisition of the image features means that the image features which can be used for a classifier are obtained from an image, and the common image features are in the form of high-dimensional vectors; the input of the classifier is image characteristics, the output is a classification effect, the classifier contains unknown trainable parameters, the parameters of the classifier can be continuously optimized by utilizing a training set, the training set comprises a label of each image, namely the class of an object in a picture, and the classifier can learn the classification prior information in the training set in the training process; after the classifier is trained, a picture of an unknown label can be input, and the classification of the picture can be output through calculation of the classifier.
With the development of hardware resources and the research of algorithm theory, deep learning begins to be applied to a large number of related problems of computer images, the extraction of image features adopts a method of a multilayer convolutional neural network, the multilayer convolutional network contains a large number of trainable parameters, and the parameters are trained by means of a training set. After the features are extracted, a classifier is needed to classify the images, a full-link layer and a softmax function are often used to classify the image features in deep learning, and the probability that the features belong to each class can be calculated. The training of the network requires a loss function, the most commonly used in the classification problem is the cross-entropy loss function.
The features extracted by a common deep learning image classification method are separable in a high-dimensional space, but the intra-class distance is possibly larger than the inter-class distance, and when two unknown classified pictures are given to judge whether the pictures belong to the same class, an effective judgment threshold value cannot be selected. The existing method adopts a full connection layer for classification, and essentially depends on the mode and angle characteristics of the features in a high-dimensional space. Similarly, euclidean distances may be used as a basis for classification in a high dimensional space, and a loss function of the network is calculated based on the euclidean distances.
The above background disclosure is only for the purpose of assisting in understanding the inventive concepts and technical solutions of the present invention, and it is not necessarily the prior art of the present application, and should not be used to assess the novelty and inventive step of the present application in the event that there is no clear evidence that the above disclosure has been made at the filing date of the present application.
Disclosure of Invention
The invention mainly aims to overcome the defects of the prior art and provides a deep learning image classification method based on the construction of a class center.
In order to achieve the purpose, the invention adopts the following technical scheme:
a deep learning image classification method based on construction of a category center comprises the following steps:
1) inputting a training image, extracting a feature vector of the image by using a multilayer convolutional neural network, wherein the feature vector of the image is a high-dimensional vector distributed in a high-dimensional feature space, and meanwhile, constructing different types of central vectors in the feature space, wherein the dimensions of the central vectors are consistent with those of the feature vectors of the image;
2) calculating Euclidean distances between feature vectors of the image and central vectors of different classes, and taking the class of the central vector with the minimum distance as the class of the feature vectors of the image;
3) calculating the dispersion degree of the feature vector distribution of different types of images according to the Euclidean distance between the central vectors of the types;
4) calculating the probability fraction of each class of the image according to the Euclidean distance between the feature vector of the image and the central vector of each class, and introducing a margin parameter in the probability fraction calculation to control the intra-class distance;
5) and calculating a network loss function, and updating the values of the network weight and the central vector of the category by using a reverse gradient propagation method.
Further:
in the step 1), the constructed category center vector is a trainable parameter, the category center vector is initialized by using a random initialization method, in the step 4), the value of the category center vector is updated by using a gradient back propagation method, assuming that the number of classification categories is m, all the center vectors are represented as ci,i=1,2,...,m。
In the step 2), the feature vector of the image is expressed as f, and the distance between the feature vector of the image and the category center vector of the i-th class is LiI 1, 2.. m, find the minimum distance LkI.e. Lk=mini=1,2,...,mLiThen the category of the image feature is determined as the kth category. In the step 3), the central vectors of different classes are assumed to be ciI 1,2,.. m, calculating the Euclidean distance between every two central vectors of different classes, and expressing the Euclidean distance as DijWhere i, j is 1, 2., m and i ≠ j, for DijAveraging all or part of the elements to obtain different types of dispersion degrees.
In the step 3), for DijThe averaging of the partial elements of (2) is performed by averaging the euclidean distances between each class of central vector and the nearest central vector.
In the step 4), the distance L between the feature vector f of the image and the central vectors of different classes is usedi1, 2.. m, a probability score is calculated by the following specific method: all distances are first normalized to obtain relative distances
Figure BDA0002277964870000031
i=1,2,...,m,RiThe larger the distance between the feature vector representing the image and the center vector of the ith class is; in order to constrain the intra-class distance, a margin is introduced in the calculation of the probability score; assuming that the true label of the feature vector of the image is of the kth class, calculating that the feature vector of the image belongs to the kth classThe probability score is:
Figure BDA0002277964870000032
the probability scores for the feature vectors of an image belonging to other categories are expressed as:
Figure BDA0002277964870000033
wherein the probability score PiThe smaller the probability that a feature vector representing an image belongs to the ith class is, the larger m represents the margin.
And m is greater than 1.
In the step 5), the calculation of the Loss function comprises two parts, namely the classification Loss of the image characteristics1And degree of dispersion Loss of various centers2The two loss functions adjust the weights of the different constraints with the hyperparameter λ, i.e. the final training loss is:
Loss=Loss1+λLoss2(3)
loss of class Loss1The cross entropy is used for the calculation.
After the Loss function Loss is obtained, the values of the network weight and the central point are updated by using a gradient back propagation method.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the image classification method of any one of claims 1 to 8.
The invention has the following beneficial effects:
the invention provides an image classification method for constructing a category center, wherein image features are extracted, and the category center is constructed in a feature space; classifying the image features by taking the Euclidean distance between the image features and the class center as a classification basis of the image features; calculating the dispersion degrees of different categories according to the Euclidean distance between every two central vectors; and calculating a loss function of the network according to the classification result and the dispersion degrees of different classes, and learning the network parameters and the central vector by using the loss function. The image classification method has the characteristic of controlling the distance between the intra-class and the inter-class, and the method controls the distance between the intra-class and the inter-class by directly constructing the class center, so that the distribution of the image characteristics is more favorable for classification, and a better classification effect is obtained. Compared with the prior art, the method of the invention can ensure that the characteristics extracted by the network have better distribution characteristics within and among classes. Compared with a classification network of adding softmax operation to a full connection layer, the central point thought and the calculation of the Euclidean distance provided by the invention can directly control the distribution conditions of the feature vectors of different classes in a high-dimensional space, so that the feature vectors of each class are close to the class center as much as possible, and the feature vectors of different classes are far away from each other as much as possible. The method has very intuitive geometric interpretation, and is different from the full-connection layer and softmax method in that the geometric interpretation of the full-connection layer and the softmax method is the vector angle characteristic of a high-dimensional space, and the method is the Euclidean distance characteristic of the high-dimensional space and introduces a trainable class center point. The loss function obtains a very good classification effect on the task of image classification, and better intra-class and inter-class feature distribution characteristics can be obtained by adjusting the super parameters in the method.
Drawings
Fig. 1 is a basic flow of an image classification method of an embodiment of the present invention.
FIG. 2 is a schematic diagram of a first degree of dispersion according to an embodiment of the present invention.
FIG. 3 is a second schematic illustration of the dispersion according to the embodiment of the present invention.
FIG. 4 is a diagram of two classes of parameters without margin according to an embodiment of the present invention.
FIG. 5 is a diagram of two classes with margin parameters according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail below. It should be emphasized that the following description is merely exemplary in nature and is not intended to limit the scope of the invention or its application.
Referring to fig. 1 to 5, in an embodiment, a method for deep learning image classification based on constructing a class center includes the following steps:
step 1), constructing a deep learning model, inputting a training image, extracting a feature vector of the image by using a multilayer convolutional neural network, wherein the feature vector of the image is a high-dimensional vector distributed in a high-dimensional feature space, and meanwhile, constructing different types of central vectors in the feature space, wherein the central vectors have the same dimension with the feature vector of the image;
step 2), calculating Euclidean distances between the feature vectors of the image and the central vectors of different classes, and taking the class of the central vector with the minimum distance as the class of the feature vectors of the image;
step 3), calculating the dispersion degree of the feature vector distribution of the images of different categories according to the Euclidean distance between the central vectors of the categories;
step 4), calculating the probability score of each class of the image according to the Euclidean distance between the feature vector of the image and the center vector of each class, and introducing a margin parameter in the probability score calculation to control the intra-class distance;
and 5) calculating a network loss function, and learning the network parameters and the center vector by using the loss function.
The image features extracted in the step 1) are high-dimensional vectors which are distributed in a high-dimensional feature space, a central feature vector is constructed for each category in the feature space, and the dimension of the central feature vector is consistent with that of the image feature vector. Specifically, the constructed category center vector is a trainable parameter, the category center vector is initialized by using a random initialization method, the value of the category center vector is updated by using a gradient back propagation method in step 4), assuming that the number of classification categories is m, all the center vectors are represented as ci,i=1,2,…,m。
In the step 2), the Euclidean distance between the feature vector of the image and the center vector of each category is calculated, and the image is classified into the category of the center vector with the minimum Euclidean distance from the feature of the image. The feature vector of the image is expressed as f, and the distance between the feature vector of the image and the category center vector of the i-th class is LiI 1,2, …, m, finding the minimum distance LkI.e. Lk=mini=1,2,…,mLiThen the category of the image feature is determined as the kth category.
In the step 3), Euclidean distances between the class center vectors are calculated to represent the dispersion degrees of the feature vectors of different classes. There are two ways to express the degree of dispersion, the first is to average the euclidean distance between every two central vectors, and the second is to average the euclidean distance between each class of central vector and its nearest central vector.
In particular, assume that the center vectors of the different classes are ciI 1,2, …, m, and calculating Euclidean distances between every two central vectors of different classes, which is expressed as DijWhere i, j is 1,2, …, m and i ≠ j, for DijAveraging all or part of the elements to obtain different types of dispersion degrees. In one embodiment, pair DijThe averaging of the partial elements of (2) may be performed by averaging euclidean distances between each class of central vectors and the nearest central vector.
In the step 4), the probability score of the image belonging to each class is calculated according to the Euclidean distance between the image characteristics and the center of each class. In the process of calculating the probability score, in order to make the intra-class distance as small as possible, a margin parameter is introduced to ensure that the Euclidean distance between the image feature and the correct class center is far smaller than the Euclidean distance between the image feature and the wrong class center.
In a preferred embodiment, the distance L between the feature vector f of the image and the center vectors of the different classes is usediAnd i is 1,2, …, and m is used for calculating the probability score, and the specific method is as follows: all distances are first normalized to obtain relative distances
Figure BDA0002277964870000061
i=1,2,…,m,RiThe larger the distance between the feature vector representing the image and the center vector of the ith class is; in order to restrict the intra-class distance, the traditional method for classifying the image features by using the softmax function is improved, and the probability score is calculated by introducingThe balance; assuming that the real label of the feature vector of the image is of the kth class, calculating the probability score of the feature vector of the image belonging to the kth class by using the method of the application as follows:
Figure BDA0002277964870000062
the probability scores for the feature vectors of an image belonging to other categories are expressed as:
Figure BDA0002277964870000063
wherein the probability score PiThe smaller the probability that a feature vector representing an image belongs to the ith class is, the larger m represents the margin. Preferably, m is greater than 1 in order to increase the intra-class distance constraint.
In the step 5), a loss function of the network is calculated on the basis of the classification probability scores and the dispersion degrees of the images of different types, the calculated loss function comprises two parts, namely classification constraint and dispersion degree constraint, the classification constraint can be calculated by using cross entropy loss, the dispersion degree constraint can be represented by the dispersion degrees of the different types, and different weights are distributed between the two constraints through a weight coefficient. The network weights and the center vector are then updated using a back-propagation method.
In particular, the computation of the Loss function involves two parts, namely the Loss of classification of the image features, Loss1And degree of dispersion Loss of various centers2The two loss functions adjust the weights of the different constraints with the hyperparameter λ, i.e. the final training loss is:
Loss=Lossi+λLoSS2(3)。
in the preferred embodiment, after the Loss function Loss is obtained, the values of the network weights and the center point are updated using a gradient back propagation method.
In some embodiments, the present invention provides an image classification method for constructing a class center, and the method mainly includes: extracting image features, and constructing a category center in a feature space; the Euclidean distance between the image features and the category center is used as a classification basis of the image features to classify the image features; calculating the dispersion degrees of different categories according to the Euclidean distance between every two central vectors; and calculating a loss function of the network according to the classification result and the dispersion degrees of different classes, and learning the network parameters and the central vector by using the loss function.
In some embodiments, the present invention generally comprises the steps of: constructing a feature extraction network, extracting abstract features of an input image by using a multilayer convolution network to obtain high-dimensional feature vectors for classification, and constructing different types of central vectors in a feature space; the second step is that: judging the feature type, and calculating Euclidean distances between the image feature vectors and each type of central vector, wherein the central vector type with the minimum distance is used as the type of the image feature; the third step: calculating the dispersion degree of the feature distribution of different categories according to the Euclidean distance between the central vectors of the categories; the fourth step: calculating the probability fraction of the image features belonging to each class, and introducing a margin parameter to control the intra-class distance; the fifth step: and calculating a network loss function, and updating the values of the network weight and the central vector by using a reverse gradient propagation method. In these embodiments, the loss function is calculated using Euclidean distance, and the intra-class and inter-class distances of the image features are controlled by the margins and the degree of dispersion.
The first step specifically comprises: features of the input image are extracted using multi-layer convolution, the image features being high-dimensional vectors distributed over a feature space. And constructing different types of central vectors in the feature space, wherein the dimensionality of the central vectors is consistent with the feature vectors of the image.
The second step specifically comprises: and calculating Euclidean distances between the image features and the central vectors of the categories, and classifying the features into the category of the central vector with the minimum Euclidean distance.
The third step specifically comprises: and calculating Euclidean distances among the class center vectors to express the dispersion degree of the feature vectors of different classes. There are two ways to express the degree of dispersion, the first is to average the euclidean distance between every two central vectors, and the second is to average the euclidean distance between each class of central vector and its nearest central vector.
The fourth step specifically includes: and calculating the probability score of the image belonging to each class according to the Euclidean distance between the image characteristics and the center of each class. In the process of calculating the probability score, in order to make the intra-class distance as small as possible, a margin parameter is introduced to ensure that the Euclidean distance between the image feature and the correct class center is far smaller than the Euclidean distance between the image feature and the wrong class center.
The fifth step specifically includes: calculating a loss function of the network on the basis of the dispersion degrees of different classes and the image classification probability scores, wherein the loss function comprises a dispersion degree constraint part and a classification constraint part, the classification constraint part is calculated by using cross entropy loss, the dispersion degree constraint part is represented by the dispersion degrees of different classes, and different weights are distributed between the two constraint parts through a weight coefficient. And updating the network weight and the center vector by using a back propagation method.
As described in further detail below.
And (3) central vector construction, as shown in figure 1, the flow of the image classification task comprises two parts of feature extraction and classification by using features, the feature extraction of the image depends on a convolutional neural network, the input is a picture to be classified, and a high-dimensional vector, namely the image features in figure 1, is obtained after a plurality of convolutional layers. Assuming that the dimension of the image feature is 1 × N and the number of categories is m, the feature space is defined as
Figure BDA0002277964870000081
M feature vectors c are constructediI 1, 2.. times, m, expressed as center vectors of different classes, and values of the center vectors are randomly initialized using a gaussian distribution or a uniform distribution.
Degree of dispersion: in order to ensure the classification effect, it is necessary to ensure that the features of different classes are as far away as possible, and the inter-class distance of the image features is controlled by controlling the distance between the centers of the different classes in the present application. Assume all classes are centered at ciI 1, 2.. m, there are two specific methods for calculating the degree of dispersion of different classes. The first method is shown in FIG. 2, in whichThe small dots represent the distribution of the category centers, the Euclidean distance between each center and other center points is calculated, the Euclidean distance between the center point 1 and other center points is drawn in figure 2, the corresponding Euclidean distance is calculated for other points in the same way, each point has m-1 Euclidean distances, m points are shared, m (m-1) distances are shared, and the distances are represented as DijWhere i, j is 1, 2.., m and i ≠ j, the dispersion degree of all the center points is determined by averaging all the euclidean distances, that is to say
Figure BDA0002277964870000082
The intuitive understanding is that the Euclidean distance, Dis, between two central points is calculatedavgThe larger the distance between the central points is, the larger the dispersion degree is, and the larger the inter-class distance of the image features is. The method has the disadvantages that the training process is to ensure that the Dis is in the process ofavgAs large as possible, but it is possible to make the centers far away from each other among all the center points further away, while the centers near to each other are relatively unchanged and may even be near to each other, which is not beneficial for the final classification effect. The present application therefore proposes a second method of calculating the degree of dispersion, as shown in fig. 3, in which each small dot still represents the center of each category, the number on the dot being the number of the center point, for the center ciStill calculate ciDistance from all other centre points, but only from the centre point closest thereto, e.g. centre point c in fig. 33If the point No. 1 in the rest points is the nearest, D is reserved31Similarly, all are found
Figure BDA0002277964870000083
1, 2.. m, where k isiIs taken to satisfy
Figure BDA0002277964870000084
The connecting lines in FIG. 3 represent all
Figure BDA0002277964870000085
To pair
Figure BDA0002277964870000086
The average was found to give the degree of dispersion, expressed as
Figure BDA0002277964870000087
Compared with the first type, the second type only considers the distance between relatively close points, and the two points with smaller distance are continuously far away in the training process, so that the condition that the distance between the two points is too small is avoided.
And (3) classification probability: obtaining abstract characteristics f of the image through a characteristic extraction network, wherein the central vector of the ith class is ciCalculating Euclidean distances, denoted L, between the image features and each of the central vectorsiI 1, 2.. m, which is the euclidean distance between the image feature vector and the center of the ith class. Classify the image into the class of the center point nearest to its feature, i.e. find LiMinimum L of 1,2kThe image is classified into the kth class. To limit the range of values of the distance, it is normalized to obtain the relative distance, i.e. the distance
Figure BDA0002277964870000091
i=1,2,...,m,RiThe smaller the value, the smaller the distance between the representation feature and the ith center, the more likely the image feature belongs to the ith class.
Consider the case of only two classes, with a relative distance of R1And R2Satisfy R1+R2When R is 11<R2When R is in the category 1, the image is classified as1>R2The images are classified into class 2, so the classification surface is R1=R20.5, as shown in fig. 4, the hatched area of the horizontal line indicates R classified as class 11The area of values of (1) and the area hatched by the vertical line indicates R classified as class 21The value range of (a). At R1The region around 0.5 may be classified as class 1 or class 2, and a small fluctuation may cause the classification to be less stable, so that the addition of a margin is considered.
It is also a binary problem to obtain two relative distances R1And R2Then, the classification condition is calculated by using the margin parameter m when mR is used1<R2When R is in the category 1, the image is classified as1>mR2Then, the image is classified into class 2, and the known classification surface is calculated as
Figure BDA0002277964870000092
When the value of m is greater than 1, there is a certain margin between the classification planes, as shown in fig. 5, where m is 3, and when the horizontal line shadow region and the vertical line shadow region are R classified into the 1 st class and the 2 nd class respectively1The value range is the margin, the middle diagonally shaded area. After adding the margin, after R, compared to the method without the margin1The image cannot be classified because the relative distance cannot be reliably classified at this time, and only when the relative distance is sufficiently reliable, the image can be correctly classified.
In a general classification task, a cross entropy is used to calculate a loss function, and a probability score of an image belonging to each class needs to be represented, and assuming that the label of the image is k, the probability score of the image belonging to each class is as follows:
Figure BDA0002277964870000093
satisfy the requirement of
Figure BDA0002277964870000094
But P isiNot a direct probability, PiThe smaller the probability that the image belongs to the ith class. When m is larger than 1, the intra-class constraint can be increased, so that the image features of the same class are gathered as much as possible, and a certain margin is provided on the classification surface, so that the learned image features are more reliable in the classification problem.
Loss function: the loss function includes two parts of classification loss and dispersion degree. After the probability score of each class of the image features is obtained, the cross entropy is used for calculating the classification loss, namely the image label is assumed to be of the kth class, and the probability score of the image belonging to the kth class is PkThen the classification penalty is expressed as Loss1=log Pk(5)
The smaller the probability score of correct classification, the smaller the loss value, so the network parameters are optimized towards smaller and smaller probability scores of correct classification.
The degree of dispersion of the known class centers can be expressed as DisavgOr DisminThe dispersion degree and the Euclidean distance belong to the same order of magnitude, in order to ensure that the network is rapidly converged when the gradient is larger at the initial training stage and ensure that the network is stably converged when the gradient is smaller at the later training stage, a logarithmic function is added on the basis of the Euclidean distance, namely the Loss function of the dispersion degree is expressed as Loss2=-log Dis (6)
Wherein Dis may be DisavgOr DisminThe larger the Dis, the smaller the value of the loss function and the smaller the gradient value.
The weights of the two loss functions are adjusted by a parameter lambda, the resulting loss function being expressed as
Loss=Loss1+λLoss2(7)
And then, optimizing the values of the network weight and the central vector by using a gradient back propagation method, so that the loss value is continuously reduced, correspondingly, the classification accuracy is continuously improved, and the distribution of the image characteristics meets the requirements of small intra-class distance and large inter-class distance.
The foregoing is a more detailed description of the invention in connection with specific/preferred embodiments and is not intended to limit the practice of the invention to those descriptions. It will be apparent to those skilled in the art that various substitutions and modifications can be made to the described embodiments without departing from the spirit of the invention, and these substitutions and modifications should be considered to fall within the scope of the invention. In the description herein, references to the description of the term "one embodiment," "some embodiments," "preferred embodiments," "an example," "a specific example," or "some examples" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Claims (9)

1. A deep learning image classification method based on construction of a category center comprises the following steps:
1) inputting a training image, extracting a feature vector of the image by using a multilayer convolutional neural network, wherein the feature vector of the image is a high-dimensional vector distributed in a high-dimensional feature space, and meanwhile, constructing different types of central vectors in the feature space, wherein the dimensions of the central vectors are consistent with those of the feature vectors of the image;
2) calculating Euclidean distances between feature vectors of the image and central vectors of different classes, and taking the class of the central vector with the minimum distance as the class of the feature vectors of the image;
3) calculating the dispersion degree of the feature vector distribution of different types of images according to the Euclidean distance between the central vectors of the types;
4) calculating the probability fraction of each class of the image according to the Euclidean distance between the feature vector of the image and the central vector of each class, and introducing a margin parameter in the probability fraction calculation to control the intra-class distance;
5) and calculating a network loss function, and learning the central vectors of the network parameters and the categories by using the loss function.
2. The image classification method according to claim 1, wherein in step 1), the constructed class center vector is a trainable parameter, the class center vector is initialized by using a random initialization method, and in step 4), the value of the class center vector is updated by using a gradient back propagation method, assuming that the number of classification classes is m, all the center vectors are represented as ci,i=1,2,...,m。
3. The image classification method according to claim 1 or 2, characterized in that in step 2), the image feature vector is represented as f, and the distance between the image feature vector and the center vector of the i-th class is LiI 1, 2.. m, find the minimum distance LkI.e. Lk=mini1,2,...,mLiThen the category of the image feature is determined as the kth category.
4. The image classification method according to any one of claims 1 to 3, characterized in that in step 3), it is assumed that the central vectors of the different classes are ciI 1,2,.. m, calculating the Euclidean distance between every two central vectors of different classes, and expressing the Euclidean distance as DijWhere i, j is 1, 2., m and i ≠ j, for DijAveraging all or part of the elements to obtain different types of dispersion degrees.
5. The image classification method according to claim 4, characterized in that in step 3), D is classifiedijThe averaging of the partial elements of (2) is performed by averaging the euclidean distances between each class of central vector and the nearest central vector.
6. The image classification method according to any one of claims 1 to 5, characterized in that in step 4), the distances L between the feature vectors f of the images and the central vectors of different classes are usedi1, 2.. m, a probability score is calculated by the following specific method: all distances are first normalized to obtain relative distances
Figure FDA0002277964860000021
RiThe larger the distance between the feature vector representing the image and the center vector of the ith class is; in order to constrain the intra-class distance, a margin is introduced in the calculation of the probability score; assuming that the real label of the feature vector of the image is of the kth class, calculating the probability score of the feature vector of the image belonging to the kth class as follows:
Figure FDA0002277964860000022
where e is a natural constant, and the probability score of the feature vector of the image belonging to other categories is expressed as:
Figure FDA0002277964860000023
probability score PiThe smaller the probability that a feature vector representing an image belongs to the ith class is, the larger m represents the margin.
7. The image classification method according to claim 6, characterized in that m takes a value greater than 1.
8. The image classification method according to any one of claims 1 to 7, characterized in that in step 5), the computation of the Loss function comprises two parts, namely the classification Loss of the image features, Loss1And degree of dispersion Loss of various centers2The two loss functions adjust the weights of the different constraints with the hyperparameter λ, i.e. the final training loss is:
Loss=Loss1+λLoss2(3)
preferably, the Loss of classification Loss1Calculating by using cross entropy;
preferably, the values of the central vectors of the network weights and the categories are updated by a method of inverse gradient propagation.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the image classification method of any one of claims 1 to 8.
CN201911129753.1A 2019-11-18 2019-11-18 Image classification method for constructing class center Active CN111079790B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911129753.1A CN111079790B (en) 2019-11-18 2019-11-18 Image classification method for constructing class center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911129753.1A CN111079790B (en) 2019-11-18 2019-11-18 Image classification method for constructing class center

Publications (2)

Publication Number Publication Date
CN111079790A true CN111079790A (en) 2020-04-28
CN111079790B CN111079790B (en) 2023-06-30

Family

ID=70311126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911129753.1A Active CN111079790B (en) 2019-11-18 2019-11-18 Image classification method for constructing class center

Country Status (1)

Country Link
CN (1) CN111079790B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633407A (en) * 2020-12-31 2021-04-09 深圳云天励飞技术股份有限公司 Method and device for training classification model, electronic equipment and storage medium
CN112836629A (en) * 2021-02-01 2021-05-25 清华大学深圳国际研究生院 Image classification method
WO2021214943A1 (en) * 2020-04-23 2021-10-28 日本電信電話株式会社 Parameter optimization method, non-temporary recording medium, feature amount extraction method, and parameter optimization device
CN116912920A (en) * 2023-09-12 2023-10-20 深圳须弥云图空间科技有限公司 Expression recognition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108108807A (en) * 2017-12-29 2018-06-01 北京达佳互联信息技术有限公司 Learning-oriented image processing method, system and server
CN108235770A (en) * 2017-12-29 2018-06-29 深圳前海达闼云端智能科技有限公司 image identification method and cloud system
CN108304859A (en) * 2017-12-29 2018-07-20 达闼科技(北京)有限公司 Image-recognizing method and cloud system
US20190213448A1 (en) * 2016-12-30 2019-07-11 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, server, and storage medium
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190213448A1 (en) * 2016-12-30 2019-07-11 Tencent Technology (Shenzhen) Company Limited Image recognition method, apparatus, server, and storage medium
CN108108807A (en) * 2017-12-29 2018-06-01 北京达佳互联信息技术有限公司 Learning-oriented image processing method, system and server
CN108235770A (en) * 2017-12-29 2018-06-29 深圳前海达闼云端智能科技有限公司 image identification method and cloud system
CN108304859A (en) * 2017-12-29 2018-07-20 达闼科技(北京)有限公司 Image-recognizing method and cloud system
CN110084318A (en) * 2019-05-07 2019-08-02 哈尔滨理工大学 A kind of image-recognizing method of combination convolutional neural networks and gradient boosted tree

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021214943A1 (en) * 2020-04-23 2021-10-28 日本電信電話株式会社 Parameter optimization method, non-temporary recording medium, feature amount extraction method, and parameter optimization device
CN112633407A (en) * 2020-12-31 2021-04-09 深圳云天励飞技术股份有限公司 Method and device for training classification model, electronic equipment and storage medium
CN112633407B (en) * 2020-12-31 2023-10-13 深圳云天励飞技术股份有限公司 Classification model training method and device, electronic equipment and storage medium
CN112836629A (en) * 2021-02-01 2021-05-25 清华大学深圳国际研究生院 Image classification method
CN112836629B (en) * 2021-02-01 2024-03-08 清华大学深圳国际研究生院 Image classification method
CN116912920A (en) * 2023-09-12 2023-10-20 深圳须弥云图空间科技有限公司 Expression recognition method and device
CN116912920B (en) * 2023-09-12 2024-01-05 深圳须弥云图空间科技有限公司 Expression recognition method and device

Also Published As

Publication number Publication date
CN111079790B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN111079790B (en) Image classification method for constructing class center
CN109241817B (en) Crop image recognition method shot by unmanned aerial vehicle
CN111079639B (en) Method, device, equipment and storage medium for constructing garbage image classification model
CN108647583B (en) Face recognition algorithm training method based on multi-target learning
WO2020108474A1 (en) Picture classification method, classification identification model generation method and apparatus, device, and medium
CN109977757B (en) Multi-modal head posture estimation method based on mixed depth regression network
CN109063719B (en) Image classification method combining structure similarity and class information
CN105488528B (en) Neural network image classification method based on improving expert inquiry method
CN111178208A (en) Pedestrian detection method, device and medium based on deep learning
CN110516537B (en) Face age estimation method based on self-learning
CN108427740B (en) Image emotion classification and retrieval algorithm based on depth metric learning
CN108875933A (en) A kind of transfinite learning machine classification method and the system of unsupervised Sparse parameter study
CN110516533B (en) Pedestrian re-identification method based on depth measurement
CN108154156B (en) Image set classification method and device based on neural topic model
US11823490B2 (en) Non-linear latent to latent model for multi-attribute face editing
CN110472693B (en) Image processing and classifying method and system
CN111104831B (en) Visual tracking method, device, computer equipment and medium
CN116880538B (en) High subsonic unmanned plane large maneuvering flight control system and method thereof
WO2023088174A1 (en) Target detection method and apparatus
CN113505855A (en) Training method for anti-attack model
KR101676101B1 (en) A Hybrid Method based on Dynamic Compensatory Fuzzy Neural Network Algorithm for Face Recognition
Hu et al. An integrated classification model for incremental learning
CN112836629A (en) Image classification method
CN112560824B (en) Facial expression recognition method based on multi-feature adaptive fusion
CN115510986A (en) Countermeasure sample generation method based on AdvGAN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant