CN114972834B

CN114972834B - Image classification method and device of multi-level multi-classifier

Info

Publication number: CN114972834B
Application number: CN202110516975.XA
Authority: CN
Inventors: 丁小波; 蔡茂贞; 刘井安; 彭琨; 钟地秀; 李小青
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Internet Co Ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2023-09-05
Anticipated expiration: 2041-05-12
Also published as: CN114972834A

Abstract

The application discloses an image classification method and device of a multi-level multi-classifier, wherein the method comprises the following steps: the method comprises the steps of obtaining a feature extraction classifier to pre-classify a target image and extracting features, and obtaining a feature image of each convolution layer in each class and a first feature probability of the feature image; based on the first feature probability, carrying out probability accumulation on the feature map of each convolution layer in each category to obtain a first thermodynamic diagram corresponding to each convolution layer in each category; processing the first thermodynamic diagram into uniform size and splicing the first thermodynamic diagram into a first multi-level thermodynamic diagram of each category; inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier; and outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and a preset threshold value.

Description

Image classification method and device of multi-level multi-classifier

Technical Field

The application relates to the field of image classification, in particular to an image classification method and device of a multi-level multi-classifier.

Background

With the progress and development of science and technology, image recognition or image classification by using a neural network has become a common way, in the existing image classification method, generally, a neural network is trained by using classified images of various categories, a picture to be predicted is input into the neural network during prediction, the neural network directly outputs probabilities of the various categories, and a category with the highest probability is selected as a prediction output.

However, when the probability of outputting all the categories through the neural network is relatively low, the category with the highest probability is directly selected as the output, which results in the occurrence of a wrong classification, and further how to accurately output the category of the image to be predicted becomes a problem to be solved at present.

Disclosure of Invention

The application discloses an image classification method and device of a multi-level multi-classifier, which are used for solving the problem that the existing image classification method causes higher false omission recognition under the condition of lower probability of all categories.

In order to solve the problems, the application adopts the following technical scheme:

in a first aspect, an embodiment of the present application discloses an image classification method of a multi-level multi-classifier, where the method includes: the method comprises the steps of obtaining a feature extraction classifier to pre-classify a target image and extracting features, and obtaining a feature image of each convolution layer in each class and a first feature probability of the feature image; based on the first feature probability, carrying out probability accumulation on the feature map of each convolution layer in each category to obtain a first thermodynamic diagram corresponding to each convolution layer in each category; processing the first thermodynamic diagram into uniform size and splicing the first thermodynamic diagram into a first multi-level thermodynamic diagram of each category; inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier; according to the bottommost division

And outputting the category of the target image according to the quantitative relation between the maximum probability in the category probabilities and a preset threshold.

In a second aspect, an embodiment of the present application discloses an image classification apparatus of a multi-level multi-classifier, the apparatus including: the acquisition module is used for acquiring a feature image of each convolution layer in each class and a first feature probability of the feature image, wherein the feature image is obtained by pre-classifying a target image by the feature extraction classifier and extracting features; the accumulation module is used for carrying out probability accumulation on the feature graphs of each convolution layer in each category based on the first feature probability to obtain a first thermodynamic diagram corresponding to each convolution layer in each category; the splicing module is used for processing the first thermodynamic diagrams into uniform sizes and splicing the first thermodynamic diagrams into first multi-level thermodynamic diagrams of various types; the input module is used for inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier; and the output module is used for outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and the preset threshold value.

The embodiment of the application discloses a technical scheme adopted by the application, which can achieve the following beneficial effects:

the embodiment of the application discloses an image classification method of a multi-level multi-classifier, which is characterized in that a target image is pre-classified and feature extracted through a feature extraction classifier, a feature extraction result is processed into a first thermodynamic diagram, the first multilevel thermodynamic diagram is input into the multi-level multi-classifier after being spliced into the first multilevel thermodynamic diagram, the target image is finely classified through the multi-level multi-classifier to obtain the classification probability of each layer of classification, and the classification of the target image is output according to the number relation between the maximum probability in the classification probability of the bottommost layer and a preset threshold value.

Drawings

Fig. 1 is a schematic flow chart of an image classification method of a multi-level multi-classifier according to an embodiment of the present application;

FIG. 2 is a flow chart of another image classification method of a multi-level multi-classifier according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a pre-training process of a feature extraction classifier and a multi-level multi-classifier according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an image classification device with multiple layers and multiple classifiers according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The image classification method and device of the multi-level multi-classifier provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

Fig. 1 is a flow chart of a method for classifying images of a multi-level multi-classifier according to an embodiment of the present application, where the method may be performed by an electronic device, in other words, the method may be performed by software or hardware installed in the electronic device. As shown in fig. 1, an image classification method of a multi-level multi-classifier according to an embodiment of the present application may include the following steps:

s110: the method comprises the steps of obtaining a feature extraction classifier to pre-classify a target image and extracting features, and obtaining a feature map of each convolution layer in each class and a first feature probability of the feature map.

After the target image is input into a feature extraction classifier, a convolution layer in the feature extraction classifier extracts features of the target image, a pooling layer performs feature compression on the target image after extracting the features, main features of the target image are extracted, a full-connection layer is connected with all the features after extracting the main features, output values are sent to the classifier, the classifier classifies the output values, so that the output layer obtains a feature map corresponding to each convolution layer in each category of the target image and first feature probability of the feature map, the categories of the classifier are various, such as a softmax classifier, and specifically, the output values of the full-connection layer can be converted into probability distribution ranging from [0,1] to 1, namely the first feature probability through a softmax function.

It should be noted that, the feature extraction classifier is equivalent to a convolutional neural network, and the convolutional neural network includes: input layer, convolution layer, pooling layer, full connection layer and output layer.

S120: and based on the first feature probability, carrying out probability accumulation on the feature graphs of each convolution layer in each category to obtain a first thermodynamic diagram corresponding to each convolution layer in each category.

S130: the first thermodynamic diagrams are processed to uniform dimensions and spliced into first multi-level thermodynamic diagrams of each class.

The target image is input into the feature extraction classifier, so that a weight (namely, first feature probability) corresponding to each feature map in the plurality of convolution layers can be obtained, the feature maps corresponding to each convolution layer are subjected to weight accumulation, a first thermodynamic diagram can be obtained, the first thermodynamic diagram is processed into a uniform size, and in one implementation manner, the plurality of first thermodynamic diagrams can be processed into the uniform size by utilizing the ROI Pooling (Region of Interest Pooling ), so that the first thermodynamic diagrams can be spliced into first multi-level thermodynamic diagrams of various types.

S140: and inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of category output by the multi-level classifier.

S150: and outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and the preset threshold value.

After the combined first multi-level thermodynamic diagram is input into the multi-level classifier, the multi-level classifier is provided with the multi-level classifier, each layer of classifier is equivalent to a convolutional neural network, the first multi-level thermodynamic diagram is processed by the classifier of each layer, the classification probability of each layer of class of the target image can be obtained, and the class with the highest probability can be output according to the relation between the maximum probability in the classification probability of the bottommost layer and the quantity of the preset threshold. Through the mode of multi-level multi-classification, the problem of missed recognition caused by low obtained probability when the target image is finely classified can be avoided.

In one implementation, a method for inputting a first multi-level thermodynamic diagram into a multi-level classifier to obtain a classification probability of each level of class output by the multi-level classifier includes:

and inputting the first multi-level thermodynamic diagram into each layer of classifier in the multi-level classifier to obtain the first classification probability of each layer of class.

And based on the first feature probability, carrying out probability accumulation on the first classification probability and the first feature probability of each layer of category to obtain a second classification probability.

And normalizing the second classification probability to obtain the classification probability of each layer of class.

After the multi-level classifier receives the first multi-level thermodynamic diagram, the probability of each category of the current level, namely the first classification probability, can be obtained, the probability accumulation is carried out on the first classification probability and the first characteristic probability of each category of the current level based on the first characteristic probability obtained by the pre-classification operation of the characteristic extraction classifier on the target image, the probability of the target image in the current level, namely the second classification probability, can be obtained, and then the classification probability of each category output by the multi-level classifier can be obtained by carrying out normalization processing on the second classification probability.

After obtaining the classification probability of each layer of the classes output by the multi-level multi-classifier, the class of the target image can be judged and output according to the number relation between the maximum probability in the classification probability of the bottommost layer and the preset threshold value, and in one implementation manner, the class of the target image is output according to the number relation between the maximum probability in the classification probability of the bottommost layer and the preset threshold value, including:

and outputting the category corresponding to the maximum probability as the category of the target image under the condition that the maximum probability is larger than a preset threshold value.

Judging whether the current layer is the highest layer or not under the condition that the maximum probability is not greater than a preset threshold value, and if so, outputting a category corresponding to the maximum probability as the category of the target image; otherwise, the third classification probability of each category in the bottommost layer is recalculated by combining the classification probabilities output by the classifier of the previous layer, wherein the category corresponding to the maximum probability in the third classification probability is output as the category of the target image under the condition that the maximum probability in the third classification probability is larger than a preset threshold.

According to the classification probability of each category of each layer obtained by the multi-level classifier, firstly, the highest probability in each category can be selected from the bottommost layer, and whether the highest probability is larger than a preset threshold value is judged. Otherwise, judging by combining the classification probability output by the previous layer, recalculating the probability of each category of the bottommost layer (namely third classification probability), and outputting the category corresponding to the maximum probability in the third classification probability as the category of the target image under the condition that the maximum probability in the third classification probability is larger than a preset threshold.

Fig. 2 is a flow chart of another image classification method of a multi-level multi-classifier, and as shown in fig. 2, an embodiment of the application discloses another image classification method of a multi-level multi-classifier, which includes the following steps:

s210: inputting the target image into a feature extraction classifier to perform feature extraction and category judgment to obtain the probability p of each category _i (i＝1Λk ₁ ) Wherein k is ₁ The number of classes of the classifier is extracted for the feature.

S220: thermodynamic diagrams of a plurality of convolution layers corresponding to each category are calculated.

S230: multiple thermodynamic diagrams are processed into uniform sizes by utilizing ROI Pooling, and are combined into a multi-level thermodynamic diagram.

S240: inputting the multi-level thermodynamic diagram of each category into each layer of the multi-level classifier to obtain the probability p of each category of the current level _ij (i＝1Λk ₁ ,j＝1Λk ₂ ) Wherein k is ₂ Is the number of categories for the current hierarchy.

S250: probability accumulation is carried out on the probabilities of all the categories of the current layer to obtain

S260: p pair of _j Normalization processing is carried out to obtain the final classification probability p' _j ，

S270: and selecting the maximum probability p in the classification probabilities of the bottommost layer according to the classification probability of each layer.

S280: judging whether the maximum probability p is larger than a preset threshold h or not;

if so, directly selecting the category with the highest probability as output;

otherwise, judging by combining the classification probability output by the previous layer, and recalculating the probability p of each class of the bottommost layer _m ，p _m ＝p _n p _mn Wherein p is _mn For the probability of each category of the current bottom layer, p _n Class probability, p, for upper layer classification _m For the recalculated probability of each category at the bottom, p _m Normalization processing is carried out to obtain

S290: judging whether the current layer is the highest layer, if so, directly outputting the probability of the highest category.

For example, the lowest classification can be cat (0.4), dog (0.1), flower (0.4), grass (0.1), the preset threshold is 0.5, which cannot be distinguished, and the upper classification is animal (0.9), plant (0.1), according to p _m ＝p _n p _mn Recalculated into cat (0.9×0.4=0.36), dog (0.9×0.1=0.09), flower (0.1×0.4=0.04), grass (0.1×0.1=0.01), according to the followingAnd (3) normalizing to obtain a cat (0.72), a dog (0.18), flowers (0.08) and grass (0.02), wherein the probability of the cat is greater than a preset threshold value of 0.5, and the output class is the cat.

In order to better ensure the accuracy of prediction, in one implementation manner, before the feature extraction classifier performs pre-classification and feature extraction on the target image, the method further includes:

pre-training a feature extraction classifier and a multi-level classifier.

Through the pre-training of the feature extraction classifier and the multi-level classifier, more accurate categories can be output when the target image is actually predicted.

In one implementation, the pre-trained feature extraction classifier and multi-level classifier may include the steps of:

and inputting the training images of the target categories into the convolutional neural network in the feature extraction classifier, and outputting feature vectors of feature graphs of each convolutional neural network layer.

And selecting the feature vectors of the last n convolutional neural network layers to carry out global average pooling, and splicing the pooled feature vectors into a first feature vector, wherein n is an integer greater than 1.

The first feature vector is fully connected with the output layer and classified by softmax.

The feature extraction classifier can pretrain based on the feature vectors of the convolutional neural network layers of the last layers by splicing the feature vectors of the convolutional neural network layers of the last layers into the first feature vector, and perform feature extraction and pretassification on the training image under the condition that pretraining is completed.

And carrying out feature extraction on the training image based on the pre-trained feature extraction classifier to obtain a feature map of each convolutional neural network layer and a second feature probability of the feature map.

And based on the second feature probability, carrying out probability accumulation on the feature map of each convolutional neural network layer corresponding to the target category to obtain a second thermodynamic diagram corresponding to each convolutional neural network layer.

The second thermodynamic diagram is processed to a uniform size and spliced into a multi-level thermodynamic diagram.

And inputting the multi-level thermodynamic diagram into each layer of classifier in the multi-level classifier, and training each layer of classifier in the multi-level classifier.

Based on the feature extraction classifier which is finished by pre-training, the weight (and the second feature probability) corresponding to each feature map in a plurality of convolution layers can be obtained, the weight of the target class is selected, the feature maps of each convolution layer are subjected to weight accumulation, the second thermodynamic diagram of each convolution layer can be obtained, the second thermodynamic diagrams are spliced to obtain a multi-level thermodynamic diagram, and then the classifier of each layer can be trained based on the thermodynamic diagram.

Specifically, fig. 3 shows a schematic diagram of a pre-training process of a feature extraction classifier and a multi-layer multi-classifier, after an image is input into the feature extraction classifier and is processed by a multi-layer convolutional neural network layer, feature vectors extracted by a plurality of layers of convolutional neural networks are selected, global average pooling is performed on the feature vectors, feature stitching is performed on the feature vectors subjected to the global average pooling, so that a first feature vector is obtained, then full connection is performed, and output values after full connection are classified through softmax, so that pre-training of the feature extraction classifier, pre-classification of the image and feature extraction can be realized. Based on the trained feature extraction classifier, the weight corresponding to each feature map in the plurality of convolution layers can be obtained, the weight of the target class of the image is selected, the feature map of each convolution layer is accumulated, the thermodynamic diagram (namely the second thermodynamic diagram) of each convolution layer can be obtained, the plurality of thermodynamic diagrams are spliced into a multi-level thermodynamic diagram and then input into the multi-level multi-classifier, and based on the multi-level thermodynamic diagram, a plurality of convolution neural networks in the multi-level multi-classifier can be trained, namely the classifier of each layer can be trained.

Based on the above-mentioned image classification method of a multi-level multi-classifier, an embodiment of the present application discloses an image classification device 400 of a multi-level multi-classifier, as shown in fig. 4, the device includes:

the obtaining module 410 is configured to obtain a feature map of each convolution layer in each class and a first feature probability of the feature map, where the feature extraction classifier performs pre-classification and feature extraction on the target image.

And the accumulating module 420 is configured to accumulate the probabilities of the feature graphs of each convolution layer in each category based on the first feature probabilities, so as to obtain a first thermodynamic diagram corresponding to each convolution layer in each category.

And the splicing module 430 is configured to process the first thermodynamic diagram into a unified size and splice the first thermodynamic diagram into a first multi-level thermodynamic diagram of each class.

The input module 440 is configured to input the first multi-level thermodynamic diagram to the multi-level classifier to obtain the classification probability of each level of class output by the multi-level classifier.

And the output module 450 is used for outputting the category of the target image according to the number relation between the maximum probability of the lowest-level classification probability and the preset threshold value.

The embodiment of the application discloses an image classifying device of a multi-level multi-classifier, which is characterized in that a target image is pre-classified and feature extracted through a feature extraction classifier, a feature extraction result is processed into a first thermodynamic diagram, the first multilevel thermodynamic diagram is input into the multi-level multi-classifier after being spliced into the first multilevel thermodynamic diagram, the target image can be finely classified through the multi-level multi-classifier to obtain the classifying probability of each layer of the class, and the class of the target image is output according to the number relation between the maximum probability in the classifying probability of the bottommost layer and a preset threshold value.

In one implementation, the input module 440 is configured to:

After obtaining the classification probability of each layer of classification outputted by the multi-layer multi-classifier, the classification of the target image can be judged and outputted according to the number relation between the maximum probability in the classification probability of the bottommost layer and the preset threshold value, and in one implementation manner, the output module 450 can be used for:

In this way, according to the number relation between the maximum probability of each category of the bottommost layer and the preset threshold, the judgment can be performed by combining the probability output by the previous layer until the maximum probability is greater than the preset threshold, and the category with the maximum probability is output.

In one implementation, the apparatus further includes:

a pre-training module for pre-training the feature extraction classifier and the multi-level classifier prior to the acquisition module 410.

In a further technical solution, the pre-training module is configured to:

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

The foregoing embodiments of the present application mainly describe differences between the embodiments, and as long as there is no contradiction between different optimization features of the embodiments, the embodiments may be combined to form a better embodiment, and in view of brevity of line text, no further description is provided herein.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. An image classification method of a multi-level multi-classifier, the method comprising:

the method comprises the steps of obtaining a feature extraction classifier to pre-classify a target image and extracting features, and obtaining a feature image of each convolution layer in each class and a first feature probability of the feature image;

based on the first feature probability, carrying out probability accumulation on the feature map of each convolution layer in each category to obtain a first thermodynamic diagram corresponding to each convolution layer in each category;

processing the first thermodynamic diagram into uniform size and splicing the first thermodynamic diagram into a first multi-level thermodynamic diagram of each category;

inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier;

outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and a preset threshold value;

the step of inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier comprises the following steps:

inputting the first multi-level thermodynamic diagram into each layer of classifier in the multi-level classifier to obtain a first classification probability of each layer of class;

based on the first feature probability, carrying out probability accumulation on the first classification probability of each layer of category and the first feature probability to obtain a second classification probability;

normalizing the second classification probability to obtain the classification probability of each layer of class;

wherein outputting the category of the target image according to the quantitative relation between the maximum probability of the lowest-layer classification probability and a preset threshold value comprises:

outputting a category corresponding to the maximum probability as the category of the target image under the condition that the maximum probability is larger than the preset threshold value;

judging whether the current layer is the highest layer or not under the condition that the maximum probability is not greater than the preset threshold value, and if so, outputting a category corresponding to the maximum probability as the category of the target image;

otherwise, calculating third classification probability of each class in the bottommost layer again by combining the classification probability output by the classifier of the previous layer, wherein the class corresponding to the maximum probability in the third classification probability is output as the class of the target image under the condition that the maximum probability in the third classification probability is larger than the preset threshold value;

before the feature extraction classifier performs pre-classification and feature extraction on the target image, and the obtained feature map of each convolution layer in each class and the first feature probability of the feature map, the method further includes:

pre-training the feature extraction classifier and the multi-level classifier;

wherein said pre-training said feature extraction classifier and said multi-level classifier comprises:

inputting training images of target categories into the convolutional neural network in the feature extraction classifier, and outputting feature vectors of feature graphs of each convolutional neural network layer;

selecting the feature vectors of the last n convolutional neural network layers to carry out global average pooling, and splicing the pooled feature vectors into a first feature vector, wherein n is an integer greater than 1;

fully connecting the first feature vector with an output layer and classifying softmax;

performing feature extraction on the training image based on the pre-trained feature extraction classifier to obtain a feature map of each convolutional neural network layer and a second feature probability of the feature map;

based on the second feature probability, carrying out probability accumulation on the feature map of each convolutional neural network layer corresponding to the target category to obtain a second thermodynamic diagram corresponding to each convolutional neural network layer;

processing the second thermodynamic diagram into uniform size and splicing the uniform size into a multi-level thermodynamic diagram;

2. An image classification apparatus of a multi-level multi-classifier, the apparatus comprising:

the acquisition module is used for acquiring a feature image of each convolution layer in each class and a first feature probability of the feature image, wherein the feature image is obtained by pre-classifying a target image by the feature extraction classifier and extracting features;

the accumulation module is used for carrying out probability accumulation on the feature graphs of each convolution layer in each category based on the first feature probability to obtain a first thermodynamic diagram corresponding to each convolution layer in each category;

the splicing module is used for processing the first thermodynamic diagrams into uniform sizes and splicing the first thermodynamic diagrams into first multi-level thermodynamic diagrams of various types;

the input module is used for inputting the first multi-level thermodynamic diagram into a multi-level classifier to obtain the classification probability of each layer of class output by the multi-level classifier;

the output module is used for outputting the category of the target image according to the quantity relation between the maximum probability in the lowest-layer classification probability and a preset threshold value;

wherein, the input module is used for:

wherein, output module is used for:

the pre-training module is used for pre-training the feature extraction classifier and the multi-level classifier before the acquisition module;

wherein, the pre-training module is used for: