CN111259967A

CN111259967A - Image classification and neural network training method, device, equipment and storage medium

Info

Publication number: CN111259967A
Application number: CN202010054273.XA
Authority: CN
Inventors: 张潇; 赵瑞; 乔宇; 李鸿升
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2020-01-17
Filing date: 2020-01-17
Publication date: 2020-06-09
Anticipated expiration: 2040-01-17
Also published as: CN111259967B

Abstract

The present disclosure relates to an image classification and neural network training method, apparatus, device, and storage medium, the method comprising: carrying out feature extraction processing on the target image through a target neural network to obtain target features of the target image; determining radial basis distances between the target feature and class center features of the images of all classes according to the first parameters; and determining the target class of the target image according to the radial basis distance. According to the image classification method disclosed by the embodiment of the disclosure, the category of the target image can be determined according to the radial basis distance between the target feature and the category center feature of each category, the distribution effect of the target feature and the category center feature of each category can be improved, the distribution randomness is reduced, the aggregation effect of the same category features is enhanced, and the classification accuracy of the target image is improved.

Description

Image classification and neural network training method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for image classification and neural network training.

Background

Image classification techniques are an important foundation in computer vision techniques, and techniques such as object detection, image semantic segmentation, and instance segmentation need to be based on image classification techniques. The image classification technology is to input the image acquired by the image acquisition equipment into a neural network and output a group of probabilities describing whether the image belongs to a specific category or content. As a computer vision technology, image classification can be widely applied to the fields of industrial production, anomaly detection, unmanned driving and the like.

The task of image classification techniques is to determine the membership between different objects with high similarity (e.g., discrimination of people with long-term similarity, classification of birds and animals, etc.). In the training process of the neural network, the obtained features need to be optimized in the Euclidean space, however, as reasonable loss value distribution cannot be carried out on the training samples in the Euclidean space, the generated feature distribution effect is not ideal, so that the training effect of the neural network is not ideal, and the classification accuracy is low.

Disclosure of Invention

The disclosure provides an image classification and neural network training method, device, equipment and storage medium.

According to an aspect of the present disclosure, there is provided an image classification method, including: carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image; determining a radial basis distance between the target feature and a class center feature of each class of image in an image set according to a preset first parameter, wherein the radial basis distance is used for representing classification probability that the target feature belongs to each class respectively, and the image set comprises at least one class of image; and determining the target category of the target image according to the radial basis distance.

According to the image classification method disclosed by the embodiment of the disclosure, the category of the target image can be determined according to the radial basis distance between the target feature and the category center feature of each category, the distribution effect of the target feature and the category center feature of each category can be improved, the distribution randomness is reduced, the aggregation effect of the same category features is enhanced, and the classification accuracy of the target image is improved.

In a possible implementation manner, determining, according to a preset first parameter, a radial basis distance between the target feature and a class center feature of each class of image includes: determining Euclidean distances between the target feature and class center features of various classes; and determining the radial basis distance between the target feature and the class center features of each class according to the Euclidean distance and the first parameter.

In one possible implementation, the method further includes: respectively extracting the features of each image in the image set through a target neural network to respectively obtain the feature information of each image; and determining the class center feature of each class in the feature information of the images with the same class.

In one possible implementation manner, determining class center features of each class in feature information of images with the same class includes: determining a central image in each category of images; and determining the feature information corresponding to the center image of each category as the class center feature of each category.

In one possible implementation manner, determining class center features of each class in feature information of images with the same class includes: and clustering the characteristic information of each image to obtain the class center characteristics of each class.

In one possible implementation manner, determining class center features of each class in feature information of images with the same class includes: and respectively carrying out weighted average processing on the characteristic information of each type of image to obtain the class center characteristic of each type.

According to an aspect of the present disclosure, there is provided a neural network training method, including: performing feature extraction processing on a first sample image through a neural network to obtain first features of the first sample image; determining radial basis distances of class center features respectively corresponding to the first features and sample images of various classes in a training sample set according to preset first parameters; determining the network loss of the neural network according to a preset second parameter and the radial basis distance; and training the neural network according to the network loss, and obtaining a target neural network after the training is finished.

According to the neural network training method disclosed by the embodiment of the disclosure, the network loss can be determined according to the radial basis distance, the neural network is optimized, the distribution effect of the features extracted by the optimized neural network is improved, the training effect of the neural network is improved, and the classification accuracy of the neural network is improved.

In a possible implementation manner, determining a network loss of the neural network according to a preset second parameter and the radial basis distance includes: determining a first radial base distance between the first feature and a class center feature of a class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial basis distance, the second parameter and the radial basis distance between the first feature and the class center feature of each class; and determining the network loss of the neural network according to the classification probability.

Through the mode, distance constraint is added to the first feature and the second feature extracted by the neural network through the first parameter and the radial basis distance, so that the radial basis distance between the first feature and the class center features of the same class is not too large, randomness is reduced, outliers are reduced, the distance between the features is stable, training difficulty is reduced, and training effect is improved. In addition, due to the addition of distance constraint, the relative difference between the radial basis distance between different types of features in the middle and later periods of training and the radial basis distance between similar features is reduced, the features of similar images can be further optimized, and additional training and optimization of the features of similar images are not needed. The classification probability is determined through the second parameter and the radial basis distance, and then the network loss is determined through the classification probability, so that the relative difference between the radial basis distance between different types of features in the middle and later periods of training and the radial basis distance between the same type of features is reduced, the network loss of the same type of features cannot be too small, the features of the same type of images can be further optimized, and the training effect is improved.

In one possible implementation, the method further includes: and determining the second parameter according to the number of the types of the plurality of sample images in the sample image set.

According to an aspect of the present disclosure, there is provided an image classification apparatus including: the first extraction module is used for carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image; the first determining module is used for determining radial basis distances between the target features and class center features of the images of all classes in the image set according to preset first parameters, wherein the radial basis distances are used for representing classification probabilities that the target features belong to all the classes respectively, and the image set comprises images of at least one class; and the second determining module is used for determining the target category of the target image according to the radial base distance.

In one possible implementation manner, the first determining module is further configured to: determining Euclidean distances between the target feature and class center features of various classes; and determining the radial basis distance between the target feature and the class center features of each class according to the Euclidean distance and the first parameter.

In one possible implementation, the apparatus further includes: the second extraction module is used for respectively extracting and processing the features of each image in the image set through the target neural network to respectively obtain the feature information of each image; and the third determining module is used for determining the class center features of all the classes in the feature information of the images with the same class.

In one possible implementation manner, the third determining module is further configured to: determining a central image in each category of images; and determining the feature information corresponding to the center image of each category as the class center feature of each category.

In one possible implementation manner, the third determining module is further configured to: and clustering the characteristic information of each image to obtain the class center characteristics of each class.

In one possible implementation manner, the third determining module is further configured to: and respectively carrying out weighted average processing on the characteristic information of each type of image to obtain the class center characteristic of each type.

According to an aspect of the present disclosure, there is provided an image neural network training apparatus including: the third extraction module is used for carrying out feature extraction processing on the first sample image through a neural network to obtain first features of the first sample image; the fourth determining module is used for determining radial basis distances of the first features and class center features respectively corresponding to the sample images of all classes in the training sample set according to preset first parameters; a fifth determining module, configured to determine a network loss of the neural network according to a preset second parameter and the radial basis distance; and the training module is used for training the neural network according to the network loss and obtaining a target neural network after the training is finished.

In one possible implementation manner, the fifth determining module is further configured to: determining a first radial base distance between the first feature and a class center feature of a class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial basis distance, the second parameter and the radial basis distance between the first feature and the class center feature of each class; and determining the network loss of the neural network according to the classification probability.

In one possible implementation, the apparatus further includes: a sixth determining module, configured to determine the second parameter according to the number of categories of the plurality of sample images in the sample image set.

According to an aspect of the present disclosure, there is provided an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to: the above method is performed.

According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 shows a flow diagram of an image classification method according to an embodiment of the present disclosure;

FIG. 2 shows a flow diagram of a neural network training method in accordance with an embodiment of the present disclosure;

FIG. 3 shows a schematic diagram of a feature space at an early stage of training according to an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of a feature space in a late mid-training stage according to an embodiment of the present disclosure;

FIG. 5 illustrates a graph of Euclidean distance versus radial base distance in accordance with an embodiment of the disclosure;

FIG. 6 illustrates a graph of a first radial base distance versus classification probability in accordance with an embodiment of the present disclosure;

FIG. 7 illustrates an application diagram of a neural network training method in accordance with an embodiment of the present disclosure;

fig. 8 shows a block diagram of an image classification apparatus according to an embodiment of the present disclosure;

FIG. 9 shows a block diagram of a neural network training device, in accordance with an embodiment of the present disclosure;

FIG. 10 shows a block diagram of an electronic device according to an embodiment of the present disclosure;

fig. 11 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Fig. 1 shows a flowchart of an image classification method according to an embodiment of the present disclosure, as shown in fig. 1, the method comprising:

in step S11, performing feature extraction processing on a target image through a target neural network to obtain a target feature of the target image;

in step S12, according to a preset first parameter, determining a radial basis distance between the target feature and a class center feature of each class of image in an image set, where the radial basis distance is used to represent a classification probability that the target feature belongs to each class, and the image set includes at least one class of image;

in step S13, the category of the target image is determined according to the radial basis distance.

In one possible implementation, the image classification method may be performed by a terminal device or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like. The other processing devices may be servers or cloud servers, etc. In some possible implementations, the image classification training method may be implemented by a processor calling computer-readable instructions stored in a memory.

In one possible implementation manner, the image classification method may be used in a process of classifying the portrait, for example, in a process of classifying the captured portrait image by using an access control device, a security device, or the like. For example, the target features of the target image may be extracted by a neural network and classified according to the target features. When a target image is classified, at least one class of class center features is generally known, the target features of the image are extracted through a target neural network, and then the class of the target image is determined through the feature similarity between the features of the target image and the class center features of each class. The target neural network can be a deep learning neural network such as a convolutional neural network, and the network structure of the target neural network is not limited by the disclosure.

In one possible implementation manner, an image set (e.g., a sample image set) including a plurality of reference images may be stored in the access control device or the security device, and may be used for comparison with a captured target image. The reference images in the image set can also be subjected to feature extraction by a neural network, and the extracted features are grouped into a plurality of clusters in a feature space according to the categories of the reference images, wherein each cluster is a category of reference images. For example, if the reference image is a portrait image, each cluster is a feature of the reference image of the same person, and further, each cluster may further include a class center feature of each class, for example, a feature extracted from a certificate photo, or a class center feature obtained by weighted averaging of features of similar reference images.

In an example, the image set may include multiple categories of images, e.g., category a, category B, category C … … each having a class center. The method further comprises the following steps: respectively extracting the features of each image in the image set through a target neural network to respectively obtain the feature information of each image; and determining the class center feature of each class in the feature information of the images with the same class.

In one possible implementation, the image set may include a plurality of sample images, for example, may include a plurality of portrait images, and the plurality of images may be classified into a plurality of categories, for example, portrait images including a plurality of persons in the image set, and portrait images of each person may be classified into one category. And performing feature extraction processing on each sample image through the target neural network to obtain feature information of each sample image. The feature information may be a feature vector, and the data type of the feature information is not limited by the present disclosure.

In an example, a representative image may be selected as the center image in each category (e.g., if the images in the image set are portrait images, then classification may be by identity of the target objects in the images, a certificate photograph or a front photograph of each target object may be selected as the center image for the category), and features of the center image extracted by the neural network may be determined as center-like features.

In an example, the class center feature of each class may be obtained by clustering or the like, for example, the class center feature of each class may be obtained by extracting feature information of each image respectively through a neural network and performing clustering processing on the feature information.

In an example, the class center feature of each class may be obtained by means of weighted average, for example, the class center feature of each class may be obtained by extracting feature information of each image respectively through a neural network and performing weighted average processing on the feature information of each class.

In one possible implementation manner, after the class center feature of each class is obtained, the class of the image may be determined according to the feature similarity between the target feature of the target image and the class center features of the classes, for example, the euclidean distance between the target feature of the target image and the class center features of the classes may be determined, and the class to which the class center feature having the smallest euclidean distance with the features of the image belongs is determined as the class of the image. However, the use of the euclidean distance to determine the feature similarity may cause the distribution randomness of the feature information in the feature space to be large, the aggregation effect of the feature information of the same class is not good, so that it is difficult to determine the class of the target feature, and the use of the euclidean distance to determine the feature similarity is also not beneficial to the training of the neural network.

In one possible implementation, in step S11, the target neural network may be used to extract target features of a target image. And in step S12, the radial basis distance between the target feature and the class center feature of each class in the reference image is determined according to the preset first parameter. In an example, a first parameter γ may be preset, and the radial basis distance of the target feature from the class-center feature of each class is calculated using the first parameter γ.

In one possible implementation, step S12 may include: determining Euclidean distances between the target feature and class center features of various classes; and determining the radial basis distance between the target feature and the class center features of each class according to the Euclidean distance and the first parameter.

In one possible implementation, euclidean distances between the target feature and each class of central feature may be determined, in an example, the target feature and each class of central feature may be feature vectors, and the euclidean distances between the target feature and each class of central feature may be determined using the following equation (1):

wherein the content of the first and second substances,

is a target feature, wherein the target sample image belongs to the ith (i is a positive integer) class,

class center features for the jth (j being a positive integer) class,

is composed of

The two norms of (a). d_i,jIs the euclidean distance between the target feature and the class-centered feature of the jth class. The euclidean distance between the target feature and the class center feature of each class can be determined according to equation (1) above.

In a possible implementation, the Euclidean distance d can be determined according to_i,jAnd a preset first parameter gamma is used for determining the radial base distance between the target feature and the class center feature of each class. For example, the radial base distance of the target feature from the class-centered features of each class may be determined according to the following equation (2):

wherein, K_i,jIs the radial base distance between the target feature and the class-centered feature of the jth class. The radial base distance K can be obtained by the radial base distance in an exponential form and a preset target parameter gamma_i,jAdding distance constraints such that the radial base distance K_i,jHas a value range of 0 < K _i,j1 or less, i.e. when the Euclidean distance d is less than_i,jToward infinity, the radial base distance K_i,jWhen the limit of (D) is 0, as the Euclidean distance d_i,jIs 0, i.e.

While, the radial base distance K _i,j1. The radial base distance of the target feature from the class-center feature of each class may be determined according to equation (2) above.

In one possible implementation manner, in step S13, the category of the target image is determined according to the radial base distance between the target feature and the class center feature of each category, for example, the class center feature having the smallest radial base distance from the target feature may be determined, and the category to which the class center feature belongs is determined as the category of the target image. For example, the images in the image set and the target image are both portrait images, the images in the image set may be classified into a plurality of categories according to identity, and the category to which the center-like feature having the smallest radial base distance from the target feature belongs is identity a, so the identity of the target object in the target image may be determined as identity a. The present disclosure does not limit the type and manner of classification of the target image and the images in the image set.

In one possible implementation, the neural network may be trained prior to using the neural network for the feature extraction process. In an example, a loss function of the neural network may be determined by the radial base distance to enhance the training effect.

Fig. 2 shows a flow chart of a neural network training method according to an embodiment of the present disclosure, as shown in fig. 2, the method including:

in step S21, performing feature extraction processing on a first sample image through a neural network to obtain a first feature of the first sample image;

in step S22, according to a preset first parameter, determining radial basis distances of class center features corresponding to the first feature and each class of sample images in the training sample set, respectively;

in step S23, determining a network loss of the neural network according to a preset second parameter and the radial basis distance;

in step S24, the neural network is trained according to the network loss, and a target neural network is obtained after the training is completed.

In one possible implementation, during the training of the neural network, the network loss of the neural network is also determined according to the euclidean distance. However, in the initial stage of training, at the stage of large network loss of the neural network, network parameters are not accurate enough, errors of extracted features of the images are large, randomness of the extracted features is strong, the features of the images of each category are randomly distributed in a feature space, and at the moment, the Euclidean distance between the features cannot accurately represent feature similarity between the features of each category.

Fig. 3 is a schematic diagram of a feature space at an initial stage of training according to an embodiment of the present disclosure, as shown in fig. 3, a unit in the feature space is a unit distance in the feature space, and due to high randomness and no distance constraint between features, euclidean distances of features of images of the same category (for example, in an image of the same category, a class center feature of the category and a feature of any image of the category) in the feature space may be large and have more outliers, which results in a large training difficulty in a training process, that is, a difficulty in reducing the euclidean distances of the features of the same category. On the other hand, due to strong randomness, the Euclidean distance between different features may be small, and the training difficulty is large in the training process. That is, the euclidean distance between the features is unstable. Further, it may cause a deviation of the class center feature during the training process, for example, in the feature space, the class center feature of the certificate photo of the target object is not located at the center of the area where all the features of the class are located, or the class center feature of the certificate photo of the target object has a large deviation from the weighted average of all the features of the class.

In a possible implementation manner, in a stage where the network loss of the neural network is small in the middle and later stages of training, the network parameters are optimized, the error of the extracted features of the image is small, and the features of the images of the classes can be distributed in the feature space in a concentrated manner, for example, in a cluster manner.

Fig. 4 shows a schematic diagram of a feature space in a middle and later training period according to an embodiment of the present disclosure, as shown in fig. 4, a unit in the feature space is a unit distance in the feature space, and in the middle and later training periods, the features of the images of the same type are distributed in a concentrated manner, that is, the euclidean distance between the features of the images of the same type is much smaller than the euclidean distance between the features of the different types, which may result in a small network loss, or even disappear.

In summary, determining the network loss by the euclidean distance between the features may result in poor training of the neural network. The network loss may be optimized, for example, by determining the radial base distance between features.

In one possible implementation, the neural network may be trained by a sample image set including sample images of at least one class. The sample image set may include a plurality of sample images, for example, may include a plurality of portrait images, and the plurality of sample images may be classified into a plurality of categories, for example, the sample image set may include portrait images of a plurality of persons, and the portrait images of each person may be classified into one category. The second feature of each sample image can be obtained by performing feature extraction processing on each sample image through a neural network. The second feature may be a feature vector, and the present disclosure does not limit the data type of the second feature.

In one possible implementation, the class-center feature of each class may be determined from the second feature of each sample image. In an example, the second features of each sample image may be clustered to obtain class-centered features for each class. Alternatively, the sample images of each category may have an annotation category, that is, given the category of each sample image, the second features of the sample images of the same category may be weighted and averaged to obtain the class center feature. Alternatively, a representative sample image may be selected as the class center in the category (for example, if the sample images in the sample image set are portrait images, a certificate photograph or a front photograph of each target object may be selected as the class center of the category), and the feature of the class center extracted by the neural network may be determined as the class center feature.

In one possible implementation manner, in step S21, the category of the first sample image is one of the categories of the sample images in the sample image set. For example, if the first sample image is a portrait image, the identity of the target object in the first sample image may be consistent with the identity of the target object in at least one sample image in the set of sample images. The first feature of the first sample image can be obtained by performing feature extraction processing on the first sample image through a neural network. The data type of the first feature is consistent with that of the second feature, for example, the first feature and the second feature are feature vectors with the same dimension, so as to calculate the feature similarity between the first feature and the second feature. The present disclosure does not limit the data type of the first feature and the second feature.

In one possible implementation, in step S22, a radial base distance between the first feature of the first sample image and the class-center feature of each sample image in the sample image set may be determined. The Euclidean distance of the first feature from the class center feature of each class can be determined, for example, the Euclidean distance d of the first feature from the class center feature of each class is first determined by the above formula (1)_i,j. And determining the radial basis distance between the first feature and the class center feature of each class according to the Euclidean distance, for example, determining the class between the first feature and each class according to the above formula (2)Radial base distance K of central feature_i,j。

Fig. 5 shows a graph of the euclidean distance and the radial basis distance according to an embodiment of the disclosure, where the unit of the euclidean distance and the radial basis distance is a unit distance in the feature space, and fig. 5 shows the euclidean distance d when the first parameter γ takes values of 0.8, 1.2, 1.6, 2.0, and 2.4, respectively_i,jDistance K from radial base_i,jThe relationship between them. With Euclidean distance d_i,jIncreasing radial base distance K_i,jAnd the first parameter gamma determines the speed of the decrease of the radial base distance, the speed of the decrease of the radial base distance decreases as the first parameter gamma increases, i.e. the larger the first parameter gamma, the slower the speed of the decrease of the radial base distance. The value of the first parameter γ may be preset, for example, the first parameter γ may be made to take a value between 1 and 2.

Through the mode, distance constraint is added to the first feature and the second feature extracted by the neural network through the first parameter and the radial basis distance, so that the radial basis distance between the first feature and the class center features of the same class is not too large, randomness is reduced, outliers are reduced, the distance between the features is stable, training difficulty is reduced, and training effect is improved. In addition, due to the fact that distance constraint is added, the relative difference between the radial basis distance between different types of features in the middle and later periods of training and the radial basis distance between the same type of features is reduced, the features of the same type of images can be further optimized, additional training and optimization of the features of the same type of images are not needed, and the training effect is improved.

In one possible implementation manner, in step S23, the network loss of the neural network may be determined according to the radial basis distance and the preset second parameter S. In an example, a second parameter s may be preset, and the network loss of the neural network may be determined using the second parameter s and the radial basis distance obtained by the above method.

In one possible implementation, step S23 may include: determining a first radial base distance between the first feature and a class center feature of a class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial basis distance, the second parameter and the radial basis distance between the first feature and the class center feature of each class; and determining the network loss of the neural network according to the classification probability.

In a possible implementation manner, the first sample image may have an annotation category, that is, a category (accurate category) of the first sample image may be annotated, and further, a class center feature of the category of the first sample image may be determined from class center features of the categories. The radial basis distance between the first feature of the first sample image and the class center feature of the class of the first sample image is the first radial basis distance K_i,iI.e. the radial base distance K when i ═ j_i,j。

In one possible implementation, the classification probability of the first sample image may be calculated, for example, the classification probability may be calculated using the first radial basis distance, the radial basis distance between the first feature and the class center feature of each class, and the preset second parameter s.

In an example, the classification probability can be calculated according to the following equation (3):

wherein, y_iThe (i) th category is represented,

c is the number of classes, e.g., the number of classes of the plurality of sample images in the sample image set.

Fig. 6 shows a relationship diagram of a first radial basis distance and a classification probability, where the unit of the first radial basis distance is a unit distance in a feature space, according to an embodiment of the present disclosure, and fig. 6 shows the first radial basis distance K when the second parameter s takes values of 8, 26, 32, 64, and 128, respectively_i,iAnd the classification probability. With a first radial base distance K_i,iIncrease, classification probability P_i,yiIncreased by a second parameter sGiven the speed of the increase of the classification probability, the speed of the increase of the classification probability increases as the second parameter s increases, i.e., the larger the second parameter s, the faster the speed of the increase of the classification probability. The value range of the first radial base distance is more than 0 and less than K _i,i1, so that a suitable second parameter s has to be preset such that the first radial basis distance K is_i,iThe range of values of the classification probability can cover the range of 0 to 1 when varying between 0 and 1. For example, as shown in fig. 6, when the value of the second parameter s is 8, the range of the classification probability cannot cover 0 to 1 in the range of the first radial basis distance, and therefore, it is not appropriate to set the value of the second parameter s to 8, and other values, for example, 16, 32, etc., may be set.

In a possible implementation, the second parameter s may also be derived from a calculation, and the method further includes: determining the second parameter according to a number of categories of at least one sample image in the sample image set. In an example, the second parameter may be determined according to the following equation (4):

s＝2ln (C-1) (4)

wherein C is the number of categories.

In one possible implementation, the network loss of the neural network, e.g., the radial basis index normalized cross entropy loss, may be determined from the classification probability that the first sample image belongs to its labeled class.

In an example, the network loss of the neural network may be determined according to the following equation (5):

i.e. normalized cross entropy loss of radial basis index

Is taken as the classification probability P_i,yiThe negative logarithm of (d), the base of the logarithm may be an arbitrarily set positive number, e.g., 10. The present disclosure does not limit the base of the logarithm.

By the method, the classification probability is determined through the second parameter and the radial basis distance, and then the network loss is determined through the classification probability, so that the relative difference between the radial basis distance between different types of features in the middle and later periods of training and the radial basis distance between the same type of features is reduced, the network loss of the same type of features cannot be too small, the features of the same type of images can be further optimized, and the training effect is improved.

In one possible implementation, in step S24, the neural network may be trained according to the network loss. In an example, network parameters of the neural network may be adjusted in a direction that minimizes network loss, e.g., the network loss may be back-propagated using a gradient descent method to adjust the network parameters of the neural network. And when the training condition is met, the trained target neural network is obtained. The training condition may be an adjustment number, and the network parameter of the neural network may be adjusted a predetermined number of times. For example, the training condition may be the magnitude or convergence of the network loss, and when the network loss is reduced to a certain degree or falls within a certain threshold, the adjustment may be stopped to obtain a trained target neural network, and the trained target neural network may be used in the classification processing.

According to the neural network training method disclosed by the embodiment of the disclosure, distance constraints can be added to the first feature and the second feature extracted by the neural network through the first parameter and the radial basis distance, so that the radial basis distance between the first feature and the class center features of the same class cannot be too large, the randomness is reduced, the outliers are reduced, the distance between the features is stable, the training difficulty is favorably reduced, the distribution effect of the optimized features extracted by the neural network is improved, and the training effect is improved. And because the distance constraint is added, the relative difference between the radial basis distance between different types of features in the middle and later periods of training and the radial basis distance between similar features is reduced, the network loss of the similar features cannot be too small, the features of the similar images can be further optimized, the features of the similar images do not need to be trained additionally and optimized, the training effect is improved, and the classification accuracy of the neural network is improved.

Fig. 7 is a schematic diagram illustrating an application of a neural network training method according to an embodiment of the present disclosure, in the training process, sample images in a sample image set may be firstly classified, and a class center feature of each class may be obtained. The sample image set may include a plurality of sample images, for example, may include a plurality of portrait images, and the portrait images of each person may be classified into one category. The method comprises the steps of respectively carrying out feature extraction processing on each sample image through a neural network, respectively obtaining second features of each sample image, and determining class center features of each class in the second features of each class, for example, the sample images are portrait images, a certificate photo or a front photo of each target object can be selected as the class center of each class, and the second features of the class center are determined as the class center features.

In a possible implementation manner, feature extraction may be performed on a first sample image through a neural network to obtain first features of the first sample image, euclidean distances between the first features and various types of central features may be determined according to formula (1), further, a first parameter γ may be preset, and radial basis distances between the first features and various types of central features may be determined according to formula (2).

In one possible implementation, the second parameter may be calculated according to formula (4), and the classification probability of the first sample image may be calculated according to formula (3) according to the radial basis distance between the first feature and each type of central feature, and further, the network loss of the neural network may be calculated according to formula (5).

In one possible implementation, the network parameters of the neural network may be adjusted according to the network loss, and when the training condition is satisfied, the trained neural network is obtained. The performance of the target neural network trained by the method in the classification processing is better.

In fig. 7, f illustrates the distribution of the features obtained by feature extraction of the sample images in the sample image set by the target neural network trained by the above method in the feature space, and a diagram, b diagram, c diagram, d diagram and e diagram respectively illustrate the distribution of the features obtained by feature extraction of the sample images in the sample image set by the neural network trained by other methods in the feature space. As shown, the distribution of the features of the same class in the f-diagram is more concentrated, the radial basis distance from the class center feature is smaller, the classification effect is more significant, and the classification accuracy (99.20%) in the f-diagram is higher than that (98.97%, 99.04%, 99.05%, 98.93% and 98.74%) of the classification performed by the neural network trained in other ways.

In a possible implementation manner, the neural network training method can be used for classifying a large number of images and carrying out anomaly detection on industrial production, can also be used for image-based search engines, photo album classification and the like, or can help methods such as target detection, semantic segmentation and the like to improve performance. The application field of the neural network training method is not limited by the disclosure.

Fig. 8 illustrates a block diagram of an image classification apparatus according to an embodiment of the present disclosure, which includes, as illustrated in fig. 8:

the first extraction module 11 is configured to perform feature extraction processing on a target image through a target neural network to obtain a target feature of the target image; a first determining module 12, configured to determine, according to a preset first parameter, a radial basis distance between the target feature and a class center feature of each class of image in an image set, where the radial basis distance is used to represent a classification probability that the target feature belongs to each class respectively, and the image set includes at least one class of image; and a second determining module 13, configured to determine a target category of the target image according to the radial base distance.

Fig. 9 shows a block diagram of a neural network training apparatus according to an embodiment of the present disclosure, as shown in fig. 9, the apparatus including: a third extraction module 21, configured to perform feature extraction processing on a first sample image through a neural network, so as to obtain a first feature of the first sample image; a fourth determining module 22, configured to determine, according to a preset first parameter, radial basis distances of class center features corresponding to the first feature and each class of sample images in the training sample set, respectively; a fifth determining module 23, configured to determine a network loss of the neural network according to a preset second parameter and the radial basis distance; and the training module 24 is configured to train the neural network according to the network loss, and obtain a target neural network after the training is completed.

It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted.

In addition, the present disclosure also provides an image classification apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the image classification methods provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the methods section are not repeated.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and for specific implementation, reference may be made to the description of the above method embodiments, and for brevity, details are not described here again

Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a non-volatile computer readable storage medium.

An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the above method.

The electronic device may be provided as a terminal, server, or other form of device.

Fig. 10 is a block diagram illustrating an electronic device 800 in accordance with an example embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.

Referring to fig. 10, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.

The embodiments of the present disclosure also provide a computer program product, which includes computer readable code, and when the computer readable code runs on a device, a processor in the device executes instructions for implementing the picture search method provided in any of the above embodiments.

The embodiments of the present disclosure also provide another computer program product for storing computer readable instructions, which when executed, cause a computer to perform the operations of the picture searching method provided in any of the above embodiments.

The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Fig. 11 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, the electronic device 1900 may be provided as a server. Referring to fig. 11, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.

The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An image classification method, comprising:

carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image;

determining a radial basis distance between the target feature and a class center feature of each class of image in an image set according to a preset first parameter, wherein the radial basis distance is used for representing classification probability that the target feature belongs to each class respectively, and the image set comprises at least one class of image;

and determining the target category of the target image according to the radial basis distance.

2. The method according to claim 1, wherein determining the radial basis distance between the target feature and the class center feature of each class of image according to a preset first parameter comprises:

determining Euclidean distances between the target feature and class center features of various classes;

and determining the radial basis distance between the target feature and the class center features of each class according to the Euclidean distance and the first parameter.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

respectively extracting the features of each image in the image set through a target neural network to respectively obtain the feature information of each image;

and determining the class center feature of each class in the feature information of the images with the same class.

4. The method according to claim 3, wherein determining class-center features of each class in feature information of images of the same class comprises:

determining a central image in each category of images;

and determining the feature information corresponding to the center image of each category as the class center feature of each category.

5. The method according to claim 3, wherein determining class-center features of each class in feature information of images of the same class comprises:

and clustering the characteristic information of each image to obtain the class center characteristics of each class.

6. The method according to claim 3, wherein determining class-center features of each class in feature information of images of the same class comprises:

and respectively carrying out weighted average processing on the characteristic information of each type of image to obtain the class center characteristic of each type.

7. A neural network training method, comprising:

performing feature extraction processing on a first sample image through a neural network to obtain first features of the first sample image;

determining radial basis distances of class center features respectively corresponding to the first features and sample images of various classes in a training sample set according to preset first parameters;

determining the network loss of the neural network according to a preset second parameter and the radial basis distance;

and training the neural network according to the network loss, and obtaining a target neural network after the training is finished.

8. The method of claim 7, wherein determining the network loss of the neural network according to the preset second parameter and the radial basis distance comprises:

determining a first radial base distance between the first feature and a class center feature of a class to which the first sample image belongs according to the labeling class of the first sample image;

determining the classification probability of the first sample image according to the first radial basis distance, the second parameter and the radial basis distance between the first feature and the class center feature of each class;

and determining the network loss of the neural network according to the classification probability.

9. The method of claim 7, further comprising:

and determining the second parameter according to the number of the types of the plurality of sample images in the sample image set.

10. An image classification apparatus, comprising:

the first extraction module is used for carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image;

the first determining module is used for determining radial basis distances between the target features and class center features of the images of all classes in the image set according to preset first parameters, wherein the radial basis distances are used for representing classification probabilities that the target features belong to all the classes respectively, and the image set comprises images of at least one class;

and the second determining module is used for determining the target category of the target image according to the radial base distance.

11. The apparatus of claim 10, wherein the first determining module is further configured to:

12. The apparatus of claim 10 or 11, further comprising:

the second extraction module is used for respectively extracting and processing the features of each image in the image set through the target neural network to respectively obtain the feature information of each image;

and the third determining module is used for determining the class center features of all the classes in the feature information of the images with the same class.

13. The apparatus of claim 12, wherein the third determining module is further configured to:

determining a central image in each category of images;

14. The apparatus of claim 12, wherein the third determining module is further configured to:

15. The apparatus of claim 12, wherein the third determining module is further configured to:

16. A neural network training device, comprising:

the third extraction module is used for carrying out feature extraction processing on the first sample image through a neural network to obtain first features of the first sample image;

the fourth determining module is used for determining radial basis distances of the first features and class center features respectively corresponding to the sample images of all classes in the training sample set according to preset first parameters;

a fifth determining module, configured to determine a network loss of the neural network according to a preset second parameter and the radial basis distance;

and the training module is used for training the neural network according to the network loss and obtaining a target neural network after the training is finished.

17. The apparatus of claim 16, wherein the fifth determining module is further configured to:

18. The apparatus of claim 16, further comprising:

a sixth determining module, configured to determine the second parameter according to the number of categories of the plurality of sample images in the sample image set.

19. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: performing the method of any one of claims 1 to 9.

20. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 9.