CN113762005B

CN113762005B - Feature selection model training and object classification methods, devices, equipment and media

Info

Publication number: CN113762005B
Application number: CN202011242309.3A
Authority: CN
Inventors: 祖辰
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-11-09
Filing date: 2020-11-09
Publication date: 2024-06-18
Anticipated expiration: 2040-11-09
Also published as: CN113762005A

Abstract

The embodiment of the invention discloses a training and object classification method, device, equipment and medium of a feature selection model. The training method of the feature selection model comprises the following steps: inputting a plurality of training samples and class labeling results respectively corresponding to each training sample into a feature selection model, wherein each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features; and adjusting network parameters in an objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm. According to the technical scheme, the effect of high-accuracy feature selection is achieved through the self-adaptive loss function which can be suitable for data measurement under various data distribution.

Description

Feature selection model training and object classification methods, devices, equipment and media

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for training a feature selection model and classifying objects.

Background

Feature selection is an important component of many feature selection applications, particularly in many bioinformatics and computer vision applications, which require efficient and powerful feature selection techniques to extract meaningful features and remove noise and redundant features to avoid degrading the application performance of subsequent correlation algorithms.

The feature selection is a process of selecting a relevant feature subset, and is a key component for constructing a robust feature selection model for numerous applications such as classification, regression, clustering and the like, because the process can accelerate the learning process of the model, improve the generalization capability of the model and relieve the influence of dimension disasters on the model.

A number of feature selection techniques have been proposed by researchers and used in practical applications such as filtering feature selection techniques, wrapped feature selection techniques, embedded feature selection techniques, and the like. The embedded feature selection technology embeds the feature selection process into the training process of the feature selection model, and the feature selection is finished simultaneously with the completion of the training of the feature selection model.

In the process of realizing the invention, the inventor finds that the following technical problems exist in the prior art: the embedded feature selection technique often adopts a square loss function to measure the difference between the predicted value and the true value, which can amplify the loss value of an abnormal point in the data, i.e. the loss value is sensitive to the abnormal value in the data, which can have a great influence on the accuracy of feature selection.

Disclosure of Invention

The embodiment of the invention provides a training and object classification method, device, equipment and medium of a feature selection model, so as to realize the effect of high-accuracy feature selection under various data distribution.

In a first aspect, an embodiment of the present invention provides a training method for a feature selection model, which may include:

Inputting a plurality of training samples and class labeling results corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

And adjusting network parameters in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm.

In a second aspect, an embodiment of the present invention further provides an object classification method, which may include:

Acquiring characteristics of an object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method in any embodiment of the invention;

Classifying the object to be classified according to the target characteristics.

In a third aspect, an embodiment of the present invention further provides a training device for a feature selection model, which may include:

The data input module is used for inputting a plurality of training samples and class labeling results corresponding to each training sample into the feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

And the model training module is used for adjusting network parameters in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first form related to a first norm and a second form related to a second norm.

In a fourth aspect, an embodiment of the present invention further provides an object classification apparatus, which may include:

the feature selection module is used for acquiring the features of the object to be classified, and selecting target features from the features based on a feature selection model obtained by training according to the training method according to any embodiment of the invention;

and the object classification module is used for classifying the objects to be classified according to the target characteristics.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, where the electronic device may include:

One or more processors;

a memory for storing one or more programs;

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the training method or the object classification method for the feature selection model provided by any embodiment of the present invention.

In a sixth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the training method or the object classification method of the feature selection model provided in any embodiment of the present invention.

According to the technical scheme, training samples comprising a plurality of sample characteristics and class labeling results corresponding to each training sample are input into a characteristic selection model, wherein the training samples can comprise samples related to biological characteristic processing, image characteristic processing, voice characteristic processing and/or text characteristic processing, and the characteristic selection model is a model for selecting target characteristics from all sample characteristics; because the loss function measurement unit in the feature selection model is constructed based on the self-adaptive loss function which can be adapted to data measurement under various data distribution, the loss value of an abnormal point in the data can not be amplified, and therefore, the network parameters can be accurately adjusted based on the output result of the objective function determination module where the loss function measurement unit is positioned. According to the technical scheme, the self-adaptive loss function of the data measurement under various data distribution can be adapted, the adjustment precision of network parameters under various conditions is guaranteed, the feature selection model obtained based on the accurate adjustment of the network parameters improves the selection precision of target features, and the robustness of feature selection is high.

Drawings

FIG. 1 is a flow chart of a training method of a feature selection model in a first embodiment of the invention;

Fig. 2a is a schematic diagram showing a comparison between a curve of an adaptive loss function and a curve of the remaining norms when σ=0.1 in a training method of a feature selection model according to the first embodiment of the present invention;

Fig. 2b is a schematic diagram showing a comparison between a curve of an adaptive loss function and a curve of the remaining norms when σ=1 in a training method of a feature selection model according to the first embodiment of the present invention;

fig. 2c is a schematic diagram showing a comparison between a curve of an adaptive loss function and a curve of the remaining norms when σ=10 in a training method of a feature selection model according to the first embodiment of the present invention;

FIG. 3 is a flow chart of a training method of a feature selection model in a second embodiment of the invention;

FIG. 4 is a flow chart of a method of classifying objects in accordance with a third embodiment of the invention;

FIG. 5 is a block diagram of a training device for a feature selection model in accordance with a fourth embodiment of the present invention;

FIG. 6 is a block diagram of an object classification apparatus according to a fifth embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device in a sixth embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.

Example 1

Fig. 1 is a flowchart of a training method of a feature selection model according to a first embodiment of the present invention. The present embodiment is applicable to a case where a target feature is selected from a plurality of sample features based on an adaptive loss function adaptable to various training samples in a model training process. The method can be performed by the training device of the feature selection model provided by the embodiment of the invention, the device can be realized by software and/or hardware, and the device can be integrated on electronic equipment, and the electronic equipment can be various user terminals or servers.

Referring to fig. 1, the method of the embodiment of the present invention specifically includes the following steps:

S110, inputting a plurality of training samples and class labeling results corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features.

Each training sample may be data obtained by extracting features of a training object corresponding to the training sample, so each training sample may include a plurality of sample features. By way of example, the training object may be biological data, image data, text data, speech data, etc., and the training samples may be samples associated with biological feature processing, image feature processing, speech feature processing, text feature processing, etc., and the feature extraction of the training object may be implemented by Scale-INVARIANT FEATURE TRANSFORM, SIFT, wavelet transform, directional gradient histogram (Histogram of Oriented Gradient, HOG), natural language processing (Neuro Linguistic Programming, NLP), fourier transform, etc.

In particular, in the field of biological feature processing, feature processing modes that may be involved include filtering, segmentation, artifact removal, independent component analysis, time domain analysis, frequency domain analysis, sequence alignment, and the like. Taking the processing of the genetic data as an example, after the genetic data of a certain object is obtained, the genetic data and pre-stored genetic data (hereinafter may be simply referred to as pre-stored data) may be compared based on a sequence comparison algorithm, and a plurality of genetic features may be obtained in the comparison process. Then, for example, when the pre-stored data is genetic data of various categories such as human genetic data, canine genetic data, feline genetic data, etc., the category of the subject to which the genetic data belongs, such as human, canine, feline, etc., can be judged based on the above genetic characteristics; further exemplary, when parent-child identification is performed according to the gene data, whether the object to which the gene data belongs and the object to which the pre-stored data belongs have parent-child relationships may be determined according to the above-described gene characteristics; etc., and are not particularly limited herein.

In the field of image feature processing, feature processing modes that may be involved include SIFT, wavelet transform, HOG, image segmentation, morphological analysis, and the like. Taking the fingerprint data processing as an example, after the fingerprint data of a certain user is obtained, the fingerprint data is usually a color image, and is converted into a gray image and normalized to obtain a normalized image; solving a horizontal gray level image and a vertical gray level image for the normalized image based on a Sobel operator; detecting the positions of singular points of fingerprints in the horizontal gray level image and the vertical gray level image by using a singular point detection algorithm based on Poincare index; and solving the direction field information by using a size-variable template, correcting the direction field information according to the consistency of the direction field, and finally smoothing and filtering the corrected direction field information by using a mean value filtering algorithm to obtain final direction field information, wherein the final direction field information can be used as fingerprint characteristics of fingerprint data. Besides, histogram equalization can be performed on the color image, and the structural tensor of the color image is calculated according to the result of the histogram equalization; determining a feature vector corresponding to the feature value of the structure tensor; processing the feature vector based on an arctangent formula to obtain a point direction field of each pixel point in the color image, wherein the point direction field can also be used as fingerprint features of fingerprint data; and the like, are not particularly limited herein. In a subsequent application, it may be determined from the fingerprint feature whether the fingerprint data belongs to the fingerprint data of the target object, whether the fingerprint data belongs to the target category, or the like.

In the field of text feature processing, feature processing methods that may be involved include word segmentation, text cleaning, normalization, NLP, and the like. Processing texture of the text data, processing the text data after obtaining certain text data to obtain a similarity matrix of text vectors, and obtaining an initial equivalence division threshold value of each text data by adopting each row element of the similarity matrix, so as to perform initial equivalence division on the text data, and further determining initial cluster number and initial cluster center, wherein the initial cluster number and the initial cluster center can be used as text characteristics of the text data; in addition, an artificial fish swarm algorithm can be adopted in combination, and the state of each artificial fish is updated according to the global optimal information and the local optimal information so as to find a global optimal clustering center, wherein the global optimal clustering center can also be used as the text characteristic of the text data; etc. In a subsequent application, it may be determined from these text features whether the piece of text data originated from a particular composer, its reflected emotional tendency, and so forth.

In the field of speech feature processing, feature processing modes that may be involved include fourier transforms, mel-frequency cepstral coefficients, linear prediction coefficients, line spectral frequencies, and the like. Taking the processing of voice data as an example, after a certain voice data is obtained, windowing processing can be performed on the voice data to obtain a voice frame, wherein the voice frame can be used as the voice characteristic of the voice data; besides, a threshold value can be obtained from the voice frame, and the voice data in the voice data can be obtained according to the threshold value, wherein the voice data can be used as the voice characteristic of the voice data; etc. In a subsequent application, the voice features may be compared with pre-stored voice features, and based on the comparison, it may be determined whether the piece of voice data originated from the target object, whether the target emotion is reflected, and so on.

The feature selection model is any model capable of completing a feature selection task, and target features selected based on the feature selection model can be used for achieving tasks such as classification, regression, clustering and the like. Therefore, each training sample has a corresponding class labeling result, the class labeling result is a class in which the training sample is truly located in the feature selection tasks of classification, regression, clustering and the like, for example, the target feature selected based on the feature selection model is used for classifying the image into a cat, a dog and others, and the class labeling result corresponding to the training sample of a certain image can be a cat or be a data representation mode corresponding to a cat.

In practical applications, alternatively, and exemplarily, assuming that the number of training samples { x ₁,x₂,…,x_n } is n, each training sample includes d sample features, such a sample set formed by a plurality of training samples may be obtained by a sample matrixRepresenting, wherein the ith row in X represents the ith training sample/>Assuming that the training samples totally relate to c categories, the labeling set formed by labeling results of the categories can be formed by labeling matrix/>Representation is made in which/>One-hot encoding (one-hot encoding) of class labeling results for the ith training sample, i.e., if x _i belongs to the j-th class of the c classes, then the j-th element of y _i The remaining elements are all 0.

And S120, adjusting network parameters in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, and the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm.

After the training sample and the class labeling result are input into the feature selection model, the feature selection model can conduct class prediction on the training sample to obtain a class prediction result, at this time, the loss function measurement unit in the objective function determination module can measure the class prediction result and the class labeling result which belong to the same training sample, and then the objective function value of the objective function determination module can be calculated according to the measurement result and other factors, wherein the other factors can be factors related to feature selection, and the objective function value can be the adjustment basis of network parameters in the objective function determination module. After the network parameters are adjusted, the adjusted network parameters can be used for feature selection (because the selection process of the feature selection unit related to feature selection in the objective function determination module involves the network parameters), and can also be used for performing the task of feature selection such as classification, regression, clustering, and the like.

It should be noted that, the loss function measurement unit may be a calculation unit constructed based on an adaptive loss function, and considering an application scenario possibly related to an embodiment of the present invention, each training object may possibly include data under multiple data distributions at the same time, for example, a certain training pair object includes data partially conforming to a gaussian distribution and data partially conforming to a laplace distribution, a loss function formed based on a single norm may hardly perform a better measurement on each data, for example, a square error loss function based on a i ₂ norm is sensitive to an outlier in the data under the laplace distribution because a penalty is a square, and further, for example, an error loss function based on a i ₁ norm may hardly accurately score the data under the gaussian distribution.

In order to adapt to accurate depiction of data under various data distributions, the adaptive loss function provided by the embodiment of the present invention includes a first form related to a first norm and a second form related to a second norm, where the adaptive loss function is a loss function obtained by fusing the first norm and the second norm, and the first norm and the second norm may be norms adapted to data under different data distributions, respectively, and the first norm may be norms adapted to data under laplace distribution, and the second norm may be norms adapted to data under gaussian distribution; and vice versa. The adaptive loss function can be suitable for accurate measurement between the predicted value and the true value of data under various data distributions, is not sensitive to the loss value of an abnormal point in the data under certain data distribution, so that the adaptive loss function can overcome Gaussian noise and Laplacian noise in the data, the accurate calculation of the objective function value is an important basis for the accuracy adjustment of network parameters under various data distributions, the network parameters after the accurate adjustment improve the learning performance of a feature selection model and the accuracy of selecting the objective features from various sample features, and the robustness of feature selection is high.

It should be noted that, as described above, the selection process of the feature selection unit related to feature selection in the objective function determining module involves the network parameter, so that, according to the network parameter after accurate adjustment, the objective feature can be accurately selected from the sample features, where the objective feature can be a sample feature that has a greater influence on the subsequent remaining tasks, for example, a sample feature that contributes to each classification subtask in the multiple classification tasks. The importance of each sample feature is automatically calculated in the model training process, and the target feature is selected from a plurality of sample features according to the importance, so that redundant and irrelevant features are deleted from the plurality of sample features, and the realization performance of the subsequent rest tasks is improved.

In practical applications, optionally, a fusion parameter for fusing the first norm and the second norm may be set in the first equation and/or the second equation, and a specific value of the fusion parameter may affect which norm is more biased by the fused adaptive loss function. The reason for this is that in practical application, the distribution condition of each data in a certain sample object is unknown, which may be data conforming to a gaussian distribution, data conforming to a laplace distribution, and more likely include both data conforming to a gaussian distribution and data conforming to a laplace distribution, so that the ratio between the gaussian distribution and the laplace distribution can be adjusted by setting a fusion parameter, so that the adaptive loss function can be adapted to the data distribution in various situations.

To improve the accuracy of feature selection, a set of fusion parameters may be preset, some of which may bias the adaptive loss function more toward being adaptable to the norm for gaussian distribution, and some of which may bias the adaptive loss function more toward being adaptable to the norm for laplace distribution. On the basis, each fusion parameter corresponds to an adaptive loss function, namely, based on each fusion parameter, a corresponding feature selection model can be obtained through training, and the suitability of which fusion parameter with a value to the data is better can be determined according to the excellent application results (such as classification results, regression results and clustering results) of different trained feature selection models, and the fusion parameter with the value can be used in the data.

Alternatively, the curve of the adaptive loss function may include a curve between the curve of the first norm and the curve of the second norm, such an adaptive loss function being a norm between the first norm and the second norm, which has advantages of both the first norm and the second norm, which is robust against data outliers in the laplace distribution, and which can effectively learn normal data in the gaussian distribution. Illustratively, the first norm may be the l _2,1 norm, and/or the second norm may be the l _F norm, although vice versa.

Based on the above technical solutions, considering the application scenario possibly related to the embodiments of the present invention, for a d-dimensional vectorIts l ₁ and l ₂ norms are defined as/>, respectivelyAnd/>Z _i represents the i-th element in z. The square error loss function based on the i ₂ norm is insensitive to small losses but very sensitive to outliers, since outliers with large losses will dominate the objective function and thus have a large impact on the learning performance of the feature selection model. Although the use of a loss function based on the l ₁ norm is insensitive to outliers, it is sensitive to small losses (it gives a relatively large penalty to small losses compared to the l ₂ norm). In other words, the l ₁ norm may be suitable for data characterization under a laplace distribution, and the l ₂ norm may be suitable for data characterization under a gaussian distribution. Typically, if the correct feature selection model has been selected, most of the data may have less loss to fit the model, while only a small amount of data has greater loss to fit the model, which may be considered outliers under the model. In practice, the data may include data that partially conforms to a gaussian distribution and data that partially conforms to a laplacian distribution, i.e., it may be assumed that a smaller loss of most data is a gaussian distribution, while a larger loss of some data is a laplacian distribution. Based on such consideration, vector/>The adaptive loss function of (1) may be set between the l ₁ norm and the l ₂ norm, an alternative set is shown in equation (1):

Where z _i is the i-th element of z, σ is the fusion parameter, which is the parameter used to approximate either the l ₁ norm or the l ₂ norm. The above-mentioned adaptive loss function may smoothly lie between the l ₁ -and l ₂ -norms, and fig. 2 a-2 c show the difference between the adaptive loss function and the l ₁ -and l ₂ -norms in the case of different values of σ, from which it is apparent that the curve of the adaptive loss function lies between the curve of the l ₁ -norms and the curve of the l ₂ -norms. Meanwhile, such an adaptive loss function has the following properties:

1. II _σ is a non-negative convex function;

2. II Z II _σ secondary can be made micro;

3. when σ tends to 0, the adaptive loss function ++z _σ tends to l ₁ norms ++z ₁;

4. When sigma tends to be ≡infinity, the adaptive loss function iiz _σ tends to l ₂ norm iiz ₂.

In practical data mining, in many cases, a multi-dimensional object of a training sample is fitted, and a loss of a matrix is immediately drawn, so that an adaptive loss function of the vector can be expanded into an adaptive loss function of the matrix. In particular, for a matrixIts l _2,1 norm (which is a matrix extension of the l ₁ norm) and l _F norm (which is a matrix extension of the l ₂ norm) are defined as/>, respectivelyAnd/>Where Z ⁱ is denoted as the ith row vector of Z. On this basis, the adaptive loss function of the matrix Z may be set between the i _2,1 norm and the i _F norm, and an alternative setting is shown in formula (2):

In connection with the application scenario that may be involved in the embodiment of the present invention, the meanings of the variables are described below, σ is the fusion parameter, z ⁱ||₂ + σ is the first equation, The second expression is that Z is a difference between a class prediction result of the training sample predicted by the feature selection model and a class labeling result corresponding to the class prediction result, Z ⁱ is a difference corresponding to the ith training sample, and n is the number of training samples. Obviously, when the matrix Z is degraded to a vector Z, the formula (2) will be simplified to the formula (1), i.e. the adaptive loss function of the vector Z is a special case of the adaptive loss function of the matrix Z. Similar to the adaptive loss function of the vector, the adaptive loss function of the matrix also has the following properties:

1. II Z II _σ is a non-negative convex function;

2. II Z II _σ can be made micro twice;

3. When a tends to be 0, the adaptive loss function, zell _σ, tends to be l _2,1 norm, zell _2,1;

4. When sigma tends to be ≡infinity, the adaptive loss function iiz _σ tends to l _F norm iiz _F.

On this basis, optionally, considering the application scenario possibly involved in the embodiment of the present invention, one selectable value of Z may be z=xw+1b ^T -Y, where xw+1b ^T is a class prediction result, Y is a class labeling result, X is a training sample, W and b are network parameters, W is a regression coefficient, and b is a bias. Specifically, W can be regarded asB can be considered as/>Bias vector of/>For an all 1 vector with the size of n, xw+1b ^T means that the feature selection model can obtain a category prediction result by carrying out regression on the training sample to the category labeling result through W and b.

Example two

Fig. 3 is a flowchart of a training method of a feature selection model provided in the second embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, the objective function determining module further includes a feature selecting unit configured based on a sparse regularization term, where the sparse regularization term is a norm related to the network parameter; the feature selection model is specifically used for determining output results, which are output by the feature selection unit and respectively correspond to the features of each sample, according to the adjusted network parameters, and selecting target features from the features of each sample according to the output results. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.

Referring to fig. 3, the method of this embodiment may specifically include the following steps:

S210, inputting a plurality of training samples and category labeling results respectively corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, and each training sample comprises a plurality of sample features.

S220, adjusting network parameters in an objective function determining module according to an objective function value output by the objective function determining module in a feature selecting model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function and a feature selecting unit constructed based on a sparse regularization term, the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm, and the sparse regularization term is a norm related to the network parameters; the feature selection model is specifically used for determining output results, which are output by the feature selection unit and respectively correspond to the features of each sample, according to the adjusted network parameters, and selecting target features from the features of each sample according to the output results.

The feature selection unit is a unit which is constructed based on a sparse regularization term and is used for selecting features, and the sparse regularization term is a norm related to network parameters. After the training samples and the class labeling results corresponding to each training sample are received by the feature selection model, a loss function measurement unit can calculate a loss function measurement value according to the training samples, the class labeling results and the network parameters, and the feature selection unit can calculate a feature selection value according to the network parameters, so that the calculation results of the loss function measurement value and the feature selection value can be used as objective function values output by the objective function determination module. Thus, the feature selection process is fused into the model training process, and the loss function metric value related to the loss function calculation and the feature selection value related to the feature selection can generate basis for the adjustment process of the network parameters.

In practical application, the sparse regularization term may be l ₁ norms, for example, l ₁ -SVM adopts l ₁ norms regularization term which is easy to generate sparse solution to perform feature selection (sample features corresponding to non-zero elements are target features); the sparse regularization term may also be l _2,1 norms, for example, a characteristic selection for cross-task coupling by adopting a l _2,1 norms regularization term, wherein the l _2,1 norms regularization term can select characteristics with joint sparsity in data, specifically, if a certain sample characteristic has higher contribution to each classification subtask of a multi-class classification task, the sample characteristic can be reserved as a target characteristic, thereby generating a similar group sparse effect as a group lasso method; the sparse regularization term may also be l _2,1 norms related to the regression coefficient W, and by taking into account an application scenario possibly related to the embodiment of the present invention, a feature selection unit constructed based on the l _2,1 norms related to the regression coefficient W may be λiiwti _2,1, and an l _2,1 norms acting on W may generate a sparse effect, that is, elements in many rows of elements in the W obtained by final optimization are all 0, and the non-zero rows correspond to the retained target features, thereby implementing a feature selection function; λ is a regularization parameter that can be used to balance the weights of the loss function metric unit and the feature selection unit, while the sparseness of W in the l _2,1 norm can be controlled, with more rows in W compressed to 0 when λ is greater.

On this basis, the objective function determining module may be represented by the following formula (3), where the adjustment process of the network parameter may be an iterative optimization process, and the meaning of min is that W and b will take different values during the iterative process, and W and b are selected to minimize the value of |xw+1b ^T-Y‖_σ.

min_W,b‖XW+1b^T-Y‖_σ+λ‖W‖_2,1 (3)

It should be noted that, the output result output by the feature selection unit and corresponding to each sample feature may be a certain row of an adjusted network parameter, each row of the adjusted network parameter may correspond to one sample feature, and the output result may be considered as a certain row vector in the matrix. According to the output results corresponding to the sample features, the target features can be selected from the sample features, for example, the sample features corresponding to the rows with non-zero elements can be used as the target features; and if the rows are also ranked from large to small, the feature selection is carried out according to the number of the target features to be selected and the ranking result, which are preset; etc., and are not particularly limited herein.

According to the technical scheme provided by the embodiment of the invention, the feature selection unit is constructed through the sparse regularization term related to the network parameters, the sparse regularization term can generate a sparse effect on the network parameters, and the target features can be accurately selected from the sample features according to the output result of the feature selection unit.

An optional technical solution, according to the objective function value output by the objective function determining module in the feature selection model, adjusts the network parameter in the objective function determining module, which may include: inputting network parameters into a feature selection model, and determining whether an objective function value output by an objective function determining module in the feature selection model is converged; if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to the adjustment result; and repeatedly executing the step of inputting the network parameters into the feature selection model until the objective function value converges and finishing the network parameter adjustment. The parameter adjustment data may be intermediate variables related to network parameters, which are set for simplifying the calculation process, and may be used for updating the network parameters.

In order to better understand the specific implementation process of the above steps, an exemplary description will be given below of a training method of the feature selection model of this embodiment, taking an example shown in formula (3) as an example. Optionally, deriving the formula (3) results in the formula (4):

Wherein D is the element at the ith position on the diagonal of the diagonal matrix W ⁱ is the ith row in W, let And d _i is parameter adjustment data (i.e., intermediate variables), since the adaptive loss function has a non-convex, minuscule property, the entire equation of equation (4) obtained after deriving equation (3) is 0, i.e., equation (4) is zero to obtain equation (5), whereby the expressions of W and b can be obtained:

Due to And d _i are both related to the variables W to be optimized, while the optimization process to optimize these variables is very difficult, for which reason these variables can be optimized based on an iterative re-weighting algorithm: the input data may comprise a sample matrix/>, which is made up of a plurality of training samplesLabeling matrix/>, composed of multiple category labeling resultsThe number k of target features to be selected, regularization parameter lambda; the output data may include k target features, and this iterative re-weighting algorithm is performed as follows:

________________________________________________________

1: t=0 (t is the number of iterations)

2: Initialization W _t =i (I is an identity matrix of size n×d) and b _t =1

3: Repeat (repeatedly execute the following steps)

4: Calculated from W _t and b _t And/>

5: Updating W _t+1 and updating b according to equation (5) _t+1

6: Until converges (until the objective function value converges)

7: Ordering from large to small according to W ⁱ||₂, and taking sample features corresponding to the first k rows as target features

_______________________________________________

Example III

Fig. 4 is a flowchart of an object classification method according to a third embodiment of the present invention. The method and the device can be applied to the situation that the object to be classified is classified based on the target features selected by the feature selection model obtained through pre-training. The method may be performed by an object classification apparatus provided by an embodiment of the present invention, where the apparatus may be implemented in software and/or hardware, and the apparatus may be integrated on an electronic device, where the electronic device may be a terminal or a server.

Referring to fig. 4, the method of the embodiment of the present invention specifically includes the following steps:

S310, acquiring characteristics of an object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained through training according to the training method according to any embodiment of the invention.

Wherein the object to be classified is an object to be classified, such as biological data, image data, text data, voice data, and the like to be classified; the features are features obtained after feature extraction of the object to be classified, and as in the first embodiment of the present invention, there are various implementation manners of feature extraction, which are not described herein.

After the characteristics of the object to be classified are obtained, the target characteristics, which are the characteristics with larger contribution to the classification task of the object to be classified, can be screened from the characteristics based on the characteristic selection model obtained by training in advance.

S320, classifying the object to be classified according to the target characteristics.

The method comprises the steps of carrying out classification on the object to be classified according to target characteristics, for example, carrying out processing on each target characteristic again, and obtaining a classification result of the object to be classified according to the processing result; then, for example, inputting each target feature into an object classification model obtained by training in advance, and obtaining a classification result of the object to be classified according to an output result of the object classification model; etc., and are not particularly limited herein.

It should be noted that the above technical solution may be applied to many fields, and in the biological field, by way of example, continuing to take the above-described genetic data as an example, the genetic data may be classified into genetic data of humans, dogs or cats according to a genetic target feature of the genetic data (i.e., a target feature obtained by selecting each genetic feature), and the genetic data may be classified into genetic data of an object having or not having a relationship with the target object; in the image field, taking the fingerprint data as an example, the fingerprint data can be classified into fingerprint data belonging to or not belonging to a target object and fingerprint data belonging to or not belonging to a target class according to the fingerprint target characteristics of the fingerprint data, wherein the target class can be human beings, animals and the like; continuing with the text data described above as an example in the text field, the text data can be classified into text data from a-composer, B-composer or C-composer according to the text target characteristics of the text data, and the emotion tendencies reflected by the text data can be classified into happiness, pain or sadness; in the field of speech, continuing to take the above-mentioned speech data as an example, according to the speech target feature of a certain speech data, the speech data can be classified into speech data belonging to or not belonging to a target object, and the emotion represented by the speech data can be classified into emotion belonging to or not belonging to the target object; etc. Of course, the above technical solution can also be applied in other fields, and is not specifically limited herein.

According to the technical scheme, the characteristics of the object to be classified are obtained, and the target characteristics with larger contribution to the classification task of the subsequent object to be classified are selected from the characteristics based on the trained characteristic selection model; and classifying the object to be classified according to the target feature to obtain a classification result of the object to be classified. According to the technical scheme, the target features with larger contribution to the classification task of the subsequent object to be classified are selected from the features through the feature selection model, and then the classification accuracy of the object to be classified is improved when the object to be classified is classified based on the target features.

Example IV

Fig. 5 is a block diagram of a training device for a feature selection model according to a fourth embodiment of the present invention, where the training device is configured to execute the training method for a feature selection model according to any of the foregoing embodiments. The device and the training method of the feature selection model of each embodiment belong to the same invention conception, and reference is made to the embodiment of the training method of the feature selection model for details which are not described in detail in the embodiment of the training device of the feature selection model. Referring to fig. 5, the apparatus may specifically include: a data input module 410 and a model training module 420.

The data input module 410 is configured to input a plurality of training samples and class labeling results corresponding to each training sample into the feature selection model, where the training samples include samples related to biological feature processing, image feature processing, voice feature processing, and/or text feature processing, and each training sample includes a plurality of sample features, and the feature selection model is configured to select a target feature from among the sample features;

The model training module 420 is configured to adjust network parameters in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, where the objective function determining module includes a loss function measurement unit constructed based on an adaptive loss function, and the adaptive loss function includes a first equation related to a first norm and a second equation related to a second norm.

Optionally, the curve of the adaptive loss function is a curve between the curve of the first norm and the curve of the second norm; and/or, a fusion parameter used for fusing the first norm and the second norm is arranged in the first equation and/or the second equation; and/or the first norm comprises the l _2,1 norm and/or the second norm comprises the l _F norm.

Alternatively, the adaptive loss function is expressed by the following formula:

Wherein sigma is a fusion parameter, |z ⁱ||₂ +sigma is a first expression, The second expression is that Z is a difference between a class prediction result of the training sample predicted by the feature selection model and a class labeling result corresponding to the class prediction result, Z ⁱ is a difference corresponding to the ith training sample, and n is the number of training samples.

Optionally, z=xw+1b ^T -Y, where xw+1b ^T is a class prediction result, Y is a class labeling result, X is a training sample, W and b are network parameters, W is a regression coefficient, and b is a bias.

Optionally, the objective function determining module further includes a feature selecting unit constructed based on a sparse regularization term, where the sparse regularization term is a norm related to the network parameter; the feature selection model is specifically used for determining output results, which are output by the feature selection unit and respectively correspond to the features of each sample, according to the adjusted network parameters, and selecting target features from the features of each sample according to the output results.

On this basis, optionally, the sparse regularization term is the l _2,1 norm associated with the regression coefficient W.

Optionally, the model training module 420 may specifically be configured to:

inputting network parameters into a feature selection model, and determining whether an objective function value output by an objective function determining module in the feature selection model is converged; if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to the adjustment result; and repeatedly executing the step of inputting the network parameters into the feature selection model until the objective function value converges and finishing the network parameter adjustment.

According to the training device for the feature selection model, provided by the fourth embodiment of the invention, training samples comprising a plurality of sample features and class labeling results respectively corresponding to each training sample can be input into the feature selection model through the data input module, wherein the training samples can comprise samples related to biological feature processing, image feature processing, voice feature processing and/or text feature processing, and the feature selection model is a model for selecting target features from all sample features; the model training module is constructed based on the self-adaptive loss function which can be adapted to the data measurement under various data distribution, and the loss value of an abnormal point in the data can not be amplified, so that the network parameter can be accurately adjusted based on the output result of the objective function determining module where the loss function measuring unit is positioned. The device ensures the adjustment precision of network parameters under various conditions through the self-adaptive loss function which can be suitable for data measurement under various data distribution, and the feature selection model obtained based on the accurate adjustment of the network parameters improves the selection precision of target features, and the robustness of feature selection is higher.

The training device for the feature selection model provided by the embodiment of the invention can execute the training method for the feature selection model provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the training device of the feature selection model, each unit and module included are only divided according to the functional logic, but are not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Example five

Fig. 6 is a block diagram of an object classification apparatus according to a fifth embodiment of the present invention, where the apparatus is configured to perform the object classification method according to any of the foregoing embodiments. The device belongs to the same inventive concept as the object classification method of the above embodiments, and in the details of the embodiment of the object classification device, which are not described in detail, reference may be made to the embodiment of the object classification method. Referring to fig. 6, the apparatus may specifically include: a feature selection module 510 and an object classification module 520. Wherein,

The feature selection module 510 is configured to obtain features of an object to be classified, and select a target feature from the features based on a feature selection model obtained by training according to the training method according to any embodiment of the present invention;

the object classification module 520 is configured to classify the object to be classified according to the target feature.

According to the object classification device provided by the fifth embodiment of the invention, the feature selection module is used for acquiring the feature of the object to be classified, and the target feature with larger contribution to the classification task of the subsequent object to be classified is selected from the features based on the trained feature selection model; and the object classification module classifies the object to be classified according to the target feature to obtain a classification result of the object to be classified. According to the device, the target features with larger contribution to the classification task of the subsequent object to be classified are selected from the features through the feature selection model, and the effect of improving the classification accuracy of the object to be classified is achieved when the object to be classified is classified based on the target features.

The object classification device provided by the embodiment of the invention can execute the object classification method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the object classification apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Example six

Fig. 7 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention, and as shown in fig. 7, the electronic device includes a memory 610, a processor 620, an input device 630, and an output device 640. The number of processors 620 in the electronic device may be one or more, one processor 620 being taken as an example in fig. 7; the memory 610, processor 620, input device 630, and output device 640 in the electronic device may be connected by a bus or other means, for example by bus 650 in fig. 7.

The memory 610 is used as a computer readable storage medium for storing a software program, a computer executable program, and a module, such as a program instruction/module corresponding to a training method of a feature selection model in an embodiment of the present invention (for example, the data input module 410 and the model training module 420 in a training device of the feature selection model), or a program instruction/module corresponding to an object classification method in an embodiment of the present invention (for example, the feature selection module 510 and the object classification module 520 in an object classification device). The processor 620 executes various functional applications and data processing of the electronic device by executing software programs, instructions and modules stored in the memory 610, i.e., implements the training method or object classification method of the feature selection model described above.

The memory 610 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 610 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 610 may further include memory remotely located relative to processor 620, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 630 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output device 640 may include a display device such as a display screen.

Example seven

A seventh embodiment of the present invention provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a training method of a feature selection model, the method comprising:

Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the above-described method operations, and may also perform the related operations in the training method of the feature selection model provided in any embodiment of the present invention.

Example eight

An eighth embodiment of the present invention provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a method of object classification, the method comprising:

From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. In light of such understanding, the technical solution of the present invention may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), FLASH Memory (FLASH), hard disk, optical disk, or the like, of a computer, which may be a personal computer, a server, a network device, or the like, including instructions for causing a computer device (which may be a personal computer, a server, or the like) to perform the methods described in the various embodiments of the present invention.

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. A method of training a feature selection model, comprising:

Inputting a plurality of training samples and class labeling results corresponding to each training sample into a feature selection model, wherein the training samples comprise samples related to image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

Adjusting network parameters in an objective function determining module according to an objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, the adaptive loss function comprises a first form related to a first norm and a second form related to a second norm, fusion parameters used for fusing the first norm and the second norm are arranged in the first form and/or the second form, and the adaptive loss function is related to the ratio of the second form to the first form;

The feature selection model is further used for respectively carrying out category prediction on each training sample to obtain category prediction results, the loss function measurement unit is used for measuring the category prediction results and the category labeling results which belong to the same training sample, the objective function value is calculated according to the measurement results and other factors, and the other factors are factors related to feature selection.

2. The method of claim 1, wherein the curve of the adaptive loss function is a curve between the curve of the first norm and the curve of the second norm; and/or the number of the groups of groups,

The first norm includesNorms, and/or, the second norms include/>Norms.

3. The method according to claim 1 or 2, characterized in that the adaptive loss function is expressed by the following formula:

；

Wherein, Is a fusion parameter,/>Is the first formula,/>Is the second formula,/>Is the difference between the predicted class result of the training sample predicted by the characteristic selection model and the labeled class result corresponding to the predicted class result,/>Is/>The difference value corresponding to each training sample,/>Is the number of training samples.

4. The method of claim 3, wherein the step of,Wherein/>Is the result of the category prediction,/>Is the result of the category annotation,/>Is the training sample,/>And/>Is the network parameter,/>Is a regression coefficient,/>Is offset.

5. The method of claim 1, wherein the objective function determination module further comprises a feature selection unit constructed based on a sparse regularization term, the sparse regularization term being a norm associated with the network parameter;

the feature selection model is specifically configured to determine output results corresponding to each sample feature output by the feature selection unit according to the adjusted network parameters, and select a target feature from each sample feature according to each output result.

6. The method of claim 5, wherein the sparse regularization term is a regression coefficientRelated toNorms.

7. The method of claim 1, wherein the adjusting the network parameters in the objective function determination module according to the objective function value output by the objective function determination module in the feature selection model comprises:

Inputting the network parameters into the feature selection model, and determining whether the objective function values output by an objective function determining module in the feature selection model are converged;

if not, determining parameter adjustment data according to the network parameters, adjusting the network parameters according to the parameter adjustment data, and updating the network parameters according to an adjustment result;

and repeating the step of inputting the network parameters into the feature selection model until the objective function values are converged, and ending the network parameter adjustment.

8. An object classification method, comprising:

Acquiring characteristics of an object to be classified, and selecting target characteristics from the characteristics based on a characteristic selection model obtained by training according to the training method of any one of claims 1 to 7;

and classifying the object to be classified according to the target characteristics.

9. A training device for a feature selection model, comprising:

The data input module is used for inputting a plurality of training samples and category labeling results corresponding to each training sample into the feature selection model, wherein the training samples comprise samples related to image feature processing, voice feature processing and/or text feature processing, each training sample comprises a plurality of sample features, and the feature selection model is used for selecting target features from the sample features;

The model training module is used for adjusting network parameters in the objective function determining module according to the objective function value output by the objective function determining module in the feature selection model, wherein the objective function determining module comprises a loss function measuring unit constructed based on an adaptive loss function, the adaptive loss function comprises a first equation related to a first norm and a second equation related to a second norm, fusion parameters used for fusing the first norm and the second norm are arranged in the first equation and/or the second equation, and the adaptive loss function is related to the ratio of the second equation to the first equation;

10. An object classification apparatus, comprising:

The feature selection module is used for acquiring the features of the object to be classified, and selecting target features from the features based on a feature selection model obtained by training according to the training method of any one of claims 1-7;

11. An electronic device, comprising:

One or more processors;

a memory for storing one or more programs;

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of training a feature selection model as claimed in any one of claims 1-7, or the method of object classification as claimed in claim 8.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a training method of a feature selection model according to any one of claims 1-7 or an object classification method according to claim 8.