CN114139658A - Method for training classification model and computer readable storage medium - Google Patents

Method for training classification model and computer readable storage medium Download PDF

Info

Publication number
CN114139658A
CN114139658A CN202210117469.8A CN202210117469A CN114139658A CN 114139658 A CN114139658 A CN 114139658A CN 202210117469 A CN202210117469 A CN 202210117469A CN 114139658 A CN114139658 A CN 114139658A
Authority
CN
China
Prior art keywords
data
training
data set
classification model
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210117469.8A
Other languages
Chinese (zh)
Inventor
刘国清
杨广
王启程
郑伟
贺硕
杨国武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Minieye Innovation Technology Co Ltd
Original Assignee
Shenzhen Minieye Innovation Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Minieye Innovation Technology Co Ltd filed Critical Shenzhen Minieye Innovation Technology Co Ltd
Priority to CN202210117469.8A priority Critical patent/CN114139658A/en
Publication of CN114139658A publication Critical patent/CN114139658A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a training method of a classification model, which comprises the following steps: training an initial classification model according to a first data set to obtain an intermediate classification model, wherein the first data set is a set of labeled data; extracting a feature vector of a training data set by using an intermediate classification model, wherein the training data set comprises a first data set and a second data set, and the second data set is a set of label-free data; constructing a nearest data graph and a farthest data graph according to the feature vectors of the training data set; obtaining a predictive label of the second data set according to the latest data graph and the farthest data graph; and training the intermediate classification model according to the first data set and the second data set with the prediction labels to obtain a target classification model. In addition, the invention also provides a computer readable storage medium. The technical scheme of the invention effectively solves the problem of low accuracy of the classification model caused by small quantity of labeled data.

Description

Method for training classification model and computer readable storage medium
Technical Field
The present invention relates to the field of machine learning technologies, and in particular, to a training method for a classification model and a computer-readable storage medium.
Background
Deep learning models have achieved tremendous success in various fields, and especially supervised learning algorithms have achieved significant results in a large number of application fields. Deep learning generally learns a model from a large number of labeled training samples to predict as correct a label as possible for an unseen sample. However, in many practical application scenarios, manually labeling large-scale training samples requires enormous manpower and material resources. Therefore, many studies focus on semi-supervised learning, i.e. a model that learns in the case of only partially labeled samples and a large number of unlabeled samples.
Disclosure of Invention
The invention provides a training method of a classification model and a computer readable storage medium, which are used for solving the problem of low accuracy of the classification model caused by small quantity of labeled data.
In a first aspect, an embodiment of the present invention provides a method for training a classification model, where the method for training the classification model includes:
training an initial classification model according to a first data set to obtain an intermediate classification model, wherein the first data set is a set of labeled data;
extracting feature vectors of a training data set by using the intermediate classification model, wherein the training data set comprises the first data set and a second data set, and the second data set is a set of unlabeled data;
constructing a nearest data graph and a farthest data graph according to the feature vectors of the training data set, wherein the nearest data graph is a relation graph of each data in the training data set and a plurality of other data which are closest to each data, and the farthest data graph is a relation graph of each data in the training data set and a plurality of other data which are farthest to each data;
obtaining a predictive label for the second data set from the most recent data map and the most recent data map; and
and training the intermediate classification model according to the first data set and the second data set with the prediction labels to obtain a target classification model.
In a second aspect, an embodiment of the present invention provides a computer-readable storage medium for storing program instructions executable by a processor to implement a training method of a classification model as described above.
According to the training method of the classification model and the computer readable storage medium, the initial classification model is trained according to the first data set with the label to obtain the intermediate classification model, so that the obtained intermediate classification model obtains certain feature extraction capability, and the feature vector of the data is conveniently extracted subsequently. The process in which the initial classification model is trained from the first data set may be referred to as a warm-start phase. And extracting the feature vectors of the first data set and the second data set by using the intermediate classification model, and constructing a nearest data graph and a farthest data graph according to the feature vectors. And propagating the label information carried by the first data set to the second data set according to the similarity of each data in the nearest data graph and the other data which are closest to each other and the dissimilarity of each data in the farthest data graph and the other data which are farthest from each other, thereby obtaining the predicted label of the second data set. The predicted label can be combined with the label information of the data closest to the predicted label and the label information of the data farthest from the predicted label, and the similarity and the dissimilarity between the data are utilized to disambiguate, so that the predicted label is more credible and more accurate. And finally, training the intermediate classification model according to the first data set and the second data set with the prediction labels, so that the number of labeled data is effectively increased, a target classification model with good classification capability is obtained, and the performance of the target classification model is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a flowchart of a method for training a classification model according to an embodiment of the present invention.
Fig. 2 is a first sub-flowchart of a method for training a classification model according to an embodiment of the present invention.
Fig. 3 is a second sub-flowchart of the training method of the classification model according to the embodiment of the present invention.
Fig. 4 is a third sub-flowchart of a method for training a classification model according to an embodiment of the present invention.
FIG. 5 is a schematic diagram of the recent data graph shown in FIG. 1.
Fig. 6 is a schematic diagram of the farthest data graph shown in fig. 1.
Fig. 7 is a schematic diagram of the neighbor similarity matrix shown in fig. 3.
Fig. 8 is a schematic diagram of the distant similarity matrix shown in fig. 3.
Fig. 9 is a schematic diagram of the initial tag matrix shown in fig. 3.
Fig. 10 is a schematic diagram of the target tag matrix shown in fig. 4.
Fig. 11 is a schematic internal structure diagram of a terminal according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the above-described drawings (if any) are used for distinguishing between similar items and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances, in other words that the embodiments described are to be practiced in sequences other than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and any other variation thereof, may also include other things, such as processes, methods, systems, articles, or apparatus that comprise a list of steps or elements is not necessarily limited to only those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such processes, methods, articles, or apparatus.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Please refer to fig. 1, which is a flowchart illustrating a method for training a classification model according to an embodiment of the present invention. The training method is used for training the classification model, and the classification model obtained through training can classify the label-free data. The training method of the classification model specifically comprises the following steps.
And S102, training an initial classification model according to the first data set to obtain an intermediate classification model. Wherein the first data set is a set of tagged data. In this embodiment, before training the initial classification model, a plurality of preset classes are preset. The first data in the first data set comprises real tags, and the real tag of each first data corresponds to one of a plurality of preset categories. And training the initial classification model according to the first data set at a preset learning rate, and judging whether the training times of the initial classification model reach preset times. And when the training times of the initial classification model reach preset times, taking the initial classification model trained for the preset times as an intermediate classification model. The specific process of training the initial classification model is as follows: and inputting the first data in the first data set into the initial classification model to obtain a corresponding first label, and updating the parameters of the initial classification model at a preset learning rate according to the first label and the real label. And inputting the first data into the initial classification model after the parameters are updated to obtain a new first label, and updating the parameters of the initial classification model again at a preset learning rate according to the new first label and the real label until the training reaches the preset times. And taking the initial classification model after the parameters are updated for the last time as an intermediate classification model. Wherein the intermediate classification model has not yet reached convergence. The preset learning rate and the preset times can be set according to actual conditions, and are not limited herein. The first data in the first data set includes, but is not limited to, images, text, audio, and the like.
And step S104, extracting the feature vector of the training data set by using the intermediate classification model. It will be appreciated that the intermediate classification model trained from the first data set has a certain feature extraction capability that can be used to extract feature vectors of the data. The data in the training data set is input into the intermediate classification model, and a feature vector corresponding to each data can be obtained. Wherein the training data set comprises a first data set and a second data set, and the second data set is a set of unlabeled data. The second data in the second data set includes, but is not limited to, images, text, audio, and the like. It will be appreciated that the data in the training data set comprises first data and second data, the first data and the second data being of the same type of data. That is, if the first data is an image, the second data should also be an image; if the first data is a character, the second data should also be a character; if the first data is audio, the second data should also be audio.
The feature vectors directly output by the intermediate classification model have higher dimensionality, and are usually 512-dimensional or even higher dimensionality. Therefore, the feature vectors output by the intermediate classification model need to be subjected to dimension reduction processing so as to facilitate subsequent calculation, and meanwhile, the calculation amount is greatly reduced. In this embodiment, the feature vector is normalized, and the feature vector after the normalization is subjected to dimension reduction processing, so that the dimension of the feature vector is 128 dimensions.
And step S106, constructing a nearest data graph and a farthest data graph according to the feature vectors of the training data set. The feature vectors of the data can be used for representing the distance between the data, and a nearest data graph and a farthest data graph can be constructed according to the distance between the data. It is understood that the farther the distance between two data is, the more dissimilar the two data are, the more unlikely they belong to the same category; the closer the distance between two data, the more similar the two data are, the more likely they belong to the same category. Wherein the recent data graph is a relationship graph of each data in the training data set and a plurality of other data with each data having a closest distance. The farthest data map is a plot of each data in the training data set against a plurality of other data each data is farthest away from. The specific process of constructing the nearest data map and the farthest data map from the feature vectors of the training data set will be described in detail below.
And step S108, acquiring the prediction label of the second data set according to the latest data map and the farthest data map. According to the idea of the label propagation algorithm, the characteristics that data closer to the nearest data graph are similar and data farther from the farthest data graph are dissimilar can be utilized to transfer the real label of the first data to the second data so as to form a predicted label of the second data in the second data set. The specific process of obtaining the predictive label for the second data set from the most recent data map and the most distant data map will be described in detail below.
And step S110, training an intermediate classification model according to the first data set and the second data set with the prediction labels to obtain a target classification model. And inputting the first data in the first data set and the second data in the second data set into the intermediate classification model to obtain corresponding second labels, and updating parameters of the intermediate classification model according to the second labels and the real labels of the first data and the second labels and the prediction labels of the second data until the intermediate classification model converges. And taking the converged intermediate classification model as a target classification model for classifying the non-label data.
In some possible embodiments, an intermediate classification model trained according to the first data set and the second data set up to a preset training number may also be used as the target classification model.
In the above embodiment, the initial classification model is trained according to the labeled first data set to obtain the intermediate classification model, so that the obtained intermediate classification model obtains a certain feature extraction capability, which is convenient for subsequently extracting feature vectors of data. The process in which the initial classification model is trained from the first data set may be referred to as a warm-start phase. And extracting the feature vectors of the first data set and the second data set by using the intermediate classification model, and constructing a nearest data graph and a farthest data graph according to the feature vectors. And propagating the label information carried by the first data set to the second data set according to the similarity of each data in the nearest data graph and the other data which are closest to each other and the dissimilarity of each data in the farthest data graph and the other data which are farthest from each other, thereby obtaining the predicted label of the second data set. The predicted label can be combined with the label information of the data closest to the predicted label and the label information of the data farthest from the predicted label, and the similarity and the dissimilarity between the data are utilized to disambiguate, so that the predicted label is more credible and more accurate. And finally, training the intermediate classification model according to the first data set and the second data set with the prediction labels, so that the number of labeled data is effectively increased, a target classification model with good classification capability is obtained, and the performance of the target classification model is improved.
Please refer to fig. 2, which is a first sub-flowchart of a training method of a classification model according to an embodiment of the present invention. Step S106 specifically includes the following steps.
Step S202, respectively calculating the distance between every two data in the training data set according to the feature vectors of the training data set. The distance between every two data can be represented by the euclidean distance between the feature vectors of every two data, and the distance between every two data can also be represented by the cosine similarity between the feature vectors of every two data. Then, the euclidean distance or cosine similarity between the feature vectors of every two data is calculated as the distance between the corresponding two data.
Step S204, the distances corresponding to each datum are sorted in the order from small to large. And respectively sequencing the distances corresponding to each datum from small to large. It will be appreciated that if n data are included in the training data set, then there are n-1 distances corresponding to each data, and the n-1 distances corresponding to the same data are sorted. For example, the training data set includes 6 data, which are data a, data B, data C, data D, data E, and data F. The distances between the data A and the data B, between the data C, between the data D, between the data E and between the data F are A1, A2, A3, A4 and A5 respectively; sorting the distances corresponding to the data a in order from small to large may result in A3, a2, a4, a1, a 5. The distances between the data B and the data A, the data C, the data D, the data E and the data F are respectively B1, B2, B3, B4 and B5; sorting the distances corresponding to the data B in order from small to large may result in B4, B2, B1, B5, B3. The distances corresponding to data C, data D, data E, and data F are also sorted accordingly, and are not described in detail herein.
In step S206, a preset number of other data are selected from the training data set from the data corresponding to the minimum distance as neighbor data of the data. It should be understood that the plurality of other data closest to each data does not mean that the plurality of other data are all the same as the corresponding data, but one other data closest to each data is selected from the training data set, and then one other data closest to each data is selected until the number of the selected other data is the preset number. The preset number may be set according to an actual situation, and is not limited herein. In this embodiment, each data has a preset number of neighbor data. For example, if the predetermined number is 2. For data a, the data corresponding to the minimum distance a3 is data D, and data D is selected from the training data set as the neighbor data of data a. Next, the data corresponding to the minimum distance a2 is data C, and data C is selected from the training data set as neighbor data of data a. Then, the neighbor data of data a includes data D and data C. For data B, the data corresponding to the minimum distance B4 is data E, and data E is selected from the training data set as the neighbor data of data B. Next, the data corresponding to the minimum distance B2 is data C, and data C is selected from the training data set as neighbor data of data B. Then, the neighbor data of data B includes data E and data C. The neighbor data of the data C, the data D, the data E, and the data F are also selected accordingly, and are not described in detail herein.
Step S208, a nearest data map is constructed according to the neighbor data of each data in the training data set. In this embodiment, each data in the training data set is a node in the nearest data graph, and each data is connected to the corresponding neighboring data by a wire. For example, data a, data B, data C, data D, data E, and data F each have two neighboring data, and the constructed nearest data map is shown in fig. 5. The neighbor data of the data C comprises data B and data E, the neighbor data of the data D comprises data B and data E, the neighbor data of the data E comprises data B and data F, and the neighbor data of the data F comprises data A and data D.
Step S210, selecting a preset number of other data from the training data set from the data corresponding to the maximum distance as distant data of the data. It should be understood that the plurality of other data farthest from each data does not mean that the plurality of other data are all the same distance from the corresponding data, but after selecting one other data farthest from the training data set, one other data farthest from the training data set is selected until the number of the selected other data is the preset number. The preset number may be set according to an actual situation, and is not limited herein. In this embodiment, each data has a predetermined number of distant data. That is, the number of neighboring data per data is the same as the number of distant data. In some possible embodiments, the number of close-neighbor data and the number of distant data per data may not be the same. For example, if the predetermined number is 2. For data a, the data corresponding to the maximum distance a5 is data F, and data F is selected from the training data set as the distant data of data a. Next, the data corresponding to the maximum distance a1 is data B, and data B is reselected from the training data set as the distant data of data a. Then, the distant data of the data a includes data F and data B. For data B, the data corresponding to the maximum distance B3 is data D, and data D is selected from the training data set as distant data of data B. Next, the data corresponding to the maximum distance B5 is data F, and data F is reselected from the training data set as distant data of data B. Then, the distant data of the data B includes data D and data F. The remote data of the data C, the data D, the data E, and the data F are also selected accordingly, and are not described in detail herein.
Step S212, constructing a farthest data graph according to distant data of each data in the training data set. In this embodiment, each data in the training data set is a node in the farthest data graph, and each data is connected to a corresponding distant data by a connection line. For example, the data a, the data B, the data C, the data D, the data E, and the data F each have two distant data, and the farthest data graph is constructed as shown in fig. 6. The remote data of the data C comprises data A and data D, the remote data of the data D comprises data A and data F, the remote data of the data E comprises data A and data C, and the remote data of the data F comprises data B and data E.
In the above embodiment, the distance between every two data sets is respectively calculated according to the feature vectors of the data sets, and a preset number of neighboring data sets and a preset number of distant data sets are selected from the training data set for each data set according to the distance between every two data sets. And constructing a nearest data graph according to the neighbor data of all the data, and constructing a farthest data graph according to the distant data of all the data, so that the nearest data graph can cover the similar relation among all the data, and the farthest data graph can cover the dissimilar relation among all the data. And label propagation is carried out based on the latest data graph and the farthest data graph, and the similarity relation between the data and the dissimilarity relation between the data can be simultaneously utilized, so that potential wrong label information is removed, and more credible label information is propagated for the second data.
Referring to fig. 3 and fig. 4 in combination, fig. 3 is a second sub-flowchart of a training method of a classification model according to an embodiment of the present invention, and fig. 4 is a third sub-flowchart of the training method of the classification model according to the embodiment of the present invention. Step S108 specifically includes the following steps.
And step S302, constructing a neighbor similarity matrix according to the nearest data graph. The numerical values in the neighbor similarity matrix include the similarity between each data in the training data set and a plurality of other data closest to each data, the similarity between each data and the rest data in the training data set except the plurality of other data closest to each data, and the similarity between the same data in the training data set. In this embodiment, the distance between each data and the corresponding neighboring data is taken as the similarity of each data to a plurality of other data closest to each data, respectively. The similarity of each data and the rest data except for a plurality of other data which are closest to each other in the training data set is set to be 0, and the similarity of the same data in the training data set is set to be 0. For example, the neighbor data of the data a includes data C and data D, and in the neighbor similarity matrix (as shown in fig. 7), the similarity between the data a and the data a is 0, the similarity between the data a and the data B is 0, the similarity between the data a and the data C is a2, the similarity between the data a and the data D is A3, the similarity between the data a and the data E is 0, and the similarity between the data a and the data F is 0. The neighbor data of the data B includes data C and data E, and in the neighbor similarity matrix, the similarity between the data B and the data a is 0, the similarity between the data B and the data B is 0, the similarity between the data B and the data C is B2, the similarity between the data B and the data D is 0, the similarity between the data B and the data E is B4, and the similarity between the data B and the data F is 0. The similarity between the data C, the data D, the data E, the data F and the data in the training data set is calculated accordingly, and is not described in detail herein.
And after constructing the neighbor similarity matrix according to the nearest data graph, carrying out normalization processing on the neighbor similarity matrix. In this embodiment, neighbor similarity is utilizedAnd the degree matrix of the degree matrix is used for carrying out normalization processing on the neighbor similarity matrix. The degree matrix is a diagonal matrix, and the elements on the diagonal of the degree matrix are the number of the data adjacent to each data, namely the preset number. Specifically, the neighbor similarity matrix is normalized by a first formula. The first formula is
Figure DEST_PATH_IMAGE001
. Wherein,
Figure 534863DEST_PATH_IMAGE002
representing the normalized neighbor similarity matrix,
Figure DEST_PATH_IMAGE003
a matrix representing the similarity of the neighbors is represented,
Figure 920845DEST_PATH_IMAGE004
a degree matrix representing a neighbor similarity matrix.
And step S304, constructing a distant similarity matrix according to the farthest data graph. The numerical values in the distant similarity matrix include the similarity between each data in the training data set and a plurality of other data with the farthest distance from each data, the similarity between each data and the rest of data in the training data set except the plurality of other data with the farthest distance from each data, and the similarity between the same data in the training data set. In this embodiment, the distance between each data and the corresponding distant data is taken as the similarity between each data and a plurality of other data having the farthest distance from each data, respectively. The similarity of each data and the rest data except for a plurality of other data with the farthest distance in the training data set is set to be 0, and the similarity of the same data in the training data set is set to be 0. For example, the distant data of the data a includes data B and data F, and in the distant similarity matrix (as shown in fig. 8), the similarity between the data a and the data a is 0, the similarity between the data a and the data B is a1, the similarity between the data a and the data C is 0, the similarity between the data a and the data D is 0, the similarity between the data a and the data E is 0, and the similarity between the data a and the data F is a 5. The distant data of the data B includes data D and data F, and in the distant similarity matrix, the similarity between the data B and the data a is 0, the similarity between the data B and the data B is 0, the similarity between the data B and the data C is 0, the similarity between the data B and the data D is B3, the similarity between the data B and the data E is 0, and the similarity between the data B and the data F is B5. The similarity between the data C, the data D, the data E, the data F and the data in the training data set is calculated accordingly, and is not described in detail herein.
And after constructing the distant similarity matrix according to the farthest data graph, carrying out normalization processing on the distant similarity matrix. In this embodiment, the degree matrix of the distant similarity matrix is used to perform normalization processing on the distant similarity matrix. The degree matrix is a diagonal matrix, and elements on the diagonal of the degree matrix are the number of distant data of each data, namely the preset number. Specifically, the remote similarity matrix is normalized by the second formula. The second formula is
Figure DEST_PATH_IMAGE005
. Wherein,
Figure 28478DEST_PATH_IMAGE006
representing the normalized distant similarity matrix,
Figure DEST_PATH_IMAGE007
a matrix of the degree of similarity of the distance is represented,
Figure 751583DEST_PATH_IMAGE008
a degree matrix representing a distant similarity matrix.
And S306, acquiring a predicted label of the second data set according to the initial label matrix, the neighbor similarity matrix and the distant similarity matrix of the training data set. In this embodiment, the initial label matrix of the training data set includes m rows and n columns, and the values in the initial label matrix include 0 and 1. Where m represents the number of preset categories and n represents the number of data in the training data set. The values in the initial label matrix represent the relationship between each data in the training dataset and each preset category. If the data belongs to the preset category, the numerical value corresponding to the data and the preset category is 1; if the data does not belong to the preset category, the value corresponding to the data and the preset category is 0. Since the second data is non-label data, the values corresponding to the second data and all the preset categories are 0. Because the first data are tagged data and the real tags of the first data are one-hot vectors, the numerical values in the real tags correspond to the preset categories one by one, and the number of the numerical values in the real tags is the same as the number of the preset categories. The numerical value in the first data real label and the numerical value corresponding to the first data and all the preset categories are in one-to-one correspondence. For example, the preset categories include category a, category b, and category c. In data a, data B, data C, data D, data E, and data F, data C, data D, and data F are first data, and data a, data B, and data E are second data. The true tag of the data C is (0, 0, 1), the true tag of the data D is (1, 0, 0), and the true tag of the data F is (0, 1, 0). The initial label matrix of the training data set is shown in fig. 9, where data C belongs to class C, data D belongs to class a, and data F belongs to class b. The specific process of obtaining the predicted label of the second data set according to the initial label matrix, the neighbor similarity matrix and the distant similarity matrix of the training data set comprises the following steps.
Step S3061, a neighbor label matrix is obtained through calculation according to the initial label matrix and the neighbor similarity matrix. In this embodiment, the neighbor tag matrix is calculated using a third formula. Specifically, the third formula is
Figure DEST_PATH_IMAGE009
. Wherein,
Figure 34797DEST_PATH_IMAGE010
a matrix of neighboring labels is represented,
Figure DEST_PATH_IMAGE011
the unit matrix is represented by a matrix of units,
Figure 421041DEST_PATH_IMAGE002
a matrix representing the similarity of the neighbors is represented,
Figure 547129DEST_PATH_IMAGE012
a matrix of the initial labels is represented,
Figure DEST_PATH_IMAGE013
the coefficients are represented. In the present embodiment, it is preferred that,
Figure 949292DEST_PATH_IMAGE013
between 0 and 1, the setting can be performed according to practical situations, and is not limited herein.
Step S3062, a distant label matrix is obtained through calculation according to the initial label matrix and the distant similarity matrix. In this embodiment, the distant label matrix is calculated by using a fourth formula. Specifically, the fourth formula is
Figure 477225DEST_PATH_IMAGE014
. Wherein,
Figure DEST_PATH_IMAGE015
a matrix of distant labels is represented,
Figure 939430DEST_PATH_IMAGE011
the unit matrix is represented by a matrix of units,
Figure 287235DEST_PATH_IMAGE006
a matrix of the degree of similarity of the distance is represented,
Figure 758668DEST_PATH_IMAGE012
a matrix of the initial labels is represented,
Figure 16474DEST_PATH_IMAGE016
the coefficients are represented. In the present embodiment, it is preferred that,
Figure 180739DEST_PATH_IMAGE016
between 0 and 1, the setting can be carried out according to actual conditions.
Figure 517304DEST_PATH_IMAGE013
And
Figure 526849DEST_PATH_IMAGE016
may be the same or different, and is not limited herein.
Step S3063, a target label matrix is obtained through calculation according to the neighbor label matrix and the distant label matrix. In this embodiment, the target label matrix is calculated using a fifth formula. Specifically, the fifth formula is
Figure DEST_PATH_IMAGE017
. Wherein,
Figure 763795DEST_PATH_IMAGE018
a matrix of target tags is represented that,
Figure 567803DEST_PATH_IMAGE010
a matrix of neighboring labels is represented,
Figure 31145DEST_PATH_IMAGE015
a matrix of distant labels is represented,
Figure DEST_PATH_IMAGE019
the coefficients are represented.
Figure 703435DEST_PATH_IMAGE019
The setting can be performed according to the actual situation, and is not limited herein.
Step S3064, obtain the predicted label of each second data in the second data set according to the target label matrix. In this embodiment, the maximum value in the column corresponding to the second data in the target label matrix is selected as the credible value, and the predicted label of the second data is formed according to the credible value. Each numerical value in each column in the target label matrix represents the correlation between each data in the training data set and each preset category. It is understood that the larger the numerical value is, the stronger the correlation between the second data and the corresponding preset category is; the smaller the value, the weaker the correlation between the second data and the corresponding preset category. The prediction labels are one-hot vectors, numerical values in the prediction labels correspond to preset categories one by one, and the number of the numerical values in the prediction labels is the same as that of the preset categories. The preset category corresponding to the credible numerical value is the category of the second data, accordingly, the numerical value corresponding to the preset category in the prediction label is 1, and the rest are 0. For example, the target tag matrix is shown in fig. 10. The credible numerical value of the data A is 0.8, the corresponding preset category is category a, and the prediction label of the data A is (1, 0, 0); the credible numerical value of the data B is 0.6, the corresponding preset category is category B, and the prediction label of the data B is (0, 1, 0); the credible value of the data E is 0.7, the corresponding preset category is category b, and the prediction tag of the data E is (0, 1, 0).
In the above embodiment, a neighbor similarity matrix is constructed according to the nearest data diagram, a distant similarity matrix is constructed according to the farthest data diagram, the similarity relation between data is embodied in the neighbor tag matrix by using the neighbor similarity matrix, and the dissimilarity relation between data is embodied in the distant tag matrix by using the distant similarity matrix. And calculating a target label matrix according to the neighbor label matrix and the distant label matrix to obtain a predicted label of each second data, so that the predicted labels of the second data can effectively fuse the similarity relation of the neighbor data and the dissimilarity relation of the distant data, and the accuracy is high.
Please refer to fig. 11, which is a schematic diagram of an internal structure of a terminal according to an embodiment of the present invention. The terminal 10 includes a computer-readable storage medium 11, a processor 12, and a bus 13. The computer-readable storage medium 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The computer readable storage medium 11 may in some embodiments be an internal storage unit of the terminal 10, such as a hard disk of the terminal 10. The computer readable storage medium 11 may also be, in other embodiments, an external storage device of the terminal 10, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the terminal 10. Further, the computer-readable storage medium 11 may also include both an internal storage unit and an external storage device of the terminal 10. The computer-readable storage medium 11 may be used not only to store application software and various types of data installed in the terminal 10 but also to temporarily store data that has been output or will be output.
The bus 13 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 11, but this is not intended to represent only one bus or type of bus.
Further, the terminal 10 may also include a display assembly 14. The display component 14 may be a Light Emitting Diode (LED) display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch panel, or the like. The display component 14 may also be referred to as a display device or display unit, as appropriate, for displaying information processed in the terminal 10 and for displaying a visual user interface, among other things.
Further, the terminal 10 may also include a communication component 15. The communication component 15 may optionally include a wired communication component and/or a wireless communication component, such as a WI-FI communication component, a bluetooth communication component, etc., typically used to establish a communication connection between the terminal 10 and other intelligent control devices.
The processor 12 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip for executing program codes stored in the computer-readable storage medium 11 or Processing data. Specifically, the processor 12 executes a processing program to control the terminal 10 to implement the training method of the classification model.
While fig. 11 illustrates only the terminal 10 having components 11-15 for implementing a training method for classification models, those skilled in the art will appreciate that the architecture illustrated in fig. 11 does not constitute a limitation of the terminal 10, and that the terminal 10 may include fewer or more components than those illustrated, or may combine certain components, or a different arrangement of components.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, insofar as these modifications and variations of the invention fall within the scope of the claims of the invention and their equivalents, the invention is intended to include these modifications and variations.
The above-mentioned embodiments are only examples of the present invention, which should not be construed as limiting the scope of the present invention, and therefore, the present invention is not limited by the claims.

Claims (10)

1. A training method of a classification model is characterized by comprising the following steps:
training an initial classification model according to a first data set to obtain an intermediate classification model, wherein the first data set is a set of labeled data;
extracting feature vectors of a training data set by using the intermediate classification model, wherein the training data set comprises the first data set and a second data set, and the second data set is a set of unlabeled data;
constructing a nearest data graph and a farthest data graph according to the feature vectors of the training data set, wherein the nearest data graph is a relation graph of each data in the training data set and a plurality of other data which are closest to each data, and the farthest data graph is a relation graph of each data in the training data set and a plurality of other data which are farthest to each data;
obtaining a predictive label for the second data set from the most recent data map and the most recent data map; and
and training the intermediate classification model according to the first data set and the second data set with the prediction labels to obtain a target classification model.
2. The method for training a classification model according to claim 1, wherein the obtaining of the prediction labels of the second data set from the most recent data graph and the most recent data graph specifically comprises:
constructing a neighbor similarity matrix according to the nearest data graph, wherein numerical values in the neighbor similarity matrix comprise the similarity of each data in the training data set with a plurality of other data which are nearest to the each data, the similarity of each data with the rest of the training data set except the plurality of other data which are nearest to the each data, and the similarity of the same data in the training data set;
constructing a distant similarity matrix according to the farthest data graph, wherein numerical values in the distant similarity matrix comprise the similarity of each data in the training data set and a plurality of other data which are farthest away from each data, the similarity of each data and the rest data in the training data set except the plurality of other data which are farthest away, and the similarity of the same data in the training data set; and
and acquiring the predicted label of the second data set according to the initial label matrix of the training data set, the neighbor similarity matrix and the distant similarity matrix.
3. The method for training a classification model according to claim 2, wherein the obtaining the predicted label of the second data set according to the initial label matrix of the training data set, the neighboring similarity matrix, and the distant similarity matrix specifically comprises:
calculating to obtain a neighbor label matrix according to the initial label matrix and the neighbor similarity matrix;
calculating to obtain a distant label matrix according to the initial label matrix and the distant similarity matrix;
calculating to obtain a target label matrix according to the neighbor label matrix and the distant label matrix; and
and acquiring the predicted label of each second data in the second data set according to the target label matrix.
4. The method for training a classification model according to claim 3, wherein obtaining the predicted label of each second data in the second data set according to the target label matrix specifically comprises:
selecting the maximum numerical value in the column corresponding to the second data in the target label matrix as a credible numerical value, wherein each numerical value in each column in the target label matrix represents the correlation between each data in the training data set and each preset category; and
and forming a prediction label of the second data according to the credible numerical value, wherein the preset category corresponding to the credible numerical value is the category of the second data.
5. The method for training a classification model according to claim 2, wherein constructing the nearest data graph and the farthest data graph from the feature vectors of the training data set specifically comprises:
respectively calculating the distance between every two data in the training data set according to the feature vectors of the training data set;
sorting the distances corresponding to each datum in a descending order;
selecting a preset number of other data from the training data set from the data corresponding to the minimum distance as neighbor data of the data;
constructing the nearest data map from neighbor data of each data in the training data set;
selecting a preset number of other data from the training data set from the data corresponding to the maximum distance as distant data of the data; and
and constructing the farthest data graph according to distant data of each data in the training data set.
6. The method for training a classification model according to claim 5, wherein constructing a neighbor similarity matrix from the nearest data graph specifically comprises:
respectively taking the distance between each datum and the corresponding adjacent datum as the similarity of each datum and a plurality of other data closest to each datum;
setting the similarity of each data and the rest data except for the plurality of other data closest to the data in the training data set to be 0; and
and setting the similarity of the same data in the training data set to be 0.
7. The method for training classification models according to claim 5, wherein constructing the distant similarity matrix according to the farthest data graph specifically comprises:
respectively taking the distance between each datum and the corresponding distant datum as the similarity of each datum and a plurality of other data with the farthest distance from each datum;
setting the similarity of each data and the rest data except for a plurality of other data with the farthest distance in the training data set to be 0; and
and setting the similarity of the same data in the training data set to be 0.
8. The method for training a classification model according to claim 1, wherein training an initial classification model from a first data set to obtain an intermediate classification model specifically comprises:
training the initial classification model according to the first data set at a preset learning rate;
judging whether the training times of the initial classification model reach preset times or not; and
and when the training times of the initial classification model reach preset times, taking the initial classification model trained for the preset times as the intermediate classification model.
9. The method for training a classification model according to claim 1, wherein after extracting feature vectors of a training data set using the intermediate classification model, the method for training a classification model further comprises:
normalizing the feature vector; and
and performing dimension reduction on the feature vector after the normalization processing.
10. A computer-readable storage medium for storing program instructions executable by a processor to implement a method of training a classification model according to any one of claims 1 to 9.
CN202210117469.8A 2022-02-08 2022-02-08 Method for training classification model and computer readable storage medium Pending CN114139658A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210117469.8A CN114139658A (en) 2022-02-08 2022-02-08 Method for training classification model and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210117469.8A CN114139658A (en) 2022-02-08 2022-02-08 Method for training classification model and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114139658A true CN114139658A (en) 2022-03-04

Family

ID=80382124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210117469.8A Pending CN114139658A (en) 2022-02-08 2022-02-08 Method for training classification model and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114139658A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257760A (en) * 2023-05-11 2023-06-13 浪潮电子信息产业股份有限公司 Data partitioning method, system, equipment and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257760A (en) * 2023-05-11 2023-06-13 浪潮电子信息产业股份有限公司 Data partitioning method, system, equipment and computer readable storage medium
CN116257760B (en) * 2023-05-11 2023-08-11 浪潮电子信息产业股份有限公司 Data partitioning method, system, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
US20200302340A1 (en) Systems and methods for learning user representations for open vocabulary data sets
CN112164391A (en) Statement processing method and device, electronic equipment and storage medium
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN108830329B (en) Picture processing method and device
CN111898739B (en) Data screening model construction method, data screening method, device, computer equipment and storage medium based on meta learning
KR101472451B1 (en) System and Method for Managing Digital Contents
US20200364216A1 (en) Method, apparatus and storage medium for updating model parameter
CN114780746A (en) Knowledge graph-based document retrieval method and related equipment thereof
CN113254354A (en) Test case recommendation method and device, readable storage medium and electronic equipment
CN114724156B (en) Form identification method and device and electronic equipment
CN110781818A (en) Video classification method, model training method, device and equipment
CN113869464B (en) Training method of image classification model and image classification method
CN114139658A (en) Method for training classification model and computer readable storage medium
CN114021670A (en) Classification model learning method and terminal
CN114359582A (en) Small sample feature extraction method based on neural network and related equipment
CN110929647B (en) Text detection method, device, equipment and storage medium
CN112287140A (en) Image retrieval method and system based on big data
CN111709475A (en) Multi-label classification method and device based on N-grams
CN116069985A (en) Robust online cross-modal hash retrieval method based on label semantic enhancement
CN111666902B (en) Training method of pedestrian feature extraction model, pedestrian recognition method and related device
CN112132150B (en) Text string recognition method and device and electronic equipment
CN114373088A (en) Training method of image detection model and related product
CN114610953A (en) Data classification method, device, equipment and storage medium
CN113989596B (en) Training method of image classification model and computer readable storage medium
CN109670552B (en) Image classification method, device and equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220304

RJ01 Rejection of invention patent application after publication