CN108304876B

CN108304876B - Classification model training method and device and classification method and device

Info

Publication number: CN108304876B
Application number: CN201810098964.2A
Authority: CN
Inventors: 孙源良; 樊雨茂; 刘萌
Original assignee: Guoxin Youe Data Co Ltd
Current assignee: Guoxin Youe Data Co Ltd
Priority date: 2018-01-31
Filing date: 2018-01-31
Publication date: 2021-07-06
Anticipated expiration: 2038-01-31
Also published as: CN108304876A

Abstract

The invention provides a classification model training method, a classification model training device, a classification method and a classification device, wherein the classification model training method comprises the following steps: performing common feature capture on the source domain data and the target domain data by using a first neural network, so that a first target domain feature vector learns common features of the source domain data and the target domain data; using a second neural network to capture difference features of the source domain data and the target domain data, so that a second target domain feature vector learns the difference features of the source domain data and the target domain data; clustering the first target domain feature vector and the second target domain feature vector respectively; and performing the training of the current round on the first neural network and the first classifier according to the clustering result and the first classification result. The method can utilize the same characteristics between the source domain and the target domain, can also utilize the difference characteristics between the source domain and the target domain, and can obtain more accurate classification results by training the obtained classification model.

Description

Classification model training method and device and classification method and device

Technical Field

The invention relates to the technical field of deep learning, in particular to a classification model training method and device and a classification method and device.

Background

The transfer learning can train a classification model to calibrate data of a target domain (which can be called target domain data) by using a labeled training sample (which can be called source domain data) in a known domain, and does not require that the source domain data and the target domain data have the same data distribution. In practice, the transfer learning is to find the relation between the data to be calibrated and the known label data, for example, the source domain data and the target domain data are mapped into the same space by using a kernel function, and the source domain data and the target domain data have the same distribution in the space, so that the classifier can be trained by using the labeled source domain sample data represented by the space to calibrate the target domain.

In the existing migration learning method, the same neural network is usually trained by using source domain data and target domain data to obtain a parameter sharing network. The network training process can find out the commonalities between the source domain data and the target domain data, and usually maps the source domain data and the target domain data in a high bit space with high comparability, thereby obtaining the distribution characteristics of the two domains. Although the same features between the source domain data and the target domain data can be utilized by the training method, the difference features between the source domain and the target domain are seriously lost, so that certain errors exist in the classification model obtained by training when the target domain data is classified.

Disclosure of Invention

In view of the above, an object of the embodiments of the present invention is to provide a classification model training method, an apparatus, and a classification method and an apparatus, which can utilize the same features between a source domain and a target domain, and also can utilize the difference features between the source domain and the target domain, and the obtained classification model can obtain a more accurate classification result.

In a first aspect, an embodiment of the present invention provides a classification model training method, where the method includes:

acquiring source domain data carrying a label and target domain data not carrying the label;

inputting the source domain data and the target domain data into a first neural network, extracting a first source domain feature vector for the source domain data, and extracting a first target domain feature vector for the target domain data; and are

Performing common feature capture on the source domain data and the target domain data, so that the first target domain feature vector learns common features of the source domain data and the target domain data; and are

Inputting the first source domain feature vector into a first classifier to obtain a first classification result;

inputting the source domain data and the target domain data into a second neural network, and extracting a second target domain feature vector for the target domain data; and are

Performing difference feature capture on the source domain data and the target domain data to enable the second target domain feature vector to learn difference features of the source domain data and the target domain data;

clustering the first target domain feature vector and the second target domain feature vector respectively;

performing a current round of training on the first neural network and the first classifier according to the clustering result and the first classification result;

and performing multi-round training on the first neural network and the first classifier to obtain a classification model.

In a second aspect, an embodiment of the present invention further provides a classification method, where the method includes:

acquiring data to be classified;

inputting the data to be classified into a classification model obtained by the classification model training method provided by the embodiment of the application, and obtaining a classification result of the data to be classified;

wherein the classification model comprises: the first neural network and the first classifier.

In a second aspect, an embodiment of the present invention further provides a classification model training apparatus, including:

the acquisition module is used for acquiring source domain data carrying a label and target domain data not carrying the label;

the first processing module is used for inputting the source domain data and the target domain data into a first neural network, extracting a first source domain feature vector for the source domain data and extracting a first target domain feature vector for the target domain data; common feature capture is carried out on the source domain data and the target domain data, so that the first target domain feature vector learns the common features of the source domain data and the target domain data; inputting the first source domain feature vector into a first classifier to obtain a first classification result;

the second processing module is used for inputting the source domain data and the target domain data into a second neural network and extracting a second target domain feature vector for the target domain data; differential feature capture is carried out on the source domain data and the target domain data, so that a second target domain feature vector learns the differential features of the source domain data and the target domain data;

the clustering module is used for clustering the first target domain feature vector and the second target domain feature vector respectively;

the training module is used for carrying out the training of the first neural network and the first classifier in the current round according to the clustering result and the first classification result; and performing multi-round training on the first neural network and the first classifier to obtain a classification model.

In a fourth aspect, an embodiment of the present invention further provides a classification apparatus, including: the data to be classified acquisition module is used for acquiring data to be classified;

the classification module is used for inputting the data to be classified into the classification model obtained by the classification model training method provided by the embodiment of the application to obtain the classification result of the data to be classified; wherein, the classification model includes: a first neural network and a first classifier.

In a fifth aspect, an embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the classification model training method provided in the embodiment of the present application when executing the computer program.

In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the classification model training method provided in this embodiment.

The embodiment of the invention utilizes the first neural network to extract the first source domain feature vector for the source domain data, and can capture the common features of the source domain data and the target domain data when extracting the first target domain feature vector for the target domain data, so that the second neural network can be utilized to extract the second target domain feature vector for the target domain data while the first target domain feature vector learns the common features of the source domain data and the target domain data; the method can capture the difference characteristics of the source domain data and the target domain data, enable the second target domain feature vector to learn the difference characteristics of the source domain data and the target domain data, and then train the first neural network and the first classifier according to the captured same characteristics and the difference characteristics to obtain a classification model, so that the same characteristics between the source domain and the target domain can be utilized, the difference characteristics between the source domain and the target domain can also be utilized, and the classification model can obtain a more accurate classification result.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a flow chart of a classification model training method according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a specific method for performing common feature capture on source domain data and target domain data in the classification model training method provided by the embodiment of the present invention;

fig. 3 is a flowchart illustrating a specific method for adjusting parameters of a first neural network according to a domain classification result in the classification model training method according to the embodiment of the present invention;

fig. 4 is a flowchart illustrating a specific method for performing differential feature capture on source domain data and target domain data in the classification model training method according to the embodiment of the present invention;

fig. 5 is a flowchart illustrating a specific method for performing parameter adjustment on a second neural network according to a domain classification result in the classification model training method according to the embodiment of the present invention;

fig. 6 is a flowchart illustrating a specific method for performing a current round of training on a first neural network according to a clustering result in the classification model training method according to the embodiment of the present invention;

FIG. 7 is a flowchart illustrating a specific method of similarity calculation operation in the classification model training method according to an embodiment of the present invention;

FIG. 8 is a flow chart of a classification method provided by an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of a classification model training apparatus according to an embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a sorting apparatus provided in an embodiment of the present invention;

FIG. 11 is a schematic structural diagram of a computer device according to an embodiment of the present invention

Detailed Description

Different from the prior art, the embodiment of the invention utilizes the first neural network to extract the first source domain feature vector for the source domain data, and can capture the common features of the source domain data and the target domain data when extracting the first target domain feature vector for the target domain data, so that the second neural network can be utilized to extract the second target domain feature vector for the target domain data while the first target domain feature vector learns the common features of the source domain data and the target domain data; the method can capture the difference characteristics of the source domain data and the target domain data, enable the second target domain characteristic vector to learn the difference characteristics of the source domain data and the target domain data, and then train the first neural network and the first classifier according to the captured same characteristics and the difference characteristics to obtain a classification model, so that the same characteristics between the source domain and the target domain can be utilized, the difference characteristics between the source domain and the target domain can also be utilized, and the classification model can obtain a more accurate classification result.

To facilitate understanding of the present embodiment, a detailed description is first given of a classification model training method disclosed in the present embodiment, where the method is used for training classification models of various data, and the obtained classification model can classify corresponding data.

Referring to fig. 1, a method for training a classification model according to an embodiment of the present invention includes:

s101: and acquiring source domain data carrying a label and target domain data not carrying the label.

In a specific implementation, the source domain data is data with a tag, and the target domain data is data without a tag. The source domain data and the target domain data have certain commonality and certain difference.

The source domain data serving as training samples are sufficient in quantity, the target domain data with the preset features need to be classified in real requirements, the second target domain data with the preset features serving as training samples are insufficient in quantity, or the difficulty in the training process is high, so that the preset features need to be learned simultaneously in the source domain data learning process through transfer learning, and the preset features and the source domain data features are fused; meanwhile, the difference characteristics between the source domain data and the target domain data are learned simultaneously in the process of learning the source domain data through transfer learning, so that the preset characteristics, the difference characteristics of the source domain data and the target domain data are fused with the source domain data characteristics, a target domain characteristic space is fully learned, and the classification of the target domain data is more accurate.

Here, the source domain data and the target domain data may be images, videos, languages, and the like that can be classified using neural network learning.

For example, when the source domain data and the target domain data are both image data, the source domain data may be a better quality image, such as a clear human face image with a face without an obstruction under uniform illumination conditions using an image acquisition device with higher resolution. The face in the source domain data may be a face image from a variety of angles, such as a face front view image, a side view image, a squint image, a bottom view image, a top view image, and the like.

The target domain data are all images with preset characteristics, such as images with poor image quality, and can be unclear face images acquired by image acquisition equipment with low resolution under various non-uniform different illumination conditions. The face in the target domain data may also be a face image from a variety of angles.

For another example, when the source domain data and the target domain data are both language data, the source domain data is french vocabulary, the target domain data is spanish vocabulary, and since french and spanish belong to the latin language family, the two have a feature that is partially common to each other; however, the two languages belong to two different languages, and thus have a certain difference. The feature of spanish is learned using the recognizable french language, so that spanish can be recognized.

For another example, when the source domain data and the target domain data are language data, emotion analysis is performed on certain words or dialogs; the source domain data is words marked with emotion labels, and the target domain data is words marked with no emotion labels.

S102: inputting source domain data and target domain data into a first neural network, extracting a first source domain feature vector for the source domain data, and extracting a first target domain feature vector for the target domain data; common feature capture is carried out on the source domain data and the target domain data, so that the first target domain feature vector learns the common features of the source domain data and the target domain data; and inputting the first source domain feature vector into a first classifier to obtain a first classification result.

In a specific implementation, the first Neural Network may adopt a Convolutional Neural Network (CNN) to extract a first source domain feature vector for the source domain data and a first target domain feature vector for the target domain data.

The source domain data is data carrying a label, and the label is used for indicating a correct classification result of the source domain data; the target domain data is data that does not carry a tag. After the source domain data and the target domain data are input into the first neural network, the first neural network performs shared parameter feature learning on the source domain data and the target domain data. In the process, because the first neural network performs supervised learning on the source domain data and performs unsupervised learning on the target domain data, parameters used in the first neural network can be continuously adjusted in the learning process of sharing parameters between the source domain data and the target domain data by using the same first neural network, so that the parameters of the first neural network are influenced by the target domain data in the training process of the first neural network, and the first neural network can be interfered by the target domain data in a first source domain feature vector obtained by extracting features of each source domain data after performing feature learning on the source domain data and the target domain data, thereby realizing inter-domain mixing of the source domain data and the target domain data.

Meanwhile, after a first source domain feature vector is extracted for source domain data and a first target domain feature vector is extracted for target domain data, common features of the source domain data and the target domain data are captured, so that when a first neural network is trained, the first source domain feature vector extracted for the source domain data is interfered by the target domain data, namely the common features of the source domain data and the target domain data, so that when the inter-domain fusion of the source domain data and the target domain data is realized, the common domains of the source domain data and the target domain data are fused, and finally the first source domain feature vector learns the common features of the source domain data and the target domain data.

Referring to fig. 2, the embodiment of the present invention further provides a specific method for performing common feature capture on source domain data and target domain data. The method comprises the following steps:

s201: after extracting a first source domain feature vector for the source domain data and a first target domain feature vector for the target domain data, performing gradient inversion processing on the first source domain feature vector and the first target domain feature vector.

S202: and inputting the first source domain feature vector and the first target domain feature vector subjected to gradient inverse processing into a first domain classifier.

S203: and adjusting parameters of the first neural network according to domain classification results of the source domain data and the target domain data respectively represented by the first source domain feature vector and the first target domain feature vector by the first domain classifier.

In the implementation, the training process of the first neural network using the source domain data and the target domain data is actually a process of domain mixing the source domain data and the target domain data. Namely, a first source domain feature vector obtained by extracting features of source domain data by using a first neural network is influenced by features in target domain data, namely, the first source domain feature vector is close to the features of the target domain data; meanwhile, the first target domain feature vector obtained by extracting the features of the source domain data by using the first neural network is influenced by the features in the source domain data, namely, the first target domain feature vector is close to the features of the source domain data. Therefore, in order to implement domain mixing of source domain data and target domain data, after extracting a first target domain feature vector for each target domain data in the target domain data and extracting a first source domain feature vector for each source domain data in the source domain data, gradient inversion processing is performed on the first target domain feature vector and the first source domain feature vector, then the first target domain feature vector and the first source domain feature vector which are subjected to the gradient inversion processing are input to a first domain classifier, and the first target domain feature vector and the first source domain feature vector are subjected to domain classification by using the first domain classifier.

The result of the domain classification is correct, that is, the probability that the first domain classifier can correctly classify the first source domain feature vector and the first target domain feature vector is higher, the smaller the degree of domain mixing is explained; the larger the probability of the result of the domain classification is wrong, that is, the smaller the probability that the domain classifier correctly classifies the first source domain feature vector and the first target domain feature vector, the larger the degree of domain mixing is described to be, so that the parameter adjustment is performed on the first neural network based on the result of the first domain classifier classifying the source domain data and the target domain data respectively represented by the first target domain feature vector and the first source domain feature vector.

Here, referring to fig. 3, the parameter adjustment of the first neural network according to the domain classification result may be implemented by performing the following domain classification loss determination operation:

s301: and determining the domain classification loss of the current domain classification of the source domain data and the target domain data respectively represented by the current first source domain feature vector and the first target domain feature vector.

Here, the degree of domain mixing is characterized by the domain classification loss. The domain classification loss of the source domain data refers to the number of source domain data classified into the target domain as a classification result in the process of classifying the source domain data and the target domain data based on the first source domain feature vector and the first target domain feature vector. The domain classification loss of the target domain data refers to the number of target domain data classified into the source domain as a classification result in the process of classifying the source domain data and the target domain data based on the first source domain feature vector and the first target domain feature vector. After the source domain data and the target domain data respectively represented by the first source domain feature vector and the first target domain feature vector are subjected to domain classification by using the first domain classifier, a domain classification result can be obtained, and then domain classification losses respectively corresponding to the source domain data and the target domain data are determined according to the domain classification result.

S302: and generating third feedback information aiming at the fact that the difference between the domain classification losses of the latest preset times is not smaller than a preset difference threshold value, and carrying out parameter adjustment on the first neural network based on the third feedback information.

Here, a preset difference threshold is used to constrain the degree of domain mixing. The first domain classifier pre-stores the distribution of the domains to which the first source domain feature vector and the first target domain feature vector respectively belong, and when the difference between the domain classification losses of the latest preset times is not less than a preset difference threshold, the domain classification is not considered to reach a stable state, that is, in a certain domain classification, the first domain classifier can correctly distinguish the domains to which the first source domain feature vector and the first target domain feature vector respectively belong, and in a certain domain classification, the domain classifier cannot correctly distinguish the domains to which the first source domain feature vector and the first target domain feature vector respectively belong, and the domain mixing degree is not stable, so that the parameter of the first neural network needs to be adjusted, and third feedback information with an excessively large domain classification loss difference is generated and fed back to the first neural network. And after receiving the third feedback information with the excessive domain classification loss difference, the first neural network adjusts the parameters of the first neural network.

S303: based on the adjusted parameters, a first neural network is used for extracting a new first source domain feature vector for source domain data, a new first target domain feature vector for target domain data is extracted, domain classification loss determination operation is carried out until the difference is larger than a preset difference threshold value, and the current round of training of the first neural network based on a first domain classifier is completed.

The training of the first neural network based on the first domain classifier is to maintain the domain classification loss determined according to the classification result of the first domain classifier on the first source domain feature vector and the first target domain feature vector at a certain value, to distinguish whether the target domain data and the source domain data belong to the source domain or the target domain as far as possible, and to extract the common features of the two.

Here, it should be noted that when the difference between the domain classification losses of the last preset times is smaller than the preset difference threshold, feedback information suitable for the domain classification losses is also generated and fed back to the first neural network. After receiving the feedback information with proper domain classification loss, the first neural network also adjusts the parameters of the first neural network in a smaller amplitude, and strives for gradient reduction to be locally optimal.

S103: inputting the source domain data and the target domain data into a second neural network, and extracting a second target domain feature vector for the target domain data; and carrying out difference feature capture on the source domain data and the target domain data, so that the second target domain feature vector learns the difference features of the source domain data and the target domain data.

In a specific implementation, the second Neural Network may use a Convolutional Neural Network (CNN) to extract a second source domain feature vector for the source domain data and a second target domain feature vector for the target domain data.

Optionally, the first neural network and the second neural network are structurally identical neural networks.

The source domain data is data carrying a label, and the label is used for indicating a correct classification result of the source domain data; the target domain data is data that does not carry a tag. After the source domain data and the target domain data are input into the second neural network, the second neural network performs feature learning of shared parameters on the source domain data and the target domain data. In the process, the second neural network carries out supervised learning on the source domain data and carries out unsupervised learning on the target domain data, and in the learning process of sharing parameters between the source domain data and the target domain data by using the same second neural network, the parameters used in the first neural network are continuously adjusted, so that in the process of training the second neural network, the parameters of the second neural network are influenced by the target domain data, and further, after the second neural network carries out feature learning on the source domain data and the target domain data, the second source domain feature vector obtained by carrying out feature extraction on each source domain data is interfered by the target domain data, and the inter-domain mixing of the source domain data and the target domain data is realized.

Meanwhile, after a second source domain feature vector is extracted for the source domain data and a second target domain feature vector is extracted for the target domain data, differential features of the source domain data and the target domain data are captured, so that the second source domain feature vector extracted for the source domain data is interfered by the target domain data when a second neural network is trained, namely, the source domain data and the target domain data are interfered by the differential features, and therefore when inter-domain fusion of the source domain data and the target domain data is realized, the differential domains of the source domain data and the target domain data are fused, and finally the second source domain feature vector learns the differential features of the source domain data and the target domain data.

Before differential feature capture is performed on the source domain data and the target domain data, a second source domain feature vector is extracted for the source domain data.

Referring to fig. 4, the embodiment of the present invention further provides a specific method for performing differential feature capture on source domain data and target domain data. The method comprises the following steps:

s401: and inputting the second source domain feature vector and the second target domain feature vector into a second domain classifier.

S402: and adjusting parameters of the second neural network according to the domain classification results of the source domain data and the target domain data respectively represented by the second source domain feature vector and the second target domain feature vector of the second domain classifier.

When the method is specifically implemented, the second source domain feature vector and the second target domain feature vector are not subjected to gradient inverse processing, but are directly input into a second domain classifier, the second domain classifier is used for performing domain classification on source domain data and target domain data respectively represented by the second source domain feature vector and the second target domain feature vector, and the obtained domain classification loss is smaller, so that the domains to which the source domain data and the target domain data belong can be distinguished as far as possible. And the second neural network can capture the difference characteristics between the source domain data and the target domain data, and the distance between the source domain data and the target domain data is widened.

Specifically, referring to fig. 5, the parameter adjustment of the second neural network according to the domain classification result can be realized by performing the following domain classification loss determination operation:

s501: and determining the domain classification loss of the current domain classification of the source domain data and the target domain data respectively represented by the current second source domain feature vector and the second target domain feature vector.

Here, the degree of domain mixing of the second source domain feature vector and the second target domain feature vector is characterized by a domain classification loss. The domain classification loss of the source domain data refers to the number of source domain data of the target domain data as a classification result in the process of classifying the source domain data and the target domain data based on the second source domain feature vector and the second target domain feature vector. The domain classification loss of the target domain data refers to the number of the target domain data of the source domain data as a classification result in the process of classifying the source domain data and the target domain data based on the second source domain feature vector and the second target domain feature vector. After the source domain data and the target domain data respectively represented by the second source domain feature vector and the second target domain feature vector are subjected to domain classification by using a domain classifier, a domain classification result can be obtained, and then domain classification losses respectively corresponding to the source domain data and the target domain data are determined according to the domain classification result.

S502: and generating fourth feedback information aiming at the condition that the domain classification result is wrong, and carrying out parameter adjustment on the second neural network based on the fourth feedback information.

Here, since the correctness of the domain classification result is ensured, only if the domain classification result is correct, the distance between the source domain data and the target domain data is extended, that is, the difference data between the source domain data and the target domain data is extracted, when the domain classification result is incorrect, the fourth feedback information is generated, and the parameter adjustment is performed on the second neural network based on the fourth feedback information.

S503: based on the adjusted parameters, a second neural network is used for extracting a new second source domain feature vector for the source domain data, extracting a new second target domain feature vector for the target domain data, and performing domain classification loss determination operation.

Until the domain classification result is correct, or the accuracy of the domain classification result reaches a preset threshold.

S104: and clustering the first target domain feature vector and the second target domain feature vector respectively.

S105: and performing the training of the current round on the first neural network and the first classifier according to the clustering result and the first classification result.

In the concrete implementation, the first neural network and the first classifier are trained in the current round according to the clustering result and the first classification result, namely parameters of the first neural network in the training process are adjusted according to the clustering result, and the parameters of the first neural network and the first classifier in the training process are adjusted according to the first classification result.

One is as follows: according to the clustering result, when parameters of the first neural network in the training process are adjusted, because the first neural network is used for extracting a first target domain feature vector for target domain data, the second neural network is used for extracting a second target domain feature vector for the target domain data, and the final training result of the model is to classify the target domain data as correctly as possible, in the training process of the first neural network and the second neural network, the similarity of the distribution of the first target domain feature vector and the second target domain feature vector respectively extracted by the first neural network and the second neural network is ensured to be within a certain range.

Here, the similarity of the distribution of the first target domain feature vector and the second target domain feature vector means that the target domain data includes a plurality of data. The plurality of data all belong to target domain data. Extracting a feature vector for each of the plurality of data when extracting a feature vector for each target domain data; the distribution of the first target domain feature vector and the second target domain feature vector of the plurality of data over space is similar.

Therefore, the first target domain feature vector and the second target domain feature vector are clustered separately.

Referring to fig. 6, when performing the current round of training on the first neural network according to the clustering result, the method includes:

s601: and generating a first adjacency matrix according to the clustering result of the first target domain feature vector.

Specifically, when clustering the first target domain feature vectors, each first target domain data may be regarded as a point mapped to a high-dimensional space, and the points are clustered according to the distance between the points, so that the points with the distance within a preset threshold are classified into the same class. A first adjacency matrix of point-to-point distances is then formed based on the results of the clustering operations.

In the first adjacency matrix, if two points belong to the same class during clustering, the distance between the two points is 1; if two points do not belong to the same class at the time of clustering, the distance between the two points is 0.

For example, there are 5 target domain data, and the obtained first target domain feature vectors are: 1. 2, 3, 4 and 5. The result of clustering the first target domain feature vector is as follows: {1,3}, {2}, and {4,5}, the adjacency matrix formed is:

s602: and generating a second adjacency matrix according to the clustering result of the second target domain feature vector.

Here, the method of generating the second adjacency matrix is similar to the method of generating the first adjacency matrix, and is not described herein again.

S603: and training the first neural network according to the similarity between the first adjacent matrix and the second adjacent matrix.

Here, when the first neural network is trained according to the similarity between the first adjacent matrix and the second adjacent matrix, the following similarity calculation operation is performed until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset similarity threshold.

Referring to fig. 7, the similarity calculation operation includes:

s701: and calculating the similarity between the first adjacency matrix and the second adjacency matrix which are obtained currently.

In a specific implementation, when the similarity between the first adjacent matrix and the second adjacent matrix obtained currently is calculated, the track of the first adjacent matrix and the track of the second adjacent matrix are calculated, and the closer the distance between the track of the first adjacent matrix and the track of the second adjacent matrix is, the higher the similarity between the first adjacent matrix and the second adjacent matrix is. When solving for the distance between the traces of the first adjacent matrix and the traces of the second adjacent matrix, the difference between the traces of the first adjacent matrix and the traces of the second adjacent matrix may be taken as the similarity between the first adjacent matrix and the second adjacent matrix, that is, the greater the absolute value of the difference between the traces of the first adjacent matrix and the traces of the second adjacent matrix, the lower the similarity between the first adjacent matrix and the second adjacent matrix.

S702: and generating first feedback information aiming at the condition that the similarity is not less than a preset similarity threshold, and adjusting parameters of the first neural network and the second neural network based on the first feedback information.

S703: based on the adjusted parameters, extracting new first target domain feature vectors for the target domain data by using the first neural network, and extracting new second target domain feature vectors for the target domain data by using the second neural network;

s704: clustering the new first target domain feature vector to generate a new first adjacency matrix, clustering the new second target domain feature vector to generate a new second adjacency matrix, and performing the similarity calculation operation again.

Since the higher the similarity between the first adjacency matrix and the second adjacency matrix, the more similar the classification result of the first adjacency matrix characterizing on the first target domain feature vector is and the classification result of the second adjacency matrix characterizing on the second target domain feature vector is, the parameter adjustment is performed on the first neural network according to the similarity between the first adjacency matrix and the second adjacency matrix, so that the difference feature between the source domain data and the target domain data of the first neural network is limited, and the model convergence is accelerated.

And secondly, adjusting parameters of the first neural network and the first classifier in the training process according to the first classification result.

Specifically, when parameters of the first neural network and the first classifier in the training process are adjusted according to the first classification result, the following classification operation is performed, and if the obtained first classification result is correct, the current round of training of the first neural network and the first classifier based on the second neural network is completed.

The classification operation includes:

classifying the currently extracted first source domain feature vector by using a first classifier;

generating second feedback information aiming at the condition that the classification result is wrong, and adjusting parameters of the first neural network and the first classifier according to the second feedback information;

and based on the adjusted parameters, extracting a new first source domain feature vector for the source domain data by using the first neural network, and executing the classification operation again.

Here, since the classification model is to classify the target domain data, and it is required to ensure the accuracy of the classification as much as possible, the first classification result is used to constrain the parameters of the first neural network and the first classifier, while using the differential features of the source domain and the target domain scaffold, to ensure that the classification result of the source domain data is accurate.

S106: and performing multi-round training on the first neural network and the first classifier to obtain a classification model.

In a specific implementation, a round of training is performed on the first neural network and the first classifier, which means that a set of source domain data and target domain data is used to train the first neural network and the first classifier. And then, continuously inputting multiple groups of domain data and target domain data to train the first neural network and the first classifier until the first neural network and the first classifier which meet the requirements are obtained, and taking the obtained first neural network and the obtained first classifier as the obtained classification model.

In the above process, S102 and S103 are not executed in sequence.

In the classification model training method provided by the embodiment of the invention, a first neural network is used for extracting a first source domain feature vector for source domain data, and when a first target domain feature vector is extracted for target domain data, common features of the source domain data and the target domain data can be captured, so that a second target domain feature vector can be extracted for the target domain data by using a second neural network while the first target domain feature vector learns the common features of the source domain data and the target domain data; the method can capture the difference characteristics of the source domain data and the target domain data, enable the second target domain feature vector to learn the difference characteristics of the source domain data and the target domain data, and then train the first neural network and the first classifier according to the captured same characteristics and the difference characteristics to obtain a classification model, so that the same characteristics between the source domain and the target domain can be utilized, the difference characteristics between the source domain and the target domain can also be utilized, and the classification model can obtain a more accurate classification result.

In another embodiment, because the first neural network and the first classifier are trained based on the second neural network, the second neural network is trained simultaneously with the first neural network. When the second neural network is trained, firstly, the parameters of the second neural network are adjusted according to the clustering result. And secondly, the classification result of the source domain data by the second neural network is also ensured to be accurate.

Therefore, after extracting a second source domain feature vector for the source domain data, the second source domain feature vector is input into a second classifier to obtain a second classification result;

and adjusting parameters of the second neural network according to a second classification result of the second classifier on the source domain data characterized by the second source domain feature vector.

Here, the parameter adjustment of the second neural network according to the second classification result of the second classifier on the source domain data characterized by the second source domain feature vector specifically includes:

executing the following classification operation until the obtained second classification result is correct, and finishing the training of the second neural network and the second classifier;

the classification operation includes:

classifying the currently extracted second source domain feature vector by using a second classifier;

generating fifth feedback information aiming at the condition that the classification result is wrong, and adjusting parameters of the second neural network and the second classifier according to the fifth feedback information;

based on the adjusted parameters, a new second source domain feature vector is extracted for the source domain data using the second neural network, and the classification operation is performed again.

The process of using the second classification result to perform parameter adjustment on the second neural network and the second classifier is similar to the process of using the first classification result to perform parameter adjustment on the first neural network and the first classifier, and is not described herein again.

Referring to fig. 8, an embodiment of the present invention further provides a classification method, where the method includes:

s801: acquiring data to be classified;

s802: inputting data to be classified into a classification model obtained by a classification model training method provided by the embodiment of the application to obtain a classification result of the data to be classified;

wherein, the classification model includes: a first neural network and a first classifier.

Based on the same inventive concept, the embodiment of the present invention further provides a classification model training apparatus corresponding to the classification model training method, and since the principle of problem solving of the apparatus in the embodiment of the present invention is similar to that of the classification model training method in the embodiment of the present invention, the implementation of the apparatus can refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 9, a classification model training apparatus provided in an embodiment of the present invention includes:

an obtaining module 901, configured to obtain source domain data carrying a tag and target domain data not carrying a tag;

a first processing module 902, configured to input source domain data and target domain data into a first neural network, extract a first source domain feature vector for the source domain data, and extract a first target domain feature vector for the target domain data; common feature capture is carried out on the source domain data and the target domain data, so that the first target domain feature vector learns the common features of the source domain data and the target domain data; inputting the first source domain feature vector into a first classifier to obtain a first classification result;

a second processing module 903, configured to input the source domain data and the target domain data into a second neural network, and extract a second target domain feature vector for the target domain data; differential feature capture is carried out on the source domain data and the target domain data, so that a second target domain feature vector learns the differential features of the source domain data and the target domain data;

a clustering module 904, configured to cluster the first target domain feature vector and the second target domain feature vector respectively;

the training module 905 is configured to perform a current training on the first neural network and the first classifier according to the clustering result and the first classification result; and performing multi-round training on the first neural network and the first classifier to obtain a classification model.

Optionally, the first processing module 902 is configured to perform common feature capture on the source domain data and the target domain data as follows:

after extracting a first source domain feature vector for source domain data and a first target domain feature vector for target domain data, carrying out gradient reverse processing on the first source domain feature vector and the first target domain feature vector;

inputting the first source domain feature vector and the first target domain feature vector subjected to gradient inverse processing into a first domain classifier;

and adjusting parameters of the first neural network according to domain classification results of the source domain data and the target domain data respectively represented by the first source domain feature vector and the first target domain feature vector by the first domain classifier.

Optionally, the second processing module 903 extracts a second source domain feature vector for the source domain data after inputting the source domain data and the target domain data into the second neural network and performing feature learning on the source domain data and the target domain data;

the second processing module 903 is configured to perform differential feature capture on the source domain data and the target domain data in the following manner:

inputting the second source domain feature vector and the second target domain feature vector into a second domain classifier;

and adjusting parameters of the second neural network according to the domain classification results of the source domain data and the target domain data respectively represented by the second source domain feature vector and the second target domain feature vector of the second domain classifier.

Optionally, the system further comprises a second training module, configured to, after extracting a second source domain feature vector for the source domain data, input the second source domain feature vector into a second classifier to obtain a second classification result;

Optionally, the training module 905 is specifically configured to generate a first adjacency matrix according to a result of clustering the first target domain feature vector;

generating a second adjacency matrix according to the result of clustering the second target domain feature vector;

and performing the training of the current round on the first neural network and the first classifier according to the similarity between the first adjacency matrix and the second adjacency matrix and the first classification result.

Optionally, the training module 905 is specifically configured to: performing similarity calculation operation and classification operation until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset similarity threshold and the obtained first classification result is correct, and finishing the training of the first neural network and the first classifier based on the second neural network;

the similarity calculation operation includes:

calculating the similarity between the first adjacency matrix and the second adjacency matrix which are obtained currently;

generating first feedback information aiming at the condition that the similarity is not less than a preset similarity threshold, and adjusting parameters of the first neural network and the second neural network based on the first feedback information;

based on the adjusted parameters, extracting new first target domain feature vectors for the target domain data by using the first neural network, and extracting new second target domain feature vectors for the target domain data by using the second neural network;

clustering the new first target domain feature vector to generate a new first adjacency matrix, clustering the new second target domain feature vector to generate a new second adjacency matrix, and performing similarity calculation again;

the classification operation includes:

Optionally, the training module 905 is specifically configured to calculate a similarity between the currently obtained first adjacency matrix and the second adjacency matrix according to the following manner:

calculating a trace of the first adjacency matrix and a trace of the second adjacency matrix;

the difference between the traces of the first adjacency matrix and the traces of the second adjacency matrix is taken as the similarity between the first adjacency matrix and the second adjacency matrix.

Optionally, the second training module is further configured to perform parameter adjustment on the second neural network in the following manner:

the following domain classification loss determination operations are performed:

determining the domain classification loss of the current domain classification of the source domain data and the target domain data respectively represented by the current second source domain feature vector and the second target domain feature vector;

generating fourth feedback information aiming at the condition that the domain classification result is wrong, and adjusting parameters of the second neural network based on the fourth feedback information;

based on the adjusted parameters, a second neural network is used for extracting a new second source domain feature vector for the source domain data, extracting a new second target domain feature vector for the target domain data, and performing domain classification loss determination operation.

Still another embodiment of the present invention further provides a classification apparatus, as shown in fig. 10, the classification apparatus provided in the embodiment of the present invention includes:

a to-be-classified data acquisition module 1001 configured to acquire to-be-classified data;

the classification module 1002 is configured to input data to be classified into a classification model obtained by the classification model training method provided in the embodiment of the present application, so as to obtain a classification result of the data to be classified; wherein, the classification model includes: a first neural network and a first classifier.

Corresponding to the classification model training method in fig. 1, an embodiment of the present invention further provides a computer device, as shown in fig. 11, the device includes a memory 1000, a processor 2000 and a computer program stored in the memory 1000 and executable on the processor 2000, wherein the processor 2000 implements the steps of the classification model training method when executing the computer program.

Specifically, the memory 1000 and the processor 2000 can be general memories and general processors, which are not specifically limited herein, and when the processor 2000 runs a computer program stored in the memory 1000, the classification model training method can be executed, so as to solve an error problem of the classification model in classifying the target domain data, which is caused by the fact that the same features between the source domain data and the target domain data can be utilized in the classification model training process, but the difference data between the source domain data and the motto data cannot be utilized, so that the same features between the source domain and the target domain can be utilized, the difference features between the source domain and the target domain can also be utilized, and the classification model obtained by training can obtain a more accurate classification result.

Corresponding to the classification model training method in fig. 1, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the classification model training method.

Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, or the like, and when a computer program on the storage medium is executed, the above-mentioned classification model training method can be executed, so as to solve the problem of errors in classifying target domain data of the classification model caused by the fact that the same features between source domain data and target domain data can be utilized in the classification model training process, but difference data between the source domain data and the moral language data cannot be utilized, thereby achieving the effects of utilizing the same features between the source domain and the target domain, utilizing the difference features between the source domain and the target domain, and obtaining a more accurate classification result by the trained classification model.

The classification model training method and apparatus, and the computer program product of the classification method and apparatus provided in the embodiments of the present invention include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments, and are not described herein again.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A classification model training method, comprising:

the method comprises the steps of obtaining source domain data with a label and target domain data without the label, wherein the source domain data comprise at least one of a sample face image, a sample video and a sample language vocabulary;

inputting the source domain data and the target domain data into a first neural network, extracting a first source domain feature vector of the source domain data, and extracting a first target domain feature vector of the target domain data; and are

inputting the source domain data and the target domain data into a second neural network, and extracting a second target domain feature vector of the target domain data; and are

performing multiple rounds of training on the first neural network and the first classifier to obtain a classification model;

performing common feature capture on the source domain data and the target domain data in the following manner:

extracting a first source domain feature vector extracted from the source domain data, and after extracting a first target domain feature vector extracted from the target domain data, performing gradient reverse processing on the first source domain feature vector and the first target domain feature vector;

performing parameter adjustment on the first neural network according to domain classification results of the source domain data and the target domain data respectively represented by a first source domain feature vector and a first target domain feature vector by the first domain classifier;

after the inputting the source domain data and the target domain data into a second neural network and performing feature learning on the source domain data and the target domain data, the method further includes:

extracting a second source domain feature vector for the source domain data;

performing differential feature capture on the source domain data and the target domain data as follows:

inputting the second source domain feature vector and a second target domain feature vector into a second domain classifier;

performing parameter adjustment on the second neural network according to domain classification results of the source domain data and the target domain data respectively characterized by a second source domain feature vector and a second target domain feature vector of the second domain classifier;

performing a current training round on the first neural network and the first classifier according to the result of clustering and the first classification result, specifically including:

generating a first adjacency matrix according to a clustering result of the first target domain feature vector;

performing a current training on the first neural network and the first classifier according to the similarity between the first adjacency matrix and the second adjacency matrix and the first classification result;

the performing a training round on the first neural network and the first classifier according to the similarity between the first adjacency matrix and the second adjacency matrix and the first classification result specifically includes:

performing similarity calculation operation and classification operation until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset similarity threshold and an obtained first classification result is correct, and finishing the training of the first neural network and the first classifier based on the second neural network;

the similarity calculation operation includes:

generating first feedback information aiming at the condition that the similarity is not less than a preset similarity threshold, and carrying out parameter adjustment on the first neural network and the second neural network based on the first feedback information;

based on the adjusted parameters, extracting the target domain number by using a first neural network to extract a corresponding new first target domain feature vector, and extracting a new second target domain feature vector corresponding to the target domain data by using a second neural network;

clustering the new first target domain feature vector to generate a new first adjacency matrix, clustering the new second target domain feature vector to generate a new second adjacency matrix, and executing the similarity calculation operation again;

the classification operation includes:

classifying the first source domain feature vector currently extracted using the first classifier;

generating second feedback information aiming at the condition that the classification result is wrong, and adjusting the parameters of the first neural network and the first classifier according to the second feedback information;

based on the adjusted parameters, extracting a new first source domain feature vector for the source domain data by using a first neural network, and executing the classification operation again;

acquiring data to be classified;

and inputting the data to be classified into the classification model to obtain a classification result of the data to be classified.

2. The method of claim 1, further comprising, after extracting the second source domain feature vector of the source domain data:

inputting the second source domain feature vector into a second classifier to obtain a second classification result;

3. The method according to claim 1, wherein the calculating the similarity between the currently obtained first adjacency matrix and the second adjacency matrix specifically includes:

calculating a trace of a first adjacency matrix and a trace of the second adjacency matrix;

taking a difference between a trace of a first adjacency matrix and a trace of the second adjacency matrix as a similarity between the first adjacency matrix and the second adjacency matrix.

4. The method according to claim 1, wherein the parameter adjustment of the first neural network according to the domain classification results of the source domain data and the target domain data respectively characterized by a first source domain feature vector and a first target domain feature vector by the first domain classifier specifically comprises:

determining the domain classification loss of the current domain classification of the source domain data and the target domain data respectively represented by the current first source domain feature vector and the first target domain feature vector;

generating third feedback information aiming at the fact that the difference between the domain classification losses of the latest preset times is not smaller than a preset difference threshold value, and carrying out parameter adjustment on the first neural network based on the third feedback information;

based on the adjusted parameters, using a first neural network to extract a new first source domain feature vector corresponding to the source domain data, extract a new first target domain feature vector corresponding to the target domain data, and execute the domain classification loss determination operation until the difference is greater than a preset difference threshold value, thereby completing the current round of training on the first neural network based on the first domain classifier.

5. The method according to claim 1, wherein the performing parameter adjustment on the second neural network according to the domain classification results of the source domain data and the target domain data respectively characterized by a second source domain feature vector and a second target domain feature vector of the second domain classifier specifically includes:

generating fourth feedback information aiming at the condition that the domain classification result is wrong, and carrying out parameter adjustment on the second neural network based on the fourth feedback information;

and based on the adjusted parameters, extracting a new second source domain feature vector corresponding to the source domain data and a new second target domain feature vector corresponding to the target domain data by using a second neural network, and executing the domain classification loss determination operation.