WO2020147409A1

WO2020147409A1 - Text classification method and apparatus, computer device, and storage medium

Info

Publication number: WO2020147409A1
Application number: PCT/CN2019/118342
Authority: WO
Inventors: 金戈; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-01-14
Filing date: 2019-11-14
Publication date: 2020-07-23
Also published as: CN109918499A

Abstract

A text classification method and apparatus, a computer device, and a storage medium. The method comprises the following steps: S10, constructing a word vector and converting an input text into a vector form; S20, separately inputting the word vector in S10 to at least two sets of sentiment classifiers, and outputting respective fully connected layers of the sentiment classifiers to respective loss functions thereof, so as to select, by each of the sentiment classifiers, different sentiment characteristics according to different classification needs of services; and S30, performing cross-learning and updating the sentiment classifiers, and adding, on the basis of equal weights, each of the loss functions to LOSSes as an overall loss function according to the number of sentiment classifiers, so that multi-label classification can be implemented by means of cross-learning of multiple classifiers, so as to achieve better generalization or calibration effects.

Description

一种文本分类方法、装置、计算机设备及存储介质Text classification method, device, computer equipment and storage medium

本申请申明享有2019年01月14日递交的申请号为CN201910038962.9、名称为“一种文本分类方法、装置、计算机设备及存储介质”的中国专利申请的优先权，该中国专利申请的整体内容以参考的方式结合在本申请中。This application affirms that it enjoys the priority of the Chinese patent application filed on January 14, 2019 with the application number CN201910038962.9 and the title "a text classification method, device, computer equipment and storage medium". The entire Chinese patent application The content is incorporated in this application by reference.

技术领域Technical field

本申请自然语言处理领域，涉及一种基于语境词向量和深度学习的文本分类方法、装置、计算机设备及存储介质。The natural language processing field of this application relates to a text classification method, device, computer equipment and storage medium based on contextual word vectors and deep learning.

背景技术Background technique

在互联网快速发展的今天，互联网信息呈现***式增长，情感分析或意见挖掘已经渗入到人们生活的方方面面，京东、淘宝、亚马逊等互联网在线购物平台，在线音乐平台、微博、推特等社交网站、新闻传媒以及政治选举等等。例如，网上购物已经成了人们生活的潮流，针对用户在购物网站的评论进行观点挖掘及情感分析，不仅可以帮助用户更好的了解和选购产品，还能帮助产品制造商理解用户的需求，改进自身产品；在微博中，同样也可以对热搜事件中用户的观点和情感进行挖掘和处理，从而观察出现代人们的生活品质、爱好等Today, with the rapid development of the Internet, Internet information has exploded, and sentiment analysis or opinion mining has penetrated into all aspects of people’s lives. News media and political elections, etc. For example, online shopping has become a trend in people’s lives. Opinion mining and sentiment analysis based on users’ comments on shopping websites can not only help users better understand and purchase products, but also help product manufacturers understand users’ needs. Improve your own products; in Weibo, you can also mine and process users’ opinions and emotions in hot search events, so as to observe modern people’s quality of life, hobbies, etc.

现阶段对于文本分类，如情感分析等大多是某一方面的分类，通过为每一个分类构建一个单独的模型实习，对于多个方面的分类大多需要采用多个模型或是多个全连接层去连接分类器,存在计算量过大，训练时间耗时久，且精度和泛化能力也达不到需求的问题。At this stage, for text classification, such as sentiment analysis, most of them are classified in one aspect. By constructing a separate model practice for each classification, multiple models or multiple fully connected layers are mostly needed for multiple aspects of classification. The connection classifier has the problems of excessive calculation, long training time, and insufficient accuracy and generalization ability.

发明内容Summary of the invention

本申请的目的是提供一种多损失函数文本分类方法、装置、计算机设备及存储介质，用于解决现有技术存在的问题，具备了更好的学习及泛化能力。The purpose of this application is to provide a multi-loss function text classification method, device, computer equipment and storage medium, which are used to solve the problems existing in the prior art and have better learning and generalization capabilities.

为实现上述目的，本申请提供一种多损失函数文本分类方法，包括以下步骤：In order to achieve the above objective, this application provides a method for text classification with multiple loss functions, which includes the following steps:

S10：构建词向量，将输入文本转化为词向量形式；S10: Construct a word vector and convert the input text into a word vector form;

S20：将S10中的词向量分别输入到至少两组情感分类器进行训练，所述情感分类器对所述词向量进行训练后，将各自全连接层分别输出到各自的loss函数中，各情感分类器根据业务不同的分类需求选择不同情感特征；S20: Input the word vectors in S10 to at least two sets of emotion classifiers for training. After the emotion classifiers train the word vectors, they output the respective fully connected layers to their respective loss functions. Each emotion The classifier selects different emotional characteristics according to different classification requirements of the business;

S30:交叉学习并更新情感分类器，根据情感分类器的数量，将各loss函数按照等权重加成到LOSSes中作为整体损失函数，并根据所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。S30: Cross-learn and update the sentiment classifier, according to the number of sentiment classifiers, add each loss function to LOSSes with equal weight as the overall loss function, and update the sentiment classifiers according to the overall loss function , Until the overall loss function no longer decreases.

为实现上述目的，本申请还提供一种文本分类装置，其包括In order to achieve the above objective, the present application also provides a text classification device, which includes

词向量构建模块，其用于将输入文本转化为词向量形式；The word vector building module, which is used to convert the input text into a word vector form;

词向量输入模块，初步分类模块，其用于所述词向量分别输入到至少两组情感分类器中，并将所述情感分类器的各自全连接层输出到各自的loss函数中，各所述情感分类器根据业务不同的分类需求选择不同情感特征；The word vector input module and the preliminary classification module are used to input the word vectors into at least two sets of sentiment classifiers respectively, and output the respective fully connected layers of the sentiment classifiers to their respective loss functions. The sentiment classifier selects different sentiment features according to different classification requirements of the business;

整体损失函数获取及更新模块，用于将各所述loss函数等权重加成形成LOSSes作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。The overall loss function acquisition and update module is used to add the weights of the loss functions to form LOSSes as the overall loss function, and update the sentiment classifiers based on the overall loss function until the overall loss function is no longer Until lowered.

为实现上述目的，本申请还提供一种计算机设备，包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现文本分类方法的以下步骤：In order to achieve the above objective, this application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. The processor implements the text classification method when the computer program is executed. The following steps:

整体损失函数获取及更新模块，用于将各所述loss函数等权重加成形成 LOSSes作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。The overall loss function acquisition and update module is used to add the weights of the loss functions to form LOSSes as the overall loss function, and update the sentiment classifiers based on the overall loss function until the overall loss function is no longer Until lowered.

为实现上述目的，本申请还提供计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现实体识别与链接方法的以下步骤：In order to achieve the above-mentioned purpose, the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps of the entity identification and linking method are realized:

本申请提供一种文本分类方法、装置、计算机设备及存储介质，将文本输入后转化为至少两个分支，每个分支的情感分类器根据业务不同的分类需求确定多组不同的情感特征，各情感分类器最终汇合在全连接层且采用不同独立的损失函数进行训练，再通过多个分类器的交叉学习(即根据所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止过)，可以同时对各通道的模型进行更新因此具有更高的精准度，同时，其可以预测训练集中未出现过的情感搭配，与原有的模型分别预测n种相比较，其可以预测n*n种情感搭配，因此具有较好的泛化能力。This application provides a text classification method, device, computer equipment, and storage medium, which converts text input into at least two branches. The sentiment classifier of each branch determines multiple sets of different sentiment features according to different classification requirements of the business. The sentiment classifiers are finally merged in the fully connected layer and trained with different independent loss functions, and then through the cross learning of multiple classifiers (that is, the sentiment classifiers are updated according to the overall loss function until the overall loss function No more reductions so far), the model of each channel can be updated at the same time, so it has higher accuracy. At the same time, it can predict the emotional collocations that have not appeared in the training set. Compared with the original model predicting n types, It can predict n*n emotional collocations, so it has good generalization ability.

附图说明BRIEF DESCRIPTION

图1为本申请一种文本分类方法一实施例的流程图；FIG. 1 is a flowchart of an embodiment of a text classification method according to this application;

图2为本申请一种文本分类方法另一实施例的流程图；2 is a flowchart of another embodiment of a text classification method according to this application;

图3为本申请一种文本分类装置的一实施例程序模块示意图；FIG. 3 is a schematic diagram of program modules of an embodiment of a text classification device of this application;

图4为本申请文本分类装置一实施例的硬件结构示意图。FIG. 4 is a schematic diagram of the hardware structure of an embodiment of the text classification device of this application.

具体实施方式detailed description

为了使本申请申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请申请进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本申请申请，并不用于限定本申请申请。基于本申请申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请申请保护的范围。In order to make the purpose, technical solutions and advantages of this application clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application of this application, and are not used to limit the application of this application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work are within the scope of protection of this application.

实施例一Example one

本申请公开了一种文本分类方法中，其基于多损失函数进行，如图1所示，包括以下步骤：This application discloses a method for text classification, which is based on multiple loss functions, as shown in Fig. 1, and includes the following steps:

S10：构建词向量，将输入文本转化为向量形式；步骤S10中，利用word2vec工具得到文本语义词向量。本实施例中，使用的是CBOW模型，通过调整神经网络的隐藏矩阵的值来实现最大化的语言预测模型。S10: Construct a word vector and convert the input text into a vector form; in step S10, use the word2vec tool to obtain the text semantic word vector. In this embodiment, the CBOW model is used, and the maximum language prediction model is realized by adjusting the value of the hidden matrix of the neural network.

S20：将S10中的词向量分别输入到至少两组情感分类器进行训练，所述情感分类器对所述词向量进行训练后，将各自全连接层分别输出到各自的loss函数中，各情感分类器根据业务不同的分类需求选择不同情感特征；其中，各情感分类器的loss函数可为交叉熵损失函数。S20: Input the word vectors in S10 to at least two sets of emotion classifiers for training. After the emotion classifiers train the word vectors, they output the respective fully connected layers to their respective loss functions. Each emotion The classifier selects different emotion features according to different classification requirements of the business; among them, the loss function of each emotion classifier can be a cross-entropy loss function.

本申请提供的文本分类方法，将文本输入后转化为至少两个分支，每个分支的情感分类器根据业务不同的分类需求确定多组不同的情感特征，各情感分类器最终汇合在全连接层且采用不同独立的损失函数进行训练，再通过多个分类器的交叉学习可以实现多标签分类，具有更好的泛化或者是校准。具体而言上述文本分类方法可以同时对多个通道的模型进行更新，因此具有更高的精准度，同时，其可以预测训练集中未出现过的情感搭配，与原有的模型分别预测n种相比较，其可以预测n*n种情感搭配，因此具有较好的泛化能力。The text classification method provided in this application converts the text input into at least two branches. The sentiment classifier of each branch determines multiple sets of different sentiment features according to different classification requirements of the business, and the sentiment classifiers are finally merged in the fully connected layer And using different independent loss functions for training, and then through the cross learning of multiple classifiers can achieve multi-label classification, with better generalization or calibration. Specifically, the above text classification method can update the models of multiple channels at the same time, so it has higher accuracy. At the same time, it can predict the emotion collocations that have not appeared in the training set, and respectively predict n kinds of phases with the original model. In comparison, it can predict n*n kinds of emotional collocations, so it has better generalization ability.

作为一优选方案，共设置两组情感分类器，即一级情感分类器与二级情感分类器，一级情感分类器用于输入文本的情感正负分类；二级情感分类器用于输入文本具体情绪类型的分类，则请参阅图2，本实施例的所示的文本分类方法，包括如下步骤：As a preferred solution, a total of two sets of sentiment classifiers are set up, namely, the first-level sentiment classifier and the second-level sentiment classifier. The first-level sentiment classifier is used to classify the positive and negative sentiments of the input text; the second-level sentiment classifier is used to input the specific emotion of the text For classification of types, please refer to Figure 2. The text classification method shown in this embodiment includes the following steps:

S11：构建词向量，将输入文本转化为向量形式。S11: Construct a word vector and convert the input text into a vector form.

S21：将S11中的词向量作为一级情感分类器与二级情感分类器的输入，并将一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中，S21: Use the word vector in S11 as the input of the first-level emotion classifier and the second-level emotion classifier, and output the fully connected layer of the first-level emotion classifier and the second-level emotion classifier to their respective loss functions,

本实施例中，一级情感分类器可用于输入文本的情感正负分类；二级情感分类器可用于输入文本具体情绪类型的分类；一级情感分类器与二级情感分类器对词向量进行训练后，将各自全连接层分别输出到各自的loss函数中。In this embodiment, the first-level emotion classifier can be used to classify the positive and negative emotions of the input text; the second-level emotion classifier can be used to classify the specific emotion types of the input text; the first-level emotion classifier and the second-level emotion classifier perform word vectors After training, the respective fully connected layers are output to their respective loss functions.

本实施例中，步骤S21，基于TextRNN结合attention机制建立一级情感分类器，此外，也可使用TextCNN或TextRCNN代替上述TextRNN结合attention方案；基于TextCNN建立二级情感分类器，此外，也可使用TextRNN代替上述TextCNN方案。In this embodiment, in step S21, a first-level emotion classifier is established based on TextRNN and attention mechanism. In addition, TextCNN or TextRCNN can also be used instead of the above-mentioned TextRNN and attention scheme; a second-level emotion classifier is established based on TextCNN. In addition, TextRNN Instead of the above TextCNN solution.

在基于TextRNN结合attention机制建立一级情感分类器中，对TextRNN中每一个节点h _t分配了权重αt，使其权重值使其更新为h _newt＝α _t*h _t，以为编码的词向量进行权重加成，权重αt为： In the establishment of a first-level sentiment classifier based on TextRNN combined with the attention mechanism, each node h _t in TextRNN is assigned a weight αt, and its weight value is updated to h _newt =α _t *h _t , to perform the encoding for the word vector Weight bonus, weight αt is:

其中，u _w与u _t为需要设置的权重，二者确定方法相同，u _t＝tanh(W _wh _t+b _w)，W _w、与b _w为Attention的权重与bias。 Among them, u _w and u _t are the weights that need to be set, and the determination methods are the same, u _t =tanh (W _w h _t + b _w ), and W _w , and b _w are the weights and biases of Attention.

在基于TextCNN建立二级情感分类器，TextCNN由Conv与激活函数、BN、 MaxPooling组成，其中，Conv为卷积层，用于捕捉文本局部相关性，激活函数为了给网络添加非线性变换使网络泛化能力增强，BN是为了防止梯度弥散使模型可以收敛更好更快，Maxpooling为了最大化局部特征以及减少计算量，具体步骤如下：In the establishment of a secondary emotion classifier based on TextCNN, TextCNN is composed of Conv and activation function, BN, and MaxPooling. Among them, Conv is a convolutional layer, which is used to capture the local correlation of text. The activation function is to add nonlinear transformation to the network to make the network universal BN is to prevent gradient dispersion so that the model can converge better and faster. Maxpooling maximizes local features and reduces the amount of calculation. The specific steps are as follows:

(1)使用Conv对输入的词向量进行卷积操作，本实施例中，选取6种size(长度)的1D filter进行卷积。(1) Use Conv to perform a convolution operation on the input word vector. In this embodiment, 6 sizes (length) of 1D filters are selected for convolution.

(2)卷积后进行BN也就是批归一化，具体计算公式为：(2) Perform BN after convolution, that is, batch normalization. The specific calculation formula is:

X1＝W*XX1=W*X

X3＝γ*X2+βX3=γ*X2+β

其中X2中的μ为均值，σ为方差，即为某次计算选取样本的某个隐藏变量下的均值与方差，X3中γ与β为偏移与放缩的超参数。Among them, μ in X2 is the mean, σ is the variance, which is the mean and variance of a hidden variable of a selected sample in a certain calculation, and γ and β in X3 are hyperparameters for offset and scaling.

(3)进入激活函数，激活函数我们选用Relu激活函数。(3) Enter the activation function, the activation function we choose Relu activation function.

(4)进入Maxpooling即池化层，该部分选取向量中最大的值最为代表输出。(4) Enter Maxpooling, which is the pooling layer. This part selects the largest value in the vector as the representative output.

当S21中的词向量分别经过上述处理后，将一级情感分类器与二级情感分类器的全连接层输出到各自的loss函数中，上述一级情感分类器与二级情感分类器的loss函数均为交叉熵损失函数。After the word vectors in S21 are processed separately, the fully connected layers of the first-level emotion classifier and the second-level emotion classifier are output to their respective loss functions. The loss of the first-level emotion classifier and the second-level emotion classifier The functions are all cross entropy loss functions.

S31：交叉学习并更新一级情感分类模器与二级情感分类器：将一级情感分类器与二级情感分类器的各loss函数按照等权重加成到LOSSes中作为整体损失函数，然后根据整体损失函数，更新一级情感分类模器与二级情感分类器两个通道的超参数，直到整体损失函数不再降低为止，此时，模型收敛，训练完成。其中：S31: Cross learning and update the first-level emotion classifier and the second-level emotion classifier: add the loss functions of the first-level emotion classifier and the second-level emotion classifier to LOSSes according to equal weight as the overall loss function, and then according to The overall loss function is to update the hyperparameters of the two channels of the first-level sentiment classifier and the second-level sentiment classifier until the overall loss function no longer decreases. At this time, the model converges and the training is completed. among them:

Losses＝0.5*Loss _RNN+0.5*Loss _CNN。 Losses=0.5*Loss _RNN +0.5*Loss _CNN .

即通过添加了损失函数LOSSes(两个损失函数加权和)，更新两个分类器，其中Loss _RNN与Loss _CNN分别为一级情感分类器与二级情感分类器的交叉熵。 That is, by adding the loss function LOSSes (weighted sum of two loss functions), the two classifiers are updated, where Loss _RNN and Loss _CNN are the cross entropy of the first-level emotion classifier and the second-level emotion classifier, respectively.

采用此方法可以同时对两个通道的模型进行更新因此具有更高的精准度，且该方法可以预测训练集中未出现过的情感搭配，比方说训练集有‘悲伤+哀怨’等10种，而我们的两个模型分别预测4种，那么我们可以预测4*4种而不是10种，因此具有较好的泛化能力。Using this method can update the models of the two channels at the same time, so it has higher accuracy, and this method can predict the emotional collocation that has not appeared in the training set, for example, there are 10 kinds of'sad + sadness' in the training set. Our two models predict 4 kinds respectively, then we can predict 4*4 kinds instead of 10 kinds, so it has better generalization ability.

上述实施例中，将文本输入后转化为两个分支，其中一个用于文本的情感正负分类，另一部分用于文本的具体情绪类型(悲伤、平缓等)的分类，最终汇合在全连接层，采用两个独立的损失函数进行训练，再通过两个分类器的交叉学习可以实现分类，从而具备了更好的学习能力、泛化能力；且在两条通道分类有相似的情况下还具备了一定的校准作用，对于模型的准确率有一定的提升。In the above embodiment, the text input is transformed into two branches, one of which is used to classify the positive and negative emotions of the text, and the other part is used to classify the specific emotion type (sadness, gentleness, etc.) of the text, and finally merge in the fully connected layer , Using two independent loss functions for training, and then through the cross learning of the two classifiers to achieve classification, so that it has a better learning ability, generalization ability; and in the case that the two channel classifications are similar A certain calibration effect can improve the accuracy of the model.

实施例二Example 2

请继续参阅图2，本实施例示出了一种文本分类方法，在本实施例中，文本分类方法10可以包括或被分割成一个或多个程序模块，一个或者多个程序模块被存储于存储介质中，并由一个或多个处理器所执行，以完成本申请，并可实现上述文本分类装置方法。本申请所称的程序模块是指能够完成特定功能的一系列计算机程序指令段，比程序本身更适合于描述文本分类方法10在存储介质中的执行过程。以下描述将具体介绍本实施例各程序模块的功能：Please continue to refer to FIG. 2. This embodiment shows a text classification method. In this embodiment, the text classification method 10 may include or be divided into one or more program modules, and the one or more program modules are stored in storage. In the medium, and executed by one or more processors, to complete the application, and implement the above text classification device and method. The program module referred to in the present application refers to a series of computer program instruction segments that can complete specific functions, and is more suitable for describing the execution process of the text classification method 10 in the storage medium than the program itself. The following description will specifically introduce the functions of each program module in this embodiment:

本申请还公开了一种文本分类装置，包括The application also discloses a text classification device, including

词向量构建模块11，其用于将输入文本转化为词向量形式。The word vector construction module 11 is used to convert the input text into a word vector form.

词向量输入模块21，初步分类模块，其用于所述词向量分别输入到至少两组情感分类器中，并将所述情感分类器的各自全连接层输出到各自的loss函数中，各所述情感分类器根据业务不同的分类需求选择不同情感特征。The word vector input module 21, a preliminary classification module, is used to input the word vectors into at least two sets of sentiment classifiers, and output the respective fully connected layers of the sentiment classifiers to their respective loss functions. The sentiment classifier selects different sentiment features according to different classification requirements of the business.

本申请提供的文本分类装置，将文本输入后转化为至少两个分支，每个分支的情感分类器根据业务不同的分类需求确定多组不同的情感特征，各情感分类器最终汇合在全连接层且采用不同独立的损失函数进行训练，再通过多个分类器的交叉学习可以实现分类，可以同时对两个通道的模型进行更新因此具有更高的精准度，同时，其可以预测训练集中未出现过的情感搭配，与原有的模型分别预测n种相比较，其可以预测n*n种情感搭配，因此具有较好的泛化能力。The text classification device provided in this application converts the text input into at least two branches. The sentiment classifier of each branch determines multiple sets of different sentiment features according to different classification requirements of the business, and the sentiment classifiers are finally merged in the fully connected layer And use different independent loss functions for training, and then through the cross learning of multiple classifiers to achieve classification, can update the models of two channels at the same time, so it has higher accuracy, and at the same time, it can predict that it does not appear in the training set Compared with the original model predicting n kinds of emotion collocations, it can predict n*n kinds of emotion collocations, so it has better generalization ability.

作用一优选方案，词向量构建模块11中，使用word2vec构建词向量。利用word2vec工具得到文本语义词向量。本实施例中，使用的是CBOW模型，通过调整神经网络的隐藏矩阵的值来实现最大化的语言预测模型。As an optimal solution, in the word vector construction module 11, word2vec is used to construct the word vector. Use word2vec tool to get text semantic word vector. In this embodiment, the CBOW model is used, and the maximum language prediction model is realized by adjusting the value of the hidden matrix of the neural network.

作用一优选方案，词向量输入模块21中loss函数均为交叉熵损失函数。As a preferred solution, the loss functions in the word vector input module 21 are all cross-entropy loss functions.

作用一优选方案，词向量输入模块21中，设置一级情感分类器与二级情感分类器，所述S1中的词向量作为一级情感分类器与二级情感分类器的输入，并将所述一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中。本实施例中，一级情感分类器可用于输入文本的情感正负分类；二级情感分类器可用于输入文本具体情绪类型的分类。Function a preferred solution, in the word vector input module 21, a first-level emotion classifier and a second-level emotion classifier are set, and the word vector in S1 is used as the input of the first-level emotion classifier and the second-level emotion classifier, and all The fully connected layers of the first-level emotion classifier and the second-level emotion classifier are output to their respective loss functions. In this embodiment, the first-level emotion classifier can be used to classify the positive and negative emotions of the input text; the second-level emotion classifier can be used to classify the specific emotion types of the input text.

更进一步的，词向量输入模块21中，此外，也可使用TextCNN或TextRCNN代替上述TextRNN结合attention方案；基于TextCNN建立二级情感分类器，此外，也可使用TextRNN代替上述TextCNN方案。Furthermore, in the word vector input module 21, in addition, TextCNN or TextRCNN can also be used instead of the above-mentioned TextRNN combined with the attention scheme; a secondary emotion classifier is established based on TextCNN, and in addition, TextRNN can also be used instead of the above-mentioned TextCNN scheme.

其中，u _t＝tanh(W _wh _t+b _w)，W _w、U _w与b _w均为Attention的权重与bias。 Among them, u _t =tanh(W _w h _t +b _w ), and W _w , U _w and b _w are all Attention weights and biases.

在基于TextCNN建立二级情感分类器，TextCNN由Conv与激活函数、BN、MaxPooling组成，其中，Conv为卷积层，用于捕捉文本局部相关性，激活函数为了给网络添加非线性变换使网络泛化能力增强，BN是为了防止梯度弥散使模型可以收敛更好更快，Maxpooling为了最大化局部特征以及减少计算量，具体步骤如下In the establishment of a secondary sentiment classifier based on TextCNN, TextCNN is composed of Conv and activation function, BN, and MaxPooling. Among them, Conv is a convolutional layer, which is used to capture the local correlation of text. The activation function is to add nonlinear transformation to the network to make the network universal The ability to transform is enhanced. BN is to prevent gradient dispersion so that the model can converge better and faster. Maxpooling is to maximize local features and reduce the amount of calculation. The specific steps are as follows

(1)使用Conv对输入的词向量进行卷积操作，本实施例中，选取6种size的1D filter进行卷积。(1) Conv is used to convolve the input word vector. In this embodiment, 1D filters of 6 sizes are selected for convolution.

X1＝W*XX1=W*X

X3＝γ*X2+βX3=γ*X2+β

其中X2中的μ为均值，σ为方差，X3中γ与β为偏移与放缩的超参数。Among them, μ in X2 is the mean value, σ is the variance, and γ and β in X3 are hyperparameters of offset and scaling.

(3)进入激活函数，激活函数我们选用Relu激活函数，(3) Enter the activation function, the activation function we choose Relu activation function,

词向量分别经过上述处理后，将一级情感分类器与二级情感分类器的全连接层输出到各自的loss函数中，上述一级情感分类器与二级情感分类器的loss函数均为交叉熵损失函数。After the word vectors are processed separately, the fully connected layers of the first-level emotion classifier and the second-level emotion classifier are output to their respective loss functions. The loss functions of the above-mentioned first-level emotion classifier and the second-level emotion classifier are all crosses Entropy loss function.

相应的，整体损失函数获取及更新模块31中：交叉学习并更新一级情感分类模器与二级情感分类器：将一级情感分类器与二级情感分类器的各loss函数按照等权重加成到LOSSes中作为整体损失函数，其中：Correspondingly, the overall loss function acquisition and update module 31: cross-learn and update the first-level emotion classifier and the second-level emotion classifier: add the loss functions of the first-level emotion classifier and the second-level emotion classifier according to equal weight Into LOSSes as the overall loss function, where:

Losses＝0.5*Loss _RNN+0.5*Loss _CNN。 Losses=0.5*Loss _RNN +0.5*Loss _CNN .

即通过添加了损失函数LOSSes(两个损失函数加权和)，更新两个分类器。That is, by adding the loss function LOSSes (weighted sum of two loss functions), the two classifiers are updated.

实施例三Example three

如图3所示，本实施例还提供一种计算机设备，如可以执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器，或者多个服务器所组成的服务器集群)等。本实施例的计算机设备20至少包括但不限于：可通过***总线相互通信连接的存储器21、处理器22，如图3所示。需要指出的是，图3仅示出了具有组件21-22的计算机设备20，但是应理解的是，并不要求实施所有示出的组件，可以替代的实施更多或者更少的组件。As shown in Figure 3, this embodiment also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a cabinet server ( Including independent servers, or server clusters composed of multiple servers) and so on. The computer device 20 in this embodiment at least includes but is not limited to: a memory 21 and a processor 22 that can be communicably connected to each other through a system bus, as shown in FIG. 3. It should be pointed out that FIG. 3 only shows the computer device 20 with components 21-22, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.

本实施例中，存储器21(即可读存储介质)包括闪存、硬盘、多媒体卡、卡型存储器(例如，SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中，存储器21可以是计算机设备20的内部存储单元，例如该计算机设备20的硬盘或内存。在另一些实施例中，存储器21也可以是计算机设备20的外部存储设备，例如该计算机设备20上配备的插接式硬盘，智能存储卡(Smart Media Card,SMC)，安全数字(Secure Digital,SD)卡，闪存卡(Flash Card)等。当然，存储器21还可以既包括计算机设备20的内部存储单元也包括其外部存储设备。本实施例中，存储器21通常用于存储安装于计算机设备20的操作***和各类应用软件，例如实施例一的文本分类装置装置10的程序代码等。此外，存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 21 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 20, such as a hard disk or memory of the computer device 20. In other embodiments, the memory 21 may also be an external storage device of the computer device 20, for example, a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 20 and its external storage device. In this embodiment, the memory 21 is generally used to store an operating system and various application software installed in the computer device 20, such as the program code of the text classification apparatus 10 in the first embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.

处理器22在一些实施例中可以是中央处理器(Central Processing Unit，CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备20的总体操作。本实施例中，处理器22用于运行存储器21中存储的程序代码或者处理数据，例如运行文本分类装置装置10，以实现实施例一的文本分类装置方法。The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 20. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the text classification device 10, so as to implement the text classification device method of the first embodiment.

实施例四Example 4

如图4所示，本实施例还提供一种计算机可读存储介质，如闪存、硬盘、多媒体卡、卡型存储器(例如，SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等，其上存储有计算机程序，程序被处理器执行时实现相应功能。本实施例的计算机可读存储介质用于存储文本分类装置装置10，被处理器执行时实现实施例一的文本分类装置方法。As shown in Figure 4, this embodiment also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access Memory (SRAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Disk, Optical Disk, Server, App Store, etc. A computer program is stored on it, and when the program is executed by the processor, the corresponding function is realized. The computer-readable storage medium of this embodiment is used to store the text classification device 10, and when executed by a processor, it implements the text classification device method of the first embodiment.

上述本申请实施例序号仅仅为了描述，不代表实施例的优劣。The sequence numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.

以上仅为本申请的优选实施例，并非因此限制本申请的专利范围，凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the present application and do not limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by the description and drawings of this application, or directly or indirectly used in other related technical fields The same reason is included in the patent protection scope of this application.

Claims

一种文本分类方法，其特征在于，包括以下步骤：A text classification method is characterized in that it comprises the following steps:

S10：构建词向量，将输入文本转化为词向量形式；S10: Construct a word vector and convert the input text into a word vector form;

S20：将S10中的词向量分别输入到至少两组情感分类器进行训练，所述情感分类器对所述词向量进行训练后，将各自全连接层分别输出到各自的loss函数中，各情感分类器根据业务不同的分类需求选择不同情感特征；S20: Input the word vectors in S10 to at least two sets of emotion classifiers for training. After the emotion classifiers train the word vectors, they output the respective fully connected layers to their respective loss functions. Each emotion The classifier selects different emotional characteristics according to different classification requirements of the business;

S30:交叉学习并更新情感分类器，根据情感分类器的数量，将各loss函数按照等权重加成到LOSSes中作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。S30: Cross learning and update sentiment classifiers, according to the number of sentiment classifiers, add each loss function with equal weight to LOSSes as the overall loss function, and update the sentiment classifiers based on the overall loss function , Until the overall loss function no longer decreases.
根据权利要求1所述的文本分类方法，其特征在于，所述步骤S10中，使用word2vec构建词向量。The text classification method according to claim 1, wherein in the step S10, word2vec is used to construct a word vector.
根据权利要求1所述的文本分类方法，其特征在于，所述步骤S20中，设置一级情感分类器与二级情感分类器，所述S10中的词向量作为一级情感分类器与二级情感分类器的输入，并将所述一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中。The text classification method according to claim 1, wherein in the step S20, a first-level emotion classifier and a second-level emotion classifier are set, and the word vector in the S10 is used as the first-level emotion classifier and the second-level emotion classifier. The sentiment classifier is input, and the fully connected layer of the first-level sentiment classifier and the second-level sentiment classifier are output to their respective loss functions.
根据权利要求3所述的文本分类方法，其特征在于，所述步骤S20中，所述基于TextRNN结合attention机制建立一级情感分类器；The text classification method according to claim 3, wherein in the step S20, the first-level sentiment classifier is established based on the TextRNN combined with the attention mechanism;

和/或，基于TextCNN建立所述二级情感分类器。And/or, establish the secondary emotion classifier based on TextCNN.
根据权利要求3所述的文本分类方法，其特征在于，所述一级情感分类器中，对TextRNN中每一个节点h _t分配了权重αt，使其权重值使其更新为h _newt＝α _t*h _t，以为编码的词向量进行权重加成，所述权重αt为： The text classification method according to claim 3, characterized in that, in the first-level emotion classifier, each node h _t in the TextRNN is assigned a weight αt, and its weight value is updated to h _newt = α _t *h _t , add weight to the encoded word vector, the weight αt is:

其中，u _t＝tanh(W _wh _t+b _w)，W _w、U _w与b _w均为Attention的权重与bias。 Among them, u _t =tanh(W _w h _t +b _w ), and W _w , U _w and b _w are all Attention weights and biases.
根据权利要求3所述的文本分类方法，其特征在于，步骤S30中，LOSSes为：The text classification method of claim 3, wherein in step S30, LOSSes is:

Losses＝0.5*Loss _RNN+0.5*Loss _CNN。 Losses=0.5*Loss _RNN +0.5*Loss _CNN .
根据权利要求1或3所述的文本分类方法，其特征在于：所述loss函数均为交叉熵损失函数。The text classification method according to claim 1 or 3, wherein the loss functions are all cross-entropy loss functions.
一种文本分类装置，其特征在于：包括A text classification device, which is characterized in that it comprises

词向量构建模块，其用于将输入文本转化为词向量形式；The word vector building module, which is used to convert the input text into a word vector form;

词向量输入模块，初步分类模块，其用于所述词向量分别输入到至少两组情感分类器中，并将所述情感分类器的各自全连接层输出到各自的loss函数中，各所述情感分类器根据业务不同的分类需求选择不同情感特征；The word vector input module and the preliminary classification module are used to input the word vectors into at least two sets of sentiment classifiers respectively, and output the respective fully connected layers of the sentiment classifiers to their respective loss functions. The sentiment classifier selects different sentiment features according to different classification requirements of the business;

整体损失函数获取及更新模块，用于将各所述loss函数等权重加成形成LOSSes作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。The overall loss function acquisition and update module is used to add the weights of the loss functions to form LOSSes as the overall loss function, and update the sentiment classifiers based on the overall loss function until the overall loss function is no longer Until lowered.
根据权利要求8所述文本分类装置，其特征在于，所述词向量构建模块中，使用word2vec构建词向量。The text classification device according to claim 8, wherein in the word vector construction module, word2vec is used to construct the word vector.
根据权利要求8所述文本分类装置，其特征在于，所述词向量输入模块中，设置一级情感分类器与二级情感分类器，所述词向量作为一级情感分类器与二级情感分类器的输入，并将所述一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中。The text classification device according to claim 8, wherein the word vector input module is provided with a first-level emotion classifier and a second-level emotion classifier, and the word vector serves as the first-level emotion classifier and the second-level emotion classifier And output the fully connected layer of the first-level emotion classifier and the second-level emotion classifier to their respective loss functions.
根据权利要求8所述文本分类装置，其特征在于，所述词向量输入模块中，所述基于TextRNN结合attention机制建立一级情感分类器；The text classification device according to claim 8, wherein in the word vector input module, the first-level sentiment classifier is established based on the TextRNN combined with the attention mechanism;

和/或，基于TextCNN建立所述二级情感分类器。And/or, establish the secondary emotion classifier based on TextCNN.
根据权利要求11所述文本分类装置，其特征在于，所述一级情感分类器中，对TextRNN中每一个节点h _t分配了权重αt，使其权重值使其更新为h _newt＝α _t*h _t，以为编码的词向量进行权重加成，所述权重αt为： The text classification device according to claim 11, wherein, in the first-level emotion classifier, a weight αt is assigned to each node h _t in TextRNN, and its weight value is updated to h _newt =α _t * h _t , add weight to the encoded word vector, the weight αt is:

其中，u _t＝tanh(W _wh _t+b _w)，W _w、U _w与b _w均为Attention的权重与bias。 Among them, u _t =tanh(W _w h _t +b _w ), and W _w , U _w and b _w are all Attention weights and biases.
根据权利要求8所述文本分类装置，其特征在于，所述整体损失函数获取及更新模块中，LOSSes为：The text classification device according to claim 8, wherein in the overall loss function obtaining and updating module, LOSSes is:

Losses＝0.5*Loss _RNN+0.5*Loss _CNN。 Losses=0.5*Loss _RNN +0.5*Loss _CNN .
根据权利要求8所述文本分类装置，其特征在于，所述loss函数均为交叉熵损失函数。8. The text classification device according to claim 8, wherein the loss functions are all cross-entropy loss functions.
一种计算机设备，包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现文本分类方法的以下步骤：A computer device includes a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the following steps of a text classification method when the processor executes the computer program:

构建词向量，将输入文本转化为词向量形式；Construct word vector and convert input text into word vector form;

将所述词向量分别输入到至少两组情感分类器进行训练，所述情感分类器对所述词向量进行训练后，将各自全连接层分别输出到各自的loss函数中，各情感分类器根据业务不同的分类需求选择不同情感特征；The word vectors are input to at least two sets of emotion classifiers for training. After the emotion classifiers train the word vectors, they output their respective fully connected layers to their respective loss functions, and each emotion classifier is based on Different types of business require different emotional characteristics;

交叉学习并更新情感分类器，根据情感分类器的数量，将各loss函数按照等权重加成到LOSSes中作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。Cross-learn and update sentiment classifiers. According to the number of sentiment classifiers, each loss function is added to LOSSes with equal weight as the overall loss function, and the sentiment classifiers are updated based on the overall loss function until The overall loss function is no longer reduced.
根据权利要求15所述计算机设备，其特征在于，设置一级情感分类器与二级情感分类器，所述词向量作为一级情感分类器与二级情感分类器的输入，并将所述一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中。The computer device according to claim 15, wherein a first-level emotion classifier and a second-level emotion classifier are provided, the word vector is used as an input of the first-level emotion classifier and the second-level emotion classifier, and the one The fully connected layers of the second-level emotion classifier and the second-level emotion classifier are output to their respective loss functions.
根据权利要求16所述计算机设备，其特征在于，所述一级情感分类器中，对TextRNN中每一个节点h _t分配了权重αt，使其权重值使其更新为h _newt＝α _t*h _t，以为编码的词向量进行权重加成，所述权重αt为： The computer device according to claim 16, characterized in that, in the first-level emotion classifier, each node h _t in the TextRNN is assigned a weight αt, and its weight value is updated to h _newt =α _t *h _t , add weight to the encoded word vector, and the weight αt is:

其中，u _t＝tanh(W _wh _t+b _w)，W _w、U _w与b _w均为Attention的权重与bias。 Among them, u _t =tanh(W _w h _t +b _w ), and W _w , U _w and b _w are all Attention weights and biases.
一种计算机可读存储介质，其上存储有计算机程序，其特征在于：所述计算机程序被处理器执行时实现文本分类方法的以下步骤：A computer-readable storage medium with a computer program stored thereon, characterized in that: when the computer program is executed by a processor, the following steps of a text classification method are implemented:

构建词向量，将输入文本转化为词向量形式；Construct word vector and convert input text into word vector form;

将所述词向量分别输入到至少两组情感分类器进行训练，所述情感分类器对所述词向量进行训练后，将各自全连接层分别输出到各自的loss函数中，各情感分类器根据业务不同的分类需求选择不同情感特征；The word vectors are input to at least two sets of emotion classifiers for training. After the emotion classifiers train the word vectors, they output their respective fully connected layers to their respective loss functions, and each emotion classifier is based on Different types of business require different emotional characteristics;

交叉学习并更新情感分类器，根据情感分类器的数量，将各loss函数按照等权重加成到LOSSes中作为整体损失函数，并基于所述整体损失函数对所述各情感分类器进行更新，直到整体损失函数不再降低为止。Cross-learn and update sentiment classifiers. According to the number of sentiment classifiers, each loss function is added to LOSSes with equal weight as the overall loss function, and the sentiment classifiers are updated based on the overall loss function until The overall loss function is no longer reduced.
根据权利要求18所述计算机可读存储介质，其特征在于，设置一级情感分类器与二级情感分类器，所述词向量作为一级情感分类器与二级情感分类器的输入，并将所述一级情感分类器与二级情感分类器全连接层输出到各自的loss函数中。The computer-readable storage medium according to claim 18, wherein a first-level sentiment classifier and a second-level sentiment classifier are provided, and the word vector is used as the input of the first-level sentiment classifier and the second-level sentiment classifier, and The fully connected layers of the first-level emotion classifier and the second-level emotion classifier are output to their respective loss functions.
根据权利要求19所述计算机可读存储介质，其特征在于，所述一级情感分类器中，对TextRNN中每一个节点h _t分配了权重αt，使其权重值使其更新为h _newt＝α _t*h _t，以为编码的词向量进行权重加成，所述权重αt为： The computer-readable storage medium according to claim 19, wherein in the first-level sentiment classifier, each node h _t in the TextRNN is assigned a weight αt, so that the weight value is updated to h _newt = α _t *h _t , add weight to the encoded word vector, the weight αt is:

其中，u _t＝tanh(W _wh _t+b _w)，W _w、U _w与b _w均为Attention的权重与bias。 Among them, u _t =tanh(W _w h _t +b _w ), and W _w , U _w and b _w are all Attention weights and biases.