CN111199155A

CN111199155A - Text classification method and device

Info

Publication number: CN111199155A
Application number: CN201811275675.1A
Authority: CN
Inventors: 王超; 李修鹏; 田文宝; 赵欣莅; 赵东伟; 张志朋; 樊锐强; 刘庆标; 尹学正; 温连魁
Original assignee: Feihu Information Technology Tianjin Co Ltd
Current assignee: Feihu Information Technology Tianjin Co Ltd
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2020-05-26
Anticipated expiration: 2038-10-30
Also published as: CN111199155B

Abstract

The application provides a text classification method, which is based on a multi-dimensional convolutional neural network model and is used for respectively converting each phrase in a target text into a corresponding phrase semantic representation vector, wherein each phrase comprises a plurality of words which are semantically represented by word embedding vectors; inputting each phrase semantic expression vector in the target text into a multi-granularity long and short term memory model for processing, and determining the average value of the output vector of each hidden layer in the multi-granularity long and short term memory model as a layered semantic vector of the target text; and inputting the hierarchical semantic vector of the target text into a classification model for classification processing to obtain the probability distribution of the target text in a preset type set, and taking the type corresponding to the maximum probability value as the type of the target text. The invention combines the local semantics between words and the global semantics between phrases, enhances the understanding of natural language and further improves the accuracy of text classification.

Description

Text classification method and device

Technical Field

The invention relates to the technical field of data processing, in particular to a text classification method and a text classification device.

Background

At present, the internet has a large amount of text data, and in order to enable a user to efficiently obtain text data to be browsed according to text types, the text data needs to be accurately classified.

The existing text classification methods mainly comprise a dictionary-based method, a traditional machine learning-based method and a deep learning-based method. The method mainly comprises the steps of establishing a series of dictionaries and rules based on a dictionary method, carrying out paragraph borrowing and syntactic analysis on a text as a text classification basis, and enabling an analysis result to depend on the establishment of the rules and a sentence splitting method without universality on the method; the training text is manually labeled based on a traditional machine learning method, then a supervised machine learning process is carried out, and the classification result depends on the selection of feature representation and does not have universality on data. In recent years, deep learning has been favored by many researchers due to its high efficiency, plasticity, and universality, and more researchers have made remarkable results by applying deep learning to the fields of natural language processing and the like.

The existing text classification method based on deep learning has achieved remarkable results, but from the aspects of word feature extraction method, text semantic representation and the like, the following problems mainly exist:

(1) most of the existing word vector acquisition methods extract the characteristics of words according to word frequency, the word order and the semantic information of the words are lost, and the obtained result cannot meet the requirement of semantic analysis.

The method for obtaining the feature vector of the words based on the word frequency assumes that: for a text, the word order and the syntax are ignored and are only regarded as a word set or a word combination, and the occurrence of each word in the text is independent and independent of whether other words occur or not; or when the author writes an article, selecting a word at any one position is not influenced by the preceding sentence and is independently selected. Although the natural language processing is simplified and is convenient for modeling, the purpose of text semantic analysis research is to obtain the category of the whole article from the attribute of the words by using a computer method, and the sequence and semantic information of the words are important influence factors to be considered in the analysis process, so that the assumption is unreasonable.

(2) When the whole text is modeled, the structure of the whole article is not fully considered, and the relation between the local semantic meaning and the global semantic meaning of the text is ignored.

In the text analysis process, how to model the logical relationship among sentences in a document is a problem to be solved urgently. As a research object of text analysis, a text has a composition structure of 'words-sentences-chapters', most existing text analysis methods ignore the hierarchical relationship and directly model by taking the words as basic units, the words can describe basic information of languages, but single words lack correlation, different semantemes are obtained by different combinations of the same words, and the unique granularity of modeling by directly taking the words as text analysis is obviously unreasonable.

Disclosure of Invention

In view of this, the present invention provides a text classification method and device, which combine local semantics between words and global semantics between phrases to enhance understanding of natural language and further improve accuracy of text classification.

In order to achieve the above purpose, the invention provides the following specific technical scheme:

a method of text classification, comprising:

splitting a target text into a plurality of phrases, and splitting each phrase into a plurality of words;

respectively converting each phrase in the target text into a corresponding phrase semantic representation vector based on a multi-dimensional convolutional neural network model, wherein each phrase comprises a plurality of words semantically represented by word embedding vectors;

inputting each phrase semantic expression vector in the target text into a multi-granularity long and short term memory model for processing, and determining the average value of the output vectors of each hidden layer in the multi-granularity long and short term memory model as the hierarchical semantic vector of the target text;

and inputting the hierarchical semantic vector of the target text into a classification model for classification processing to obtain probability distribution of the target text in a preset type set, and taking the type corresponding to the maximum probability value as the type of the target text.

Optionally, the converting each phrase in the target text into a corresponding phrase semantic representation vector based on the multidimensional convolutional neural network model includes:

respectively connecting a plurality of word embedded vectors corresponding to each phrase in a target text in series to obtain a serial vector of each phrase in the target text, wherein the quantity value of words in each phrase in the target text is the same as the width value of a convolution kernel window of a multi-dimensional convolution neural network model;

and respectively inputting the serial vectors of each phrase in the target text into the multi-dimensional convolutional neural network model for processing, performing average sampling on the output vectors of the convolutional layers to obtain input vectors of the pooling layers, performing average folding on the output vectors of the pooling layers, and generating phrase semantic representation vectors corresponding to each phrase in the target text.

Optionally, the inputting each term semantic representation vector in the target text into a multi-granularity long-short term memory model for processing includes:

inputting each phrase semantic representation vector in the target text into a multi-granularity long-short term memory model for processing to obtain an output vector of a first hidden layer;

and for each hidden layer except the first hidden layer, performing non-relevant forgetting operation and relevant updating operation on the input vector corresponding to the hidden layer and the output vector of the previous hidden layer to obtain the output vector of each hidden layer.

Optionally, before the step of converting each phrase in the target text into a corresponding phrase semantic representation vector based on the multidimensional convolutional neural network model, the method further includes:

constructing a training set, a test set and a word vector matrix;

initializing parameters of a multi-dimensional convolutional neural network model, a multi-granularity long and short term memory model and a classification model;

training a multi-dimensional convolutional neural network model, a multi-granularity long and short term memory model and a classification model according to the training set, and adjusting parameters of the multi-dimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model and the word vector matrix by using a back propagation algorithm;

and testing the trained multidimensional convolution neural network model, the multi-granularity long and short term memory model and the classification model by using the test set, and stopping training when the test result meets the accuracy requirement of text classification.

A text classification apparatus comprising:

the target text splitting unit is used for splitting the target text into a plurality of phrases and splitting each phrase into a plurality of words;

the phrase semantic representation unit is used for respectively converting each phrase in the target text into corresponding phrase semantic representation vectors based on the multidimensional convolutional neural network model, wherein each phrase comprises a plurality of words which are semantically represented by word embedding vectors;

the hierarchical semantic expression unit is used for inputting each phrase semantic expression vector in the target text into a multi-granularity long and short term memory model for processing, and determining the average value of the output vector of each hidden layer in the multi-granularity long and short term memory model as the hierarchical semantic vector of the target text;

and the classification processing unit is used for inputting the hierarchical semantic vector of the target text into a classification model for classification processing to obtain the probability distribution of the target text in a preset type set, and taking the type corresponding to the maximum probability value as the type of the target text.

Optionally, the phrase semantic representation unit is specifically configured to respectively concatenate a plurality of word embedding vectors corresponding to each phrase in the target text to obtain a concatenated vector of each phrase in the target text, where a number value of a word in each phrase in the target text is the same as a width value of a convolution kernel window of the multidimensional convolution neural network model; and respectively inputting the serial vectors of each phrase in the target text into the multi-dimensional convolutional neural network model for processing, performing average sampling on the output vectors of the convolutional layers to obtain input vectors of the pooling layers, performing average folding on the output vectors of the pooling layers, and generating phrase semantic representation vectors corresponding to each phrase in the target text.

Optionally, the hierarchical semantic representation unit is specifically configured to input each term semantic representation vector in the target text into a multi-granularity long-term and short-term memory model for processing, so as to obtain an output vector of a first hidden layer; and for each hidden layer except the first hidden layer, performing non-relevant forgetting operation and relevant updating operation on the input vector corresponding to the hidden layer and the output vector of the previous hidden layer to obtain the output vector of each hidden layer.

Optionally, the apparatus further comprises:

the building unit is used for building a training set, a test set and a word vector matrix;

the initialization unit is used for initializing parameters of the multi-dimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model;

the model training unit is used for training the multidimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model according to the training set and adjusting the parameters of the multidimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model and the word vector matrix by using a back propagation algorithm;

and the model testing unit is used for testing the trained multidimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model by using the test set, and stopping training when the test result meets the accuracy requirement of text classification.

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses a text classification method, which is characterized in that semantic representation is carried out on words in each phrase in a target text by utilizing word embedding vectors, and each phrase in the target text is converted into corresponding phrase semantic representation vectors respectively based on a multi-dimensional convolutional neural network model so as to represent the local semantic relationship between words in the phrases. And processing all phrase semantic expression vectors in the target text by using the multi-granularity long and short term memory model, determining the average value of the output vectors of each hidden layer in the multi-granularity long and short term memory model as the layered semantic vector of the target text, and realizing semantic expression between phrases at different intervals in the target text. And the hierarchical semantic vectors obtained by combining the local semantics among the words and the global semantics among the phrases are used as the feature vectors for classification to perform classification processing, so that the understanding of natural language is enhanced, and the accuracy of text classification is further improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a diagram illustrating text classification performed by a hierarchical neural network model according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a text classification method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a multidimensional convolutional neural network model disclosed in an embodiment of the present invention;

FIG. 4-a is a schematic diagram of a classical LSTM model disclosed in an embodiment of the present invention;

FIG. 4-b is a schematic diagram of a multi-granularity long-short term memory model according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart of a model construction method disclosed in the embodiments of the present invention;

fig. 6 is a schematic structural diagram of a text classification device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Based on the problems of the existing text classification method based on deep learning, for example, the existing word vector acquisition method mostly extracts the characteristics of words according to word frequency, loses the semantic information of word order and syntax, and the obtained result can not meet the requirement of semantic analysis; when the whole text is modeled, the structure of the whole article is not fully considered, and the problems of the relation between the local semantics and the global semantics of the text and the like are ignored. The invention provides a text classification method, which is used for carrying out hierarchical splitting on a text: word-phrase-text, and classifying the textual components using a hierarchical neural network model, wherein the hierarchical neural network model comprises: a multidimensional convolution neural network model, a multi-granularity long and short term memory model and a classification model. Referring to fig. 1, fig. 1 is a schematic diagram of text classification performed by a hierarchical neural network model, where operation 1 is an operation performed by a multidimensional convolutional neural network model to convert a phrase into a phrase semantic representation vector; operation 2 is an operation performed by the multi-granularity long-short term memory model, and converts a plurality of phrase semantic representation vectors into a text hierarchical semantic vector.

Specifically, please refer to fig. 2, this embodiment discloses a text classification method, which specifically includes the following steps:

s101: splitting a target text into a plurality of phrases, and splitting each phrase into a plurality of words;

specifically, the method for splitting the target text into a plurality of phrases may be any one of the existing text splitting methods based on semantics, and the method for splitting the phrases into a plurality of words may be any one of the existing word splitting methods.

S102: respectively converting each phrase in the target text into corresponding phrase semantic expression vectors based on a multi-dimensional convolutional neural network model;

wherein each phrase comprises a plurality of words semantically represented by word-embedded vectors, each word is represented as a low-dimensional, continuous, real-valued vector, and all vectors are stored in a word vector matrix L e R^dim×|V|Where dim is the dimension of word embedding and V is the dictionary.

Convolutional Neural Network (CNN) is a relatively advanced semantic model for natural language processing at present, and can learn a fixed-length vector from a variable-length phrase, and its calculation process mainly depends on the order of words in the phrase and does not depend on a syntax tree. The multidimensional convolutional neural network model is an improved convolutional neural network model, an input vector of the multidimensional convolutional neural network model is a serial vector of each phrase in a target text, and the serial vector of each phrase is obtained by respectively connecting a plurality of word embedding vectors corresponding to each phrase in the target text in series.

Referring to fig. 3, fig. 3 is a schematic diagram of a multidimensional convolutional neural network model, wherein a word in each phrase in a target text is semantically represented by a word embedding vector in a first lookup layer of the multidimensional convolutional neural network model; and the convolution layer of the second layer performs convolution operation on the word vectors by using convolution cores with different window sizes, finally performs average sampling on the output vectors of the convolution layers to obtain the input vectors of the pooling layer, performs average folding on the output vectors of the pooling layer, and generates phrase semantic expression vectors corresponding to each phrase in the target text.

Convolution kernels with different window sizes are used to capture context semantics at different granularities and generate semantic representations of the phrases based thereon. This approach has proven to work well in the field of emotion analysis and annotation. In this embodiment, we use convolution kernels with windows 3, 4, 5 to capture the semantics of 3, 4, 5-grams.

Given a phrase containing n words w₁,w₂,w₃,…,w_nLcf is the window width of the convolution kernel cf, W_cfAnd b_cfEach word w in the phrase being a shared parameter, respectively, which is a linear part of the convolution kernel cf_iAll pass through the word vector matrix L ∈ R^dim ^×|V|Is mapped as a word embedding vector we corresponding to the word_i∈R^dimWhere dim is the dimension of word embedding and | V | is the size of the lexicon. The input of the convolution layer is the word-embedded concatenation corresponding to lcf words in the convolution kernel window, i.e. I_cf＝[we_i；we_i+1；…；we_i+lcf-1]∈R^dim·lcf. The output of the convolutional layer is thus expressed as:

O_cf＝tanh(W_cf·I_cf+b_cf),

wherein, W_cf∈R^{locf×dim·lcf}，b_cf∈R^locfWhere locf is the length of the convolutional layer output and tanh adds non-linear characteristics to the convolution operation.

After that, we average the output of the convolutional layer to obtain global semantics, and then merge all the convolutional layer outputs through one average folding layer to obtain the final phrase semantic representation.

S103: inputting each phrase semantic expression vector in the target text into a multi-granularity long and short term memory model for processing, and determining the average value of the output vectors of each hidden layer in the multi-granularity long and short term memory model as the hierarchical semantic vector of the target text;

referring to FIG. 4-a, the output of the last hidden layer in the classical LSTM model is usually used as the final text representation. Referring to fig. 4-b, the multi-granularity Long-Short Term Memory model is an improvement of the classic LSTM (Long Short-Term Memory) model, and averages the outputs of all hidden layers to obtain the final text semantic representation. In this way, we can consider the semantic and emotional logical relations between different intervals of phrases.

Specifically, the transfer function of the multi-granularity long-short term memory model in this embodiment is as follows:

f_t＝δ(W_f·[h_t-1；x_t]+b_f)

i_t＝δ(W_i·[h_t-1；x_t]+b_i)

C_t′＝tanh(W_C·[h_t-1；x_t]+b_C)

C_t＝f_t⊙C_t-1+i_t⊙C_t′

h_t＝δ(W_h·[h_t-1；x_t]+b_h⊙tanh(C_t)

wherein x is_tInputting a vector of the LSTM model in the t step, namely a phrase semantic expression vector of the t phrase in the target text;

f_t、i_t、W_f、W_i、b_fand b_iRespectively responsible for forgetting and updating operations of a hidden layer vector (hidden vector) and an input vector;

W_Cand b_CFor generating candidate vector C_t’；

h_t-1The hidden layer vector of the LSTM is used for representing historical information and storing the knowledge accumulated in the previous t-1 step;

C_t-1and C_t' respectively at the t step, the original state vector and the candidate vector of the neuron;

W_hand b_hThe method comprises the steps of obtaining a new hidden layer vector by updating an original hidden layer vector, an input vector and a state vector of a neuron;

⊙ is the element-wise multiplication of two vectors;

h_tand the output vector of the hidden layer in the t step is shown.

The output vector (i.e., h) of the hidden layer at each step₁，h₂，……h_n) Is determined as a hierarchical semantic vector of the target text.

S104: and inputting the hierarchical semantic vector of the target text into a classification model for classification processing to obtain probability distribution of the target text in a preset type set, and taking the type corresponding to the maximum probability value as the type of the target text.

In particular, the classification model may be a softmax classifier.

E.g. ith text c⁽ⁱ⁾The text should be assigned to e in the preset type set_kThe conditional probability of class (k ═ 1,2, 3., | E |) can be calculated by the softmax function:

wherein c is⁽ⁱ⁾For the ith text, x⁽ⁱ⁾As text c⁽ⁱ⁾The hierarchical semantic expression vector of (a), E is a preset type set, and omega is a hierarchical semantic expression vector x⁽ⁱ⁾Transition matrix to E real number domain vector, omega_jCan be regarded as x⁽ⁱ⁾For each item in type e_jThe binding coefficient of (c).

In the text classification method disclosed in this embodiment, semantic representation is performed on words in each phrase in a target text by using word embedding vectors, and each phrase in the target text is converted into a corresponding phrase semantic representation vector based on a multidimensional convolutional neural network model, so as to represent a local semantic relationship between words in the phrases. And processing all phrase semantic expression vectors in the target text by using the multi-granularity long and short term memory model, determining the average value of the output vectors of each hidden layer in the multi-granularity long and short term memory model as the layered semantic vector of the target text, and realizing semantic expression between phrases at different intervals in the target text. And the hierarchical semantic vectors obtained by combining the local semantics among the words and the global semantics among the phrases are used as the feature vectors for classification to perform classification processing, so that the understanding of natural language is enhanced, and the accuracy of text classification is further improved.

It should be noted that before text classification, a multi-dimensional convolutional neural network model, a multi-granularity long-short term memory model, and a classification model need to be constructed, please refer to fig. 5, where fig. 5 is a schematic flow chart of the model construction method, and specifically includes the following steps:

s401: constructing a training set, a test set and a word vector matrix;

s402: initializing parameters of a multi-dimensional convolutional neural network model, a multi-granularity long and short term memory model and a classification model;

s403: training a multi-dimensional convolutional neural network model, a multi-granularity long and short term memory model and a classification model according to the training set, and adjusting parameters of the multi-dimensional convolutional neural network model, the multi-granularity long and short term memory model and the classification model and the word vector matrix by using a back propagation algorithm;

and inputting the text data in the training set into a multi-dimensional convolutional neural network model, a multi-granularity long and short term memory model and a classification model for processing, wherein the text data is the data which is obtained by splitting the text and comprises phrases and words. Specifically, each phrase in a training text is converted into a corresponding phrase semantic representation vector respectively based on a multi-dimensional convolutional neural network model, wherein each phrase comprises a plurality of words semantically represented by word embedding vectors; inputting each phrase semantic expression vector in the training text into a multi-granularity long and short term memory model for processing, and determining the average value of the output vector of each hidden layer in the multi-granularity long and short term memory model as a layered semantic vector of the training text; and inputting the layered semantic vectors of the training continuous texts into a classification model for classification processing to obtain the probability distribution of the training texts in a preset type set, and taking the type corresponding to the maximum probability value as the type of the training texts.

S404: and testing the trained multidimensional convolution neural network model, the multi-granularity long and short term memory model and the classification model by using the test set, and stopping training when the test result meets the accuracy requirement of text classification.

Specifically, a threshold of accuracy of the text classification may be preset, and the training may be stopped when the accuracy of the test result reaches the threshold of accuracy of the preset text classification.

It should be noted that the text classification method disclosed in this embodiment may be applied to text classification in any field, and for further explaining the model construction method and the text classification method disclosed in this embodiment, the following description is given by using a scene embodiment.

At present, a large number of edge-wiping videos and title parties are filled in internet videos, the content of the videos does not relate to sensitive contents such as pornography and the like, but the titles and the pictures of the covers are popular and have poor inductivity, the texts do not match the questions, the user experience is influenced, and the clicking and film watching behaviors of the users are misled, so that the video understanding and recommendation are further influenced, and a method for detecting the edge-wiping video titles is needed. When the text classification is applied to the edge deletion title detection, the model construction method comprises the following steps:

inputting: title training set C_trainTitle test set C_testThe word vector matrix L ∈ R^dim×|V|。

And (3) outputting: test set C_testEdge wiping class label

Referring to fig. 6, the present embodiment correspondingly discloses a text classification device, which includes:

a target text splitting unit 501, configured to split a target text into multiple phrases, and split each phrase into multiple words;

a phrase semantic representation unit 502, configured to convert each phrase in the target text into a corresponding phrase semantic representation vector respectively based on a multidimensional convolutional neural network model, where each phrase includes a plurality of words semantically represented by word-embedded vectors;

optionally, the phrase semantic representation unit 502 is specifically configured to respectively concatenate a plurality of word embedding vectors corresponding to each phrase in the target text to obtain a concatenated vector of each phrase in the target text, where a number value of a word in each phrase in the target text is the same as a width value of a convolution kernel window of the multidimensional convolution neural network model; and respectively inputting the serial vectors of each phrase in the target text into the multi-dimensional convolutional neural network model for processing, performing average sampling on the output vectors of the convolutional layers to obtain input vectors of the pooling layers, performing average folding on the output vectors of the pooling layers, and generating phrase semantic representation vectors corresponding to each phrase in the target text.

A hierarchical semantic representation unit 503, configured to input each term semantic representation vector in the target text into a multi-granularity long and short term memory model for processing, and determine an average value of output vectors of each hidden layer in the multi-granularity long and short term memory model as a hierarchical semantic vector of the target text;

optionally, the hierarchical semantic representation unit 503 is specifically configured to input each term semantic representation vector in the target text into a multi-granularity long-term and short-term memory model for processing, so as to obtain an output vector of a first hidden layer; and for each hidden layer except the first hidden layer, performing non-relevant forgetting operation and relevant updating operation on the input vector corresponding to the hidden layer and the output vector of the previous hidden layer to obtain the output vector of each hidden layer.

And a classification processing unit 504, configured to input the hierarchical semantic vector of the target text into a classification model for classification processing, to obtain probability distribution of the target text in a preset type set, and use a type corresponding to the maximum probability value as the type of the target text.

Optionally, the apparatus further comprises:

In the text classification device disclosed in this embodiment, word embedding vectors are used to perform semantic representation on words in each phrase in a target text, and each phrase in the target text is converted into a corresponding phrase semantic representation vector based on a multidimensional convolutional neural network model, so as to represent a local semantic relationship between words in the phrases. And processing all phrase semantic expression vectors in the target text by using the multi-granularity long and short term memory model, determining the average value of the output vectors of each hidden layer in the multi-granularity long and short term memory model as the layered semantic vector of the target text, and realizing semantic expression between phrases at different intervals in the target text. And the hierarchical semantic vectors obtained by combining the local semantics among the words and the global semantics among the phrases are used as the feature vectors for classification to perform classification processing, so that the understanding of natural language is enhanced, and the accuracy of text classification is further improved.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of text classification, comprising:

2. The method according to claim 1, wherein the converting each phrase in the target text into a corresponding phrase semantic representation vector based on the multidimensional convolutional neural network model comprises:

3. The method according to claim 1, wherein the inputting each phrase semantic representation vector in the target text into a multi-granularity long-short term memory model for processing comprises:

4. The method of claim 1, wherein before the step of converting each phrase in the target text into the corresponding phrase semantic representation vector based on the multidimensional convolutional neural network model, the method further comprises:

constructing a training set, a test set and a word vector matrix;

5. A text classification apparatus, comprising:

6. The apparatus according to claim 5, wherein the phrase semantic representation unit is specifically configured to concatenate a plurality of word embedding vectors corresponding to each phrase in the target text, respectively, to obtain a concatenated vector of each phrase in the target text, where a number value of words in each phrase in the target text is the same as a window width value of a convolution kernel of the multidimensional convolution neural network model; and respectively inputting the serial vectors of each phrase in the target text into the multi-dimensional convolutional neural network model for processing, performing average sampling on the output vectors of the convolutional layers to obtain input vectors of the pooling layers, performing average folding on the output vectors of the pooling layers, and generating phrase semantic representation vectors corresponding to each phrase in the target text.

7. The apparatus according to claim 5, wherein the hierarchical semantic representation unit is specifically configured to input each term semantic representation vector in the target text into a multi-granularity long-short term memory model for processing, so as to obtain an output vector of a first hidden layer; and for each hidden layer except the first hidden layer, performing non-relevant forgetting operation and relevant updating operation on the input vector corresponding to the hidden layer and the output vector of the previous hidden layer to obtain the output vector of each hidden layer.

8. The apparatus of claim 5, further comprising: