CN112070139A - Text classification method based on BERT and improved LSTM - Google Patents

Text classification method based on BERT and improved LSTM Download PDF

Info

Publication number
CN112070139A
CN112070139A CN202010898906.5A CN202010898906A CN112070139A CN 112070139 A CN112070139 A CN 112070139A CN 202010898906 A CN202010898906 A CN 202010898906A CN 112070139 A CN112070139 A CN 112070139A
Authority
CN
China
Prior art keywords
text
bert
vector
lstm
gate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010898906.5A
Other languages
Chinese (zh)
Other versions
CN112070139B (en
Inventor
戚力鑫
万书振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges University CTGU
Original Assignee
China Three Gorges University CTGU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges University CTGU filed Critical China Three Gorges University CTGU
Priority to CN202010898906.5A priority Critical patent/CN112070139B/en
Publication of CN112070139A publication Critical patent/CN112070139A/en
Application granted granted Critical
Publication of CN112070139B publication Critical patent/CN112070139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of text recognition, and discloses a text classification method based on BERT and improved LSTM, which comprises the following steps: preprocessing input text data; inputting the preprocessed text data into a BERT model for processing to obtain a word vector sequence; carrying out depth coding on the vector sequence by utilizing an improved LSTM network to obtain a characteristic vector; reducing the dimension of the feature vector by using a full connection layer; and classifying the feature vectors subjected to dimension reduction by using a classifier. The invention distinguishes the importance of words in the text through the improved LSTM, thereby improving the learning quality and efficiency of the neuron, and the text classification model has high fitting speed and good classification effect; the BERT of the text classification model captures context information, thereby being convenient for identifying ambiguous words, laying a good foundation for feature extraction and being beneficial to improving the text classification precision.

Description

Text classification method based on BERT and improved LSTM
Technical Field
The invention belongs to the field of text recognition, and particularly relates to a text classification method based on BERT and improved LSTM.
Background
The main application fields of text classification are microblog emotion analysis, user comment mining, information retrieval, classification of news groups, word semantic resolution and the like. Before the 90 s of the 20 th century, automatic text classification mainly adopts a knowledge-based engineering mode, namely, manual classification by professionals, and has the defects of high cost, time waste and labor waste. Since the 90 s, researchers began applying various statistical and machine learning methods to automatic text classification, such as Support Vector Machine (SVM), AdaBoost, naive bayes, KNN, and Logistic regression. In recent years, with the rapid development of deep learning and various neural network models, the text classification method based on deep learning has attracted close attention and research in academia and industry, and the recurrent neural networks LSTM, GRU and the convolutional neural network CNN are widely applied to text classification.
In the current text classification method, input is often a non-dynamic word vector or word vector, the word vector or word vector cannot be changed according to the context of the word vector or word vector, and the information coverage is relatively single; most of the adopted feature extraction models are CNN and RNN models in deep learning, and fine-grained adjustment of different importance levels of input information streams in input dimensions is lacked.
Disclosure of Invention
The invention aims to solve the problems, and provides a text classification method based on BERT and improved LSTM, which adds a contribution gate in the existing LSTM unit to pay attention to the importance of different elements of a vector sequence of a text, improves the learning efficiency of neurons, accelerates the fitting speed of a text classification model, and improves the classification effect of the text classification model.
The technical scheme of the invention is a text classification method based on BERT and improved LSTM, a text classification model comprises the BERT, the improved LSTM and a classifier which are connected in sequence, and the text classification method comprises the following steps:
step 1: preprocessing input text data;
step 2: inputting the preprocessed text data into a BERT model for processing to obtain a word vector sequence;
and step 3: carrying out depth coding on the vector sequence by utilizing an improved LSTM network to obtain a characteristic vector;
and 4, step 4: reducing the dimension of the feature vector by using a full connection layer;
and 5: and classifying the feature vectors subjected to dimension reduction by using a classifier.
Further, in step 1, the preprocessing of the text data includes punctuation filtering, abbreviation filling, space deletion and illegal character filtering.
Further, the step 2 specifically includes:
1) utilizing the trained BERT model to perform word segmentation on the text of the preprocessed text data set T ', and obtaining a word vector set T' ═ { T1″,t2″,...,tn", the text of the text data set is converted into a word vector t" of fixed length1={w1,w2,...,wL}; 2) inputting the word vector set T' into a Token embedding layer, a Segment embedding layer and a Position embedding layer in BERT to respectively obtain vector codes V1Sentence coding V2And a position code V3
3) Will V1、V2、V3Adding the words and the vectors, inputting the words and the vectors into a bidirectional Transformer in BERT to obtain a word vector sequence S ═ S1,s2,...,sn}。
Further, the LSTM unit of the improved LSTM network comprises a contribution gate, a forgetting gate, an input gate and an output gate, wherein the contribution gate is based on the cell state c at the last momentt-1Hidden state ht-1And the input information of the current time to generate and input vector xtAttention vector a with the same dimensiontAttention vector atAs inputs to the forgetting gate, the input gate, and the output gate.
Further, step 5 performs probability classification on the feature vector with the dimension reduced in step 4 by using a softmax classifier, and outputs a probability prediction vector P ═ { P ═ P1,p2,...,pC},piI 1,2, C denotes the probability that the text belongs to a particular class, C being the total number of classes; p with the maximum probability valueiThe corresponding classification is determined as the category of the text.
Compared with the prior art, the invention has the beneficial effects that:
1) the text classification method of the invention distinguishes the importance of the words in the text through the improved LSTM, thereby improving the learning quality and efficiency of the neurons, and the text classification model has high fitting speed and good classification effect;
2) the BERT of the text classification model captures context information, so that ambiguous words can be conveniently recognized, a good foundation is laid for feature extraction, and the text classification precision is improved;
3) the text classification model combining BERT and improved LSTM has good universality and is suitable for text classification in different technical fields.
Drawings
The invention is further illustrated by the following figures and examples.
Fig. l is a flowchart illustrating a text classification method according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a text classification model according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of an LSTM unit in accordance with an embodiment of the present invention.
FIG. 4 is a schematic diagram showing the comparison of the improved LSTM with the verification accuracies of sRNN, LSTM and GRU.
FIG. 5 is a graphical representation of the improved LSTM vs. the loss of validation for sRNN, LSTM, GRU.
Detailed Description
Embodiments select the chinese dataset THUCNews for classification testing. As shown in FIG. 2, the text classification model of an embodiment includes BERT, modified LSTM, classifiers, which include a fully connected layer and a Softmax layer.
As shown in fig. 1, the text classification method based on BERT and modified LSTM includes the following steps:
step 1: inputting a text data set T, preprocessing sentences in the text, including punctuation mark filtration, abbreviation filling, space deletion and illegal character filtration, determining the length threshold of the sentences by combining the length distribution and mean square deviation of the sentences in the text data set, and forming uniform sentence length to finally obtain a text data set T';
step 2: vectorizing the text data set T' by using the trained BERT model;
segmenting the text in the T' by using the trained BERT model to obtain data T ″ { T ″1″,t2″,...,tn", where each text is converted to a word vector t" of fixed length L1={w1,w2,...,wL};
Putting the text in the T' into a Token Embedding layer, a Segment Embedding layer and a Position Embedding layer in the BERT to respectively obtain vector codes V1Sentence coding V2And a position code V3
Will V1、V2、V3Adding the words, inputting the words into a bidirectional Transformer in BERT, and outputting a word vector sequence S ═ S corresponding to T ″1,s2,...,snH, where each word vector subsequence siBy the word vector v (w) of the ith textj) Composition, i represents each word in the text;
the BERT model of the examples was referred to the BERT disclosed in the paper "Pre-training of Deep biological transformations for Language Understanding", published by the Google research and development team 2018.
And step 3: the improved LSTM network is used for carrying out feature learning on the word vector sequence S, potential features F are extracted, and the structure of an LSTM unit of the improved LSTM network is shown in figure 3;
and 4, step 4: reducing the dimension of the feature vector F by using a full connection layer;
and 5: performing probability classification on the feature vector of the dimensionality reduction step 4 by using a softmax classifier, and outputting a probability prediction vector P ═ P1,p2,...,pC},piI 1,2, C denotes the probability that the text belongs to a particular class, C being the total number of classes; and determining the corresponding classification with the maximum probability value as the text category.
The LSTM unit of the improved LSTM network comprises a contribution gate, a forgetting gate, an input gate and an output gate, wherein the contribution gate is based on the cell state c at the last momentt-1Hidden state ht-1And the input information of the current time to generate and input vector xtAttention having the same dimensionForce vector atAttention vector atAnd xtCombining to obtain optimized input vector xt', as inputs to a forgetting gate, an input gate, and an output gate;
at=σa(Waxt+Uaht-1+Mact-1+ba)
xt′=(xt+ht-1)οat
forget the door:
ft=σg(Wfxt′+bf)
an input gate:
it=σg(Wixt′+bi)
an output gate:
ot=σg(Woxt′+bo)
cell state:
ct=ftοct-1+itοσc(Wcxt′+bc)
hidden state:
ht=otοσh(ct)
wherein h istHidden state at the current time t, ctIs the cellular state at the present time t, Wa、Ua、Ma、Wf、Wi、Wo、WcAre respectively a weight matrix, ba、bf、bi、bo、bcAre respectively deviation terms; sigmaa、σg、σc、σhRespectively, an activation function; o denotes element-by-element dot multiplication operation.
Compared with models such as word2vec, word vectors generated by the word2vec model are fixed, and word polysemous differences are given to the BERT according to context related information to generate more accurate feature representation, so that the model performance is improved.
The improved LSTM gives attention to important elements in input information to neurons in the input dimension aspect because of the capability of paying attention to the important elements, so that the fitting speed of the model is increased, and the training effect of the model is improved.
In order to verify the effectiveness of the improved LSTM, the Chinese data set THUCNews is compared with other neural network models sRNN, LSTM and GRU, 10 times, 30 times and 50 times (Epochs) are respectively trained, the model precision (Accuracy) and the model fitting speed (Convergent) are compared, and the experimental results are shown in Table 1.
TABLE 1 precision comparison table for improved LSTM and sRNN, LSTM, GRU models
Figure BDA0002659338260000041
As can be seen from Table 1, the accuracy of the improved LSTM is optimal when the feature extraction layer selects and uses the text classification model of the improved LSTM for training different times, the accuracy of the improved LSTM is improved by 1.86% relative to the LSTM after 10 times of training, the accuracy of the improved LSTM is improved by 1.14% relative to the LSTM after 30 times of training, and the accuracy of the improved LSTM is improved by 0.94% relative to the LSTM after 50 times of training.
As can be seen from fig. 4 and 5, the verification accuracy curve of the improved LSTM does not change drastically any more when the LSTM is iterated 6 times, and the corresponding verification loss is also a minimum value, the model is close to fitting, and the model achieves the best effect.
Compared with the improved LSTM, the accuracy change range of the sRNN, the GRU and the LSTM is larger when different training times are used. Under different training times, the model training effect of the improved LSTM is similar, the model precision is 92.02% after 10 times of training, the model precision is 92.86% after 50 times of training, and the difference is only 0.8%.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept within the scope of the present invention, which is disclosed by the present invention, and the equivalent or change thereof belongs to the protection scope of the present invention.

Claims (5)

1. The text classification method based on BERT and improved LSTM is characterized by comprising the following steps:
step 1: preprocessing input text data;
step 2: inputting the preprocessed text data into a BERT model for processing to obtain a word vector sequence;
and step 3: carrying out depth coding on the vector sequence by utilizing an improved LSTM network to obtain a characteristic vector;
and 4, step 4: reducing the dimension of the feature vector by using a full connection layer;
and 5: and classifying the feature vectors subjected to dimension reduction by using a classifier.
2. The BERT and LSTM-based text classification method of claim 1, wherein the preprocessing of the text data in step 1 comprises punctuation filtering, abbreviation padding, space deletion and illegal character filtering.
3. The BERT and LSTM-based text classification method according to claim 1, wherein the step 2 specifically comprises:
1) utilizing a trained BERT model to perform word segmentation on the text of the preprocessed text data set T', and obtaining a word vector set T ═ T1”,t2”,...,tn"}, the text of the text data set is converted into a word vector t of fixed length1”={w1,w2,...,wL};
2) Inputting the word vector set T' into a Token embedding layer, a Segment embedding layer and a Position embedding layer in BERT to respectively obtain vector codes V1Sentence coding V2And a position code V3
3) Will V1、V2、V3Adding the words and the vectors, inputting the words and the vectors into a bidirectional Transformer in BERT to obtain a word vector sequence S ═ S1,s2,...,sn}。
4. According to claim 1The text classification method based on BERT and improved LSTM is characterized in that LSTM units of the improved LSTM network comprise a contribution gate, a forgetting gate, an input gate and an output gate, and the contribution gate is based on the cell state c at the last momentt-1Hidden state ht-1And the input information of the current time to generate and input vector xtAttention vector a with the same dimensiontAttention vector atAnd xtCombining to obtain optimized input vector xt', as inputs to a forgetting gate, an input gate, and an output gate;
at=σa(Waxt+Uaht-1+Mact-1+ba)
Figure FDA0002659338250000011
forget the door:
ft=σg(Wfxt'+bf)
an input gate:
it=σg(Wixt'+bi)
an output gate:
ot=σg(Woxt'+bo)
cell state:
Figure FDA0002659338250000021
hidden state:
Figure FDA0002659338250000022
wherein h istHidden state at the current time t, ctIs the cellular state at the present time t, Wa、Ua、Ma、Wf、Wi、Wo、WcAre respectively a weight matrix, ba、bf、bi、bo、bcAre respectively deviation terms; sigmaa、σg、σc、σhRespectively, an activation function;
Figure FDA0002659338250000023
representing an element-by-element dot product operation.
5. The BERT and LSTM based text classification method according to any of claims 1-4, wherein step 5 uses softmax layer to perform probability classification on the step 4 reduced-dimension feature vector, and outputs a probability prediction vector P ═ { P ═ P { (m }1,p2,...,pC},piI 1,2, C denotes the probability that the text belongs to a particular class, C being the total number of classes; p with the maximum probability valueiThe corresponding classification is determined as the category of the text.
CN202010898906.5A 2020-08-31 2020-08-31 Text classification method based on BERT and improved LSTM Active CN112070139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010898906.5A CN112070139B (en) 2020-08-31 2020-08-31 Text classification method based on BERT and improved LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010898906.5A CN112070139B (en) 2020-08-31 2020-08-31 Text classification method based on BERT and improved LSTM

Publications (2)

Publication Number Publication Date
CN112070139A true CN112070139A (en) 2020-12-11
CN112070139B CN112070139B (en) 2023-12-26

Family

ID=73665222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010898906.5A Active CN112070139B (en) 2020-08-31 2020-08-31 Text classification method based on BERT and improved LSTM

Country Status (1)

Country Link
CN (1) CN112070139B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818123A (en) * 2021-02-08 2021-05-18 河北工程大学 Emotion classification method for text
CN113177120A (en) * 2021-05-11 2021-07-27 中国人民解放军国防科技大学 Method for quickly editing information based on Chinese text classification
CN113821637A (en) * 2021-09-07 2021-12-21 北京微播易科技股份有限公司 Long text classification method and device, computer equipment and readable storage medium
CN115048447A (en) * 2022-06-27 2022-09-13 华中科技大学 Database natural language interface system based on intelligent semantic completion

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599933A (en) * 2016-12-26 2017-04-26 哈尔滨工业大学 Text emotion classification method based on the joint deep learning model
CN109101584A (en) * 2018-07-23 2018-12-28 湖南大学 A kind of sentence classification improved method combining deep learning with mathematical analysis
CN109344244A (en) * 2018-10-29 2019-02-15 山东大学 A kind of the neural network relationship classification method and its realization system of fusion discrimination information
CN109918491A (en) * 2019-03-12 2019-06-21 焦点科技股份有限公司 A kind of intelligent customer service question matching method of knowledge based library self study
CN109992648A (en) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 The word-based depth text matching technique and device for migrating study
CN109992783A (en) * 2019-04-03 2019-07-09 同济大学 Chinese term vector modeling method
CN110362817A (en) * 2019-06-04 2019-10-22 中国科学院信息工程研究所 A kind of viewpoint proneness analysis method and system towards product attribute
CN110717334A (en) * 2019-09-10 2020-01-21 上海理工大学 Text emotion analysis method based on BERT model and double-channel attention
CN110781306A (en) * 2019-10-31 2020-02-11 山东师范大学 English text aspect layer emotion classification method and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599933A (en) * 2016-12-26 2017-04-26 哈尔滨工业大学 Text emotion classification method based on the joint deep learning model
CN109101584A (en) * 2018-07-23 2018-12-28 湖南大学 A kind of sentence classification improved method combining deep learning with mathematical analysis
CN109344244A (en) * 2018-10-29 2019-02-15 山东大学 A kind of the neural network relationship classification method and its realization system of fusion discrimination information
CN109918491A (en) * 2019-03-12 2019-06-21 焦点科技股份有限公司 A kind of intelligent customer service question matching method of knowledge based library self study
CN109992783A (en) * 2019-04-03 2019-07-09 同济大学 Chinese term vector modeling method
CN109992648A (en) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 The word-based depth text matching technique and device for migrating study
CN110362817A (en) * 2019-06-04 2019-10-22 中国科学院信息工程研究所 A kind of viewpoint proneness analysis method and system towards product attribute
CN110717334A (en) * 2019-09-10 2020-01-21 上海理工大学 Text emotion analysis method based on BERT model and double-channel attention
CN110781306A (en) * 2019-10-31 2020-02-11 山东师范大学 English text aspect layer emotion classification method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818123A (en) * 2021-02-08 2021-05-18 河北工程大学 Emotion classification method for text
CN113177120A (en) * 2021-05-11 2021-07-27 中国人民解放军国防科技大学 Method for quickly editing information based on Chinese text classification
CN113177120B (en) * 2021-05-11 2024-03-08 中国人民解放军国防科技大学 Quick information reorganizing method based on Chinese text classification
CN113821637A (en) * 2021-09-07 2021-12-21 北京微播易科技股份有限公司 Long text classification method and device, computer equipment and readable storage medium
CN115048447A (en) * 2022-06-27 2022-09-13 华中科技大学 Database natural language interface system based on intelligent semantic completion
CN115048447B (en) * 2022-06-27 2023-06-16 华中科技大学 Database natural language interface system based on intelligent semantic completion

Also Published As

Publication number Publication date
CN112070139B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
CN110502749B (en) Text relation extraction method based on double-layer attention mechanism and bidirectional GRU
US11631007B2 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
CN109376242B (en) Text classification method based on cyclic neural network variant and convolutional neural network
CN110059188B (en) Chinese emotion analysis method based on bidirectional time convolution network
CN112070139B (en) Text classification method based on BERT and improved LSTM
CN112732916B (en) BERT-based multi-feature fusion fuzzy text classification system
CN111401061A (en) Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention
CN112069831B (en) Method for detecting unreal information based on BERT model and enhanced hybrid neural network
CN111368086A (en) CNN-BilSTM + attribute model-based sentiment classification method for case-involved news viewpoint sentences
CN110287323B (en) Target-oriented emotion classification method
CN108319666A (en) A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion
CN111177376A (en) Chinese text classification method based on BERT and CNN hierarchical connection
CN107451118A (en) Sentence-level sensibility classification method based on Weakly supervised deep learning
CN110175221B (en) Junk short message identification method by combining word vector with machine learning
CN112199503B (en) Feature-enhanced unbalanced Bi-LSTM-based Chinese text classification method
CN112561718A (en) Case microblog evaluation object emotion tendency analysis method based on BilSTM weight sharing
CN114462420A (en) False news detection method based on feature fusion model
CN114722835A (en) Text emotion recognition method based on LDA and BERT fusion improved model
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN115630156A (en) Mongolian emotion analysis method and system fusing Prompt and SRU
CN114417872A (en) Contract text named entity recognition method and system
CN111353032B (en) Community question and answer oriented question classification method and system
CN113204640A (en) Text classification method based on attention mechanism
CN112560440A (en) Deep learning-based syntax dependence method for aspect-level emotion analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant