CN116204645B - Intelligent text classification method, system, storage medium and electronic equipment - Google Patents

Intelligent text classification method, system, storage medium and electronic equipment Download PDF

Info

Publication number
CN116204645B
CN116204645B CN202310227369.5A CN202310227369A CN116204645B CN 116204645 B CN116204645 B CN 116204645B CN 202310227369 A CN202310227369 A CN 202310227369A CN 116204645 B CN116204645 B CN 116204645B
Authority
CN
China
Prior art keywords
reference model
training
interception
preset reference
classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310227369.5A
Other languages
Chinese (zh)
Other versions
CN116204645A (en
Inventor
戴长松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shumei Tianxia Beijing Technology Co ltd
Beijing Nextdata Times Technology Co ltd
Original Assignee
Shumei Tianxia Beijing Technology Co ltd
Beijing Nextdata Times Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shumei Tianxia Beijing Technology Co ltd, Beijing Nextdata Times Technology Co ltd filed Critical Shumei Tianxia Beijing Technology Co ltd
Priority to CN202310227369.5A priority Critical patent/CN116204645B/en
Publication of CN116204645A publication Critical patent/CN116204645A/en
Application granted granted Critical
Publication of CN116204645B publication Critical patent/CN116204645B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intelligent text classification method, an intelligent text classification system, a storage medium and electronic equipment, and relates to the field of natural language processing. The method comprises the following steps: acquiring interception information of label information of text data to be classified; classifying the text data to be classified according to the preset reference model and the interception information to obtain a classification result, and classifying the text data to be classified according to the preset reference model and the interception information to obtain the classification result, so that negative influence caused by data problems is reduced, and label classification effect is improved.

Description

Intelligent text classification method, system, storage medium and electronic equipment
Technical Field
The present invention relates to the field of natural language processing, and in particular, to an intelligent text classification method, system, storage medium, and electronic device.
Background
BERT (Bidirectional Encoder Representation from Transformers), which is a breakthrough research progress in the field of natural language processing, has remarkable effects on various tasks such as GLUE and SQUAD, and is widely applied to the fields such as text classification, natural language inference, emotion analysis, semantic similarity, reading and understanding. The representation of the context information is obtained by constructing a multi-layer bidirectional transducer model, the BERT divides the training process into two stages of pre-training and fine-tuning, two non-supervised tasks MLM and NSP are proposed in the pre-training to train the unlabeled data, and the labeled data corresponding to the downstream tasks are adopted in the fine-tuning stage to fine-tune the parameters of the pre-training model. In the classification task, if the labeling data has the problems of dirty data, label missing and the like, the model recognition accuracy can be reduced to a certain extent.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an intelligent text classification method, an intelligent text classification system, a storage medium and electronic equipment.
The technical scheme for solving the technical problems is as follows:
an intelligent text classification method, comprising:
acquiring interception information of label information of text data to be classified;
and classifying the text data to be classified according to a preset reference model and the interception information to obtain a classification result.
The beneficial effects of the invention are as follows: according to the method, the text data to be classified is classified according to the preset reference model and the interception information, so that a classification result is obtained, negative effects caused by data problems are reduced, and the label classification effect is improved.
Further, the method further comprises the following steps: replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classifying the text data to be classified according to the preset reference model and the interception information specifically comprises the following steps:
and classifying the text data to be classified according to the optimized preset reference model and the interception information.
Further, the method further comprises the following steps: training the reference model through a downstream task data set to obtain a trained preset reference model;
the classifying the text data to be classified according to the preset reference model and the interception information specifically comprises the following steps:
and classifying the text data to be classified according to the trained preset reference model and the interception information.
Further, the text data to be classified includes: tag information and text content.
The other technical scheme for solving the technical problems is as follows:
an intelligent text classification system comprising: the method comprises the steps of obtaining an interception information module and a classification module;
the interception information acquisition module is used for acquiring interception information of tag information of text data to be classified;
the classification module is used for classifying the text data to be classified according to a preset reference model and the interception information to obtain a classification result.
The beneficial effects of the invention are as follows: according to the method, the text data to be classified is classified according to the preset reference model and the interception information, so that a classification result is obtained, negative effects caused by data problems are reduced, and the label classification effect is improved.
Further, the method further comprises the following steps: the optimization module is used for replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classification module is specifically configured to classify the text data to be classified according to the optimized preset reference model in combination with the interception information.
Further, the method further comprises the following steps: the training module is used for training the reference model through a downstream task data set to obtain a trained preset reference model;
the classification module is specifically configured to classify the text data to be classified according to the trained preset reference model in combination with the interception information.
Further, the text data to be classified includes: tag information and text content.
The other technical scheme for solving the technical problems is as follows:
a storage medium having instructions stored therein which, when read by a computer, cause the computer to perform an intelligent text classification method according to any of the above aspects.
An electronic device comprising a processor and a storage medium as described in the above, the processor executing instructions in the storage medium.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a schematic flow chart of an intelligent text classification method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an intelligent text classification system according to an embodiment of the present invention;
FIG. 3 is a flowchart of an intelligent text classification method based on BERT penalty masks according to other embodiments of the present invention;
fig. 4 is a schematic structural diagram of a preset reference model according to another embodiment of the present invention.
Detailed Description
The principles and features of the present invention are described below with reference to the drawings, the illustrated embodiments are provided for illustration only and are not intended to limit the scope of the present invention.
As shown in fig. 1, the method for classifying intelligent text provided by the embodiment of the invention includes:
s1, acquiring interception information of tag information of text data to be classified;
s2, classifying the text data to be classified according to a preset reference model and the interception information, and obtaining a classification result. It should be noted that, the construction flow of the model requires: data set, training resources (GPU), training code. The training code adopts a BERT model training code which is based on Tensorflow and is of Huggingface open source, and the code is modified and adapted to a certain extent. The reference model is called because it does not apply any additional techniques and is a basic, comparable, standard model. The modified content mainly comprises two points: 1, replacing a softmax layer of a model classifier with a sigmoid layer; 2, writing the acquired interception relation between the classifier and the label into the model in a code mode, wherein the specific implementation mode is to hide loss of the robbery-dividing label, so that the prediction accuracy of the classifier is improved.
In one embodiment, the model specific classification process may include:
1 preparing annotated classification datasets (these datasets are also referred to as training sets for models, text classification datasets typically contain text content and its labels);
2, invoking GPU resources, and training a BERT model (the model is a reference model) according to the data set in the step 1;
3 scoring on the training set using a benchmark model to obtain an interception relationship between the classifier and the tag (a specific method is explained below);
modifying the training codes by inputting the interception relation obtained in step 3 into a model and retraining to obtain a new BERT model (the modification is described in detail below);
the same test set is used for testing on the reference model and the new model, and the new model has better classification effect compared with the reference model. (this step is to verify the validity of the method, and is not a step necessary for constructing a new model)
In one embodiment, the functionality of the model generally includes three parts: training and prediction, and relationship extraction.
Training is to construct a model through codes and computing resources and marked data sets; the prediction is to score text data of unknown labels on a trained model, so that labels predicted by the model are obtained; the relation extraction module obtains the label interception tendency relation of the classifier according to the scoring of the reference BERT model on the training set. The structural part of the model is shown in fig. 4.
In some embodiment, the training process of the model may include:
1, preparing marked training sets (24 labels, 301 ten thousand data);
2 preparing a GPU computing platform (NVIDIA TESLA V) a code training and debugging environment (Tensorflow, CUDA, etc.);
3BERT model training code (an open source version provided by HuggingFace);
after all the above resources are prepared, model training can begin.
The model construction process is a model training process, taking the invention as an example, and is different from the reference model training in that after the reference model is trained, two steps are added:
1, scoring a training set by using a trained reference model to obtain an interception tendency relation between a classifier and a label;
2, embedding the interception tendency obtained in the step 1 into a model training process (code is required to be modified), and retraining to obtain a new model.
According to the method, the text data to be classified is classified according to the preset reference model and the interception information, so that a classification result is obtained, negative effects caused by data problems are reduced, and the label classification effect is improved.
Optionally, in some embodiments, further comprising: replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classifying the text data to be classified according to the preset reference model and the interception information specifically comprises the following steps:
and classifying the text data to be classified according to the optimized preset reference model and the interception information.
Optionally, in some embodiments, further comprising: training the reference model through a downstream task data set to obtain a trained preset reference model; where the downstream task dataset refers to a classified text dataset, the dataset generally contains tag information and text content, for example, the training set in the present technology experiment contains 24 tags for a total of 301 ten thousand pieces of data.
The classifying the text data to be classified according to the preset reference model and the interception information specifically comprises the following steps:
and classifying the text data to be classified according to the trained preset reference model and the interception information.
Optionally, in some embodiments, the text data to be classified includes: tag information and text content.
In one embodiment, an intelligent text classification method based on a BERT penalty mask includes:
step 11: training a benchmark model on the downstream task data set using BERT; it should be noted that, constructing the reference model may include: the model needs to be trained on a GPU computing platform supporting parallel computing, and the bottom GPU based on the technical experiment is NVIDIA Tesla V100. Where the downstream task dataset refers to a classified text dataset, the dataset generally contains tag information and text content, for example, the training set in the present technology experiment contains 24 tags for a total of 301 ten thousand pieces of data. The construction flow of the model needs: data set, training resources (GPU), training code. The training code adopts a BERT model training code which is based on Tensorflow and is of Huggingface open source, and the code is modified and adapted to a certain extent. The reference model is called because it does not apply any additional techniques and is a basic, comparable, standard model.
Step 12: predicting training data by using the reference model to obtain the score of each label, then counting the average score of each classifier on each label data set, setting a threshold t, and regarding that the sub-classifier has interception tendency to the target label when the average score is higher than the threshold; the prediction process may include: prediction is one of the basic functions of a model, where prediction refers to the act of classifying a trained model on data without any labels. For example, there is a new piece of text (without any tag information) that can be scored on all tags by means of a model, giving a predictive tag to this piece of text.
The hypothesis model has three sub-classifiers: classifier A, classifier B, classifier C; the training set has three labels: label A, label B, label C; then, the reference model is used for predictive scoring of all samples of the training set, so that the average scoring of the classifier on all samples can be obtained, for example, the average scoring of the classifier A on all samples with labels C is S A C =0.2, assuming that the threshold t=0.1 is set, there is S A C >t, this indicates that classifier a has a tendency to intercept tag C.
It should be noted that, prediction is one of the basic functions of the model, the classifier is an important carrier for realizing prediction by the model, and if the training set has three labels, then there are three classifiers corresponding to the model, the functions of the three classifiers are to predict the scores of the samples on the labels, and the model judges which label the samples finally belong to according to the scores.
Step 13: retraining on the same training set, except that when calculating the loss of each classifier, the label loss with interception tendency is erased; it should be noted that, when the model is trained, the label information of a sample is used to construct the loss of the classifier, and if the classifier a has an interception tendency to the label C, when the model is trained, the loss of the classifier a is set to 0 (erasure loss) when the loss is constructed for a sample of which the label is C. The method has the effect of eliminating the influence of the classifier which is easy to robbery the current label, thereby improving the training effect of the current label.
Step 14: when the probability is output by the model, the traditional softmax layer is replaced by a sigmoid layer, so that a better label classification effect is obtained; it should be noted that, the model will score by means of a classifier in prediction, i.e. output a value for all tags: softmax is a method (function) of mapping the output values that can transform a range of values of different sizes into probabilities between 0 and 1; sigmoid is also a function of mapping values into a probability distribution between 0 and 1, except that the sum of probabilities for softmax mapping is 1, and the sum of probabilities after sigmoid mapping is often greater than 1. This is the difference caused by the different specific implementation of the two functions, where the main reason for choosing sigmoid is: the sigmoid has no probability sum 1 limit compared with the softmax, and can improve the scoring of each classifier, so that the recall of the model to the sample is improved, and a better classification effect is finally obtained.
It should be noted that steps 12, 13, 14 are in time sequence: the purpose of step 12 is to provide support for the training of the subsequent model (step 13) in order to obtain the interception tendency of the classifier; step 13, utilizing the interception tendency of the classifier obtained in the step 12 to change the training codes in a targeted manner and retraining, thereby obtaining a model embedded with priori information; step 14 is to modify the prediction process based on the trained model in step 13, so as to perfect the technical method.
The method can relieve negative effects brought by problem data and improve text classification effects; how to mitigate the negative impact of problem data may include: for samples of robbery, the model can be focused on the training of the label by using loss masking, so that adverse effects caused by problem data are relieved; the probability output layer can improve the scoring of the model labels by changing sigmoid, so that the probability sum constraint of 1 brought by softmax is avoided, and the disadvantage of robbing among the labels is relieved.
The sigmoid is used for replacing softmax to serve as a final probability output layer, so that the influence of other tags on the current tag is weakened, and recall of the current tag can be further improved;
extracting interception relations between the sub-classifier and the labels according to BERT reference model scoring on the training set, restarting training, and masking label loss with the interception relations on loss of the sub-classifier, so that negative influence caused by data problems is reduced, and label classification effect is improved; it should be noted that, extracting the interception relationship between the sub-classifier and the label may include: the hypothesis model has three sub-classifiers: classifier A, classifier B, classifier C; the training set has three labels: label A, label B, label C; then, the reference model is used for predictive scoring of all samples of the training set, so that the average scoring of the classifier on all samples can be obtained, for example, the average scoring of the classifier A on all samples with labels C is S A C =0.2, assuming that the threshold t=0.1 is set, there is S A C >t, this indicates that classifier a has a tendency to intercept tag C. The masking process may include: assuming classifier A has a tendency to intercept tag C, when model training builds a loss for a sample of tag C, the loss of classifier A will be set to 0 (erasure loss), also known as masking.
In one embodiment, as shown in FIG. 2, an intelligent text classification system includes: an interception information acquisition module 1101 and a classification module 1102;
the interception information acquisition module 1101 is configured to acquire interception information of tag information of text data to be classified;
the classification module 1102 is configured to classify the text data to be classified according to a preset reference model in combination with the interception information, so as to obtain a classification result.
According to the method, the text data to be classified is classified according to the preset reference model and the interception information, so that a classification result is obtained, negative effects caused by data problems are reduced, and the label classification effect is improved.
Optionally, in some embodiments, further comprising: the optimization module is used for replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classification module is specifically configured to classify the text data to be classified according to the optimized preset reference model in combination with the interception information.
Optionally, in some embodiments, further comprising: the training module is used for training the reference model through a downstream task data set to obtain a trained preset reference model;
the classification module is specifically configured to classify the text data to be classified according to the trained preset reference model in combination with the interception information.
Optionally, in some embodiments, the text data to be classified includes: tag information and text content.
It is to be understood that in some embodiments, some or all of the alternatives described in the various embodiments above may be included.
It should be noted that, the foregoing embodiments are product embodiments corresponding to the previous method embodiments, and the description of each optional implementation manner in the product embodiments may refer to the corresponding description in the foregoing method embodiments, which is not repeated herein.
In one embodiment, a storage medium has instructions stored therein that, when read by a computer, cause the computer to perform a method of intelligent text classification as in any of the embodiments described above.
An electronic device comprising a processor and a storage medium of the above embodiments, the processor executing instructions in the storage medium.
The reader will appreciate that in the description of this specification, a description of terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the method embodiments described above are merely illustrative, e.g., the division of steps is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple steps may be combined or integrated into another step, or some features may be omitted or not performed.
The above-described method, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The present invention is not limited to the above embodiments, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the present invention, and these modifications and substitutions are intended to be included in the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (6)

1. An intelligent text classification method, comprising:
acquiring interception information of label information of text data to be classified;
classifying the text data to be classified according to a preset reference model and the interception information to obtain a classification result;
further comprises: replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classifying the text data to be classified according to the preset reference model and the interception information specifically comprises the following steps:
classifying the text data to be classified according to the optimized preset reference model and the interception information;
further comprises: training the reference model through a downstream task data set to obtain a trained preset reference model, wherein the training process comprises the following steps:
step 11: training a benchmark model on the downstream task data set using BERT;
step 12: predicting training data by using the reference model to obtain the score of each label, counting the average score of each classifier on each label data set, setting a threshold t, and regarding that the classifier has interception tendency on the target label when the average score is higher than the threshold;
step 13: retraining on the same training set, erases the label loss with intercept propensity when computing the loss for each classifier, specifically:
extracting interception relations between the classifier and the labels according to BERT reference model scoring on the training set, restarting training, and masking label loss with the interception relations on loss of the classifier;
step 14: and when the model outputs the probability, replacing the softmax layer of the preset reference model with the sigmoid layer.
2. The intelligent text classification method according to claim 1, wherein the text data to be classified comprises: tag information and text content.
3. An intelligent text classification system, comprising: the method comprises the steps of obtaining an interception information module and a classification module;
the interception information acquisition module is used for acquiring interception information of tag information of text data to be classified;
the classification module is used for classifying the text data to be classified according to a preset reference model and the interception information to obtain a classification result;
further comprises: the optimization module is used for replacing the softmax layer of the preset reference model with a sigmoid layer to obtain an optimized preset reference model;
the classification module is specifically used for classifying the text data to be classified according to the optimized preset reference model and the interception information;
further comprises: the training module is used for training the reference model through a downstream task data set to obtain a trained preset reference model, and the training process comprises the following steps:
step 11: training a benchmark model on the downstream task data set using BERT;
step 12: predicting training data by using the reference model to obtain the score of each label, counting the average score of each classifier on each label data set, setting a threshold t, and regarding that the classifier has interception tendency on the target label when the average score is higher than the threshold;
step 13: retraining on the same training set, erasing labels with interception tendency when calculating the loss of each classifier, specifically, extracting interception relations between the classifier and the labels according to the BERT reference model scoring on the training set, restarting training, and masking the labels loss with interception relations on the loss of the classifier;
step 14: and when the model outputs the probability, replacing the softmax layer of the preset reference model with the sigmoid layer.
4. An intelligent text classification system according to claim 3 wherein said text data to be classified comprises: tag information and text content.
5. A storage medium having stored therein instructions which, when read by a computer, cause the computer to perform an intelligent text classification method according to claim 1 or 2.
6. An electronic device comprising a processor and the storage medium of claim 5, the processor executing instructions in the storage medium.
CN202310227369.5A 2023-03-02 2023-03-02 Intelligent text classification method, system, storage medium and electronic equipment Active CN116204645B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310227369.5A CN116204645B (en) 2023-03-02 2023-03-02 Intelligent text classification method, system, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310227369.5A CN116204645B (en) 2023-03-02 2023-03-02 Intelligent text classification method, system, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN116204645A CN116204645A (en) 2023-06-02
CN116204645B true CN116204645B (en) 2024-02-20

Family

ID=86509344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310227369.5A Active CN116204645B (en) 2023-03-02 2023-03-02 Intelligent text classification method, system, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116204645B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium
CN111428028A (en) * 2020-03-04 2020-07-17 中国平安人寿保险股份有限公司 Information classification method based on deep learning and related equipment
CN114741503A (en) * 2022-03-07 2022-07-12 度小满科技(北京)有限公司 Text classification method, device and equipment and readable storage medium
WO2022227207A1 (en) * 2021-04-30 2022-11-03 平安科技(深圳)有限公司 Text classification method, apparatus, computer device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium
CN111428028A (en) * 2020-03-04 2020-07-17 中国平安人寿保险股份有限公司 Information classification method based on deep learning and related equipment
WO2022227207A1 (en) * 2021-04-30 2022-11-03 平安科技(深圳)有限公司 Text classification method, apparatus, computer device, and storage medium
CN114741503A (en) * 2022-03-07 2022-07-12 度小满科技(北京)有限公司 Text classification method, device and equipment and readable storage medium

Also Published As

Publication number Publication date
CN116204645A (en) 2023-06-02

Similar Documents

Publication Publication Date Title
US11734328B2 (en) Artificial intelligence based corpus enrichment for knowledge population and query response
US20180053107A1 (en) Aspect-based sentiment analysis
US20210374347A1 (en) Few-shot named-entity recognition
CN115310425B (en) Policy text analysis method based on policy text classification and key information identification
CN109992664A (en) Mark classification method, device, computer equipment and the storage medium of central issue
CN111191275A (en) Sensitive data identification method, system and device
CN110442859B (en) Labeling corpus generation method, device, equipment and storage medium
US11900250B2 (en) Deep learning model for learning program embeddings
CN113138920B (en) Software defect report allocation method and device based on knowledge graph and semantic role labeling
CN112417132B (en) New meaning identification method for screening negative samples by using guest information
CN116432655B (en) Method and device for identifying named entities with few samples based on language knowledge learning
CN116070632A (en) Informal text entity tag identification method and device
CN114896971B (en) Method, device and storage medium for recognizing specific prefix and suffix negative words
CN113127607A (en) Text data labeling method and device, electronic equipment and readable storage medium
CN113779227B (en) Case fact extraction method, system, device and medium
Lin et al. Radical-based extract and recognition networks for Oracle character recognition
CN117725211A (en) Text classification method and system based on self-constructed prompt template
CN117520561A (en) Entity relation extraction method and system for knowledge graph construction in helicopter assembly field
CN116204645B (en) Intelligent text classification method, system, storage medium and electronic equipment
CN115936003A (en) Software function point duplicate checking method, device, equipment and medium based on neural network
CN115964484A (en) Legal multi-intention identification method and device based on multi-label classification model
CN115827871A (en) Internet enterprise classification method, device and system
CN114741512A (en) Automatic text classification method and system
CN110162629B (en) Text classification method based on multi-base model framework
WO2023035332A1 (en) Date extraction method and apparatus, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant