CN113326700A - ALBert-based complex heavy equipment entity extraction method - Google Patents

ALBert-based complex heavy equipment entity extraction method Download PDF

Info

Publication number
CN113326700A
CN113326700A CN202110217185.1A CN202110217185A CN113326700A CN 113326700 A CN113326700 A CN 113326700A CN 202110217185 A CN202110217185 A CN 202110217185A CN 113326700 A CN113326700 A CN 113326700A
Authority
CN
China
Prior art keywords
albert
entity
model
heavy equipment
complex heavy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110217185.1A
Other languages
Chinese (zh)
Other versions
CN113326700B (en
Inventor
李军怀
陈苗苗
王怀军
曹霆
于蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110217185.1A priority Critical patent/CN113326700B/en
Publication of CN113326700A publication Critical patent/CN113326700A/en
Application granted granted Critical
Publication of CN113326700B publication Critical patent/CN113326700B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an ALBert-based complex heavy equipment entity extraction method, which is implemented according to the following steps: step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus; step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert; step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set; step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; step 5, creating a dictionary Dict; and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result. The invention can complete the entity extraction task in the field of complex heavy equipment.

Description

ALBert-based complex heavy equipment entity extraction method
Technical Field
The invention belongs to the technical field of knowledge maps, and particularly relates to an ALBert-based complex heavy equipment entity extraction method.
Background
The complex heavy equipment is one of important basic equipment in the manufacturing industry, is an important guarantee for social and economic development and national defense industry, and is particularly important to be used as a national heavy equipment. The heavy equipment is taken as high-end equipment and is widely applied to key industries and fields of energy, traffic, ships, engineering machinery, metallurgy, aerospace, war industry and the like. Heavy equipment has long development period and complex stages, including preliminary investigation, design, manufacture, purchase, matching, installation, debugging, delivery, quality control, after-sales service and the like, and a great deal of knowledge is generated in the processes, wherein the great deal of knowledge is stored in a text form.
With the development of new internet technology, the effective management of knowledge and the reuse of knowledge in the equipment manufacturing industry can better assist the whole process of design, production, operation and maintenance. The knowledge graph is an efficient mode capable of organizing and managing knowledge effectively, one of important links of the construction of the knowledge graph is entity extraction, and the accuracy of the entity extraction determines the accuracy of the knowledge graph to a certain extent. Entity extraction for complex heavy equipment texts lays a foundation for subsequent knowledge map construction, effective knowledge management and knowledge reuse.
Disclosure of Invention
The invention aims to provide an ALBert-based entity extraction method for complex heavy equipment, which can complete an entity extraction task in the field of complex heavy equipment.
The technical scheme adopted by the invention is that the complex heavy equipment entity extracting method based on the ALBert is implemented according to the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting the text format into an algorithm reading format to obtain a training set and a verification set;
step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
step 5, creating a dictionary Dict;
and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The present invention is also characterized in that,
in the step 1, a web crawler frame Scapy is used for capturing related complex heavy equipment information from a webpage and storing the complex heavy equipment information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
In the step 2, the ALBert model takes a single Chinese character as input, a starting mark [ CLS ] is added in front of the first character of each sentence, an ending mark [ SEP ] is added at the tail of each sentence, the ALBert output is a representation vector of semantic information of each input character fused text, the following connection parameters are finely adjusted according to the linguistic data in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training to obtain the finely adjusted ALBert model.
And 3, completing entity labeling by adopting an artificial labeling mode, wherein a labeled entity adopts a BIO labeling mode, a B-Type label is marked on the first character of the entity, an I-Type label is marked on the non-first character of the entity, O labels are marked on the non-entity and punctuation marks, and the Type represents the entity Type.
The training model in step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vectors generated in the step 4.1 into a bidirectional gating circulation unit BGRU, and obtaining scores of all the words on all the labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain a weighted score of each word on all the labels;
step 4.4, using conditional random field CRF to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
The step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
The step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 for secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.
The method has the advantages that the method for extracting the complex heavy equipment entity based on the ALBert marks the characters of the related information in the existing texts and web pages in the field as the corpus, uses the fine-tuned ALBert to realize word embedding, uses the deep learning algorithm BGRU-Attention-CRF to train to obtain the entity extraction model, and adds the field dictionary in order to improve the entity extraction accuracy and consider the special nouns of the complex heavy equipment industry. When a new corpus is input, the trained model identifies the entity in the corpus and provides a final entity extraction result by combining a dictionary.
Drawings
Fig. 1 is a general flowchart of an ALBert-based complex heavy equipment entity extraction method of the present invention;
FIG. 2 is a flow chart of a depth learning algorithm ALBert-BGRU-Attention-CRF for establishing a complex heavy equipment entity extraction model in the complex heavy equipment entity extraction method based on ALBert.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses an ALBert-based complex heavy equipment entity extraction method, which is characterized in that a flow chart is shown in figure 1, an ALBert-BGRU-Attention-CRF algorithm based on deep learning is utilized to train an entity extraction model on the basis of data collection and processing, and a dictionary (Dict) is combined to obtain a final extraction result after initial entity extraction is carried out on a corpus to be extracted. The method is implemented according to the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
in the step 1, a web crawler frame Scapy is used for capturing related complex heavy equipment information from a webpage and storing the complex heavy equipment information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
Step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
in the step 2, the ALBert model takes a single Chinese character as input, a starting mark [ CLS ] is added in front of the first character of each sentence, an ending mark [ SEP ] is added at the tail of each sentence, the ALBert output is a representation vector of semantic information of each input character fused text, the following connection parameters are finely adjusted according to the linguistic data in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training to obtain the finely adjusted ALBert model.
Step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting the text format into an algorithm reading format to obtain a training set and a verification set;
and 3, completing entity labeling by adopting an artificial labeling mode, wherein a labeled entity adopts a BIO labeling mode, a B-Type label is marked on the first character of the entity, an I-Type label is marked on the non-first character of the entity, O labels are marked on the non-entity and punctuation marks, and the Type represents the entity Type.
And step 3, developing a manual labeling and automatic format adjusting webpage system. And adopting a manual labeling mode, and labeling the webpage by using the developed data to finish entity labeling.
The entity labeling and text format adjustment algorithm pseudo-code is as follows:
input: text data to be labeled;
output: tagged annotation data;
1. text preprocessing:
1.1. removing line feed and blank spaces in the text, and displaying the text after format arrangement;
1.2. creating a label array, and initializing all character labels in the text to be O;
2. entity marking;
2.1. clicking the tag type, selecting the entity corresponding to the tag type, and setting the tag of the text of the selected entity as the corresponding tag type;
2.2. if full text labeling is started, searching a full text, and setting all entity labels with the same name as a selected label type;
3. generating marking data in a standard format, outputting the text character by character, and adding a label corresponding to the character and a line feed character after each character;
the return format is standard and is provided with tag data;
the labeling entity adopts a BIO labeling mode, the first character of the entity is marked with a B-Type label, the non-first character of the entity is marked with an I-Type label, and the non-entity and punctuation marks are all marked with O labels, wherein the Type represents the entity Type.
For example, there is a corpus: "metal extrusion press is the most important equipment for realizing metal extrusion processing. ", the entities are labeled: the gold B-Product belongs to I-Product extrusion I-Product pressing I-Product machine I-Product is O-most O main O equipment O for realizing O-gold B-Way belongs to I-Way extrusion I-Way pressing I-Way and I-Way machining I-Way by O actual O. O is
The non-entity information is marked as O, the entity type of the B-Product is marked as the entity first character of a Product, the entity type of the I-Product is marked as the entity non-first character of the Product, the entity type of the B-Way is marked as the entity first character of a processing mode, and the entity type of the I-Way is marked as the entity non-first character of the processing mode.
Step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; the flow chart is as shown in figure 2,
the training model in step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vector generated in the step 4.1 into a bidirectional gating circulating unit BGRU (bidirectional Gated Recurrent Unit), and acquiring the score of each word on all labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain a weighted score of each word on all the labels;
step 4.4, using conditional Random field CRF (conditional Random field) to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
The training entity extraction model is as follows:
input: training set and verification set;
output: an entity extraction model;
1, an Import training set and a verification set;
2. importing the fine-tuned ALBert model;
3. importing the word vector into GRU-Attention-CRF;
4. specifying model parameters;
5. inputting a training set and a verification set to start training;
the return entity extraction model.
Step 5, creating a dictionary Dict;
the step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
And 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 for secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.

Claims (7)

1. The ALBert-based complex heavy equipment entity extraction method is characterized by comprising the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set;
step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
step 5, creating a dictionary Dict;
and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
2. The ALBert-based complex heavy equipment entity extraction method as claimed in claim 1, wherein in the step 1, a web crawler frame Scapy is used to capture information about complex heavy equipment from a webpage and store the information as a text file, and the stored text is integrated with an existing complex heavy equipment domain document collected manually as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
3. The method as claimed in claim 2, wherein in the step 2, the ALBert model takes a single chinese character as input, a start mark [ CLS ] is added in front of the first word of each sentence, an end mark [ SEP ] is added at the end of each sentence, the ALBert output is a representation vector of semantic information of fused text of each input word, the following connection parameters are finely tuned according to the corpus in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training, so as to obtain the finely tuned ALBert model.
4. The ALBert-based complex heavy equipment entity extraction method according to claim 3, wherein in the step 3, an entity labeling is completed by adopting an artificial labeling mode, a BIO labeling mode is adopted for labeling an entity, a B-Type label is marked on an entity first character, an I-Type label is marked on an entity non-first character, O labels are marked on a non-entity and punctuation marks, and the Type represents an entity Type.
5. The ALBert-based complex heavy equipment entity extraction method according to claim 4, wherein the training model in the step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vectors generated in the step 4.1 into a bidirectional gating circulation unit BGRU, and obtaining scores of all the words on all the labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain the weighted score of each word on all the labels;
step 4.4, using conditional random field CRF to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
6. The ALBert-based complex heavy equipment entity extraction method according to claim 5, wherein the step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
7. The ALBert-based complex heavy equipment entity extraction method according to claim 6, wherein the step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 to perform secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.
CN202110217185.1A 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method Active CN113326700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110217185.1A CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110217185.1A CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Publications (2)

Publication Number Publication Date
CN113326700A true CN113326700A (en) 2021-08-31
CN113326700B CN113326700B (en) 2024-05-14

Family

ID=77414448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110217185.1A Active CN113326700B (en) 2021-02-26 2021-02-26 ALBert-based complex heavy equipment entity extraction method

Country Status (1)

Country Link
CN (1) CN113326700B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
US20180203848A1 (en) * 2017-01-17 2018-07-19 Xerox Corporation Author personality trait recognition from short texts with a deep compositional learning approach
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN110188347A (en) * 2019-04-29 2019-08-30 西安交通大学 Relation extraction method is recognized between a kind of knowledget opic of text-oriented
CN110598203A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military imagination document entity information extraction method and device combined with dictionary
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN111199152A (en) * 2019-12-20 2020-05-26 西安交通大学 Named entity identification method based on label attention mechanism
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111860882A (en) * 2020-06-17 2020-10-30 国网江苏省电力有限公司 Method and device for constructing power grid dispatching fault processing knowledge graph
CN111950540A (en) * 2020-07-24 2020-11-17 浙江师范大学 Knowledge point extraction method, system, device and medium based on deep learning
CN112036185A (en) * 2020-11-04 2020-12-04 长沙树根互联技术有限公司 Method and device for constructing named entity recognition model based on industrial enterprise

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815293A (en) * 2016-12-08 2017-06-09 中国电子科技集团公司第三十二研究所 System and method for constructing knowledge graph for information analysis
US20180203848A1 (en) * 2017-01-17 2018-07-19 Xerox Corporation Author personality trait recognition from short texts with a deep compositional learning approach
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN110188347A (en) * 2019-04-29 2019-08-30 西安交通大学 Relation extraction method is recognized between a kind of knowledget opic of text-oriented
CN110598203A (en) * 2019-07-19 2019-12-20 中国人民解放军国防科技大学 Military imagination document entity information extraction method and device combined with dictionary
CN110990525A (en) * 2019-11-15 2020-04-10 华融融通(北京)科技有限公司 Natural language processing-based public opinion information extraction and knowledge base generation method
CN111199152A (en) * 2019-12-20 2020-05-26 西安交通大学 Named entity identification method based on label attention mechanism
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
CN111860882A (en) * 2020-06-17 2020-10-30 国网江苏省电力有限公司 Method and device for constructing power grid dispatching fault processing knowledge graph
CN111950540A (en) * 2020-07-24 2020-11-17 浙江师范大学 Knowledge point extraction method, system, device and medium based on deep learning
CN112036185A (en) * 2020-11-04 2020-12-04 长沙树根互联技术有限公司 Method and device for constructing named entity recognition model based on industrial enterprise

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
温超东 等: "结合ALBERT和双向门控循环单元的专利文本分类", 《计算机应用》, vol. 41, no. 2, 10 February 2021 (2021-02-10) *

Also Published As

Publication number Publication date
CN113326700B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
CN110598203B (en) Method and device for extracting entity information of military design document combined with dictionary
CN104050160B (en) Interpreter's method and apparatus that a kind of machine is blended with human translation
CN102262634B (en) Automatic questioning and answering method and system
CN110597997B (en) Military scenario text event extraction corpus iterative construction method and device
CN112417854A (en) Chinese document abstraction type abstract method
CN110705272A (en) Named entity identification method for automobile engine fault diagnosis
CN113987112B (en) Table information extraction method and device, storage medium and electronic equipment
CN111444704A (en) Network security keyword extraction method based on deep neural network
CN111460147B (en) Title short text classification method based on semantic enhancement
CN112051986A (en) Code search recommendation device and method based on open source knowledge
CN114443813A (en) Intelligent online teaching resource knowledge point concept entity linking method
CN111444720A (en) Named entity recognition method for English text
CN114969294A (en) Expansion method of sound-proximity sensitive words
CN113901224A (en) Knowledge distillation-based secret-related text recognition model training method, system and device
CN114298021A (en) Rumor detection method based on sentiment value selection comments
CN114239579A (en) Electric power searchable document extraction method and device based on regular expression and CRF model
CN113326700B (en) ALBert-based complex heavy equipment entity extraction method
CN116306506A (en) Intelligent mail template method based on content identification
CN114757191A9 (en) Electric power public opinion field named entity recognition method and system based on deep learning
CN115062615A (en) Financial field event extraction method and device
Sun et al. Generalized abbreviation prediction with negative full forms and its application on improving chinese web search
CN110990385A (en) Software for automatically generating news headlines based on Sequence2Sequence
Liu IntelliExtract: An End-to-End Framework for Chinese Resume Information Extraction from Document Images
CN111209404B (en) Method for generating similar question sentences based on deep learning assistance
CN113961674B (en) Semantic matching method and device for key information and public company announcement text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant