CN113326700A - ALBert-based complex heavy equipment entity extraction method - Google Patents
ALBert-based complex heavy equipment entity extraction method Download PDFInfo
- Publication number
- CN113326700A CN113326700A CN202110217185.1A CN202110217185A CN113326700A CN 113326700 A CN113326700 A CN 113326700A CN 202110217185 A CN202110217185 A CN 202110217185A CN 113326700 A CN113326700 A CN 113326700A
- Authority
- CN
- China
- Prior art keywords
- albert
- entity
- model
- heavy equipment
- complex heavy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 51
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000002372 labelling Methods 0.000 claims abstract description 22
- 238000012795 verification Methods 0.000 claims abstract description 10
- 239000013598 vector Substances 0.000 claims description 10
- 238000000034 method Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 230000002457 bidirectional effect Effects 0.000 claims description 4
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000001125 extrusion Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000010931 gold Substances 0.000 description 2
- 229910052737 gold Inorganic materials 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000003754 machining Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005272 metallurgy Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Animal Behavior & Ethology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses an ALBert-based complex heavy equipment entity extraction method, which is implemented according to the following steps: step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus; step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert; step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set; step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; step 5, creating a dictionary Dict; and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result. The invention can complete the entity extraction task in the field of complex heavy equipment.
Description
Technical Field
The invention belongs to the technical field of knowledge maps, and particularly relates to an ALBert-based complex heavy equipment entity extraction method.
Background
The complex heavy equipment is one of important basic equipment in the manufacturing industry, is an important guarantee for social and economic development and national defense industry, and is particularly important to be used as a national heavy equipment. The heavy equipment is taken as high-end equipment and is widely applied to key industries and fields of energy, traffic, ships, engineering machinery, metallurgy, aerospace, war industry and the like. Heavy equipment has long development period and complex stages, including preliminary investigation, design, manufacture, purchase, matching, installation, debugging, delivery, quality control, after-sales service and the like, and a great deal of knowledge is generated in the processes, wherein the great deal of knowledge is stored in a text form.
With the development of new internet technology, the effective management of knowledge and the reuse of knowledge in the equipment manufacturing industry can better assist the whole process of design, production, operation and maintenance. The knowledge graph is an efficient mode capable of organizing and managing knowledge effectively, one of important links of the construction of the knowledge graph is entity extraction, and the accuracy of the entity extraction determines the accuracy of the knowledge graph to a certain extent. Entity extraction for complex heavy equipment texts lays a foundation for subsequent knowledge map construction, effective knowledge management and knowledge reuse.
Disclosure of Invention
The invention aims to provide an ALBert-based entity extraction method for complex heavy equipment, which can complete an entity extraction task in the field of complex heavy equipment.
The technical scheme adopted by the invention is that the complex heavy equipment entity extracting method based on the ALBert is implemented according to the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting the text format into an algorithm reading format to obtain a training set and a verification set;
step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
step 5, creating a dictionary Dict;
and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The present invention is also characterized in that,
in the step 1, a web crawler frame Scapy is used for capturing related complex heavy equipment information from a webpage and storing the complex heavy equipment information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
In the step 2, the ALBert model takes a single Chinese character as input, a starting mark [ CLS ] is added in front of the first character of each sentence, an ending mark [ SEP ] is added at the tail of each sentence, the ALBert output is a representation vector of semantic information of each input character fused text, the following connection parameters are finely adjusted according to the linguistic data in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training to obtain the finely adjusted ALBert model.
And 3, completing entity labeling by adopting an artificial labeling mode, wherein a labeled entity adopts a BIO labeling mode, a B-Type label is marked on the first character of the entity, an I-Type label is marked on the non-first character of the entity, O labels are marked on the non-entity and punctuation marks, and the Type represents the entity Type.
The training model in step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vectors generated in the step 4.1 into a bidirectional gating circulation unit BGRU, and obtaining scores of all the words on all the labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain a weighted score of each word on all the labels;
step 4.4, using conditional random field CRF to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
The step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
The step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 for secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.
The method has the advantages that the method for extracting the complex heavy equipment entity based on the ALBert marks the characters of the related information in the existing texts and web pages in the field as the corpus, uses the fine-tuned ALBert to realize word embedding, uses the deep learning algorithm BGRU-Attention-CRF to train to obtain the entity extraction model, and adds the field dictionary in order to improve the entity extraction accuracy and consider the special nouns of the complex heavy equipment industry. When a new corpus is input, the trained model identifies the entity in the corpus and provides a final entity extraction result by combining a dictionary.
Drawings
Fig. 1 is a general flowchart of an ALBert-based complex heavy equipment entity extraction method of the present invention;
FIG. 2 is a flow chart of a depth learning algorithm ALBert-BGRU-Attention-CRF for establishing a complex heavy equipment entity extraction model in the complex heavy equipment entity extraction method based on ALBert.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses an ALBert-based complex heavy equipment entity extraction method, which is characterized in that a flow chart is shown in figure 1, an ALBert-BGRU-Attention-CRF algorithm based on deep learning is utilized to train an entity extraction model on the basis of data collection and processing, and a dictionary (Dict) is combined to obtain a final extraction result after initial entity extraction is carried out on a corpus to be extracted. The method is implemented according to the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
in the step 1, a web crawler frame Scapy is used for capturing related complex heavy equipment information from a webpage and storing the complex heavy equipment information as a text file, and the stored text is integrated with an existing complex heavy equipment field document collected manually to serve as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
Step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
in the step 2, the ALBert model takes a single Chinese character as input, a starting mark [ CLS ] is added in front of the first character of each sentence, an ending mark [ SEP ] is added at the tail of each sentence, the ALBert output is a representation vector of semantic information of each input character fused text, the following connection parameters are finely adjusted according to the linguistic data in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training to obtain the finely adjusted ALBert model.
Step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting the text format into an algorithm reading format to obtain a training set and a verification set;
and 3, completing entity labeling by adopting an artificial labeling mode, wherein a labeled entity adopts a BIO labeling mode, a B-Type label is marked on the first character of the entity, an I-Type label is marked on the non-first character of the entity, O labels are marked on the non-entity and punctuation marks, and the Type represents the entity Type.
And step 3, developing a manual labeling and automatic format adjusting webpage system. And adopting a manual labeling mode, and labeling the webpage by using the developed data to finish entity labeling.
The entity labeling and text format adjustment algorithm pseudo-code is as follows:
input: text data to be labeled;
output: tagged annotation data;
1. text preprocessing:
1.1. removing line feed and blank spaces in the text, and displaying the text after format arrangement;
1.2. creating a label array, and initializing all character labels in the text to be O;
2. entity marking;
2.1. clicking the tag type, selecting the entity corresponding to the tag type, and setting the tag of the text of the selected entity as the corresponding tag type;
2.2. if full text labeling is started, searching a full text, and setting all entity labels with the same name as a selected label type;
3. generating marking data in a standard format, outputting the text character by character, and adding a label corresponding to the character and a line feed character after each character;
the return format is standard and is provided with tag data;
the labeling entity adopts a BIO labeling mode, the first character of the entity is marked with a B-Type label, the non-first character of the entity is marked with an I-Type label, and the non-entity and punctuation marks are all marked with O labels, wherein the Type represents the entity Type.
For example, there is a corpus: "metal extrusion press is the most important equipment for realizing metal extrusion processing. ", the entities are labeled: the gold B-Product belongs to I-Product extrusion I-Product pressing I-Product machine I-Product is O-most O main O equipment O for realizing O-gold B-Way belongs to I-Way extrusion I-Way pressing I-Way and I-Way machining I-Way by O actual O. O is
The non-entity information is marked as O, the entity type of the B-Product is marked as the entity first character of a Product, the entity type of the I-Product is marked as the entity non-first character of the Product, the entity type of the B-Way is marked as the entity first character of a processing mode, and the entity type of the I-Way is marked as the entity non-first character of the processing mode.
Step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model; the flow chart is as shown in figure 2,
the training model in step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vector generated in the step 4.1 into a bidirectional gating circulating unit BGRU (bidirectional Gated Recurrent Unit), and acquiring the score of each word on all labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain a weighted score of each word on all the labels;
step 4.4, using conditional Random field CRF (conditional Random field) to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
The training entity extraction model is as follows:
input: training set and verification set;
output: an entity extraction model;
1, an Import training set and a verification set;
2. importing the fine-tuned ALBert model;
3. importing the word vector into GRU-Attention-CRF;
4. specifying model parameters;
5. inputting a training set and a verification set to start training;
the return entity extraction model.
Step 5, creating a dictionary Dict;
the step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
And 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
The step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 for secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.
Claims (7)
1. The ALBert-based complex heavy equipment entity extraction method is characterized by comprising the following steps:
step 1, collecting texts in the field of complex heavy equipment, and constructing a corpus;
step 2, pre-training an ALBert model by using the corpus obtained in the step 1 to obtain a pre-trained word representation model ALBert;
step 3, labeling the entity names in the corpus obtained in the step 1, and adjusting a text format to an algorithm reading format to obtain a training set and a verification set;
step 4, training the model, namely sending the marked data into an ALBert-BGRU-Attention-CRF algorithm to obtain a trained model;
step 5, creating a dictionary Dict;
and 6, inputting the text to be extracted into the model obtained in the step 4, and combining the dictionary Dict constructed in the step 5 to obtain an entity extraction result.
2. The ALBert-based complex heavy equipment entity extraction method as claimed in claim 1, wherein in the step 1, a web crawler frame Scapy is used to capture information about complex heavy equipment from a webpage and store the information as a text file, and the stored text is integrated with an existing complex heavy equipment domain document collected manually as a data source; then processing the data source, and removing special symbols, formulas and measurement units; the processed data is stored as a corpus as a text file.
3. The method as claimed in claim 2, wherein in the step 2, the ALBert model takes a single chinese character as input, a start mark [ CLS ] is added in front of the first word of each sentence, an end mark [ SEP ] is added at the end of each sentence, the ALBert output is a representation vector of semantic information of fused text of each input word, the following connection parameters are finely tuned according to the corpus in the data source on the basis of the ALBert pre-training model, and the internal training parameters of the ALBert do not participate in training, so as to obtain the finely tuned ALBert model.
4. The ALBert-based complex heavy equipment entity extraction method according to claim 3, wherein in the step 3, an entity labeling is completed by adopting an artificial labeling mode, a BIO labeling mode is adopted for labeling an entity, a B-Type label is marked on an entity first character, an I-Type label is marked on an entity non-first character, O labels are marked on a non-entity and punctuation marks, and the Type represents an entity Type.
5. The ALBert-based complex heavy equipment entity extraction method according to claim 4, wherein the training model in the step 4 is specifically as follows:
step 4.1, inputting the training set and the verification set obtained in the step 3 into the ALBert model finely adjusted in the step 2 to generate a word vector;
step 4.2, inputting the word vectors generated in the step 4.1 into a bidirectional gating circulation unit BGRU, and obtaining scores of all the words on all the labels;
4.3, weighting the result of the step 4.2 by using an Attention mechanism to obtain the weighted score of each word on all the labels;
step 4.4, using conditional random field CRF to constrain the tag sequence, and reducing the occurrence probability of abnormal sequences;
and 4.5, obtaining the trained entity extraction model.
6. The ALBert-based complex heavy equipment entity extraction method according to claim 5, wherein the step 5 is as follows:
and extracting relevant names from the complex heavy equipment detailed information table as a dictionary Dict, wherein the names include but are not limited to parts, combinations and product names.
7. The ALBert-based complex heavy equipment entity extraction method according to claim 6, wherein the step 6 is as follows:
6.1, aiming at a large amount of texts to be extracted, introducing all the texts into the entity extraction model trained in the step 4 to obtain a primary recognition result, and then adding the dictionary Dict constructed in the step 5 to perform secondary extraction on the basis to obtain a final entity extraction result;
and 6.2, aiming at entity extraction of the single sentence, pasting the sentence to be extracted to an online recognition window in an online recognition mode, calling the model obtained in the step 4 and combining the dictionary Dict to give an extraction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110217185.1A CN113326700B (en) | 2021-02-26 | 2021-02-26 | ALBert-based complex heavy equipment entity extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110217185.1A CN113326700B (en) | 2021-02-26 | 2021-02-26 | ALBert-based complex heavy equipment entity extraction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113326700A true CN113326700A (en) | 2021-08-31 |
CN113326700B CN113326700B (en) | 2024-05-14 |
Family
ID=77414448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110217185.1A Active CN113326700B (en) | 2021-02-26 | 2021-02-26 | ALBert-based complex heavy equipment entity extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113326700B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106815293A (en) * | 2016-12-08 | 2017-06-09 | 中国电子科技集团公司第三十二研究所 | System and method for constructing knowledge graph for information analysis |
US20180203848A1 (en) * | 2017-01-17 | 2018-07-19 | Xerox Corporation | Author personality trait recognition from short texts with a deep compositional learning approach |
CN109359293A (en) * | 2018-09-13 | 2019-02-19 | 内蒙古大学 | Mongolian name entity recognition method neural network based and its identifying system |
CN110188347A (en) * | 2019-04-29 | 2019-08-30 | 西安交通大学 | Relation extraction method is recognized between a kind of knowledget opic of text-oriented |
CN110598203A (en) * | 2019-07-19 | 2019-12-20 | 中国人民解放军国防科技大学 | Military imagination document entity information extraction method and device combined with dictionary |
CN110990525A (en) * | 2019-11-15 | 2020-04-10 | 华融融通(北京)科技有限公司 | Natural language processing-based public opinion information extraction and knowledge base generation method |
CN111199152A (en) * | 2019-12-20 | 2020-05-26 | 西安交通大学 | Named entity identification method based on label attention mechanism |
CN111444721A (en) * | 2020-05-27 | 2020-07-24 | 南京大学 | Chinese text key information extraction method based on pre-training language model |
CN111860882A (en) * | 2020-06-17 | 2020-10-30 | 国网江苏省电力有限公司 | Method and device for constructing power grid dispatching fault processing knowledge graph |
CN111950540A (en) * | 2020-07-24 | 2020-11-17 | 浙江师范大学 | Knowledge point extraction method, system, device and medium based on deep learning |
CN112036185A (en) * | 2020-11-04 | 2020-12-04 | 长沙树根互联技术有限公司 | Method and device for constructing named entity recognition model based on industrial enterprise |
-
2021
- 2021-02-26 CN CN202110217185.1A patent/CN113326700B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106815293A (en) * | 2016-12-08 | 2017-06-09 | 中国电子科技集团公司第三十二研究所 | System and method for constructing knowledge graph for information analysis |
US20180203848A1 (en) * | 2017-01-17 | 2018-07-19 | Xerox Corporation | Author personality trait recognition from short texts with a deep compositional learning approach |
CN109359293A (en) * | 2018-09-13 | 2019-02-19 | 内蒙古大学 | Mongolian name entity recognition method neural network based and its identifying system |
CN110188347A (en) * | 2019-04-29 | 2019-08-30 | 西安交通大学 | Relation extraction method is recognized between a kind of knowledget opic of text-oriented |
CN110598203A (en) * | 2019-07-19 | 2019-12-20 | 中国人民解放军国防科技大学 | Military imagination document entity information extraction method and device combined with dictionary |
CN110990525A (en) * | 2019-11-15 | 2020-04-10 | 华融融通(北京)科技有限公司 | Natural language processing-based public opinion information extraction and knowledge base generation method |
CN111199152A (en) * | 2019-12-20 | 2020-05-26 | 西安交通大学 | Named entity identification method based on label attention mechanism |
CN111444721A (en) * | 2020-05-27 | 2020-07-24 | 南京大学 | Chinese text key information extraction method based on pre-training language model |
CN111860882A (en) * | 2020-06-17 | 2020-10-30 | 国网江苏省电力有限公司 | Method and device for constructing power grid dispatching fault processing knowledge graph |
CN111950540A (en) * | 2020-07-24 | 2020-11-17 | 浙江师范大学 | Knowledge point extraction method, system, device and medium based on deep learning |
CN112036185A (en) * | 2020-11-04 | 2020-12-04 | 长沙树根互联技术有限公司 | Method and device for constructing named entity recognition model based on industrial enterprise |
Non-Patent Citations (1)
Title |
---|
温超东 等: "结合ALBERT和双向门控循环单元的专利文本分类", 《计算机应用》, vol. 41, no. 2, 10 February 2021 (2021-02-10) * |
Also Published As
Publication number | Publication date |
---|---|
CN113326700B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110598203B (en) | Method and device for extracting entity information of military design document combined with dictionary | |
CN104050160B (en) | Interpreter's method and apparatus that a kind of machine is blended with human translation | |
CN102262634B (en) | Automatic questioning and answering method and system | |
CN110597997B (en) | Military scenario text event extraction corpus iterative construction method and device | |
CN112417854A (en) | Chinese document abstraction type abstract method | |
CN110705272A (en) | Named entity identification method for automobile engine fault diagnosis | |
CN113987112B (en) | Table information extraction method and device, storage medium and electronic equipment | |
CN111444704A (en) | Network security keyword extraction method based on deep neural network | |
CN111460147B (en) | Title short text classification method based on semantic enhancement | |
CN112051986A (en) | Code search recommendation device and method based on open source knowledge | |
CN114443813A (en) | Intelligent online teaching resource knowledge point concept entity linking method | |
CN111444720A (en) | Named entity recognition method for English text | |
CN114969294A (en) | Expansion method of sound-proximity sensitive words | |
CN113901224A (en) | Knowledge distillation-based secret-related text recognition model training method, system and device | |
CN114298021A (en) | Rumor detection method based on sentiment value selection comments | |
CN114239579A (en) | Electric power searchable document extraction method and device based on regular expression and CRF model | |
CN113326700B (en) | ALBert-based complex heavy equipment entity extraction method | |
CN116306506A (en) | Intelligent mail template method based on content identification | |
CN114757191A9 (en) | Electric power public opinion field named entity recognition method and system based on deep learning | |
CN115062615A (en) | Financial field event extraction method and device | |
Sun et al. | Generalized abbreviation prediction with negative full forms and its application on improving chinese web search | |
CN110990385A (en) | Software for automatically generating news headlines based on Sequence2Sequence | |
Liu | IntelliExtract: An End-to-End Framework for Chinese Resume Information Extraction from Document Images | |
CN111209404B (en) | Method for generating similar question sentences based on deep learning assistance | |
CN113961674B (en) | Semantic matching method and device for key information and public company announcement text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |