CN115238029A - Construction method and device of power failure knowledge graph - Google Patents

Construction method and device of power failure knowledge graph Download PDF

Info

Publication number
CN115238029A
CN115238029A CN202210710873.6A CN202210710873A CN115238029A CN 115238029 A CN115238029 A CN 115238029A CN 202210710873 A CN202210710873 A CN 202210710873A CN 115238029 A CN115238029 A CN 115238029A
Authority
CN
China
Prior art keywords
power failure
model
data
knowledge
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210710873.6A
Other languages
Chinese (zh)
Inventor
丁一
滕飞
张磐
霍现旭
庞超
杨挺
尚学军
陈沛
吴磊
张思涵
肖文瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd, Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202210710873.6A priority Critical patent/CN115238029A/en
Publication of CN115238029A publication Critical patent/CN115238029A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method and a device for constructing a power failure knowledge graph, which comprises the following steps: step 1, acquiring data to be processed: acquiring power failure preprocessing text data; step 2, data preprocessing is carried out; step 3, adopting a BERT-BilSTM-CRF combined model to perform entity extraction on the preprocessed data; step 4, identifying and extracting the relationship between entities by adopting a method based on dependency analysis, and analyzing the dependency relationship between sentence components by identifying and positioning syntactic relationships; step 5, knowledge storage and semantic triple representation; step 6, constructing a power failure knowledge graph; the method and the device can improve the accuracy of Chinese entity identification and relationship extraction.

Description

Construction method and device of power failure knowledge graph
Technical Field
The invention belongs to the technical field of application of artificial intelligence algorithms in power systems, and relates to a method and a device for constructing a knowledge graph, in particular to a method and a device for constructing a power failure knowledge graph.
Background
With the continuous development of power grids, the types and functions of power equipment are more complex than ever before. Therefore, the daily operation of the equipment, such as fault diagnosis, maintenance, etc., is highly dependent on the professional power knowledge of the worker. However, due to the lack of effective knowledge extraction, reasoning and application, the diagnosis of power equipment faults by operation and maintenance personnel mainly depends on own experience. This subjective approach is not only inefficient, but also difficult to ensure accuracy.
In fact, the power industry has been in possession of a large amount of chinese technical literature over decades, which contains rich knowledge of electrical equipment faults. If knowledge can be extracted accurately, presented to the employee in an understandable form or an intelligent QA system, the fault diagnosis is undoubtedly more rapid and accurate. In response to this problem, some academic and applied research attempts to extract, organize and demonstrate the knowledge, and provide better support for intelligent diagnosis of power equipment faults. The knowledge organization method for researching the global fault maintenance text constructs a knowledge model based on various maintenance behaviors, realizes flexible and clear knowledge expression for expressing business logic, is beneficial to improving the automation degree of a power system, and provides global knowledge support for the smart grid. The power text is unstructured data and has the characteristics of dense knowledge and rich knowledge types. Compared with structured data with strict format specifications, the expression mode of the power text is more flexible and is more difficult to read and understand.
Therefore, a fault text natural language processing method and a behavior knowledge organization method suitable for the characteristics of the power fault need to be explored, and how to construct a knowledge graph is the key point of power text natural language processing.
Through searching, no prior art publication which is the same as or similar to the present invention is found.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a method and a device for constructing a power failure knowledge graph, and can improve the accuracy of Chinese entity identification and relation extraction.
The invention solves the practical problem by adopting the following technical scheme:
a construction method of a power failure knowledge graph comprises the following steps:
step 1, acquiring data to be processed: acquiring power failure preprocessing text data;
step 2, performing data preprocessing on the power failure preprocessing training data acquired in the step 1;
step 3, adopting a BERT-BilSTM-CRF combined model to perform entity extraction on the preprocessed data;
step 4, identifying and extracting the relationship between entities by adopting a method based on dependency analysis, and analyzing the dependency relationship between sentence components by identifying and positioning syntactic relationships;
step 5, knowledge storage and semantic triple representation: the knowledge storage specifically comprises the step of storing the entities, the attributes and the relations extracted in the step 4 into a database, and the semantic triple representation specifically comprises the step of representing the extracted knowledge in a triple form;
and 6, constructing a power failure knowledge map, and storing the processed knowledge into a map database to construct the power failure knowledge map.
Further, the specific steps of step 2 include:
(1) The word segmentation process adopts a word segmentation method of HMM-CRF, firstly, segmenting preprocessed data into words, sequencing the words, and constructing a high-frequency dictionary with characteristic word frequency; then, the processed document is divided again by using a CRF-based dividing model, and the divided document is imported into a high-frequency dictionary; and finally obtaining a high-precision segmentation result.
(2) Word vector representation uses a Word2vec model to represent text data, synonyms are calculated by calculating cosine similarity between Word vectors, and the obtained Word vectors in a corpus can also be used as input of a subsequent entity recognition model.
(3) Extracting keywords and constructing a body dictionary, extracting high-frequency keywords according to the frequency weight and the mean value of the average information entropy, and removing irrelevant words through manual screening to construct the body dictionary.
Moreover, the BERT-BilSTM-CRF combination model comprises:
(1) BERT layer: performing feature extraction and training through a multilayer neural network, converting an input text into a word vector, and enabling a BilSTM layer to learn context features; the BERT model converts an input sequence into comprehensive embedding of three characteristics of Tokens, segmentations and Positions, then inputs the comprehensive embedding into the model for extraction, and uses a self-attention mechanism and a full-link layer to model an input text.
(2) BilsTM layer: automatically extracting the characteristics of sentence context, wherein the input of each BilSTM unit is a dynamic word vector sequence; then the BilSTM unit learns how to extract the local features of the sentence; finally, the forward LSTM model outputs a hidden state sequence, and the backward LSTM model splices all the hidden state sequences according to the sentence sequence to obtain a complete hidden state sequence; the correlation data can be obtained by the formula:
i t =δ(W i *[h t-1 ,x t ]+b t ) (1)
f t =δ(W f *[h t-1 ,x t ]+b f ) (2)
o t =δ(W o *[h t-1 ,x t ]+b o ) (3)
C t =f t *C t-1 +i t *tan(W c *[h t-1 ,x t ]+b c ) (4)
h t =O t *tanh(C t ) (5)
in the formula (1-5), i t 、f t 、O t Three gating cells representing each LSTM cell: an input gate, a forgetting gate and an output gate. C t Represents the output state of the output layer at time t, h t Representing the output state of the hidden layer at time t. x is the number of t Representing the input of time t. δ () is an activation function and tanh () is a hyperbolic tangent activation function. W i 、W f 、W o Represents a hidden state vector h t And an input vector x t B, and b i 、b f 、b o And b c Representing an offset vector.
(3) CRF layer: the method is a joint probability distribution graph model without a graph representation, local features are normalized into global features, and the problem of partial labeling deviation is solved by calculating the probability distribution of the whole sequence to obtain a global optimal solution; meanwhile, the CRF model can obtain the hidden constraint rule of the label when training data.
Moreover, the specific method of the step 4 is as follows:
firstly, a subject and a core predicate are extracted through semantic role identification; then, through dependency syntactic analysis, finding an object and a subject related to the meaning of the core predicate; and finally, obtaining a relevant dependency relationship in the power failure text and an entity relationship based on the ontology structure through dependency syntax analysis.
A power failure knowledge graph building apparatus comprising:
the data acquisition module is used for acquiring data to be processed and acquiring a power failure preprocessing text;
the data preprocessing module is used for preprocessing the power failure preprocessing text, segmenting the power failure text, acquiring word vectors, extracting keywords and constructing a body dictionary;
the model training module is used for performing entity extraction and relationship extraction on the to-be-processed power failure text, acquiring word vectors in the preprocessed data, inputting the word vectors into a bidirectional long-time and short-time memory network for entity extraction, and extracting an entity relationship according to dependence analysis;
and the map construction module is configured to generate a knowledge map comprising the entities and the relations between the entities according to the entities and the relations extracted by the model training module.
The invention has the advantages and beneficial effects that:
the invention fully considers the relation between power entities and the relation between long texts, provides a method and a device for creating a power failure knowledge graph based on a novel model of BERT-BilSTM-CRF, and has the innovation point that the accuracy of Chinese entity identification and relation extraction is improved through subtle fusion. First, the language pre-training module BERT (Bi-directional Encoder reproduction from transformations) uses dynamic word vectors in preliminary entity recognition. The method not only reduces the workload of downstream tasks, but also has higher accuracy in Chinese entity recognition. This is because dynamic word vectors are more advantageous than static word vectors in chinese entity recognition. For example, a dynamic word vector may express different semantics in different contexts. In addition, the CRF (Conditional Random Field) module restrains and reverses the labeling sequence, solves part of labeling deviation, calculates the joint probability of the whole labeling sequence, and can fully ensure the accuracy of Chinese entity identification. According to the method, BERT is used for carrying out data preprocessing on the electric power text in the early stage; the processed data is utilized to carry out electric power entity identification in the middle period; and in the later stage, the CRF is utilized to constrain the labeling sequence, the joint probability is calculated, and the extraction accuracy is ensured. Through experimental data analysis, the BERT-BilSTM-CRF trained by the method aiming at the power text has extremely high accuracy, and has an obvious effect on the construction of a power failure knowledge graph.
Drawings
FIG. 1 is a power failure knowledge graph construction flow diagram of the present invention;
FIG. 2 is a schematic diagram of the CBOW model of the present invention;
FIG. 3 is a block diagram of the BERT-BilSTM-CRF model of the present invention;
FIG. 4 is an input representation schematic of the BERT model of the present invention;
fig. 5 is a schematic structural diagram of an embodiment of a power failure knowledge graph constructing apparatus provided in the present invention.
Detailed Description
The invention is described in further detail below with reference to the following drawings:
a method for constructing a power failure knowledge graph is shown in FIG. 1, and comprises the following steps:
step 1, acquiring data to be processed: acquiring power failure preprocessing text data;
in this embodiment, the power failure preprocessing training data includes: overhaul texts, operation manuals and the like.
Step 2, performing data preprocessing on the power failure preprocessing training data acquired in the step 1;
in this embodiment, the data preprocessing step specifically includes word segmentation, word vector representation, keyword extraction, and ontology dictionary construction.
The specific steps of the step 2 comprise:
(1) The word segmentation process adopts a word segmentation method of HMM-CRF, firstly, segmenting preprocessed data into words, sequencing the words, and constructing a high-frequency dictionary with characteristic word frequency; then, the processed document is divided again by using a CRF-based division model, and the divided document is imported into a high-frequency dictionary; and finally obtaining a high-precision segmentation result.
(2) The Word vector representation uses Word2vec model to represent text data, and the schematic diagram of the CBOW model is shown in fig. 2, which includes: an Input layer, a Hidden layer, and an Output layer. Synonyms are calculated by calculating cosine similarity between word vectors, and the obtained word vectors in the corpus can also be used as input of a subsequent entity recognition model.
(3) And extracting high-frequency keywords according to the frequency weight and the mean value of the average information entropy, and removing irrelevant words through manual screening to construct the body dictionary.
Step 3, adopting a BERT-BilSTM-CRF combined model to perform entity extraction on the preprocessed data, wherein the structure diagram of the BERT-BilSTM-CRF model is shown in figure 3;
the BERT-BilSTM-CRF combined model comprises the following components:
(1) BERT layer: the input representation of the BERT model is shown in fig. 4. Performing feature extraction and training through a multilayer neural network, converting an input text into a word vector, and enabling a BilSTM layer to learn context features; the BERT model converts an input sequence into comprehensive embedding of three characteristics of Tokens, segmentations and Positions, then inputs the comprehensive embedding into the model for extraction, and uses a self-attention mechanism and a full-link layer to model an input text.
In this embodiment, the key part of the BERT model is a deep network based on self-entry mechanism, and the weight coefficient matrix is adjusted mainly by the correlation between words in the same sentence, so as to obtain the expression of the words. Compared with the traditional static word vector training, the dynamic word vector trained by the BERT model contains the meaning of words and the characteristics between the context words, and can capture the implicit characteristics at the sentence level.
(2) BilSTM layer: automatically extracting the characteristics of sentence context, wherein the input of each BilSTM unit is a dynamic word vector sequence; then a BilSTM unit learns how to extract local features of the sentence; and finally, outputting the hidden state sequence by the forward LSTM model, and splicing all the hidden state sequences by the backward LSTM model according to the sentence sequence to obtain a complete hidden state sequence. The correlation data can be obtained by the formula:
i t =δ(W i *[h t-1 ,x t ]+b t ) (1)
f t =δ(W f *[h t-1 ,x t ]+b f ) (2)
O t =δ(W o *[h t-1 ,x t ]+b o ) (3)
C t =f t *C t-1 +i t *tan(W c *[g t-1 ,x t ]+b c ) (4)
h t =O t *tanh(C t ) (5)
in the formula (1-5), i t 、f t 、O t Three gating cells representing each LSTM cell: an input gate, a forgetting gate and an output gate. C t Representing the output state of the output layer at time t, h t Representing the output state of the hidden layer at time t. x is a radical of a fluorine atom t Representing the input of time t. δ () is an activation function and tanh () is a hyperbolic tangent activation function. W is a group of i 、W f 、W o Represents a hidden state vector h t And an input vector x t B, and b i 、b f 、b o And b c Representing an offset vector.
(3) CRF layer: the method is a joint probability distribution graph model without a graph representation, local features are normalized into global features, and the problem of partial labeling deviation is solved by calculating the probability distribution of the whole sequence to obtain a global optimal solution; meanwhile, the CRF model can obtain the hidden constraint rule of the label when training data.
Step 4, identifying and extracting the relationship between entities by adopting a method based on dependency analysis, and analyzing the dependency relationship between sentence components by identifying and positioning syntactic relations;
the specific method of the step 4 comprises the following steps:
firstly, a subject and a core predicate are extracted through semantic role identification; then, through dependency syntax analysis, finding out an object and a subject related to the meaning of the core predicate; and finally, obtaining a relevant dependency relationship in the power failure text and an entity relationship based on the ontology structure through dependency syntax analysis.
In the present embodiment, dependency parsing is mainly used to analyze four relationship structures in a sentence: a main and subordinate relationship (SBV), a moving object relationship (VOB), an attribute relationship (ATT), and an idiom relationship (ADV).
The process of locating and extracting predicates mainly comprises the following parts: and positioning and extracting all entities with SBV structural relation with the core predicates, preferentially positioning, extracting entities with VOB structural relation with the core predicates, and positioning the ATT structural relation with the core predicates.
Step 5, knowledge storage and semantic triple representation: the knowledge storage specifically comprises the step of storing the entities, attributes and relations extracted in the step 4 into a database, and the semantic triple representation specifically comprises the step of representing the extracted knowledge in a triple form;
and 6, constructing a power failure knowledge map, and storing the processed knowledge into a map database to construct the power failure knowledge map.
A schematic diagram of an embodiment of a power failure knowledge map construction apparatus is shown in fig. 5. The device comprises:
the data acquisition module is used for acquiring data to be processed and acquiring a power failure preprocessing text;
the data preprocessing module is used for preprocessing the power failure preprocessing text, segmenting the power failure text, acquiring word vectors, extracting keywords and constructing a body dictionary;
the model training module is used for performing entity extraction and relation extraction on the to-be-processed power failure text, acquiring word vectors in the preprocessed data, inputting the word vectors into the bidirectional long-time and short-time memory network for entity extraction, and extracting entity relations according to dependence analysis;
and the map construction module is configured to generate a knowledge map comprising the entities and the relations among the entities according to the entities and the relations extracted by the model training module.
It should be emphasized that the embodiments described herein are illustrative and not restrictive, and thus the present invention includes, but is not limited to, the embodiments described in the detailed description, as well as other embodiments that can be derived by one skilled in the art from the teachings herein.

Claims (5)

1. A construction method of a power failure knowledge graph is characterized by comprising the following steps: the method comprises the following steps:
step 1, acquiring data to be processed: acquiring power failure preprocessing text data;
step 2, performing data preprocessing on the power failure preprocessing training data acquired in the step 1;
step 3, adopting a BERT-BilSTM-CRF combined model to perform entity extraction on the preprocessed data;
step 4, identifying and extracting the relationship between entities by adopting a method based on dependency analysis, and analyzing the dependency relationship between sentence components by identifying and positioning syntactic relationships;
step 5, knowledge storage and semantic triple representation: the knowledge storage specifically comprises the step of storing the entities, attributes and relations extracted in the step 4 into a database, and the semantic triple representation specifically comprises the step of representing the extracted knowledge in a triple form;
and 6, constructing a power failure knowledge map, and storing the processed knowledge into a map database to construct the power failure knowledge map.
2. The method for constructing the power failure knowledge graph according to claim 1, wherein the method comprises the following steps: the specific steps of the step 2 comprise:
(1) The word segmentation process adopts a word segmentation method of HMM-CRF, firstly, segmenting preprocessed data into words, sequencing the words, and constructing a high-frequency dictionary with characteristic word frequency; then, the processed document is divided again by using a CRF-based division model, and the divided document is imported into a high-frequency dictionary; and finally obtaining a high-precision segmentation result.
(2) Word vector representation uses Word2vec model to represent text data, synonyms are calculated by calculating cosine similarity between Word vectors, and the obtained Word vectors in the corpus can also be used as input of a subsequent entity recognition model.
(3) Extracting keywords and constructing a body dictionary, extracting high-frequency keywords according to the frequency weight and the mean value of the average information entropy, and removing irrelevant words through manual screening to construct the body dictionary.
3. The method for constructing the power failure knowledge graph according to claim 1, wherein the method comprises the following steps: the BERT-BilSTM-CRF combined model comprises the following components:
(1) BERT layer: performing feature extraction and training through a multilayer neural network, converting an input text into a word vector, and enabling a BilSTM layer to learn context features; the BERT model converts an input sequence into comprehensive embedding of three characteristics of Tokens, segments and Positions, then inputs the comprehensive embedding into the model for extraction, and uses a self-attention mechanism and a full connection layer to model an input text.
(2) BilsTM layer: automatically extracting the characteristics of sentence context, wherein the input of each BilSTM unit is a dynamic word vector sequence; then a BilSTM unit learns how to extract local features of the sentence; finally, the forward LSTM model outputs a hidden state sequence, and the backward LSTM model splices all the hidden state sequences according to the sentence sequence to obtain a complete hidden state sequence; the correlation data can be obtained by the formula:
i t =δ(W i *[h t-1 ,x t ]+b t ) (1)
f t =δ(W f *[h t-1 ,x t ]+b f ) (2)
O t =δ(W o *[h t-1 ,x t ]+b o ) (3)
C t =f t *C t-1 +i t *tan(W c *[h t-1 ,x t ]+b c ) (4)
h t =O t *tanh(C t ) (5)
in the formula (1-5), i t 、f t 、O t Three gating cells representing each LSTM cell: an input gate, a forgetting gate and an output gate. C t Representing the output state of the output layer at time t, h t Representing the output state of the hidden layer at time t; x is a radical of a fluorine atom t Representing the input of time t. δ () is an activation function and tanh () is a hyperbolic tangent activation function. W i 、W f 、W o Represents a hidden state vector h t And an input vector x t B, and b i 、b f 、b o And b c Representing an offset vector.
(3) CRF layer: the method is a joint probability distribution graph model without a graph representation, local features are normalized into global features, and the problem of partial labeling deviation is solved by calculating the probability distribution of the whole sequence to obtain a global optimal solution; meanwhile, the hidden constraint rule of the label can be obtained by the CRF model when the data is trained.
4. The method for constructing the power failure knowledge-graph according to claim 1, wherein: the specific method of the step 4 comprises the following steps:
firstly, a subject and a core predicate are extracted through semantic role identification; then, through dependency syntax analysis, finding out an object and a subject related to the meaning of the core predicate; and finally, obtaining the related dependency relationship in the power failure text and the entity relationship based on the ontology structure through dependency syntax analysis.
5. An electric power failure knowledge map construction device is characterized in that: the method comprises the following steps:
the data acquisition module is used for acquiring data to be processed and acquiring a power failure preprocessing text;
the data preprocessing module is used for preprocessing the power failure preprocessing text, segmenting the power failure text, acquiring word vectors, extracting keywords and constructing a body dictionary;
the model training module is used for performing entity extraction and relation extraction on the to-be-processed power failure text, acquiring word vectors in the preprocessed data, inputting the word vectors into the bidirectional long-time and short-time memory network for entity extraction, and extracting entity relations according to dependence analysis;
and the map construction module is configured to generate a knowledge map comprising the entities and the relations among the entities according to the entities and the relations extracted by the model training module.
CN202210710873.6A 2022-06-22 2022-06-22 Construction method and device of power failure knowledge graph Pending CN115238029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210710873.6A CN115238029A (en) 2022-06-22 2022-06-22 Construction method and device of power failure knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210710873.6A CN115238029A (en) 2022-06-22 2022-06-22 Construction method and device of power failure knowledge graph

Publications (1)

Publication Number Publication Date
CN115238029A true CN115238029A (en) 2022-10-25

Family

ID=83669662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210710873.6A Pending CN115238029A (en) 2022-06-22 2022-06-22 Construction method and device of power failure knowledge graph

Country Status (1)

Country Link
CN (1) CN115238029A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116069955A (en) * 2023-03-06 2023-05-05 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Space-time knowledge extraction method based on MDTA model
CN116414990A (en) * 2023-06-05 2023-07-11 深圳联友科技有限公司 Vehicle fault diagnosis and prevention method
CN117910567A (en) * 2024-03-20 2024-04-19 道普信息技术有限公司 Vulnerability knowledge graph construction method based on safety dictionary and deep learning network
CN117993499A (en) * 2024-04-03 2024-05-07 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Multi-mode knowledge graph construction method for four pre-platforms for flood control in drainage basin

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116069955A (en) * 2023-03-06 2023-05-05 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Space-time knowledge extraction method based on MDTA model
CN116414990A (en) * 2023-06-05 2023-07-11 深圳联友科技有限公司 Vehicle fault diagnosis and prevention method
CN116414990B (en) * 2023-06-05 2023-08-11 深圳联友科技有限公司 Vehicle fault diagnosis and prevention method
CN117910567A (en) * 2024-03-20 2024-04-19 道普信息技术有限公司 Vulnerability knowledge graph construction method based on safety dictionary and deep learning network
CN117993499A (en) * 2024-04-03 2024-05-07 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Multi-mode knowledge graph construction method for four pre-platforms for flood control in drainage basin
CN117993499B (en) * 2024-04-03 2024-06-04 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) Multi-mode knowledge graph construction method for four pre-platforms for flood control in drainage basin

Similar Documents

Publication Publication Date Title
CN107679039B (en) Method and device for determining statement intention
CN106407333B (en) Spoken language query identification method and device based on artificial intelligence
CN111737496A (en) Power equipment fault knowledge map construction method
CN115238029A (en) Construction method and device of power failure knowledge graph
CN111931506B (en) Entity relationship extraction method based on graph information enhancement
CN110321563B (en) Text emotion analysis method based on hybrid supervision model
CN110929030A (en) Text abstract and emotion classification combined training method
CN113642330A (en) Rail transit standard entity identification method based on catalog topic classification
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN112100332A (en) Word embedding expression learning method and device and text recall method and device
CN111274817A (en) Intelligent software cost measurement method based on natural language processing technology
CN113378547B (en) GCN-based Chinese complex sentence implicit relation analysis method and device
CN114706559A (en) Software scale measurement method based on demand identification
CN113946684A (en) Electric power capital construction knowledge graph construction method
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN115759254A (en) Question-answering method, system and medium based on knowledge-enhanced generative language model
CN111859938A (en) Electronic medical record entity relation extraction method based on position vector noise reduction and rich semantics
CN111178080A (en) Named entity identification method and system based on structured information
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN117828024A (en) Plug-in retrieval method, device, storage medium and equipment
CN116467461A (en) Data processing method, device, equipment and medium applied to power distribution network
CN114510943B (en) Incremental named entity recognition method based on pseudo sample replay
CN116484848A (en) Text entity identification method based on NLP
CN114117069B (en) Semantic understanding method and system for intelligent knowledge graph questions and answers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination