CN114077674A - Power grid dispatching knowledge graph data optimization method and system - Google Patents

Power grid dispatching knowledge graph data optimization method and system Download PDF

Info

Publication number
CN114077674A
CN114077674A CN202111279160.0A CN202111279160A CN114077674A CN 114077674 A CN114077674 A CN 114077674A CN 202111279160 A CN202111279160 A CN 202111279160A CN 114077674 A CN114077674 A CN 114077674A
Authority
CN
China
Prior art keywords
power grid
entity
data
knowledge
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111279160.0A
Other languages
Chinese (zh)
Inventor
唐宁恺
陆继翔
旷文腾
谢峰
严晴
李红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Nari Technology Co Ltd
State Grid Electric Power Research Institute
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Nari Technology Co Ltd
State Grid Electric Power Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, Nari Technology Co Ltd, State Grid Electric Power Research Institute filed Critical State Grid Corp of China SGCC
Priority to CN202111279160.0A priority Critical patent/CN114077674A/en
Publication of CN114077674A publication Critical patent/CN114077674A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a power grid dispatching knowledge map data optimization method and a system, wherein the method comprises the steps of automatically mining high-quality phrases in the field by using a deep learning method to complete automatic identification and equivalent disambiguation of dispatching entities; then, the global relationship extraction of the scheduling entity is completed according to the deep learning technology, so that the recognition and the verification of the entity relationship are completed, and the purpose of establishing an initial power grid scheduling knowledge map is achieved; on the basis of finishing the two steps, performing incremental training on newly-added scheduling plan data based on a timestamp by using a natural language learning knowledge fusion technology; meanwhile, life cycle management of knowledge content of the knowledge graph is introduced in the process of finishing each step; and finally, finishing the dynamic knowledge graph for sustainable learning under the cooperation of the steps. The method ensures the high accuracy of the power grid dispatching optimization decision knowledge map, ensures the dynamic update of the incremental knowledge, and reduces the consumption of computing resources and time during the update training.

Description

Power grid dispatching knowledge graph data optimization method and system
Technical Field
The invention belongs to the technical field of power grid dispatching, and particularly relates to a power grid dispatching knowledge graph data optimization method.
Background
With the continuous expansion of the scale of the power system and the increase of the new energy ratio, the difficulty of active scheduling is increasing day by day. As one of the most important system control means of the power system, accurate and safe scheduling not only involves the development of national economy, but also guarantees the safe and efficient operation of the whole power system.
At this stage, due to the rapid increase of the number of power grid devices and the total amount of knowledge of the power system, the traditional knowledge organization and management means can not meet the requirements for a long time. Compared with a basic database, the method comprises rules, and the knowledge bases such as an intelligent decision system and a power transmission network planning decision system which are put into operation based on an expert system are applied to an electric power system in a large amount with obvious advantages. However, the current knowledge base relies on experts. Extracting, sorting and finally storing the related contents in a database in a chart form, so that the storage structure is severely limited, and a large amount of professional personnel and time are required to be invested during updating. Particularly, for professional fields of existing power dispatching, such as environments, which have the advantages of many changes, high rule requirements, fast update iteration and particularly rich cases, the industry urgently needs a more automatic and intelligent knowledge extraction, storage, management and reasoning method system.
In view of the above, a grid dispatching knowledge graph system is produced. A knowledge graph is a structured semantic knowledge base that represents and stores entities and their interrelations in graph form. In the knowledge graph, basic composition units of entities and relationships thereof are triples of entities-relationships-entities, the attributes of the entities represent storage by using attribute-value pairs, the storage of facts, examples and relationships specific to the knowledge graph perfectly conforms to the requirements of power grid dispatching auxiliary tools, and the professionality, relevance, cooperativity and constructability of the knowledge graph are improved in an all-round way. On the other hand, the knowledge map construction technology also comprises knowledge updating and learning capacity based on artificial intelligence, and the weakness of the functions of fuzzy search, similar case query and the like of the traditional database based on character strings and links for a long time is solved.
However, the existing power grid dispatching knowledge graph still has certain limitations, such as the fact that the knowledge extraction capability is still to be improved under the condition that the scale of the existing power grid is continuously increased, the existing power grid dispatching knowledge graph is insensitive to the operation and the environment in time and space, and the effort and the time waste caused by the fact that the knowledge graph needs to be reconstructed every time of updating are caused. These problems still need to be solved in order to actually implement the power grid dispatching knowledge graph.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: and (3) realizing a power grid dispatching knowledge map: the full life cycle management of knowledge extraction, knowledge fusion, knowledge map storage updating and outdated knowledge elimination solves the technical problem of insufficient intellectualization level of a scheduling system in the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a power grid dispatching knowledge graph data optimization method comprises the following steps:
establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises a power grid historical text subgraph and a power grid equipment subgraph, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraph through related stations, lines and equipment;
identifying and extracting a power grid dispatching entity;
performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
initializing entity relation triples;
step five, constructing and training the deep neural network-basedEntity recognition modelAndthe model of the relationship recognition is used,and generating accurate entities and relations.
A power grid dispatching knowledge graph data optimization system is characterized by comprising the following program modules:
a model building module: establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through relevant stations, lines and equipment;
an entity extraction module: identifying and extracting a power grid dispatching entity;
a word segmentation module: performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
entity relationship triple module: initializing entity relation triples;
a neural network training module: construction and training of deep neural network basedEntity recognition modelAndrelation recognition module The shape of the mould is as follows,and generating accurate entities and relations.
The invention has the beneficial effects that: the invention provides a power grid dispatching knowledge graph data optimization method, which comprises the following steps: firstly, automatically mining high-quality phrases in the field by utilizing a deep learning technology, and finishing automatic identification and equivalent disambiguation of a scheduling entity on the basis; then, the overall relationship extraction of the scheduling entity is completed according to the deep learning technology, so that the identification and verification of the entity relationship are completed, the accurate entity and relationship are generated, and the purpose of establishing an initial power grid scheduling knowledge map is achieved; on the basis of finishing the two steps, performing incremental training on newly-added scheduling plan data based on a timestamp by using a natural language learning knowledge fusion technology; and finally, finishing the dynamic knowledge graph for sustainable learning under the cooperation of the steps. The method ensures the high accuracy of the power grid dispatching optimization decision knowledge map, ensures the dynamic update of the incremental knowledge, and reduces the consumption of computing resources and time during the update training.
Drawings
Fig. 1 is a schematic diagram of a power grid scheduling knowledge graph topology structure diagram model provided by an embodiment of the present invention;
FIG. 2 is a diagram illustrating an exemplary method for automatic identification and extraction of scheduling entities according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating a method for obtaining a relationship between scheduling entities according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a timestamp-based knowledge graph model incremental update method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a storage scheme provided in an embodiment of the present invention.
Detailed Description
The invention is further described below. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Example 1
The embodiment provides a power grid dispatching knowledge graph data optimization method which comprises the following steps:
firstly, establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and as shown in FIG. 1, records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through related stations, lines and equipment to construct a power grid dispatching knowledge graph model.
In addition to the above intra-site connectivity and knowledge-graph structure, such data often also exhibit associations in geographic location. Many data in the system contain geographical location information, such as: company stations, substations, distribution stations, power stations, and the like.
The administrative system of China is generally divided into four levels, the first level is provincial level, and the administrative system comprises provinces, autonomous regions, direct prefectures and special administrative regions; the second level is the grade of land, including the city of land, autonomous state, region and union; the third level is a county level, including county, county-level city, self-governing county, special district, forest district and prefecture of the city; the fourth level is the countryside level, including countryside, national countryside, town, street office, etc. Based on the administrative system of China, the national grid company often uses large administrative areas to divide and manage its subsidiaries. In consideration of the factors, the power grid dispatching knowledge map model structure adopts a cascading geographical knowledge structure so as to achieve better structuralization and accuracy. The first level is a large district, the second level is a provincial level, the third level is a city level, and the fourth level is a station or a transfer station. The subordinate sites and the corresponding superior sites are connected by using the superior-subordinate relationship, and when the original corpus data only contains the low-subordinate geographic position information, the high-subordinate geographic position information and other corresponding relationship attributes can be automatically supplemented to the unconnected sites by using the connection relationship. Therefore, the tuning of the associated knowledge can be effectively increased when the query is carried out, and meanwhile, the accuracy and the query efficiency are increased.
Identifying and extracting a power grid dispatching entity, comprising the following steps:
scheduling optimization decisions relate to multiple types of data, including structured data and text-type data;
the structured data comprises load prediction data, new energy prediction data and a safety constraint section, can be directly derived from a power grid real-time database, and are extracted into triples by using regularization and stored into a power grid dispatching knowledge map;
the text data comprises a tie line plan, a maintenance plan, a power grid operation mode and a power grid abnormal event, document contents are traversed by adopting methods such as data cleaning or interface conversion, documents are divided according to a chapter and chapter association model, and original data formats are unified;
performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original corpus of the entity data of the power grid dispatching, evaluating word segmentation results, completing an entity object association model according to the evaluation results, and storing the entity object association model in a power grid dispatching knowledge graph model, wherein the specific steps are as follows:
31) word segmentation: performing phrase segmentation on original corpus data by adopting a Chinese word segmentation method; according to power core dictionaries such as power terms and power grid topological structure models and related expansion dictionaries, data contents related to scheduling optimization decisions are divided maximally and effectively, wherein the data contents comprise power grid operation data, scheduling rules, experience data and the like;
32) evaluation:
completing iterative calculation of statistical index features by using a semi-supervised learning method based on a deep neural network;
performing phrase quality evaluation by using a semi-supervised learning method;
iteratively excavating vocabularies by using a semi-supervised learning method based on a deep neural network, and excavating high-quality vocabularies and new words such as power grid entities, attributes, operation terms, restriction constraints and the like;
establishing a named entity classification system by adopting a semi-supervised learning method;
33) performing entity identification by using an open source natural language preprocessing model (such as a Transformer, BERT and the like), performing corresponding classification and clustering on scheduling optimization decision entities, extracting proper nouns of basic elements such as scheduling entities and attributes, and extracting entity information such as time;
34) the iterative training mode in deep learning is used to conclude and distinguish the homonymous and heteronymous entities and the heteronymous entities, so as to achieve the effect of disambiguation.
Initializing an entity relationship triple, and specifically comprising the following steps:
41) automatically acquiring a triple of 'entity-relation-entity' by using distance limit between entities and position limit of a relation indicator, and verifying and marking the entities;
42) labeling credible and incredible relation triples, and training and classifying the relation triples into credible and incredible relation triples by using a naive Bayes classifier so as to obtain a relation representation model;
43) performing relationship identification on a trained classifier (which can be a naive Bayes classifier) by superposing feature data through a relationship representation model obtained by training to obtain a candidate relationship triple; the characteristic data comprises a morphological characteristic, a semantic characteristic, a part of speech, a sequence and the like;
44) and merging all the approximate relation candidate triples, and calculating the reliability of each relation triplet through the statistical probability distribution condition.
Constructing and training an entity recognition model and a relationship recognition model based on the deep neural network, checking, and automatically generating accurate entities and relationships, wherein the method specifically comprises the following steps:
51) building an entity recognition model based on a deep neural network, performing recheck verification after the entity recognition model is trained, labeling entity relations in a scheduling and scheduling text in all data ranges of a power grid scheduling knowledge graph, labeling specific corresponding types of entities, and then training a relation recognition model based on a convolutional neural network by using labeled corpora;
52) identifying the scheduling entity relationship in the unlabeled scheduling plan text by using the trained relationship identification model;
53) further checking the entity relationship based on the identification result of the step 52), realizing the consistency of the entity relationship, checking whether the relationship type exists in the relationship set according to the power grid dispatching knowledge graph, and prompting auditing if the relationship type does not exist; otherwise, searching the entity matched with the entity in the power grid dispatching knowledge graph through the entity semantic features, if the entity-relation-entity triple obtained by matching and identifying is inconsistent with the entity-relation-entity triple obtained by searching the power grid dispatching knowledge graph, prompting to perform review again, and if the entity-relation-entity triple is consistent, prompting to continue to perform review is not required.
A power grid dispatching knowledge graph data optimization system is characterized by comprising the following program modules:
a model building module: establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through relevant stations, lines and equipment;
an entity extraction module: identifying and extracting a power grid dispatching entity;
a word segmentation module: performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
entity relationship triple module: initializing entity relation triples;
a neural network training module: construction and training of deep neural network basedEntity recognition modelAndrelation recognition module The shape of the mould is as follows,and generating accurate entities and relations.
Example 2
On the basis of the first step to the fifth step of the embodiment 1, the method further comprises the following steps:
and step six, updating incremental data and fusing knowledge, and specifically comprises the following steps:
61) in the fifth step, on the basis of the trained entity recognition model and relationship recognition model, training sets of the entity recognition model and the relationship recognition model are constructed by using power grid equipment information and power grid scheduling knowledge, wherein the training sets are core sets of the entity recognition model and the relationship recognition model respectively;
62) for newly-added data of a newly-added scheduling plan class which continuously changes along with time, automatically completing entity extraction and relationship extraction by using an existing entity identification model and a relationship identification model, constructing a newly-added data training set, and incrementally learning the newly-added data training set on the constructed core set by using the entity identification model and the relationship identification model; the newly added data of the newly added scheduling plan type comprises a maintenance plan, a power grid operation mode and the like;
63) adopting corresponding knowledge map instance layer updating rules according to different scenes, and obtaining entities and entity relations by using power key information in a deep-learning scheduling plan text;
64) directly constructing a sub-graph spectrum aiming at data such as incremental correction cases, procedure experience and the like so as to amplify the original knowledge graph;
65) aiming at aging-type basic data such as a scheduling plan, an overhaul plan, a grid-connected plan and the like, copying a power grid topological graph and overlapping the aging-type data to construct a new topological graph;
example (c): the equipment entities and the relations extracted from the maintenance plan act on the power grid topological graph, and the state and the connection relation of the power grid topological entities are updated;
66) after the power grid dispatching knowledge graph model is incrementally learned and updated, entity alignment and attribute alignment are carried out; completing conflict detection and resolving conflicts; the method specifically comprises the following steps of carrying out concept combination, concept upper and lower relation combination and concept attribute definition combination:
aligning entities between different power knowledge graphs by calculating semantic similarity between two power entities;
and detecting and resolving conflicts by using a power grid dispatching knowledge graph self tool method, wherein the tool method comprises a voting-based method and a quality evaluation-based method.
After the steps are completed, the fusion and the update of the newly added knowledge sub-atlas and the original knowledge can be realized, and the knowledge atlas entity and the relation between the entities are automatically updated.
A power grid dispatching knowledge graph data optimization system is characterized by comprising the following program modules:
a model building module: establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through relevant stations, lines and equipment;
an entity extraction module: identifying and extracting a power grid dispatching entity;
a word segmentation module: performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
entity relationship triple module: initializing entity relation triples;
a neural network training module: construction and training of deep neural network basedEntity recognition modelAndrelation recognition module The shape of the mould is as follows,generating accurate entities and relationships;
an increment module: and updating incremental data and fusing knowledge.
Example 3
On the basis of the first step to the fifth step of embodiment 1 or embodiment 2, the method further comprises the following steps:
and seventhly, dynamically updating, storing and recovering the knowledge based on the time section, and specifically comprising the following steps of:
using an open source graph database such as Neo4j for graph storage, marking entities, relations, entities and relation attributes in a graph database of a power grid dispatching knowledge graph with timestamp (time stamp) marks respectively, wherein the timestamp comprises two timestamps, a start timestamp and an end timestamp, the start timestamp is the time when the start timestamp is added into the knowledge graph, and the end timestamp is the time when the end timestamp is deleted from the knowledge graph;
when a certain entity and a relation are stamped with an ending time stamp, the ending of the whole life cycle of the related knowledge is represented;
deleting the bulk entities and the relations in the power grid dispatching knowledge graph for backup so as to restore and query the later graph;
when the entity, the relation and the attribute of the entity and the relation are inserted, deleted and updated, the corresponding timestamp is also updated;
when the power grid dispatching knowledge graph at a specific historical moment is required to be restored, the database at the specified moment can be restored only by deleting the change of the database after the specified moment.
The power grid data has the service characteristics, the change of the core equipment of the power grid is less, and the number of stations and lines cannot be increased or decreased within a few years; the power grid fault text has the characteristics of data dispersion and rapid growth, and the storage processing of the power grid fault text data is mainly considered.
And the grid fault text data grows along with time, and is sliced according to the time information, and the fault text data in the same time period is stored in an adjacent memory space. And (4) forming a map section of a specific fault case by taking time and equipment as slices, wherein the map section comprises information of a running mode before and after a corresponding fault, handling operation, risk early warning of associated equipment, equipment operation and maintenance, live working logs, quota, out-of-limit conditions and the like.
Building a high-availability cluster storage based on Neo4j, wherein Neo4j is a high-performance network cluster graph database, the characteristic is that structured data is stored on a graph instead of a table, and as an existing feasible high-performance graph engine, the graph engine has all the characteristics of a mature database and simultaneously provides functions required by the implementation of the invention, and the capacity that a graph cluster can still continuously provide services even if a network or hardware fails is guaranteed. If one node in the cluster breaks down, or the network connection is broken, the graph cluster should be able to continue to provide service without losing the ability to do so completely. Data can be loaded in parallel, data can be updated in real time, and multithread query can be performed under the knowledge graph distributed environment, so that the query speed of the graph and the number of nodes stored in the graph can be increased in a weak line manner.
The high availability Neo4j cluster employs a master-slave replication architecture, and the cluster can provide two key capabilities: one is strain and fault tolerance capability in the event of hardware failure, and the other is the ability to extend Neo4j to read intensive data scenarios.
The storage scheme is as shown in fig. 5, power grid fault data such as newly added fault trip logs and scheduling logs are fragmented according to time, and the newly added data are stored on the cluster nodes in a distributed manner through load balancing, so that the expandability of knowledge graph storage is realized.
Load balancing is that data is balanced, and each piece of fragmented data has two pieces of data stored on different cluster nodes, so that the data volume of each cluster node is guaranteed to be balanced.
The load balancing method comprises the following specific steps:
1. the fault knowledge is stored in a slicing mode according to the time when the fault occurs;
2. the cluster nodes store the proportion according to the memory weight;
3. and monitoring the data storage amount of the cluster nodes, and if the number of the nodes exceeds the set number of the nodes of the knowledge graph, increasing the cluster nodes to keep the graph linearly increasing.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
A power grid dispatching knowledge graph data optimization system is characterized by comprising the following program modules:
a model building module: establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through relevant stations, lines and equipment;
an entity extraction module: identifying and extracting a power grid dispatching entity;
a word segmentation module: performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
entity relationship triple module: initializing entity relation triples;
a neural network training module: construction and training of deep neural network basedEntity recognition modelAndrelation recognition module The shape of the mould is as follows,generating accurate entities and relationships;
an increment module: updating incremental data and fusing knowledge;
the knowledge dynamic updating module: and dynamically updating, storing and recovering the knowledge based on the time section.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. A power grid dispatching knowledge graph data optimization method is characterized by comprising the following steps:
establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises a power grid historical text subgraph and a power grid equipment subgraph, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraph through related stations, lines and equipment;
identifying and extracting a power grid dispatching entity;
performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
initializing entity relation triples;
and fifthly, constructing and training an entity recognition model and a relationship recognition model based on the deep neural network to generate accurate entities and relationships.
2. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
in the second step, the power grid dispatching entity is identified and extracted, and the method comprises the following steps:
scheduling optimization decisions relate to multiple types of data, including structured data and text-type data;
the structured data is derived from a power grid real-time database, extracted into triples by using regularization and stored into a power grid dispatching knowledge map;
the text data is subjected to traversal on the content of the document by adopting a data cleaning or interface conversion method, the document is divided according to a chapter and chapter association model, and the original data format is unified.
3. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
in the third step, the method specifically comprises the following steps:
31) performing phrase segmentation on original corpus data by adopting a Chinese word segmentation method;
32) completing iterative calculation of statistical index features by using a semi-supervised learning method based on a deep neural network;
performing phrase quality evaluation by using a semi-supervised learning method;
performing iterative excavation on the vocabulary by using a semi-supervised learning method based on a deep neural network, and excavating high-quality vocabulary and new words;
establishing a named entity classification system by adopting a semi-supervised learning method;
33) performing entity identification by using an open source natural language preprocessing model, performing corresponding classification and clustering on scheduling optimization decision entities, extracting proper nouns of scheduling entities and attributes, and extracting entity information;
34) and (3) inducing and distinguishing homonymous and heteronymous entities and heteronymous and homonymous entities by using an iterative training mode in deep learning.
4. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
in the fourth step, the method specifically comprises the following steps:
41) automatically acquiring entity-relation-entity triples by utilizing distance limit between entities and position limit of a relation indicator, and verifying and marking the entities;
42) labeling credible and incredible relation triples, and training and classifying the relation triples into credible and incredible relation triples by using a naive Bayes classifier so as to obtain a relation representation model;
43) performing relation identification on a trained classifier by superposing characteristic data through a relation representation model obtained by training to obtain a candidate relation triple;
44) and merging all the approximate relation candidate triples, and calculating the reliability of each relation triplet through the statistical probability distribution condition.
5. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
in the fifth step, the method specifically comprises the following steps:
51) constructing an entity recognition model based on a deep neural network, marking entity relations in a scheduling and planning text in all data ranges of a power grid scheduling knowledge graph after the entity recognition model is trained, marking specific corresponding types of entities, and then using marked corpora to train a relation recognition model based on a convolutional neural network;
52) identifying the scheduling entity relationship in the unlabeled scheduling plan text by using the trained relationship identification model;
53) further checking the entity relationship based on the identification result of the step 52), realizing the consistency of the entity relationship, checking whether the relationship type exists in the relationship set according to the power grid dispatching knowledge graph, and prompting auditing if the relationship type does not exist; otherwise, searching the entity matched with the entity in the power grid dispatching knowledge graph through the entity semantic features, if the entity-relation-entity triple obtained by matching and identifying is inconsistent with the entity-relation-entity triple obtained by searching the power grid dispatching knowledge graph, prompting to perform review again, and if the entity-relation-entity triple is consistent, prompting to continue to perform review is not required.
6. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
the method further comprises a sixth step of updating incremental data and fusing knowledge, and specifically comprises the following steps:
61) in the fifth step, on the basis of the trained entity recognition model and relationship recognition model, training sets of the entity recognition model and the relationship recognition model are constructed by using power grid equipment information and power grid scheduling knowledge, wherein the training sets are core sets of the entity recognition model and the relationship recognition model respectively;
62) for newly-added data of a newly-added scheduling plan class which continuously changes along with time, automatically completing entity extraction and relationship extraction by using an existing entity identification model and a relationship identification model, constructing a newly-added data training set, and incrementally learning the newly-added data training set on the constructed core set by using the entity identification model and the relationship identification model;
63) adopting corresponding knowledge map instance layer updating rules according to different scenes, and obtaining entities and entity relations by using power key information in a deep-learning scheduling plan text;
64) constructing a sub-map according to the incremental correction case and procedure experience data so as to amplify the original knowledge map;
65) aiming at the aging type basic data, copying the power grid topological graph and overlapping the aging type data to construct a new topological graph;
66) after the power grid dispatching knowledge graph model is incrementally learned and updated, entity alignment and attribute alignment are carried out; completing conflict detection and resolving conflicts; and carrying out concept combination, concept upper and lower relation combination and concept attribute definition combination.
7. The power grid dispatching knowledge-graph data optimization method according to claim 1, characterized in that:
the method further comprises a seventh step of dynamically updating, storing and recovering the knowledge based on the time section, and specifically comprises the following steps:
using an open source graph database for graph storage, firstly marking entities and relations and attributes of the entities and the relations in a graph database of a power grid dispatching knowledge graph with timetags respectively, wherein the timetags comprise two timestamps, a starting timestamp and an ending timestamp, the starting timestamp is the time added into the knowledge graph, and the ending timestamp is the time deleted from the knowledge graph;
when a certain entity and a relation are stamped with an ending time stamp, the ending of the whole life cycle of the related knowledge is represented;
deleting the bulk entities and the relations in the power grid dispatching knowledge graph for backup so as to restore and query the later graph;
when the entity, the relation and the attribute of the entity and the relation are inserted, deleted and updated, the corresponding timestamp is also updated;
when the power grid dispatching knowledge graph at a specific historical moment is required to be restored, the change of the database at the specified moment is only deleted, and the database at the specified moment is restored.
8. The power grid dispatching knowledge-graph data optimization method according to claim 7, characterized in that:
and segmenting newly-added power grid fault data according to time, and storing the newly-added data on the cluster nodes in a distributed manner through load balancing to realize the expandability of the knowledge graph storage.
9. The power grid dispatching knowledge-graph data optimization method according to claim 8, wherein:
the load balancing method comprises the following specific steps:
the fault knowledge is stored in a slicing mode according to the time when the fault occurs;
the cluster nodes store the proportion according to the memory weight;
and monitoring the data storage amount of the cluster nodes, and if the number of the nodes exceeds the set number of the nodes of the knowledge graph, increasing the cluster nodes to keep the graph linearly increasing.
10. A power grid dispatching knowledge graph data optimization system is characterized by comprising the following program modules:
a model building module: establishing a graph database-based power grid dispatching knowledge graph model, wherein the graph database-based power grid dispatching knowledge graph model comprises power grid historical text subgraphs and power grid equipment subgraphs, and records of the power grid historical text subgraphs are connected with the power grid equipment subgraphs through relevant stations, lines and equipment;
an entity extraction module: identifying and extracting a power grid dispatching entity;
a word segmentation module: performing Chinese word segmentation on sentences by using an open source word segmentation tool on the extracted original linguistic data of the power grid dispatching entity data, and evaluating word segmentation results;
entity relationship triple module: initializing entity relation triples;
a neural network training module: and constructing and training an entity recognition model and a relationship recognition model based on the deep neural network to generate accurate entities and relationships.
CN202111279160.0A 2021-10-31 2021-10-31 Power grid dispatching knowledge graph data optimization method and system Pending CN114077674A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111279160.0A CN114077674A (en) 2021-10-31 2021-10-31 Power grid dispatching knowledge graph data optimization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111279160.0A CN114077674A (en) 2021-10-31 2021-10-31 Power grid dispatching knowledge graph data optimization method and system

Publications (1)

Publication Number Publication Date
CN114077674A true CN114077674A (en) 2022-02-22

Family

ID=80283520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111279160.0A Pending CN114077674A (en) 2021-10-31 2021-10-31 Power grid dispatching knowledge graph data optimization method and system

Country Status (1)

Country Link
CN (1) CN114077674A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114647743A (en) * 2022-05-20 2022-06-21 国网浙江省电力有限公司 Power marketing full-service access control rule map generation and processing method and device
CN114757307A (en) * 2022-06-14 2022-07-15 中国电力科学研究院有限公司 Artificial intelligence automatic training method, system, device and storage medium
CN115344717A (en) * 2022-10-18 2022-11-15 国网江西省电力有限公司电力科学研究院 Method and device for constructing regulation and control operation knowledge graph for multi-type energy supply and consumption system
CN117235929A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
WO2024045186A1 (en) * 2022-09-02 2024-03-07 西门子股份公司 Method and apparatus for constructing knowledge graph, and computing device and storage medium
CN117725555A (en) * 2024-02-08 2024-03-19 暗物智能科技(广州)有限公司 Multi-source knowledge tree association fusion method and device, electronic equipment and storage medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114647743A (en) * 2022-05-20 2022-06-21 国网浙江省电力有限公司 Power marketing full-service access control rule map generation and processing method and device
CN114757307A (en) * 2022-06-14 2022-07-15 中国电力科学研究院有限公司 Artificial intelligence automatic training method, system, device and storage medium
CN114757307B (en) * 2022-06-14 2022-09-06 中国电力科学研究院有限公司 Artificial intelligence automatic training method, system, device and storage medium
WO2024045186A1 (en) * 2022-09-02 2024-03-07 西门子股份公司 Method and apparatus for constructing knowledge graph, and computing device and storage medium
CN115344717A (en) * 2022-10-18 2022-11-15 国网江西省电力有限公司电力科学研究院 Method and device for constructing regulation and control operation knowledge graph for multi-type energy supply and consumption system
CN115344717B (en) * 2022-10-18 2023-02-17 国网江西省电力有限公司电力科学研究院 Method and device for constructing regulation and control operation knowledge graph for multi-type energy supply and consumption system
CN117235929A (en) * 2023-09-26 2023-12-15 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
CN117235929B (en) * 2023-09-26 2024-06-04 中国科学院沈阳自动化研究所 Three-dimensional CAD (computer aided design) generation type design method based on knowledge graph and machine learning
CN117725555A (en) * 2024-02-08 2024-03-19 暗物智能科技(广州)有限公司 Multi-source knowledge tree association fusion method and device, electronic equipment and storage medium
CN117725555B (en) * 2024-02-08 2024-06-11 暗物智能科技(广州)有限公司 Multi-source knowledge tree association fusion method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114077674A (en) Power grid dispatching knowledge graph data optimization method and system
CN112612902A (en) Knowledge graph construction method and device for power grid main device
CN110674311A (en) Knowledge graph-based power asset heterogeneous data fusion method
CN109597855A (en) Domain knowledge map construction method and system based on big data driving
CN110298032A (en) Text classification corpus labeling training system
CN111552813A (en) Power knowledge graph construction method based on power grid full-service data
CN110188345B (en) Intelligent identification method and device for electric operation ticket
CN116028645B (en) Urban municipal infrastructure emergency knowledge graph determination method, system and equipment
CN104346438A (en) Data management service system based on large data
CN115357726A (en) Fault disposal plan digital model establishing method based on knowledge graph
CN115438199A (en) Knowledge platform system based on smart city scene data middling platform technology
CN112613611A (en) Tax knowledge base system based on knowledge graph
CN115470871A (en) Policy matching method and system based on named entity recognition and relation extraction model
CN115858513A (en) Data governance method, data governance device, computer equipment and storage medium
CN115757810A (en) Method for constructing standard ontology of knowledge graph
CN115033705A (en) Power grid regulation and control risk early warning information knowledge graph design method and system
CN105160046A (en) Text-based data retrieval method
CN112036179B (en) Electric power plan information extraction method based on text classification and semantic frame
CN117010373A (en) Recommendation method for category and group to which asset management data of power equipment belong
CN116757498A (en) Method, equipment and medium for pushing benefit-enterprise policy
CN115759253A (en) Power grid operation and maintenance knowledge map construction method and system
CN115937881A (en) Method for automatically identifying content of knowledge graph construction standard form
CN115827885A (en) Operation and maintenance knowledge graph construction method and device and electronic equipment
Qin et al. Construction of knowledge graph of multi-source heterogeneous distribution network systems
CN113987164A (en) Project studying and judging method and device based on domain event knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination