CN110851611A - Hidden danger data knowledge graph construction method, device, equipment and medium - Google Patents

Hidden danger data knowledge graph construction method, device, equipment and medium Download PDF

Info

Publication number
CN110851611A
CN110851611A CN201910652010.6A CN201910652010A CN110851611A CN 110851611 A CN110851611 A CN 110851611A CN 201910652010 A CN201910652010 A CN 201910652010A CN 110851611 A CN110851611 A CN 110851611A
Authority
CN
China
Prior art keywords
data
hidden danger
graph
classification model
relation characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910652010.6A
Other languages
Chinese (zh)
Inventor
刘鑫
庄浩
张继勇
蔡恒
喻磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huarui Xinzhi Baoding Technology Co ltd
Huarui Xinzhi Technology Beijing Co ltd
Original Assignee
Huarui Xinzhi Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huarui Xinzhi Technology (beijing) Co Ltd filed Critical Huarui Xinzhi Technology (beijing) Co Ltd
Priority to CN201910652010.6A priority Critical patent/CN110851611A/en
Publication of CN110851611A publication Critical patent/CN110851611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device, equipment and a medium for constructing a hidden danger data knowledge graph. The method comprises the following steps: acquiring hidden danger data; extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among various hidden danger attributes; generating graph node data according to the relation characteristic data; and generating a hidden danger data knowledge graph according to the graph node data. The method and the device have the advantages that the classification model based on machine learning is utilized to extract the relation characteristic data from the hidden danger data, and then the hidden danger data knowledge graph is generated, so that the hidden danger data is effectively organized, the internal relation of the hidden danger data is conveniently and visually checked, useful information is found, the effect of early warning of equipment and parts where the hidden danger appears is played, and corresponding preventive measures and decision making preparation can be taken.

Description

Hidden danger data knowledge graph construction method, device, equipment and medium
Technical Field
The application relates to the technical field of petroleum and petrochemical industry, in particular to a method, a device, equipment and a medium for constructing a hidden danger data knowledge graph.
Background
With the progress of our country in the accelerated development stage of industrialization, the expansion of industrial production scale and the improvement of industrial production efficiency bring great social effect, but at the same time, the problem of safe production is increasingly prominent, and the consequences brought to enterprises are more and more disastrous. In order to ensure safe production and reduce accident rate, the investigation of hidden dangers of equipment, environment and personnel becomes an important measure for the safe production of each enterprise.
Taking the petroleum and petrochemical industry as an example, the petroleum and petrochemical industry is a high-risk industry, various countries, enterprises, international or regional organizations actively summarize and explore modes and methods of enterprise safety management, and when an enterprise specifies safety production planning, targets, assessment indexes and resource allocation, effective hidden danger data need to be referred to, wherein the hidden danger data is a small segment of text description aiming at a certain operation or a certain device and is manually recorded by field safety personnel in a field safety inspection mode.
Petroleum and petrochemical enterprises record a large amount of hidden danger data by checking and rectifying industrial hidden dangers, but the hidden danger data are stored respectively, and lack of correlation mutually, so that the hidden danger data become individual data islands, and effective information cannot be obtained in time.
Disclosure of Invention
The embodiment of the application provides a method, equipment and a medium for constructing a hidden danger data knowledge graph, which are used for solving the following technical problems in the prior art: petroleum and petrochemical enterprises record a large amount of hidden danger data by checking and rectifying industrial hidden dangers, but the hidden danger data are stored respectively, and lack of correlation mutually, so that the hidden danger data become individual data islands, and effective information cannot be obtained in time.
The embodiment of the application adopts the following technical scheme:
a method for constructing a knowledge graph of hidden danger data comprises the following steps:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
Optionally, the classification model is pre-trained as follows:
constructing a classification model based on machine learning;
acquiring sample hidden danger data and a corresponding label thereof, wherein the label indicates hidden danger attributes to which one or more words of the sample hidden danger data belong, and the hidden danger attributes comprise at least one of the following: hidden danger equipment, hidden danger positions, hidden danger states and hidden danger hazards;
and carrying out supervised training on the classification model by utilizing the sample hidden danger data and the corresponding label.
Optionally, the tag further indicates grammar category data of one or more words of the sample potential risk data, the grammar category data including at least part of speech.
Optionally, extracting relationship feature data from the hidden danger data through a pre-trained classification model, including:
segmenting the hidden danger data and converting the segmented hidden danger data into corresponding word vectors;
performing, by the pre-trained classification model: determining a plurality of similar data with the hidden danger data in a set of sample hidden danger data used for training the classification model according to the word vector corresponding to the hidden danger data; according to the plurality of similar data, determining the weight of the class to which the similar data belong respectively; classifying the hidden danger data according to the weight; and obtaining relation characteristic data in the hidden danger data according to the classification result.
Optionally, determining the weight of the category to which the plurality of similar data belong according to the plurality of similar data respectively includes:
determining categories to which the plurality of similar data respectively belong;
for each determined category, determining the number of the similar data contained in the category;
and respectively determining the weight of each category according to the number and the similarity of the similar data and the hidden danger data.
Optionally, generating graph node data according to the relationship feature data includes:
receiving completion data and correction data for the relational feature data;
and performing redundant filtering and formatting treatment on the relationship characteristic data, the completion data and the correction data to generate graph node data.
Optionally, generating a hidden danger data knowledge graph according to the graph node data, including:
importing the graph node data into an NOSQL graph database for processing;
and acquiring a hidden danger data knowledge map correspondingly generated by the NOSQL graph database.
A hidden danger data knowledge graph construction device comprises:
the acquisition module acquires hidden danger data;
the extraction module is used for extracting relation characteristic data from the hidden danger data through a pre-trained classification model, and the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
the first generation module generates graph node data according to the relation characteristic data;
and the second generation module is used for generating the hidden danger data knowledge graph according to the graph node data.
A hidden danger data knowledge graph construction device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
A non-transitory computer storage medium of construction of a hidden danger data knowledge graph storing computer executable instructions configured to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects: the method has the advantages that the classification model based on machine learning is utilized to extract the relation characteristic data from the hidden danger data, and then the hidden danger data knowledge graph is generated, so that the hidden danger data is effectively organized, the internal relation of the hidden danger data is conveniently and visually checked, useful information is found, the effect of early warning equipment and parts with hidden dangers is achieved, and corresponding preventive measures and decision making preparation can be adopted.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic flow chart of a method for constructing a knowledge graph of hidden danger data according to some embodiments of the present application;
fig. 2 is a schematic specific flowchart of a method for constructing a knowledge graph of hidden danger data in fig. 1 in an application scenario provided by some embodiments of the present application;
fig. 3 is a schematic view of a business framework of a model related to a method for constructing a knowledge graph of hidden danger data in fig. 1 in an application scenario according to some embodiments of the present application;
FIG. 4 is a schematic structural diagram of a hidden danger data knowledge-graph constructing apparatus corresponding to FIG. 1 provided by some embodiments of the present application;
fig. 5 is a schematic structural diagram of a hidden danger data knowledge graph constructing apparatus corresponding to fig. 1 according to some embodiments of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The knowledge graph is a structured semantic knowledge base and is used for rapidly describing concepts and mutual relations in the physical world, and a large amount of knowledge is aggregated by reducing the data granularity from a document (document) level to a data (data) level, so that the quick response and reasoning of the knowledge are realized. The knowledge map can display the complex knowledge field through data mining, information processing, knowledge measurement and graph drawing, reveal the dynamic development rule of the knowledge field, and provide practical and valuable reference for subject research.
The method aims at the problems of the background art, utilizes a classification model based on machine learning to extract relationship characteristic data from the hidden danger data, and generates a corresponding hidden danger data knowledge graph according to the relationship characteristic data, so that the hidden danger data is effectively organized, the internal relation of the data is conveniently and visually checked, useful information is found, the effect of early warning equipment and parts with hidden dangers is achieved, and corresponding preventive measures and decision making preparation can be taken.
Fig. 1 is a schematic flowchart of a method for constructing a knowledge graph of hidden danger data according to some embodiments of the present application. The process of FIG. 1 may be performed by one or more execution entities, such as a classification model, NOSQL graph database, etc.
The process in fig. 1 comprises the following steps:
s100: and acquiring hidden danger data.
In some embodiments of the present application, the hidden danger data may be in various forms, for example, texts, images, and the like, in practical applications, the hidden danger data is generally recorded in a form of a standing book, and the recorded hidden danger data is an entity or an electronic text, where the electronic text is more convenient for processing by a computer, and therefore, step S100 may preferably be the hidden danger data in the form of the electronic text, and specifically includes the relevant description of the hidden danger.
S102: and extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes.
In some embodiments of the present application, the classification model is based on machine learning and is trained according to hidden danger sample data of predetermined relationship feature data.
The hidden danger attributes are various, for example, the hidden danger attributes include a hidden danger main body, a hidden danger position, a hidden danger state, a hidden danger category, a hidden danger reason, hidden danger damage and the like. Taking the hidden trouble body as an example, it can indicate what kind of equipment or the body such as the operation rule has the current hidden trouble. Further, by taking the hidden trouble position as an example, it can be indicated at which position of the equipment or at which step of the operation procedure, etc., the current hidden trouble exists. In the hidden danger data, the relevance such as the relative position of the content corresponding to different hidden danger attributes has a certain rule, mainly depends on the semantic relation and the grammar rule of the hidden danger attributes, and the relevance data features reflecting the relevance are mainly extracted through a classification model.
The form of the relationship feature data may be various, and it may be, for example, a word itself or a combination of consecutive words in the hidden danger data, or content extracted or mapped according to the word or the combination of consecutive words, where the content may be content that is easily and directly understood by a human, such as summarized ambiguous meaning, or content that is difficult to directly understand by a human, such as a high-dimensional feature vector extracted by a machine learning model. Some of the following examples are given primarily in the following: the relationship characteristic data includes a word itself or a combination of consecutive words itself in the hidden danger data, for example. When the classification model is trained, the relationship characteristic data can be used as a label to perform supervised training.
The classification model trained to a desired degree (e.g., after the training converges) has the ability to extract the relationship feature data from the hidden danger data more accurately.
S104: and generating graph node data according to the relation characteristic data.
In some embodiments of the present application, the graph node data includes a plurality of nodes, and different nodes may be, for example, words included in the relationship characteristic data, hidden danger attributes, and the like. According to the relationship characteristic data, edges between the nodes can be generated, and the edges can reflect semantic relationships between the nodes, included relationships possibly existing between the nodes and the like.
S106: and generating a hidden danger data knowledge graph according to the graph node data.
In some embodiments of the present application, a specified graph generation algorithm may be employed to generate the hidden danger data knowledge graph.
Through the method of fig. 1, the hidden danger data can be effectively sorted, relevant information and description such as equipment, position, state, harm and the like of the hidden danger can be extracted, and a corresponding hidden danger data knowledge graph is generated, so that a large amount of hidden danger data can be counted, combed and analyzed, a special report is formed, the hidden danger analysis and study are facilitated, and key and weak links of problems can be found. Through analyzing and judging equipment, positions, states and hazards of a large amount of hidden danger data, the nature is seen through the phenomenon, and deep problems with tendentiousness, universality and regularity are found out from the hidden danger data, so that the law of hidden danger checking and treating work is mastered, and then medicine is administered according to symptoms, and targeted measures are taken from the source.
Based on the method of fig. 1, some embodiments of the present application also provide some specific schemes of the method, and related extension schemes, which are described below.
In some embodiments of the present application, the classification model may be pre-trained as follows:
constructing a classification model based on machine learning; acquiring sample hidden danger data and a corresponding label thereof, wherein the label indicates hidden danger attributes to which one or more words of the sample hidden danger data belong, and the hidden danger attributes comprise at least one of the following: hidden danger equipment, hidden danger positions, hidden danger states and hidden danger hazards; and carrying out supervised training on the classification model by utilizing the sample hidden danger data and the corresponding label. There are various implementations of the classification model, such as a neural network algorithm, a K-Nearest Neighbor (KNN) machine learning algorithm, and the like. Besides the potential hazard attributes, the labels can also indicate grammar category data of one or more words of the sample potential hazard data, and the grammar category data at least comprises part of speech, so that the semantics of the words and the semantic association between the contextual words are conveniently extracted.
After the classification model is trained, the processing process inside the model is consistent with that during training when the classification model is used specifically, and only parameters are more reasonable, so that the classification can be carried out more accurately.
In some embodiments of the present application, taking a classification model based on a KNN machine learning algorithm as an example, the relationship feature data may be extracted from the hidden danger data as follows: segmenting hidden danger data and converting the hidden danger data into corresponding word vectors; performing, by a pre-trained classification model: determining a plurality of similar data with the hidden danger data in a set of sample hidden danger data used for training a classification model according to the word vector corresponding to the hidden danger data; determining the weight of the category of the data according to the similar data; classifying the hidden danger data according to the weight; and obtaining relation characteristic data in the hidden danger data according to the classification result. Taking a classification model based on a neural network algorithm as an example, for example, relationship feature data in the hidden danger data can be directly extracted through a hidden layer of the neural network.
Some embodiments of the present application provide a specific flow of a method for constructing a hidden danger data knowledge graph in fig. 1 in an application scenario, as shown in fig. 2. In the application scenario, the hidden danger data is recorded in a standing book form, specifically, a Neo4j graph database is used as the NOSQL graph database, and a pre-constructed and trained hidden danger labeling system is used as the classification model.
The process in fig. 2 comprises the following steps:
and acquiring data, wherein the hidden danger data ledger specifically comprises information such as hidden danger content, units to which the hidden dangers belong, hidden danger types, hidden danger grades, hidden danger sources, discovery time, hidden danger reporters, reason analysis, rectification measures, temporary measures taken before rectification, rectification responsible persons, rectification funds, rectification time limit, rectification state and the like.
And data import, namely establishing a data model, acquiring data to be processed through a data source, and importing the data into the hidden danger marking system.
The hidden danger labeling system extracts relation characteristic data (such as semantic relation descriptive words and the like) from the hidden danger data, specifically performs word segmentation and part-of-speech labeling, judges the grammar category of each word in a given sentence, determines the part-of-speech of each word, labels the part-of-speech, automatically classifies the words according to four labels such as equipment, position, state and harm, and writes the words into a corresponding classification data table.
And (4) completing and correcting the extracted relation characteristic data to obtain the marked characteristic data, for example, completing data and correcting data aiming at the relation characteristic data can be manually uploaded by a marking person for completing and correcting.
And processing the completion data and the corrected relation characteristic data, and performing redundancy filtering and formatting treatment on the completion data and the corrected relation characteristic data to further generate graph node data.
And (4) importing the graph node data into a Neo4j graph database for processing, and correspondingly generating a hidden danger data knowledge graph by processing the Neo4j graph database. The process in fig. 2 may end so far.
Further, some embodiments of the present application further provide a business framework of a model related to the method for constructing a knowledge graph of hidden danger data in fig. 1 in an application scenario, as shown in fig. 3, the model is the classification model described above.
The business framework comprises a training process of the model and a classification process when the model is actually used after the training is finished.
The training process may include the steps of:
firstly, preparing training set data consisting of sample hidden danger data, manually classifying the training set data, and classifying each piece of data according to an equipment label, a position label, a state label and a hazard label (labeled single participle, such as a flange label, a working deck label, a fault label and a potential safety hazard); these actions may pertain to preprocessing or reprocessing.
Secondly, constructing a classification model by a KNN machine learning algorithm by adopting a machine learning method;
and thirdly, using the classification model for classifying the new data, and testing the classification model.
The classification process may include the following steps:
step one, a vector used for representing sample hidden danger data is described again according to a relation characteristic data set;
after new hidden danger data arrive, segmenting new hidden danger data according to a relation characteristic data set, and determining vector representation of the new hidden danger data;
thirdly, selecting k similar data (for example, k similar data before similarity) similar to the new hidden danger data from the set of sample hidden danger data, for example, calculating the corresponding inter-vector similarity by using a cosine formula:
determining the k value generally by determining an initial value and then adjusting the k value according to the result of the experimental test;
fourthly, in the k similar data, the weight of each class is calculated in turn, and the calculation formula is as follows:
wherein the content of the first and second substances,
Figure RE-GDA0002332168710000101
is the vector corresponding to the new hidden danger data,
Figure RE-GDA0002332168710000102
the formula is calculated for the similarity, the same as the formula in the previous step, and
Figure RE-GDA0002332168710000103
as a function of the class attribute, i.e., ifBelong to class CjIf the function value is 1, otherwise, the function value is 0;
and fifthly, comparing the weights of the classes, and classifying the new hidden danger data into the class with the highest weight.
Based on the same idea, some embodiments of the present application also provide an apparatus, a device, and a non-volatile computer storage medium corresponding to the method of fig. 1.
Fig. 4 is a schematic structural diagram of an apparatus for constructing a hidden danger data knowledge-graph corresponding to fig. 1, according to some embodiments of the present application, where the apparatus includes:
the acquisition module 400 acquires hidden danger data;
an extraction module 402, configured to extract relationship feature data from the hidden danger data through a pre-trained classification model, where the relationship feature data reflects semantic relationships among multiple hidden danger attributes;
a first generating module 404, configured to generate graph node data according to the relationship feature data;
and a second generating module 406, configured to generate a hidden danger data knowledge graph according to the graph node data.
Fig. 5 is a schematic structural diagram of a hidden danger data knowledge graph constructing apparatus corresponding to fig. 1, provided in some embodiments of the present application, where the apparatus includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
Some embodiments of the present application provide a non-transitory computer storage medium for constructing a hidden danger data knowledge-graph corresponding to fig. 1, storing computer-executable instructions configured to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, device and media embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
The apparatus, the device, the apparatus, and the medium provided in the embodiment of the present application correspond to the method, and therefore, the apparatus, the device, and the medium also have similar advantageous technical effects to the corresponding method.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A construction method of a hidden danger data knowledge graph is characterized by comprising the following steps:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
2. The method of claim 1, wherein the classification model is pre-trained as follows:
constructing a classification model based on machine learning;
acquiring sample hidden danger data and a corresponding label thereof, wherein the label indicates hidden danger attributes to which one or more words of the sample hidden danger data belong, and the hidden danger attributes comprise at least one of the following: hidden danger equipment, hidden danger positions, hidden danger states and hidden danger hazards;
and carrying out supervised training on the classification model by utilizing the sample hidden danger data and the corresponding label.
3. The method of claim 2, wherein the tag further indicates grammar category data for one or more words of the sample potential risk data, the grammar category data including at least a part of speech.
4. The method of claim 2, wherein extracting relationship feature data from the hidden danger data through a pre-trained classification model comprises:
segmenting the hidden danger data and converting the segmented hidden danger data into corresponding word vectors;
performing, by the pre-trained classification model: determining a plurality of similar data with the hidden danger data in a set of sample hidden danger data used for training the classification model according to the word vector corresponding to the hidden danger data; according to the plurality of similar data, determining the weight of the class to which the similar data belong respectively; classifying the hidden danger data according to the weight; and obtaining relation characteristic data in the hidden danger data according to the classification result.
5. The method of claim 4, wherein determining the weight of the category to which the plurality of similar data belongs according to the plurality of similar data respectively comprises:
determining categories to which the plurality of similar data respectively belong;
for each determined category, determining the number of the similar data contained in the category;
and respectively determining the weight of each category according to the number and the similarity of the similar data and the hidden danger data.
6. The method of claim 1, wherein generating graph node data from the relationship feature data comprises:
receiving completion data and correction data for the relational feature data;
and performing redundant filtering and formatting treatment on the relationship characteristic data, the completion data and the correction data to generate graph node data.
7. The method of claim 1, wherein generating a hidden danger data knowledge graph from the graph node data comprises:
importing the graph node data into an NOSQL graph database for processing;
and acquiring a hidden danger data knowledge map correspondingly generated by the NOSQL graph database.
8. A hidden danger data knowledge graph construction device is characterized by comprising:
the acquisition module acquires hidden danger data;
the extraction module is used for extracting relation characteristic data from the hidden danger data through a pre-trained classification model, and the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
the first generation module generates graph node data according to the relation characteristic data;
and the second generation module is used for generating the hidden danger data knowledge graph according to the graph node data.
9. The hidden danger data knowledge graph construction equipment is characterized by comprising the following steps:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
10. A non-transitory computer storage medium for constructing a knowledge graph of hidden danger data, the computer storage medium having stored thereon computer-executable instructions configured to:
acquiring hidden danger data;
extracting relation characteristic data from the hidden danger data through a pre-trained classification model, wherein the relation characteristic data reflects semantic relations among multiple hidden danger attributes;
generating graph node data according to the relation characteristic data;
and generating a hidden danger data knowledge graph according to the graph node data.
CN201910652010.6A 2019-07-18 2019-07-18 Hidden danger data knowledge graph construction method, device, equipment and medium Pending CN110851611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910652010.6A CN110851611A (en) 2019-07-18 2019-07-18 Hidden danger data knowledge graph construction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910652010.6A CN110851611A (en) 2019-07-18 2019-07-18 Hidden danger data knowledge graph construction method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN110851611A true CN110851611A (en) 2020-02-28

Family

ID=69595253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910652010.6A Pending CN110851611A (en) 2019-07-18 2019-07-18 Hidden danger data knowledge graph construction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN110851611A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257740A (en) * 2020-09-05 2021-01-22 赛飞特工程技术集团有限公司 Knowledge graph-based image hidden danger identification method and system
CN113672741A (en) * 2021-08-19 2021-11-19 支付宝(杭州)信息技术有限公司 Information processing method, device and equipment
CN114089415A (en) * 2020-08-04 2022-02-25 中国石油天然气股份有限公司 Knowledge graph generation method and device based on seismic data processing
CN114781082A (en) * 2022-04-15 2022-07-22 广东省科学院智能制造研究所 Extrusion die design knowledge processing method, system, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN109614501A (en) * 2018-12-13 2019-04-12 浙江工商大学 A kind of industrial hidden danger standardization report method and system of knowledge based map

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN108460136A (en) * 2018-03-08 2018-08-28 国网福建省电力有限公司 Electric power O&M information knowledge map construction method
CN109614501A (en) * 2018-12-13 2019-04-12 浙江工商大学 A kind of industrial hidden danger standardization report method and system of knowledge based map

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114089415A (en) * 2020-08-04 2022-02-25 中国石油天然气股份有限公司 Knowledge graph generation method and device based on seismic data processing
CN114089415B (en) * 2020-08-04 2023-11-28 中国石油天然气股份有限公司 Knowledge graph generation method and device based on seismic data processing
CN112257740A (en) * 2020-09-05 2021-01-22 赛飞特工程技术集团有限公司 Knowledge graph-based image hidden danger identification method and system
CN113672741A (en) * 2021-08-19 2021-11-19 支付宝(杭州)信息技术有限公司 Information processing method, device and equipment
CN114781082A (en) * 2022-04-15 2022-07-22 广东省科学院智能制造研究所 Extrusion die design knowledge processing method, system, equipment and storage medium

Similar Documents

Publication Publication Date Title
US11748416B2 (en) Machine-learning system for servicing queries for digital content
EP4195112A1 (en) Systems and methods for enriching modeling tools and infrastructure with semantics
US6047277A (en) Self-organizing neural network for plain text categorization
JP2021504789A (en) ESG-based corporate evaluation execution device and its operation method
CN112860841B (en) Text emotion analysis method, device, equipment and storage medium
CN112182246B (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
Falessi et al. A comprehensive characterization of NLP techniques for identifying equivalent requirements
CN110851611A (en) Hidden danger data knowledge graph construction method, device, equipment and medium
KR101335540B1 (en) Method for classifying document by using ontology and apparatus therefor
CN105426354A (en) Sentence vector fusion method and apparatus
Gong et al. A survey on dataset quality in machine learning
Derczynski et al. Helping crisis responders find the informative needle in the tweet haystack
CN110569330A (en) text labeling system, device, equipment and medium based on intelligent word selection
Wong et al. Wiki-reliability: A large scale dataset for content reliability on wikipedia
CN113868419A (en) Text classification method, device, equipment and medium based on artificial intelligence
CN113220885A (en) Text processing method and system
CN116976321A (en) Text processing method, apparatus, computer device, storage medium, and program product
CN116521871A (en) File detection method and device, processor and electronic equipment
CN116881395A (en) Public opinion information detection method and device
CN105786929B (en) A kind of information monitoring method and device
CN113742450B (en) Method, device, electronic equipment and storage medium for user data grade falling label
CN115357711A (en) Aspect level emotion analysis method and device, electronic equipment and storage medium
Sindhu et al. Disaster management from social media using machine learning
Prathyusha et al. Normalization Methods for Multiple Sources of Data
CN113221556A (en) Method, device and equipment for identifying potential safety hazard

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211011

Address after: 3 / F, xindongyuan North building, 3501 Chengfu Road, Haidian District, Beijing 100083

Applicant after: HUARUI XINZHI TECHNOLOGY (BEIJING) Co.,Ltd.

Applicant after: Huarui Xinzhi Baoding Technology Co.,Ltd.

Address before: 3 / F, xindongyuan North building, No. 35-1, Chengfu Road, Haidian District, Beijing 100083

Applicant before: HUARUI XINZHI TECHNOLOGY (BEIJING) Co.,Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20200228

RJ01 Rejection of invention patent application after publication