CN113220896B - Multi-source knowledge graph generation method, device and terminal equipment - Google Patents

Multi-source knowledge graph generation method, device and terminal equipment Download PDF

Info

Publication number
CN113220896B
CN113220896B CN202110457283.2A CN202110457283A CN113220896B CN 113220896 B CN113220896 B CN 113220896B CN 202110457283 A CN202110457283 A CN 202110457283A CN 113220896 B CN113220896 B CN 113220896B
Authority
CN
China
Prior art keywords
knowledge
graph
initial
candidate
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110457283.2A
Other languages
Chinese (zh)
Other versions
CN113220896A (en
Inventor
林玥煜
邓侃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing RxThinking Ltd
Original Assignee
Beijing RxThinking Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing RxThinking Ltd filed Critical Beijing RxThinking Ltd
Priority to CN202110457283.2A priority Critical patent/CN113220896B/en
Publication of CN113220896A publication Critical patent/CN113220896A/en
Application granted granted Critical
Publication of CN113220896B publication Critical patent/CN113220896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the invention discloses a multi-source knowledge graph generation method, a multi-source knowledge graph generation device and terminal equipment. One embodiment of the method comprises the following steps: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, acquiring a medical database set; generating an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs, the initial knowledge-graphs characterize the medical condition, and the initial knowledge-graph set characterizes the second number of medical condition; generating a target knowledge graph set; pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set. According to the embodiment, the target knowledge graph set for representing different medical conditions is constructed according to the multi-source medical information in the medical database set, so that the medical information can be effectively utilized, and the integrity and accuracy of the target knowledge graph set can be improved.

Description

Multi-source knowledge graph generation method, device and terminal equipment
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a knowledge graph generation method, a knowledge graph generation device and terminal equipment.
Background
With the increasing living standard of people, the attention to medical health is rising year by year, and how to mine medical information contained in medical records and diagnosis results is attracting more attention. Meanwhile, the medical research level of China is continuously improved, medical researchers produce mass research documents each year, and the medical documents also contain rich professional medical knowledge. Processing such knowledge into structured information using text mining techniques can bring great progress in medical knowledge informatization. The rapid development of natural language processing has made it possible to automatically extract medical entities and relationships between entities from cases, documents. The extracted medical knowledge can be used for constructing a medical knowledge graph to promote the intelligent development of medicine.
However, when generating knowledge maps based on cases, examination reports, medical documents, and the like, there are often the following technical problems:
firstly, medical databases of multiple sources such as cases, inspection reports, medical documents and the like contain different medical knowledge data nodes and symptom attribute nodes, and the relationship between the disease symptoms and the medical knowledge data cannot be represented by simply representing the relationship between the nodes by edges in the knowledge graph, so that the knowledge graph cannot accurately represent medical information.
Secondly, the diversity of the constructed knowledge graph sources causes the problems of repeated knowledge, poor quality, inaccurate association and the like in the knowledge graph, and the problems of insufficient node matching, inaccurate calculation of the relation weights among the nodes and the like exist in the process of directly utilizing the medical database set to generate an initial knowledge graph set.
Disclosure of Invention
The disclosure is in part intended to introduce concepts in a simplified form that are further described below in the detailed description. The disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose a multi-source knowledge graph generation method, apparatus, and terminal device to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a multi-source knowledge-graph generation method, the method including: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, acquiring a medical database set; generating an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs, the initial knowledge-graphs characterize the medical condition, and the initial knowledge-graph set characterizes the second number of medical condition; generating a target knowledge graph set; pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set.
In some embodiments, the determining, for each candidate knowledge-graph in the candidate knowledge-graph set of the initial knowledge-graph, a fusion indicator of the candidate knowledge-graph and the initial knowledge-graph includes:
generating a candidate first vector of the candidate knowledge graph based on the entity set and the attribute set of the candidate knowledge graph;
generating a candidate second vector of the candidate knowledge-graph based on the relation set of the candidate knowledge-graph;
generating an initial first vector of the initial knowledge graph based on the entity set and the attribute set of the initial knowledge graph;
generating an initial second vector of the initial knowledge-graph based on the set of relationships of the initial knowledge-graph;
based on the candidate first vector, the initial first vector, the candidate second vector and the initial second vector, calculating a fusion index of the candidate knowledge graph and the initial knowledge graph by using the following formula:
wherein v is n1 Representing candidate first vector, v n2 Representing an initial first vector, v r1 Representing candidate second vectors, v r2 Representing an initial second vector, cos (,) representing a cosine value, gamma being a harmonic parameter, gamma being any integer, sim representing the fusion index.
In a second aspect, some embodiments of the present disclosure provide a multi-source knowledge-graph generation apparatus, the apparatus including: a detection unit configured to detect whether an operation authorization signal is received from the target terminal device, wherein the operation authorization signal is a signal generated by a user performing a target operation on the target control; a receiving unit configured to obtain a set of medical databases in response to detecting the operation authorization signal, wherein the set of medical databases comprises a first number of medical databases; a first generation unit configured to generate an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs; a second generation unit configured to generate a target knowledge-graph set based on the initial knowledge-graph set; the control unit is configured to push the target knowledge graph set to target terminal equipment with a display function and control the target terminal equipment to display the target knowledge graph set.
In a third aspect, some embodiments of the present disclosure provide a terminal device, including: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method as in any of the first aspects.
The above embodiments of the present disclosure have the following advantages: according to the multi-source knowledge graph generation method of some embodiments of the present disclosure, a target knowledge graph set for representing different medical conditions can be constructed according to multi-source medical information in a medical database set, so that the medical information can be effectively utilized, and the integrity and accuracy of the target knowledge graph set can be improved. In particular, the inventor finds that the reason for the poor integrity and accuracy of the current medical knowledge graph is that: the medical databases of multiple sources such as cases, inspection reports, medical documents and the like contain different medical knowledge data nodes and symptom attribute nodes, and the relationship between the disease and the medical knowledge data cannot be represented by simply representing the relationship between the nodes by edges in the knowledge graph, so that the knowledge graph cannot accurately represent medical information. Based on this, first, some embodiments of the present disclosure acquire a set of medical databases. Wherein the set of medical databases comprises a first number of medical databases, the medical databases being a set of structured medical text paragraphs. Next, an initial knowledge-graph set is generated based on the set of medical databases. The initial knowledge-graph set comprises a second number of initial knowledge-graphs, the initial knowledge-graphs represent medical symptoms, and the initial knowledge-graph set represents the second number of medical symptoms. Specifically, the initial knowledge patterns correspond to disease conditions in medicine one by one. Then, a target knowledge-graph set is generated based on the initial knowledge-graph set. Specifically, knowledge fusion processing is performed on the initial knowledge graph set to obtain a target knowledge graph set. By taking the symptoms as dimensions to generate the initial knowledge graph, the information in medical knowledge bases with different sources can be effectively utilized, and the initial knowledge graph is further fused to obtain a target knowledge graph set, so that the integrity of multi-source information is ensured, and the accuracy of the target knowledge graph set on the correspondence of disease symptoms in medicine is improved.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is an architecture diagram of an exemplary system in which some embodiments of the present disclosure may be applied;
FIG. 2 is a flow chart of some embodiments of a multi-source knowledge-graph generation method in accordance with the present disclosure;
FIG. 3 is an exemplary authorization prompt;
FIG. 4 is a flow chart of some embodiments of a multi-source knowledge-graph generation apparatus in accordance with the present disclosure;
fig. 5 is a schematic structural diagram of a terminal device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings. Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the multi-source knowledge-graph generation method of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a knowledge graph generation application, an information generation application, a data analysis application, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various terminal devices with display screens including, but not limited to, smartphones, tablets, laptop and desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the above-listed terminal apparatuses. Which may be implemented as multiple software or software modules (e.g., to provide medical database collection inputs, etc.), or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a server storing a set of medical databases input by the terminal devices 101, 102, 103, etc. The server may process the received medical database set and feed back the processing result (e.g., the target knowledge-graph set) to the terminal device.
It should be noted that, the multi-source knowledge graph generation method provided by the embodiment of the present disclosure may be executed by the server 105 or the terminal device.
It should be noted that, the local server 105 may also directly store the medical database set, and the server 105 may directly extract the local medical database set to obtain the target information set after processing, where the exemplary system architecture 100 may not include the terminal devices 101, 102, 103 and the network 104.
It should also be noted that the terminal devices 101, 102, 103 may also have a multi-source knowledge-graph generation application installed therein, and the processing method may also be executed by the terminal devices 101, 102, 103. At this point, the exemplary system architecture 100 may also not include the server 105 and the network 104.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or as a single server. When the server is software, it may be implemented as a plurality of software or software modules (for example, to provide a knowledge-graph generation service), or may be implemented as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of some embodiments of a multi-source knowledge-graph generation method in accordance with the present disclosure is shown. The multi-source knowledge graph generation method comprises the following steps:
in step 201, it is detected whether an operation authorization signal is received from the target terminal device.
In some embodiments, an execution subject of the multi-source knowledge-graph generation method (e.g., a server shown in fig. 1) detects whether an operation authorization signal is received from a target terminal device. The operation authorization signal is a signal generated by a user executing a target operation on the target control. The target terminal device may be a terminal device having the user corresponding account registered therein. The terminal device may be a "mobile phone" or a "computer". The target control may be included in an authorization prompt. The authorization prompt box can be displayed on the target terminal device. The target control may be a "confirm button".
In response to detecting the operation authorization signal, a set of medical databases is acquired, step 202.
In some embodiments, an executing body of the multi-source knowledge-graph generation method (e.g., a server shown in fig. 1) may obtain a set of medical databases entered by a user in response to detecting an operation authorization signal. The operation authorization signal may be a signal generated by executing a target operation on a target control by a user corresponding to the medical database set. The target control may be included in an authorization prompt. The authorization prompt box can be displayed on the target terminal device. The target terminal device may be a terminal device having the user corresponding account registered therein. The terminal device may be a "mobile phone" or a "computer". The target operation may be a "click operation" or a "slide operation". The target control may be a "confirm button".
As an example, the authorization prompt may be as shown in fig. 3. The authorization prompt may include: a prompt display portion 301 and a control 302. Wherein the above-mentioned hint information display part 301 may be used for displaying hint information. The hint information may be "whether acquisition of a medical database collection is allowed". The control 302 may be a "confirm button" or a "cancel button".
Optionally, the set of medical databases comprises a first number of medical databases. Wherein the medical database is a structured collection of medical text paragraphs. In particular, the types of medical databases included in the medical database collection may include, but are not limited to, one of the following: medical books, medical dictionaries, medical literature, expert discussion data, and electronic medical records. Wherein, the medical information in the medical dictionary and the electronic medical record has the structural characteristics, including but not limited to one of the following information: demographic information, laboratory reports, diagnostic results, prescriptions, and orders. Medical information in medical books, medical papers and expert discussion materials has unstructured characteristics and mainly consists of paragraphs written by using natural language. The medical literature can be a lesson book, a clinical guideline, a medicine specification, a medical treaty with legality and authority given by law, a medical paper, and a latest medical research result. In particular, the structured medical text paragraph may be "[ {" breast ultrasound ": [ "see": "echo clique": [ "position": "left milk", "nature": "practicality, miscibility" ], "conclusion": "nodular breast mass" ], "spirit": "good", "physical strength": "good", "appetite": good "," eating quality ": "good", "body weight": "unchanged", "stool": "normal", "urinate": "normal", "discovery time": "1 year" ].
Step 203, generating an initial knowledge-graph set based on the medical database set.
In some embodiments, the executing entity generates the initial knowledge-graph set based on the set of medical databases. Wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs. The initial knowledge-graph characterizes a medical condition and the initial knowledge-graph set characterizes a second number of medical conditions.
Optionally, an initial data set and an initial relation set are generated based on the medical database set and a predetermined knowledge-graph library. The initial data set comprises a positive initial data set and a negative initial data set, wherein the positive initial data represents information related to medical symptoms, and the negative initial data represents information unrelated to the medical symptoms. Specifically, the initial data set may include "spirit": "good", "physical strength": "good", "appetite": "good", "age": "older than 65 years", "symptom manifestation": "cough", "gender": "female". Wherein, "gender": "female" may be negative initial data, "symptom manifestation": the "cough" may be positive initial data. Specifically, the initial relation set includes a correspondence between initial data. For "symptom manifestation": "cough" and "symptom manifestation": "fracture", there is no initial relationship between the two, since the fracture is not responsible for the cough. For "symptom manifestation": "cough" and "symptom manifestation": "fever" is caused by a cough, and there is an initial relationship between the two. Specifically, the predetermined knowledge-graph library may be a knowledge-graph library that is determined by combing according to medical literature and data. The predetermined nodes in the predetermined knowledge graph library can be medical conditions, condition attributes, examination items, medicines, parts and operation modes obtained through medical literature and data combing. The predetermined edges in the predetermined medical knowledge graph library may be relationships between nodes obtained by combing according to medical documents and data. The predetermined weight of the edge is the accumulated connection times between different nodes. Specifically, in response to the predetermined edge weight being "0", there is no relationship between the two predetermined nodes to which the predetermined edge corresponds. In response to the predetermined edge weight being "1", there is an association between two predetermined nodes corresponding to the predetermined edge. The third number of predetermined knowledge-patterns may be included in the predetermined knowledge-pattern library. The predetermined knowledge-graph may correspond to a disorder. The predetermined knowledge-graph library includes predetermined knowledge-graphs corresponding to a third number of disorders. Specifically, the relationships between the initial data in the initial data set may be determined according to a predetermined knowledge-graph library, so as to update the initial relationship set. Specific to "symptom manifestation": "cough" and "symptom manifestation": "fracture" may induce infection after fracture by looking up in a predetermined knowledge-graph library, further causing cough. "symptom manifestation": "cough" and "symptom manifestation": "fracture" can be manifested by "symptoms": the fever produces an association.
Optionally, the initial set of weights is generated based on the initial set of data, the initial set of relationships, and a predetermined knowledge-graph library. Specifically, an initial set of weights is generated from the initial set of relationships. In response to there being no initial relationship between the two initial data, the initial weight between the two initial data is 0. In response to an initial relationship between two initial data, an initial weight between the two initial data is 1. Specifically, the initial weight set may be updated according to a predetermined knowledge-graph library, so as to obtain the initial weight set. And in response to finding the initial relation in the predetermined knowledge graph base, adding 1 to the initial weight value corresponding to the initial relation.
Optionally, an initial knowledge-graph set is generated according to the initial data set, the initial relation set and the initial weight set. Specifically, the initial data set is determined as an initial node set in the initial knowledge-graph set, and the initial relation set is determined as an initial edge set in the initial knowledge-graph set. And determining the weight of the initial edge in the initial knowledge graph according to the initial weight set.
Optionally, the initial knowledge-graph includes an initial data set, an initial relationship set, and an initial weight set. The initial data is an initial node in the initial knowledge graph, the initial relation is an initial edge in the initial knowledge graph, the initial edge represents the relation between the initial nodes, and the weight of the initial edge is the accumulated connection times between different initial nodes.
Step 204, generating a target knowledge-graph set based on the initial knowledge-graph set.
In some embodiments, the executing entity generates the target knowledge-graph set based on the initial knowledge-graph set.
Optionally, for each initial knowledge-graph in the initial knowledge-graph set, a candidate knowledge-graph set of the initial knowledge-graph is generated to obtain the candidate knowledge-graph set. Optionally, for each initial knowledge-graph in the initial knowledge-graph set, determining the initial knowledge-graph set as a candidate knowledge-graph set, and deleting the initial knowledge-graph from the candidate knowledge-graph set.
Optionally, for each initial knowledge graph in the initial knowledge graph set, fusing the initial knowledge graph with the candidate knowledge graph set of the initial knowledge graph to generate a target knowledge graph so as to obtain the target knowledge graph set. Optionally, a fusion threshold is determined. Specifically, the fusion threshold may be a threshold for controlling the fusion of the knowledge patterns, which is determined according to medical information processing experience. The fusion threshold may be a positive integer.
Optionally, for each candidate knowledge graph in the candidate knowledge graph set of the initial knowledge graph, determining a fusion index of the candidate knowledge graph and the initial knowledge graph, and performing the following generation steps to obtain a fusion index set.
Generating: and generating a candidate first vector of the candidate knowledge-graph based on the candidate data set of the candidate knowledge-graph. Specifically, text vectorization may be performed using a one-hot encoding method to generate candidate first vectors. The one-hot code uses state registers to encode states, each with its own register bit, and at any time only one of the bits is valid. Specifically, six candidate data are subjected to one-hot encoding: "headache", "fever", "cough", "poor spirit", "vomiting", "syncope". The candidate first vector after the single thermal encoding process was 000001, 000010, 000100, 001000, 010000, 100000.
And generating a candidate second vector of the candidate knowledge graph based on the candidate relation set and the candidate weight set of the candidate knowledge graph. Specifically, text vectorization may be performed using a one-hot encoding method to generate candidate second vectors. Based on the initial data set of the initial knowledge-graph, an initial first vector of the initial knowledge-graph is generated. Specifically, text vectorization may be performed using a one-hot encoding method to generate an initial first vector. Based on the initial relation set and the initial weight set of the initial knowledge-graph, an initial second vector of the initial knowledge-graph is generated. Specifically, text vectorization may be performed using a one-hot encoding method to generate an initial second vector.
Optionally, the fusion index of the candidate knowledge-graph and the initial knowledge-graph is calculated by using the following formula:
wherein v is n1 Representing candidate first vector, v n2 Representing an initial first vector, v r1 Representing candidate second vectors, v r2 Representing an initial second vector. cos (,) represents cosine values, gamma is a harmonic parameter, gamma can be any integer, and sim represents the fusion index.
Optionally, for each fusion index in the fusion index set, in response to the fusion index not being smaller than the fusion threshold, updating the initial knowledge graph according to the entity set and the attribute set in the candidate knowledge graph corresponding to the fusion index.
Optionally, for each fusion index in the fusion index set, in response to the fusion index being smaller than a fusion threshold, merging the candidate knowledge-graph corresponding to the fusion index into the initial knowledge-graph to obtain a target knowledge-graph.
And updating the initial knowledge graph to obtain a target knowledge graph. Specifically, the candidate knowledge-graph is merged according to the fusion result to obtain the target knowledge-graph.
The optional content in steps 203-204 described above is: the technical content of generating the target knowledge graph set through fusion processing is taken as an invention point of the embodiment of the disclosure, the problems of repeated knowledge, poor quality, inaccurate association and the like in the knowledge graph caused by the diversity of the knowledge graph sources in construction of the technical problem II mentioned in the background art are solved, and the problems of insufficient node matching, inaccurate calculation of the relationship weights among the nodes and the like are caused by directly generating the initial knowledge graph set by utilizing the medical database set. ". Factors causing problems of insufficient node matching, inaccurate calculation of relation weights among nodes and the like in the knowledge graph are often as follows: the data quality included in the medical database set for constructing the knowledge graph is uneven, and accuracy of node matching and node relation weight calculation is affected. If the above factors are solved, the effect of improving the matching degree can be achieved. To achieve this effect, the present disclosure introduces a method of fusion processing. First, an initial knowledge-graph set is generated based on a set of medical databases. The nodes in the initial knowledge graph comprise positive initial data and negative initial data, so that medical information corresponding to symptoms of the symptoms is more comprehensively represented. Then, a target knowledge-graph set is generated based on the initial knowledge-graph set by using a fusion method. And judging the similarity between the candidate knowledge-graph and the initial knowledge-graph according to the empirically determined fusion threshold value to generate a target knowledge-graph set. Through fusion processing, all information in the candidate knowledge graph can be utilized, and meanwhile, the similarity is judged based on the proposed fusion formula, so that the matching accuracy can be improved, and the second technical problem is solved.
Step 205, pushing the target knowledge-graph set to a target device with a display function, and controlling the target device to display the target knowledge-graph set.
In some embodiments, the executing body pushes the target knowledge-graph set to the target terminal device, and controls the target terminal device to perform the display-related operation. The target terminal device may be a device in communication with the execution body, and may perform display-related operations according to the received target knowledge-graph set. For example, when the set of target knowledge patterns output by the execution subject may be a set of knowledge patterns of a disease, specifically, the target knowledge patterns may be knowledge patterns of lung cancer, and the target knowledge patterns may also be knowledge patterns of hyperplasia of mammary glands. The target terminal device may send an alarm display signal prompting further treatment or therapy for the above-mentioned diseases. Through fusion processing, a target knowledge graph set with high matching accuracy can be generated, and the method is beneficial to assisting subsequent medical treatment.
One embodiment, as illustrated in fig. 2, has the following beneficial effects: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, acquiring a medical database set; generating an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs, the initial knowledge-graphs characterize the medical condition, and the initial knowledge-graph set characterizes the second number of medical condition; generating a target knowledge graph set; pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set. According to the embodiment, the target knowledge graph set for representing different medical conditions is constructed according to the multi-source medical information in the medical database set, so that the medical information can be effectively utilized, and the integrity and accuracy of the target knowledge graph set can be improved.
With further reference to fig. 4, as an implementation of the method described above for each of the above-described figures, the present disclosure provides some embodiments of a multi-source knowledge-graph generating apparatus, where the apparatus embodiments correspond to those described above for fig. 2, and the apparatus may be specifically applied to various terminal devices.
As shown in fig. 4, the multi-source knowledge-graph generating apparatus 400 of some embodiments includes: a detection unit 401, a receiving unit 402, a first generation unit 403, a second generation unit 404, and a control unit 405. Wherein the detecting unit 401 is configured to detect whether an operation authorization signal is received from the target terminal device, wherein the operation authorization signal is a signal generated by a user performing a target operation on the target control. The receiving unit 402 is configured to obtain a set of medical databases in response to detecting the operation authorization signal, wherein the set of medical databases comprises a first number of medical databases. A first generation unit 403 configured to generate an initial knowledge-graph set based on the set of medical databases, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs. The second generating unit 404 is configured to generate a target knowledge-graph set based on the initial knowledge-graph set. The control unit 405 is configured to push the target knowledge-graph set to a target device having a display function, and control the target device to display the target knowledge-graph set.
It will be appreciated that the elements described in the apparatus 400 correspond to the various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting benefits described above with respect to the method are equally applicable to the apparatus 400 and the units contained therein, and are not described in detail herein.
Referring now to FIG. 5, there is illustrated a schematic diagram of a computer system 500 suitable for use in implementing the terminal device of an embodiment of the present disclosure. The terminal device shown in fig. 5 is only one example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.
As shown in fig. 5, the computer system 500 includes a central processing unit (CPU, central Processing Unit) 501, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a random access Memory (RAM, random Access Memory) 503. In the RAM503, various programs and data required for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM503 are connected to each other through a bus 504. An Input/Output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: a storage section 506 including a hard disk or the like; and a communication section 507 including a network interface card such as a LAN (local area network ) card, a modem, or the like. The communication section 507 performs communication processing via a network such as the internet. The drive 508 is also connected to the I/O interface 505 as needed. A removable medium 509, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed on the drive 508 as needed so that a computer program read out therefrom is installed into the storage section 506 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 507 and/or installed from the removable medium 509. The above-described functions defined in the method of the present disclosure are performed when the computer program is executed by a Central Processing Unit (CPU) 501. It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the C-language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which features described above or their equivalents may be combined in any way without departing from the spirit of the invention. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (9)

1. A multi-source knowledge graph generation method, comprising:
detecting whether an operation authorization signal is received from target terminal equipment, wherein the operation authorization signal is generated by a user executing target operation on a target control;
in response to detecting the operation authorization signal, obtaining a medical database set, wherein the medical database set comprises a first number of medical databases, and the medical databases are structured medical text paragraph sets;
generating an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs, the initial knowledge-graph represents a medical condition, and the initial knowledge-graph set represents a second number of medical conditions;
generating a target knowledge-graph set based on the initial knowledge-graph set, comprising: generating a candidate knowledge-graph set of the initial knowledge-graph for each initial knowledge-graph in the initial knowledge-graph set to obtain the candidate knowledge-graph set; for each initial knowledge graph in the initial knowledge graph set, carrying out fusion processing on the initial knowledge graph and a candidate knowledge graph set of the initial knowledge graph to generate a target knowledge graph so as to obtain the target knowledge graph set;
pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set.
2. The method of claim 1, wherein the initial knowledge-graph comprises an initial data set, an initial relationship set and an initial weight set, the initial data is an initial node in the initial knowledge-graph, the initial relationship is an initial edge in the initial knowledge-graph, the initial edge represents a relationship between the initial nodes, and the initial weight is a cumulative number of connections between different initial nodes.
3. The method of claim 2, wherein the generating an initial knowledge-graph set based on the set of medical databases comprises:
generating an initial data set and an initial relation set based on the medical database set and a predetermined knowledge graph library, wherein the initial data set comprises a positive initial data set and a negative initial data set, the positive initial data represents information related to medical symptoms, and the negative initial data represents information irrelevant to the medical symptoms;
generating an initial weight set based on the initial data set, the initial relation set and a predetermined knowledge-graph library;
and generating the initial knowledge graph set according to the initial data set, the initial relation set and the initial weight set.
4. A method according to claim 3, wherein said generating a candidate knowledge-graph set of the initial knowledge-graph comprises:
determining the initial knowledge-graph set as a candidate knowledge-graph set;
and deleting the initial knowledge-graph from the candidate knowledge-graph set.
5. The method of claim 4, wherein the fusing the initial knowledge-graph with the candidate knowledge-graph set of the initial knowledge-graph to generate the target knowledge-graph includes:
determining a fusion threshold;
for each candidate knowledge graph in the candidate knowledge graph set of the initial knowledge graph, determining a fusion index of the candidate knowledge graph and the initial knowledge graph to obtain a fusion index set;
for each fusion index in the fusion index set, responding to the fusion index not smaller than the fusion threshold value, and updating the initial knowledge graph according to the entity set and the attribute set in the candidate knowledge graph corresponding to the fusion index;
and updating the initial knowledge graph to obtain the target knowledge graph.
6. The method of claim 5, wherein the fusing the initial knowledge-graph with the candidate knowledge-graph set of the initial knowledge-graph to generate a target knowledge-graph further comprises:
and for each fusion index in the fusion index set, responding to the fusion index being smaller than the fusion threshold, and merging the candidate knowledge graph corresponding to the fusion index into the initial knowledge graph to obtain the target knowledge graph.
7. The method of claim 6, wherein the determining, for each candidate knowledge-graph in the candidate knowledge-graph set of the initial knowledge-graph, a fusion indicator of the candidate knowledge-graph and the initial knowledge-graph includes:
generating a candidate first vector of the candidate knowledge-graph based on the candidate data set of the candidate knowledge-graph;
generating a candidate second vector of the candidate knowledge graph based on the candidate relation set and the candidate weight set of the candidate knowledge graph;
generating an initial first vector of the initial knowledge-graph based on the initial data set of the initial knowledge-graph;
generating an initial second vector of the initial knowledge-graph based on the initial relation set and the initial weight set of the initial knowledge-graph;
and determining a fusion index of the candidate knowledge graph and the initial knowledge graph based on the candidate first vector, the initial first vector, the candidate second vector and the initial second vector.
8. A multi-source knowledge graph generation device, comprising:
a detection unit configured to detect whether an operation authorization signal is received from a target terminal device, wherein the operation authorization signal is a signal generated by a user executing a target operation on a target control;
a receiving unit configured to obtain a set of medical databases in response to detecting the operation authorization signal, wherein the set of medical databases comprises a first number of medical databases;
a first generation unit configured to generate an initial knowledge-graph set based on the medical database set, wherein the initial knowledge-graph set comprises a second number of initial knowledge-graphs;
a second generation unit configured to generate a target knowledge-graph set based on the initial knowledge-graph set, including: generating a candidate knowledge-graph set of the initial knowledge-graph for each initial knowledge-graph in the initial knowledge-graph set to obtain the candidate knowledge-graph set; for each initial knowledge graph in the initial knowledge graph set, carrying out fusion processing on the initial knowledge graph and a candidate knowledge graph set of the initial knowledge graph to generate a target knowledge graph so as to obtain the target knowledge graph set;
the control unit is configured to push the target knowledge-graph set to target terminal equipment with a display function and control the target terminal equipment to display the target knowledge-graph set.
9. A first terminal device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-7.
CN202110457283.2A 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment Active CN113220896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110457283.2A CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110457283.2A CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Publications (2)

Publication Number Publication Date
CN113220896A CN113220896A (en) 2021-08-06
CN113220896B true CN113220896B (en) 2024-03-19

Family

ID=77089647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110457283.2A Active CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Country Status (1)

Country Link
CN (1) CN113220896B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114400099A (en) * 2021-12-31 2022-04-26 北京华彬立成科技有限公司 Disease information mining and searching method and device, electronic equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933994A (en) * 2017-02-27 2017-07-07 广东省中医院 A kind of core disease card relation construction method based on knowledge of TCM collection of illustrative plates
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP
CN109166622A (en) * 2018-08-20 2019-01-08 重庆柚瓣家科技有限公司 The disease of knowledge based map examines system in advance
CN109241257A (en) * 2018-08-20 2019-01-18 重庆柚瓣家科技有限公司 A kind of the wisdom question answering system and its method of knowledge based map
CN109255035A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for constructing knowledge mapping
CN110609910A (en) * 2019-09-18 2019-12-24 金色熊猫有限公司 Medical knowledge graph construction method and device, storage medium and electronic equipment
CN110782996A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Construction method and device of medical database, computer equipment and storage medium
CN110866124A (en) * 2019-11-06 2020-03-06 北京诺道认知医学科技有限公司 Medical knowledge graph fusion method and device based on multiple data sources
CN111061841A (en) * 2019-12-19 2020-04-24 京东方科技集团股份有限公司 Knowledge graph construction method and device
CN111274806A (en) * 2020-01-20 2020-06-12 医惠科技有限公司 Method and device for recognizing word segmentation and part of speech and method and device for analyzing electronic medical record
CN111291163A (en) * 2020-03-09 2020-06-16 西南交通大学 Disease knowledge graph retrieval method based on symptom characteristics
CN111708873A (en) * 2020-06-15 2020-09-25 腾讯科技(深圳)有限公司 Intelligent question answering method and device, computer equipment and storage medium
CN111986765A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Electronic case entity marking method, device, computer equipment and storage medium
CN112667773A (en) * 2020-12-23 2021-04-16 医渡云(北京)技术有限公司 Data acquisition method based on knowledge graph and related equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7401057B2 (en) * 2002-12-10 2008-07-15 Asset Trust, Inc. Entity centric computer system
US10503791B2 (en) * 2017-09-04 2019-12-10 Borislav Agapiev System for creating a reasoning graph and for ranking of its nodes

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933994A (en) * 2017-02-27 2017-07-07 广东省中医院 A kind of core disease card relation construction method based on knowledge of TCM collection of illustrative plates
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP
CN109166622A (en) * 2018-08-20 2019-01-08 重庆柚瓣家科技有限公司 The disease of knowledge based map examines system in advance
CN109241257A (en) * 2018-08-20 2019-01-18 重庆柚瓣家科技有限公司 A kind of the wisdom question answering system and its method of knowledge based map
CN109255035A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for constructing knowledge mapping
CN110609910A (en) * 2019-09-18 2019-12-24 金色熊猫有限公司 Medical knowledge graph construction method and device, storage medium and electronic equipment
CN110782996A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Construction method and device of medical database, computer equipment and storage medium
CN110866124A (en) * 2019-11-06 2020-03-06 北京诺道认知医学科技有限公司 Medical knowledge graph fusion method and device based on multiple data sources
CN111061841A (en) * 2019-12-19 2020-04-24 京东方科技集团股份有限公司 Knowledge graph construction method and device
CN111274806A (en) * 2020-01-20 2020-06-12 医惠科技有限公司 Method and device for recognizing word segmentation and part of speech and method and device for analyzing electronic medical record
CN111291163A (en) * 2020-03-09 2020-06-16 西南交通大学 Disease knowledge graph retrieval method based on symptom characteristics
CN111708873A (en) * 2020-06-15 2020-09-25 腾讯科技(深圳)有限公司 Intelligent question answering method and device, computer equipment and storage medium
CN111986765A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Electronic case entity marking method, device, computer equipment and storage medium
CN112667773A (en) * 2020-12-23 2021-04-16 医渡云(北京)技术有限公司 Data acquisition method based on knowledge graph and related equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
中医药治疗小儿厌食症的知识图谱分析;张稳;魏小维;;西部中医药(04);第71-74页 *
生物信息学数据库研究文献引文与热点分析;王蕊;胡德华;;生物信息学(04);第75-82页 *

Also Published As

Publication number Publication date
CN113220896A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
Pezoulas et al. Medical data quality assessment: On the development of an automated framework for medical data curation
Churpek et al. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards
Jalal et al. An overview of R in health decision sciences
Maarseveen et al. Machine learning electronic health record identification of patients with rheumatoid arthritis: algorithm pipeline development and validation study
Kokosi et al. Synthetic data in medical research
US10936962B1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
CN113220895B (en) Information processing method and device based on reinforcement learning and terminal equipment
US11080326B2 (en) Intelligently organizing displays of medical imaging content for rapid browsing and report creation
CN112309565A (en) Method, apparatus, electronic device, and medium for matching drug information and disorder information
Marinelli et al. Combination of active transfer learning and natural language processing to improve liver volumetry using surrogate metrics with deep learning
CN115831379A (en) Knowledge graph complementing method and device, storage medium and electronic equipment
CN110245242B (en) Medical knowledge graph construction method and device and terminal
CN113220896B (en) Multi-source knowledge graph generation method, device and terminal equipment
Stenzl et al. Application of artificial intelligence to overcome clinical information overload in urological cancer
US20200273547A1 (en) Clinical trial editing using machine learning
Fisher et al. DermO; an ontology for the description of dermatologic disease
CN115620886B (en) Data auditing method and device
CN111046085A (en) Data source tracing processing method and device, medium and equipment
CN112397195A (en) Method, apparatus, electronic device, and medium for generating physical examination model
CN116504401A (en) Intelligent physical examination project recommendation method and device
CN115718809A (en) Training method and device of knowledge graph complement model
CN115292516A (en) Block chain-based distributed knowledge graph construction method, device and system
CN114141358A (en) Disease diagnosis apparatus based on knowledge map, computer device, and storage medium
CN113053531B (en) Medical data processing method, medical data processing device, computer readable storage medium and equipment
Wu et al. Developing EMR-based algorithms to Identify hospital adverse events for health system performance evaluation and improvement: Study protocol

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant