CN113220896A - Multi-source knowledge graph generation method and device and terminal equipment - Google Patents

Multi-source knowledge graph generation method and device and terminal equipment Download PDF

Info

Publication number
CN113220896A
CN113220896A CN202110457283.2A CN202110457283A CN113220896A CN 113220896 A CN113220896 A CN 113220896A CN 202110457283 A CN202110457283 A CN 202110457283A CN 113220896 A CN113220896 A CN 113220896A
Authority
CN
China
Prior art keywords
knowledge
initial
graph
target
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110457283.2A
Other languages
Chinese (zh)
Other versions
CN113220896B (en
Inventor
林玥煜
邓侃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing RxThinking Ltd
Original Assignee
Beijing RxThinking Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing RxThinking Ltd filed Critical Beijing RxThinking Ltd
Priority to CN202110457283.2A priority Critical patent/CN113220896B/en
Publication of CN113220896A publication Critical patent/CN113220896A/en
Application granted granted Critical
Publication of CN113220896B publication Critical patent/CN113220896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the disclosure discloses a multi-source knowledge graph generation method, a multi-source knowledge graph generation device and terminal equipment. One embodiment of the method comprises: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, obtaining a set of medical databases; generating an initial set of knowledge-maps based on the set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps, the initial knowledge-maps characterizing the medical condition, the initial set of knowledge-maps characterizing the second number of medical conditions; generating a target knowledge graph set; and pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set. According to the implementation mode, the target knowledge graph set used for representing different medical symptoms is constructed according to the medical information of multiple sources in the medical database set, so that the medical information can be effectively utilized, and the integrity and the accuracy of the target knowledge graph set are improved.

Description

Multi-source knowledge graph generation method and device and terminal equipment
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a knowledge graph generation method, a knowledge graph generation device and terminal equipment.
Background
With the increasing living standard of people, the attention on medical health is rising year by year, and more attention is paid to how to mine medical information contained in medical records and diagnosis results. Meanwhile, the level of medical research in China is continuously improved, and medical research personnel generate a large amount of research documents every year, and the medical documents also contain rich professional medical knowledge. Processing the knowledge into structured information by using a text mining technology can bring great progress to informatization of medical knowledge. The rapid development of natural language processing makes it possible to automatically extract medical entities and relationships between entities from cases and documents. The extracted medical knowledge can be used for constructing a medical knowledge map and promoting the intelligent development of medicine.
However, when generating a knowledge map based on a case, an examination report, a medical literature, and the like, there are often technical problems as follows:
firstly, medical databases of cases, examination reports, medical documents and other multiple sources contain different medical knowledge data nodes and symptom attribute nodes, and the relationship between the nodes is represented by the edges in the knowledge graph, so that the relationship weight between the symptoms and the medical knowledge data cannot be embodied, and the knowledge graph cannot accurately represent medical information.
Secondly, the problems of repeated knowledge, uneven quality, inaccurate association and the like in the knowledge graph caused by the diversity of the sources of the constructed knowledge graph are solved, and the problems of insufficient node matching, inaccurate calculation of the relationship weight among the nodes and the like exist by directly utilizing the medical database set to generate the initial knowledge graph set.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure provide a multi-source knowledge graph generation method, apparatus, and terminal device to solve one or more of the technical problems mentioned in the above background.
In a first aspect, some embodiments of the present disclosure provide a multi-source knowledge-graph generating method, the method comprising: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, obtaining a set of medical databases; generating an initial set of knowledge-maps based on the set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps, the initial knowledge-maps characterizing the medical condition, the initial set of knowledge-maps characterizing the second number of medical conditions; generating a target knowledge graph set; and pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set.
In some embodiments, the determining, for each candidate knowledge-graph of the set of candidate knowledge-graphs of the initial knowledge-graph, a fusion indicator of the candidate knowledge-graph and the initial knowledge-graph comprises:
generating a candidate first vector of the candidate knowledge graph based on the entity set and the attribute set of the candidate knowledge graph;
generating a candidate second vector of the candidate knowledge graph based on the set of relationships of the candidate knowledge graph;
generating an initial first vector of the initial knowledge graph based on the entity set and the attribute set of the initial knowledge graph;
generating an initial second vector of the initial knowledge-graph based on the set of relationships of the initial knowledge-graph;
calculating a fusion index of the candidate knowledge graph and the initial knowledge graph by using the following formula based on the candidate first vector, the initial first vector, the candidate second vector and the initial second vector:
Figure BDA0003040923530000021
wherein v isn1Representing a candidate first vector, vn2Representing an initial first vector, vr1Representing candidate second vectors, vr2Representing the initial second vector, cos (,) representing taking the cosine value, γ being a blending parameter, γ may beAny integer, sim, represents the fusion index.
In a second aspect, some embodiments of the present disclosure provide a multi-source knowledge-graph generating apparatus, the apparatus comprising: the detection unit is configured to detect whether an operation authorization signal is received from the target terminal device, wherein the operation authorization signal is a signal generated by a user executing a target operation on the target control; a receiving unit configured to acquire a medical database set in response to detecting the operation authorization signal, wherein the medical database set includes a first number of medical databases; a first generating unit configured to generate an initial set of knowledge-maps based on a set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps; a second generation unit configured to generate a target set of knowledge-maps based on the initial set of knowledge-maps; a control unit configured to push the target set of knowledge maps to a target terminal device having a display function, and control the target terminal device to display the target set of knowledge maps.
In a third aspect, some embodiments of the present disclosure provide a terminal device, including: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the first aspects.
The above embodiments of the present disclosure have the following beneficial effects: according to the multi-source knowledge graph generation method, the target knowledge graph set used for representing different medical symptoms can be constructed according to the multi-source medical information in the medical database set, the medical information can be effectively utilized, and the integrity and the accuracy of the target knowledge graph set are improved. Specifically, the inventor finds that the reason for the poor integrity and accuracy of the current medical knowledge map is as follows: the medical databases of cases, examination reports, medical documents and other multi-source medical databases comprise different medical knowledge data nodes and symptom attribute nodes, and the relation between the nodes is represented by the edges in the knowledge graph, so that the relation weight between the symptoms and the medical knowledge data cannot be embodied, and the knowledge graph cannot accurately represent medical information. Based on this, first, some embodiments of the present disclosure obtain a set of medical databases. Wherein the medical database set comprises a first number of medical databases, the medical databases being a set of structured medical text paragraphs. Secondly, an initial set of knowledge maps is generated based on the set of medical databases. Wherein the set of initial knowledge maps comprises a second number of initial knowledge maps, the initial knowledge maps characterizing the medical condition, the set of initial knowledge maps characterizing the second number of medical conditions. Specifically, the initial knowledge maps correspond one-to-one to the disease conditions in medicine. Then, based on the initial set of knowledge-maps, a target set of knowledge-maps is generated. Specifically, the initial knowledge graph set is subjected to knowledge fusion processing to obtain a target knowledge graph set. By using the disease as a dimension to generate the initial knowledge graph, information in medical knowledge bases from different sources can be effectively utilized, the initial knowledge graph is further fused to obtain a target knowledge graph set, the integrity of multi-source information is guaranteed, and the accuracy of the target knowledge graph set corresponding to the disease in the medicine is improved.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.
FIG. 1 is an architectural diagram of an exemplary system in which some embodiments of the present disclosure may be applied;
FIG. 2 is a flow diagram of some embodiments of a multi-source knowledge graph generation method according to the present disclosure;
FIG. 3 is an exemplary authorization prompt box;
FIG. 4 is a flow diagram of some embodiments of a multi-source knowledge-graph generating apparatus according to the present disclosure;
fig. 5 is a schematic block diagram of a terminal device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the multi-source knowledge-graph generation method of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a knowledge graph generation application, an information generation application, a data analysis application, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various terminal devices having a display screen, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the above-listed terminal apparatuses. It may be implemented as a plurality of software or software modules (e.g. for providing a medical database collection input, etc.) or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a server that stores a set of medical databases input by the terminal devices 101, 102, 103, and the like. The server may process the received set of medical databases and feed back the processing results (e.g., the target set of knowledge-maps) to the terminal device.
It should be noted that the multi-source knowledge graph generating method provided by the embodiment of the present disclosure may be executed by the server 105 or by the terminal device.
It should be noted that the local of the server 105 may also directly store the medical database set, and the server 105 may directly extract the local medical database set to obtain the target information set after processing, in which case, the exemplary system architecture 100 may not include the terminal devices 101, 102, 103 and the network 104.
It should be noted that the terminal apparatuses 101, 102, and 103 may also have a multi-source knowledge-graph generation application installed therein, and in this case, the processing method may also be executed by the terminal apparatuses 101, 102, and 103. At this point, the exemplary system architecture 100 may also not include the server 105 and the network 104.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide a knowledgegraph generation service), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of some embodiments of a multi-source knowledge-graph generation method in accordance with the present disclosure is shown. The multi-source knowledge graph generation method comprises the following steps:
step 201, detecting whether an operation authorization signal is received from a target terminal device.
In some embodiments, an executing agent (e.g., a server shown in fig. 1) of the multi-source knowledge-graph generating method detects whether an operation authorization signal is received from a target terminal device. The operation authorization signal is a signal generated by a user executing a target operation on the target control. The target terminal device may be a terminal device logged with an account corresponding to the user. The terminal equipment can be a mobile phone or a computer. The target control may be contained in an authorization prompt box. The authorization prompt box can be displayed on the target terminal equipment. The target control may be a "confirm button".
In response to detecting the operation authorization signal, a set of medical databases is obtained, step 202.
In some embodiments, an executive (e.g., a server as shown in fig. 1) of the multi-source knowledge-graph generation method may obtain a set of medical databases input by a user in response to detecting the operation-authorizing signal. The operation authorization signal may be a signal generated by a user corresponding to the medical database set performing a target operation on the target control. The target control may be contained in an authorization prompt box. The authorization prompt box can be displayed on the target terminal equipment. The target terminal device may be a terminal device logged with an account corresponding to the user. The terminal equipment can be a mobile phone or a computer. The target operation may be a "click operation" or a "slide operation". The target control may be a "confirm button".
As an example, the authorization prompt box described above may be as shown in fig. 3. The authorization prompt box may include: a prompt information display section 301 and a control 302. The prompt information display section 301 may be configured to display prompt information. The prompt may be "whether to allow the acquisition of the medical database collection". The control 302 may be a "confirm button" or a "cancel button".
Optionally, the set of medical databases comprises a first number of medical databases. Wherein the medical database is a structured medical text paragraph set. In particular, the type of medical database included in the medical database collection may include, but is not limited to, one of: medical books, medical dictionaries, medical literature, expert discussion data, electronic medical records. The medical information in the medical dictionary and the electronic medical record has the structural characteristics, including but not limited to one of the following information: demographic information, laboratory reports, diagnostic results, prescriptions, and medical orders. Medical information in medical books, medical papers, and expert discussion materials has an unstructured characteristic, and is mainly composed of paragraphs written using natural language. The medical literature can be a teaching book, a clinical guideline or a medicine specification, the medical literature can be a medical treatise which is legally and authoritatively endowed by law, and the medical literature can also be integrated with the latest medical research result by referring to a medical paper. In particular, the structured medical text passage may be "[ {" breast ultrasound ": [ "seen": "echo ball": [ "position": "left milk", "nature": "substantivity, mixability" ], "conclusion": "nodular breast mass" ] }, "mental": "good", "physical strength": "good", "appetite": "good", "food intake": "good", "body weight": "unchanged", "stool": "normal", "urinate": "normal", "discovery time": "1 year" ] ".
And step 203, generating an initial knowledge graph set based on the medical database set.
In some embodiments, the executive generates the initial set of knowledge-maps based on a set of medical databases. Wherein the set of initial knowledge-maps comprises a second number of initial knowledge-maps. The initial set of knowledge maps characterizes a medical condition, and the initial set of knowledge maps characterizes a second number of medical conditions.
Optionally, an initial data set and an initial relationship set are generated based on the medical database set and a predetermined knowledge spectrum library. Wherein the initial data set comprises a positive initial data set characterizing information related to the medical condition and a negative initial data set characterizing information unrelated to the medical condition. In particular, the initial data set may include a "mental": "good", "physical strength": "good", "appetite": "good", "age": "greater than 65 years of age", "symptomatic manifestation": "cough", "gender": "female". Wherein the "sex": "female" may be negative initial data, "symptom manifestation": "cough" may be positive initial data. Specifically, the initial relationship set includes the corresponding relationship between the initial data. For "symptomatic manifestations": "cough" and "symptomatic manifestations": "fracture" as it is not the cause of cough, there is no initial relationship between the two. For "symptomatic manifestations": "cough" and "symptomatic manifestations": "fever", which is caused by coughing, is an initial relationship between the two. Specifically, the predetermined knowledge map library may be a knowledge map library determined by combing according to medical documents and data. The predetermined nodes in the predetermined knowledge map library may be medical conditions, condition attributes, examination items, drugs, sites, surgical procedures according to medical literature, data combing. The predetermined edges in the predetermined medical knowledge atlas database may be relationships between nodes obtained by combing according to medical documents and materials. The predetermined edge weight is the cumulative connection times between different nodes. Specifically, in response to the predetermined edge weight being "0", there is no relationship between two predetermined nodes corresponding to the predetermined edge. In response to the predetermined edge weight being "1", there is an association between two predetermined nodes corresponding to the predetermined edge. A third number of predetermined knowledge-graphs may be included in the library of predetermined knowledge-graphs. The predetermined knowledge-map may correspond to a condition. A predetermined knowledge-graph library includes predetermined knowledge-graphs corresponding to a third number of disorders. Specifically, the relationship between each initial data in the initial data set may be determined according to a predetermined knowledge map library, so as to update the initial relationship set. In particular for "symptomatic manifestations": "cough" and "symptomatic manifestations": "fracture", which by looking in a predetermined knowledge-map library, may induce infection after fracture, further leading to cough. "symptomatic manifestation": "cough" and "symptomatic manifestations": "fracture" can be manifested by "symptoms": the "fever" produces a correlation.
Optionally, the initial weight set is generated based on the initial data set, the initial relationship set and a predetermined knowledge spectrum library. Specifically, an initial weight set is generated according to the initial relationship set. In response to there being no initial relationship between the two initial data, the initial weight between the two initial data is 0. In response to an initial relationship existing between two initial data, an initial weight between the two initial data is 1. Specifically, the initial weight set may be updated according to a predetermined knowledge spectrum library to obtain the initial weight set. And in response to finding the initial relation in the predetermined knowledge map library, adding 1 to the initial weight value corresponding to the initial relation.
Optionally, an initial knowledge graph set is generated according to the initial data set, the initial relationship set, and the initial weight set. Specifically, the initial data set is determined as an initial node set in the initial knowledge-graph set, and the initial relationship set is determined as an initial edge set in the initial knowledge-graph set. Weights for an initial edge in the initial knowledge-graph are determined from the initial set of weights.
Optionally, the initial knowledge-graph includes an initial data set, an initial relationship set, and an initial weight set. The initial data is initial nodes in the initial knowledge graph, the initial relation is initial edges in the initial knowledge graph, the initial edges represent the relation between the initial nodes, and the weight of the initial edges is the accumulated connection times between different initial nodes.
And step 204, generating a target knowledge graph set based on the initial knowledge graph set.
In some embodiments, the execution principal generates the target set of knowledge-maps based on the initial set of knowledge-maps.
Optionally, for each initial knowledge-graph in the initial knowledge-graph set, a candidate knowledge-graph set of the initial knowledge-graph is generated to obtain a candidate knowledge-graph set. Optionally, for each initial knowledge-graph in the initial knowledge-graph set, the initial knowledge-graph set is determined as a candidate knowledge-graph set, and the initial knowledge-graph is deleted from the candidate knowledge-graph set.
Optionally, for each initial knowledge graph in the initial knowledge graph set, the initial knowledge graph and the candidate knowledge graph set of the initial knowledge graph are subjected to fusion processing to generate a target knowledge graph, so as to obtain a target knowledge graph set. Optionally, a fusion threshold is determined. Specifically, the fusion threshold may be a threshold for controlling the fusion of the knowledge base determined according to medical information processing experience. The fusion threshold may be a positive integer.
Optionally, for each candidate knowledge graph in the candidate knowledge graph set of the initial knowledge graph, determining a fusion index of the candidate knowledge graph and the initial knowledge graph, and performing the following generation steps to obtain a fusion index set.
A generation step: based on the candidate data set of the candidate knowledge-graph, a candidate first vector of the candidate knowledge-graph is generated. Specifically, the text vectorization process may be performed using a one-hot encoding method to generate the candidate first vector. One-hot encoding uses a status register to encode states, each state having its own independent register bit and only one of which is active at any one time. Specifically, six candidate data are subjected to one-hot encoding: "headache", "fever", "cough", "mental retardation", "vomiting" and "syncope". The candidate first vectors after the one-hot encoding process are 000001, 000010, 000100, 001000, 010000, 100000.
And generating a candidate second vector of the candidate knowledge graph based on the candidate relation set and the candidate weight set of the candidate knowledge graph. Specifically, the text vectorization process may be performed using a one-hot encoding method to generate the candidate second vectors. An initial first vector of the initial knowledge-graph is generated based on the initial set of data of the initial knowledge-graph. Specifically, a text vectorization process may be performed using a one-hot encoding method to generate an initial first vector. An initial second vector of the initial knowledge-graph is generated based on the initial set of relationships and the initial set of weights of the initial knowledge-graph. Specifically, a text vectorization process may be performed using a one-hot encoding method to generate an initial second vector.
Optionally, the fusion index of the candidate knowledge graph and the initial knowledge graph is calculated by using the following formula:
Figure BDA0003040923530000101
wherein v isn1Representing a candidate first vector, vn2Representing an initial first vector, vr1Representing candidate second vectors, vr2Representing an initial second vector. cos (,) represents taking the cosine value, γ is the harmonic parameter, γ can be any integer, sim represents the fusion index.
Optionally, for each fusion index in the fusion index set, in response to that the fusion index is not less than the fusion threshold, the initial knowledge graph is updated according to the entity set and the attribute set in the candidate knowledge graph corresponding to the fusion index.
Optionally, for each fusion index in the fusion index set, in response to that the fusion index is smaller than a fusion threshold, merging the candidate knowledge graph corresponding to the fusion index into the initial knowledge graph to obtain the target knowledge graph.
And updating the initial knowledge graph to obtain a target knowledge graph. Specifically, the candidate knowledge graphs are merged according to the fusion result to obtain the target knowledge graph.
The optional contents in step 203 and step 204 are: the technical content of generating the target knowledge graph set through fusion processing is used as an invention point of the embodiment of the disclosure, the problems that the knowledge in the knowledge graph is repeated, the quality is good and the correlation is not clear and the like caused by the diversity of the constructed knowledge graph sources in the technical problem II mentioned in the background technology are solved, and the problems of insufficient node matching, inaccurate calculation of the relationship weight among the nodes and the like exist when the initial knowledge graph set is generated by directly utilizing the medical database set. ". Factors causing problems of insufficient node matching, inaccurate calculation of relationship weights among nodes and the like in the knowledge graph are as follows: the data quality included in the medical database set for constructing the knowledge graph is uneven, and the accuracy of node matching and node relation weight calculation is influenced. If the above factors are solved, the effect of improving the matching degree can be achieved. To achieve this effect, the present disclosure introduces a method of fusion processing. First, an initial set of knowledge-maps is generated based on a set of medical databases. The nodes in the initial knowledge graph comprise positive initial data and negative initial data, so that medical information corresponding to symptoms of the diseases is more comprehensively represented. Then, a target knowledge graph set is generated based on the initial knowledge graph set by using a fusion method. And judging the similarity between the candidate knowledge graph and the initial knowledge graph according to the fusion threshold value determined by experience so as to generate a target knowledge graph set. Through the fusion processing, all information in the candidate knowledge graph can be utilized, and meanwhile, the similarity is judged based on the proposed fusion formula, so that the matching accuracy can be improved, and the technical problem two is solved.
And step 205, pushing the target knowledge graph set to a target device with a display function, and controlling the target device to display the target knowledge graph set.
In some embodiments, the executing body pushes the target knowledge-graph set to the target terminal device, and controls the target terminal device to perform display-related operations. The target terminal device may be a device communicatively connected to the execution subject, and may perform display-related operations according to the received target knowledge graph set. For example, when the target set of knowledge maps output by the execution subject may be a set of knowledge maps of a disease state, specifically, the target knowledge map may be a knowledge map of lung cancer, and the target knowledge map may also be a knowledge map of hyperplasia of mammary glands. The target terminal device may send an alarm display signal to prompt further treatment or therapy for the disease. Through fusion processing, a target knowledge graph set with high matching accuracy can be generated, and follow-up medical treatment can be assisted.
One embodiment presented in fig. 2 has the following beneficial effects: detecting whether an operation authorization signal is received from a target terminal device; in response to detecting the operation authorization signal, obtaining a set of medical databases; generating an initial set of knowledge-maps based on the set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps, the initial knowledge-maps characterizing the medical condition, the initial set of knowledge-maps characterizing the second number of medical conditions; generating a target knowledge graph set; and pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set. According to the implementation mode, the target knowledge graph set used for representing different medical symptoms is constructed according to the medical information of multiple sources in the medical database set, so that the medical information can be effectively utilized, and the integrity and the accuracy of the target knowledge graph set are improved.
With further reference to fig. 4, as an implementation of the above method for the above figures, the present disclosure provides some embodiments of a multi-source knowledge-graph generating apparatus, which correspond to those of the method embodiments described above in fig. 2, and which may be applied in various terminal devices.
As shown in fig. 4, the multi-source knowledge-graph generation apparatus 400 of some embodiments includes: detection section 401, reception section 402, first generation section 403, second generation section 404, and control section 405. The detection unit 401 is configured to detect whether an operation authorization signal is received from the target terminal device, where the operation authorization signal is a signal generated by a user performing a target operation on the target control. A receiving unit 402 configured to acquire a set of medical databases in response to detecting the operation authorization signal, wherein the set of medical databases comprises a first number of medical databases. A first generating unit 403 configured to generate a set of initial knowledge-maps based on the set of medical databases, wherein the set of initial knowledge-maps comprises a second number of initial knowledge-maps. A second generating unit 404 configured to generate a target set of knowledge-maps based on the initial set of knowledge-maps. A control unit 405 configured to push the target set of knowledge-maps to a target device having a display function, and control the target device to display the target set of knowledge-maps.
It will be understood that the elements described in the apparatus 400 correspond to various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 400 and the units included therein, and will not be described herein again.
Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing a terminal device of an embodiment of the present disclosure. The terminal device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM503 are connected to each other via a bus 504. An Input/Output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: a storage section 506 including a hard disk and the like; and a communication section 507 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 507 performs communication processing via a network such as the internet. The driver 508 is also connected to the I/O interface 505 as necessary. A removable medium 509 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 508 as necessary, so that a computer program read out therefrom is mounted into the storage section 506 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 507 and/or installed from the removable medium 509. The above-described functions defined in the method of the present disclosure are performed when the computer program is executed by a Central Processing Unit (CPU) 501. It should be noted that the computer readable medium in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept as defined above. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (10)

1. A multi-source knowledge graph generation method, comprising:
detecting whether an operation authorization signal is received from a target terminal device, wherein the operation authorization signal is a signal generated by a user executing a target operation on a target control;
in response to detecting an operation authorization signal, obtaining a medical database set, wherein the medical database set comprises a first number of medical databases which are structured medical text paragraph sets;
generating an initial set of knowledge-maps based on the set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps that characterize a medical condition, the initial set of knowledge-maps characterizing the second number of medical conditions;
generating a target knowledge graph set based on the initial knowledge graph set;
pushing the target knowledge graph set to target terminal equipment with a display function, and controlling the target terminal equipment to display the target knowledge graph set.
2. The method of claim 1, wherein the initial knowledge-graph comprises an initial data set, an initial relationship set and an initial weight set, the initial data is an initial node in the initial knowledge-graph, the initial relationship is an initial edge in the initial knowledge-graph, the initial edge represents a relationship between the initial nodes, and the initial weight is a cumulative number of connections between different initial nodes.
3. The method of claim 2, wherein the generating an initial set of knowledge-maps based on the set of medical databases comprises:
generating an initial data set and an initial relationship set based on the medical database set and a predetermined knowledge spectrum library, wherein the initial data set comprises a positive initial data set and a negative initial data set, the positive initial data represents information related to the medical symptom, and the negative initial data represents information unrelated to the medical symptom;
generating an initial weight set based on the initial data set, the initial relationship set and a predetermined knowledge spectrum library;
and generating the initial knowledge graph set according to the initial data set, the initial relation set and the initial weight set.
4. The method of claim 3, wherein the generating a target set of knowledge-graphs based on the initial set of knowledge-graphs comprises:
for each initial knowledge graph in the initial knowledge graph set, generating a candidate knowledge graph set of the initial knowledge graph to obtain a candidate knowledge graph set;
and for each initial knowledge graph in the initial knowledge graph set, carrying out fusion processing on the initial knowledge graph and the candidate knowledge graph set of the initial knowledge graph to generate a target knowledge graph so as to obtain the target knowledge graph set.
5. The method of claim 4, wherein the generating the set of candidate knowledge-graphs of the initial knowledge-graph comprises:
determining the initial knowledge-graph set as a candidate knowledge-graph set;
the initial knowledge-graph is deleted from the set of candidate knowledge-graphs.
6. The method of claim 5, wherein the fusing the initial knowledge-graph with the candidate knowledge-graph set of the initial knowledge-graph to generate the target knowledge-graph comprises:
determining a fusion threshold;
for each candidate knowledge graph in the candidate knowledge graph set of the initial knowledge graph, determining a fusion index of the candidate knowledge graph and the initial knowledge graph to obtain a fusion index set;
for each fusion index in the fusion index set, in response to the fusion index not being smaller than the fusion threshold, updating the initial knowledge graph according to the entity set and the attribute set in the candidate knowledge graph corresponding to the fusion index;
and updating the initial knowledge graph to obtain the target knowledge graph.
7. The method of claim 6, wherein the fusing the initial knowledge-graph with the candidate knowledge-graph set of the initial knowledge-graph to generate the target knowledge-graph further comprises:
and for each fusion index in the fusion index set, in response to the fusion index being smaller than the fusion threshold value, merging the candidate knowledge graph corresponding to the fusion index into the initial knowledge graph to obtain the target knowledge graph.
8. The method of claim 7, wherein determining, for each candidate knowledge-graph in the set of candidate knowledge-graphs of the initial knowledge-graph, a fusion indicator of the candidate knowledge-graph and the initial knowledge-graph comprises:
generating a candidate first vector for the candidate knowledge-graph based on the candidate data set for the candidate knowledge-graph;
generating a candidate second vector of the candidate knowledge graph based on the candidate relation set and the candidate weight set of the candidate knowledge graph;
generating an initial first vector of the initial knowledge-graph based on an initial data set of the initial knowledge-graph;
generating an initial second vector of the initial knowledge-graph based on the initial set of relationships and the initial set of weights of the initial knowledge-graph;
and determining a fusion index of the candidate knowledge-graph and the initial knowledge-graph based on the candidate first vector, the initial first vector, the candidate second vector and the initial second vector.
9. A multi-source knowledge graph generation apparatus comprising:
the detection unit is configured to detect whether an operation authorization signal is received from a target terminal device, wherein the operation authorization signal is a signal generated by a user executing a target operation on a target control;
a receiving unit configured to acquire a medical database set in response to detecting an operation authorization signal, wherein the medical database set includes a first number of medical databases;
a first generating unit configured to generate an initial set of knowledge-maps based on the set of medical databases, wherein the initial set of knowledge-maps comprises a second number of initial knowledge-maps;
a second generation unit configured to generate a target set of knowledge-maps based on the initial set of knowledge-maps;
a control unit configured to push the target set of knowledge maps to a target terminal device having a display function, and control the target terminal device to display the target set of knowledge maps.
10. A first terminal device comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
CN202110457283.2A 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment Active CN113220896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110457283.2A CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110457283.2A CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Publications (2)

Publication Number Publication Date
CN113220896A true CN113220896A (en) 2021-08-06
CN113220896B CN113220896B (en) 2024-03-19

Family

ID=77089647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110457283.2A Active CN113220896B (en) 2021-04-27 2021-04-27 Multi-source knowledge graph generation method, device and terminal equipment

Country Status (1)

Country Link
CN (1) CN113220896B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114400099A (en) * 2021-12-31 2022-04-26 北京华彬立成科技有限公司 Disease information mining and searching method and device, electronic equipment and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225629A1 (en) * 2002-12-10 2004-11-11 Eder Jeff Scott Entity centric computer system
CN106933994A (en) * 2017-02-27 2017-07-07 广东省中医院 A kind of core disease card relation construction method based on knowledge of TCM collection of illustrative plates
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP
CN109166622A (en) * 2018-08-20 2019-01-08 重庆柚瓣家科技有限公司 The disease of knowledge based map examines system in advance
CN109241257A (en) * 2018-08-20 2019-01-18 重庆柚瓣家科技有限公司 A kind of the wisdom question answering system and its method of knowledge based map
CN109255035A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for constructing knowledge mapping
US20190073420A1 (en) * 2017-09-04 2019-03-07 Borislav Agapiev System for creating a reasoning graph and for ranking of its nodes
CN110609910A (en) * 2019-09-18 2019-12-24 金色熊猫有限公司 Medical knowledge graph construction method and device, storage medium and electronic equipment
CN110782996A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Construction method and device of medical database, computer equipment and storage medium
CN110866124A (en) * 2019-11-06 2020-03-06 北京诺道认知医学科技有限公司 Medical knowledge graph fusion method and device based on multiple data sources
CN111061841A (en) * 2019-12-19 2020-04-24 京东方科技集团股份有限公司 Knowledge graph construction method and device
CN111274806A (en) * 2020-01-20 2020-06-12 医惠科技有限公司 Method and device for recognizing word segmentation and part of speech and method and device for analyzing electronic medical record
CN111291163A (en) * 2020-03-09 2020-06-16 西南交通大学 Disease knowledge graph retrieval method based on symptom characteristics
CN111708873A (en) * 2020-06-15 2020-09-25 腾讯科技(深圳)有限公司 Intelligent question answering method and device, computer equipment and storage medium
CN111986765A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Electronic case entity marking method, device, computer equipment and storage medium
CN112667773A (en) * 2020-12-23 2021-04-16 医渡云(北京)技术有限公司 Data acquisition method based on knowledge graph and related equipment

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225629A1 (en) * 2002-12-10 2004-11-11 Eder Jeff Scott Entity centric computer system
CN106933994A (en) * 2017-02-27 2017-07-07 广东省中医院 A kind of core disease card relation construction method based on knowledge of TCM collection of illustrative plates
US20190073420A1 (en) * 2017-09-04 2019-03-07 Borislav Agapiev System for creating a reasoning graph and for ranking of its nodes
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP
CN109166622A (en) * 2018-08-20 2019-01-08 重庆柚瓣家科技有限公司 The disease of knowledge based map examines system in advance
CN109241257A (en) * 2018-08-20 2019-01-18 重庆柚瓣家科技有限公司 A kind of the wisdom question answering system and its method of knowledge based map
CN109255035A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for constructing knowledge mapping
CN110782996A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Construction method and device of medical database, computer equipment and storage medium
CN110609910A (en) * 2019-09-18 2019-12-24 金色熊猫有限公司 Medical knowledge graph construction method and device, storage medium and electronic equipment
CN110866124A (en) * 2019-11-06 2020-03-06 北京诺道认知医学科技有限公司 Medical knowledge graph fusion method and device based on multiple data sources
CN111061841A (en) * 2019-12-19 2020-04-24 京东方科技集团股份有限公司 Knowledge graph construction method and device
CN111274806A (en) * 2020-01-20 2020-06-12 医惠科技有限公司 Method and device for recognizing word segmentation and part of speech and method and device for analyzing electronic medical record
CN111291163A (en) * 2020-03-09 2020-06-16 西南交通大学 Disease knowledge graph retrieval method based on symptom characteristics
CN111708873A (en) * 2020-06-15 2020-09-25 腾讯科技(深圳)有限公司 Intelligent question answering method and device, computer equipment and storage medium
CN111986765A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Electronic case entity marking method, device, computer equipment and storage medium
CN112667773A (en) * 2020-12-23 2021-04-16 医渡云(北京)技术有限公司 Data acquisition method based on knowledge graph and related equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张稳;魏小维;: "中医药治疗小儿厌食症的知识图谱分析", 西部中医药, no. 04, pages 71 - 74 *
王蕊;胡德华;: "生物信息学数据库研究文献引文与热点分析", 生物信息学, no. 04, pages 75 - 82 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114400099A (en) * 2021-12-31 2022-04-26 北京华彬立成科技有限公司 Disease information mining and searching method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113220896B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
Pezoulas et al. Medical data quality assessment: On the development of an automated framework for medical data curation
CN111564223B (en) Infectious disease survival probability prediction method, and prediction model training method and device
CN110265099B (en) Method and device for outputting medical records
WO2021032055A1 (en) Automatic entry method and device for clinical trial reports, electronic equipment, and storage medium
EP2922018A1 (en) Medical information analysis program, medical information analysis device, and medical information analysis method
US10936962B1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
US11080326B2 (en) Intelligently organizing displays of medical imaging content for rapid browsing and report creation
CN110471941B (en) Method and device for automatically positioning judgment basis and electronic equipment
CN112309565A (en) Method, apparatus, electronic device, and medium for matching drug information and disorder information
CN113220895B (en) Information processing method and device based on reinforcement learning and terminal equipment
CN116578704A (en) Text emotion classification method, device, equipment and computer readable medium
CN115858886A (en) Data processing method, device, equipment and readable storage medium
CN115831379A (en) Knowledge graph complementing method and device, storage medium and electronic equipment
CN113220896B (en) Multi-source knowledge graph generation method, device and terminal equipment
CN113628751A (en) Gastric cancer prognosis prediction method and device and electronic equipment
CN115620886B (en) Data auditing method and device
CN111128330A (en) Automatic entry method and device for electronic case report table and related equipment
CN111627566A (en) Indication information processing method and device, storage medium and electronic equipment
CN111046085A (en) Data source tracing processing method and device, medium and equipment
CN111063445A (en) Feature extraction method, device, equipment and medium based on medical data
CN112397195A (en) Method, apparatus, electronic device, and medium for generating physical examination model
Chen et al. Characterizing the use and contents of free-text family history comments in the Electronic Health Record
CN113241198B (en) User data processing method, device, equipment and storage medium
CN110931136B (en) Event searching method and device, computer medium and electronic equipment
CN114141358A (en) Disease diagnosis apparatus based on knowledge map, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant