AU2020104254A4 - Healthcare question answering (qa) method and system based on contextualized language model and knowledge embedding - Google Patents

Healthcare question answering (qa) method and system based on contextualized language model and knowledge embedding Download PDF

Info

Publication number
AU2020104254A4
AU2020104254A4 AU2020104254A AU2020104254A AU2020104254A4 AU 2020104254 A4 AU2020104254 A4 AU 2020104254A4 AU 2020104254 A AU2020104254 A AU 2020104254A AU 2020104254 A AU2020104254 A AU 2020104254A AU 2020104254 A4 AU2020104254 A4 AU 2020104254A4
Authority
AU
Australia
Prior art keywords
healthcare
answer
candidate
question
embedding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2020104254A
Inventor
Kunhui LIN
Feng Luo
Xiaoli Wang
Qingfeng Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Application granted granted Critical
Publication of AU2020104254A4 publication Critical patent/AU2020104254A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present disclosure discloses a healthcare question answering (QA) method and system based on a contextualized language model and knowledge embedding. The method includes: acquiring a healthcare question and a candidate healthcare answer in a healthcare answer database; using a trained contextualized language model to generate contextualized embedding of the healthcare question and contextualized embedding of the candidate healthcare answer; using a trained knowledge representation model to generate knowledge embedding of the healthcare question and knowledge embedding of the candidate healthcare answer; fusing the contextualized embedding and knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; fusing the contextualized embedding and knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; and determining a healthcare answer to the healthcare question according to correlations of various candidate healthcare answers in the healthcare answer database with the healthcare question. The present disclosure can realize effective retrieval of answers and improve retrieval efficiency. 13 DRAWINGS 101 Acquire a healthcare question 102 Use a trained contextualized language model to extract contextualized embedding of the10 healthcare question 103 Use a trained knowledge representation model to extract knowledge embedding of the healthcare question 104 Fuse the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question 105 Acquire a healthcare answer in a healthcare answer database and mark the healthcare answer as a candidate healthcare answer 106 Use the trained contextualized language model to extract contextualized embedding of the candidate healthcare answer 1 107 Use the trained knowledge representation model to extract knowledge embedding of the candidate healthcare answer 1 108 Fuse the contextuaized embedding ot the candidate healthcare answer and the knowledge108 embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer 109 Calculate a correlation ot the candidate healthcare answer with the healthcare question accordmg to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer No Calculate correlations of all candidate healthcare answers in the healthcare answer database with the ealthcare questio Yes 110 Determine a healthcare answer to the healthcare question according to the correlations FIG. 1 Page 1 of2

Description

DRAWINGS
101
Acquire a healthcare question
102 Use a trained contextualized language model to extract contextualized embedding of the10 healthcare question
103 Use a trained knowledge representation model to extract knowledge embedding of the healthcare question
104 Fuse the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question
105 Acquire a healthcare answer in a healthcare answer database and mark the healthcare answer as a candidate healthcare answer
106 Use the trained contextualized language model to extract contextualized embedding of the candidate healthcare answer
1 107 Use the trained knowledge representation model to extract knowledge embedding of the candidate healthcare answer
1 108 Fuse the contextuaized embedding ot the candidate healthcare answer and the knowledge108 embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer
109 Calculate a correlation ot the candidate healthcare answer with the healthcare question accordmg to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer
No Calculate correlations of all candidate healthcare answers in the healthcare answer database with the ealthcare questio
Yes 110
Determine a healthcare answer to the healthcare question according to the correlations
FIG. 1
Page 1 of2
HEALTHCARE QUESTION ANSWERING (QA) METHOD AND SYSTEM BASED ON CONTEXTUALIZED LANGUAGE MODEL AND KNOWLEDGE EMBEDDING TECHNICAL FIELD The present disclosure relates to the technical field of healthcare question answering (QA), and in particular, to a healthcare QA method and system based on a contextualized language model and knowledge embedding. BACKGROUND With the development of question answering (QA) platforms, much more users tend to obtain first-hand relevant healthcare information through healthcare QA websites. Moreover, the development of healthcare QA websites greatly facilitates people's lives and alleviates the difficulty in seeking adequate healthcare caused by traditional medical treatment. However, since the Internet is flooded with a large number of information resources, how to efficiently and accurately search for a relevant answer to a question challenges various healthcare websites. In terms of the healthcare QA technology, how to find a relevant answer to a question in massive text information is a huge challenge, and improving the performance of QA retrieval is still the focus of research in the field of healthcare QA. Although many QA retrieval methods have been used in healthcare QA, such as traditional text retrieval and QA retrieval methods based on deep learning, these methods still have great shortcomings in accuracy and other properties of healthcare QA. Traditional QA retrieval methods are based on keyword search, but ignore the semantic information in QA. The QA retrieval methods based on deep learning utilize the lexical, syntactic and semantic features, but ignore the external knowledge information. In addition, most healthcare QA technologies are used to solve factoid questions, but in reality, users often ask very abstract questions, which are non-factoid questions. This makes the existing retrieval performance unsatisfactory. SUMMARY The present disclosure is intended to provide a healthcare QA method and system based on a contextualized language model and knowledge embedding for improving retrieval performance. To achieve the above purpose, the present disclosure provides the following technical solutions. A healthcare QA method based on a contextualized language model and knowledge embedding includes: acquiring a healthcare question; using a trained contextualized language model to generate contextualized embedding of the healthcare question; using a trained knowledge representation model to generate knowledge embedding of the healthcare question; fusing the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; acquiring a healthcare answer in a healthcare answer database and marking the healthcare answer as a candidate healthcare answer; using the trained contextualized language model to generate contextualized embedding of the candidate healthcare answer; using the trained knowledge representation model to generate knowledge embedding of the candidate healthcare answer; fusing the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; repeating the steps of "acquiring a healthcare answer in a healthcare answer database and marking the healthcare answer as a candidate healthcare answer" to "calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer" to obtain correlations of various candidate healthcare answers in the healthcare answer database with the healthcare question; and determining a healthcare answer to the healthcare question according to the correlations. Optionally, the trained knowledge representation model is a knowledge representation model based on a knowledge graph. Optionally, the knowledge representation model is trained by a method including: acquiring a knowledge graph; extracting a corresponding entity relationship triple from a resource description framework (RDF) file corresponding to the knowledge graph, and converting the entity relationship triple into a knowledge graph encoding file, where, the entity relationship triple includes: entity, relationship, and entity-relationship pair, and the knowledge graph encoding file includes an entity number file, a relationship number file, and an entity-relationship pair number file; and training the knowledge representation model based on the knowledge graph encoding file to obtain the trained knowledge representation model. Optionally, the contextualized language model is a BERT model. Optionally, the knowledge representation model is a TransE model. Optionally, the calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer specifically includes: inputting the feature representation of the healthcare question and the feature representation of the candidate healthcare answer into a trained deep neural network (DNN) model to obtain the correlation of the candidate healthcare answer with the healthcare question. Optionally, the DNN model is a PACRR model, a KNRM model, or a DRMMTKS model. Optionally, the determining a healthcare answer to the healthcare question according to the correlations specifically includes: sorting the candidate healthcare answers according to the correlations, and outputting sorted candidate healthcare answers. Optionally, the determining a healthcare answer to the healthcare question according to the correlations specifically includes: outputting a specified number of candidate healthcare answers with a relatively-high correlation as a healthcare answer to the healthcare question. A healthcare QA system based on a contextualized language model and knowledge embedding includes: a healthcare question acquisition module, configured to acquire a healthcare question; a candidate healthcare answer acquisition module, configured to acquire a healthcare answer in a healthcare answer database and mark the healthcare answer as a candidate healthcare answer; contextualized embedding generation module, configured to generate contextualized embedding of the healthcare question and contextualized embedding of the candidate healthcare answer using a trained contextualized language model; knowledge embedding generation module, configured to generate knowledge embedding of the healthcare question and knowledge embedding of the candidate healthcare answer using a trained knowledge representation model; a feature representation generation module for healthcare questions, configured to fuse the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; a feature representation generation module for candidate healthcare answers, configured to fuse the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; a correlation calculation module, configured to calculate a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; and a healthcare answer determination module, configured to determine a healthcare answer to the healthcare question according to the correlations. Based on specific embodiments provided by the present disclosure, the present disclosure discloses the following technical effects: The healthcare QA method and system provided by the present disclosure combines a contextualized language model and knowledge embedding to generate contextualized embeddings and knowledge embeddings of a healthcare question and a candidate healthcare answer, fuses the contextualized embedding and knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question and fuses the contextualized embedding and knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer, and finally determines a final healthcare answer according to the correlation of the feature representation of the candidate healthcare answer with the feature representation of the healthcare question. The present disclosure utilizes contextualized semantic information and external knowledge information to assist in improving the performance of healthcare QA retrieval, and improves the adaptability to healthcare questions. BRIEF DESCRIPTION OF DRAWINGS To describe the technical solutions in the embodiments of the present disclosure or in the prior art more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments. Apparently, the accompanying drawings in the following description show merely some embodiments of the present disclosure, and a person of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts. FIG. 1 is a schematic flowchart of the healthcare QA method based on a contextualized language model and knowledge embedding provided in embodiment 1 of the present disclosure; and FIG. 2 is a structure diagram of the healthcare QA system based on a contextualized language model and knowledge embedding provided in embodiment 2 of the present disclosure. DETAILED DESCRIPTION The technical solutions in the embodiments of the present disclosure are clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present disclosure. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure. To make the foregoing objectives, features, and advantages of the present disclosure more comprehensible, the present disclosure is further described in detail below with reference to the accompanying drawings and specific implementations. A first aspect of the present disclosure provides a healthcare QA method based on a contextualized language model and knowledge embedding. FIG. 1 is a schematic flowchart of the healthcare QA method based on a contextualized language model and knowledge embedding provided in embodiment 1 of the present disclosure. As shown in FIG. 1, the healthcare QA method provided in this embodiment includes the following steps: step 101: acquiring a healthcare question; step 102: using a trained contextualized language model to generate contextualized embedding of the healthcare question; step 103: using a trained knowledge representation model to generate knowledge embedding of the healthcare question; step 104: fusing the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; step 105: acquiring a healthcare answer in a healthcare answer database and marking the healthcare answer as a candidate healthcare answer; step 106: using the trained contextualized language model to generate contextualized embedding of the candidate healthcare answer; step 107: using the trained knowledge representation model to generate knowledge embedding of the candidate healthcare answer; step 108: fusing the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; step 109: calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; repeating steps 105 to 109 to obtain correlations of various candidate healthcare answers in the healthcare answer database with the healthcare question; step 110: determining a healthcare answer to the healthcare question according to the correlations. In the embodiment, a healthcare question q is first acquired, and all candidate healthcare answers {di, d2,...,dn} in the system database are read. After text descriptions corresponding to the question q and the candidate answers {di, d2,...,dn} are acquired, a text content of a specified candidate answer di was selected as an initial input content of the method. The contextualized language model may be a BERT model. The process of generating contextualized embeddings of the healthcare question q and the candidate answer di by the BERT model is as follows: 1. Wordpiece word segmentation is performed on texts of q and di. 2. SequencesWqandWdiobtained from word segmentation of q and di are connected to a sequence W, and a connected sequence W is converted into embedding information of three parts: word segmentation embedding, position embedding, and token embedding, which is input into a multi-layer bidirectional Transformer network for calculation. 3. Output of each layer of Transformer is acquired, and contextualized embeddingsCqand Cdi respectively corresponding to q and di are extracted from the output. The knowledge representation model may be a knowledge representation model based on a knowledge graph. As an implementation, the knowledge representation model may be trained by a method including the following steps: acquiring a large number of knowledge graphs; extracting a corresponding entity relationship triple from a resource description framework (RDF) file corresponding to the knowledge graph, and converting the entity relationship triple into a knowledge graph encoding file, where, the entity relationship triple includes: entity, relationship, and entity-relationship pair, and the knowledge graph encoding file includes an entity number file, a relationship number file, and an entity-relationship pair number file; and training the knowledge representation model based on the knowledge graph encoding file to obtain a trained knowledge representation model. The knowledge representation model here may be a traditional knowledge representation model, TransE model. The generated knowledge graph encoding file may be input into the TransE model to obtain a knowledge representation E corresponding to the knowledge graph. With named entity recognition (NER), such as the NER module in the spacy toolkit and tagme tool, entities included in the healthcare question q and candidate healthcare answer di are recognized and linked to corresponding entities in the knowledge graph to generate entity sequences Eq and Edi, separately. According to the entity sequences Eq and Edi recognized from q and di, vectors indexed to corresponding entities in the knowledge representation E are used as knowledge embeddings Kgand Kai. As an implementation, the method of fusing the contextualized embedding and knowledge embedding in steps 104 and 108 may be as follows: the contextualized embedding and knowledge embedding are used as different matrices to fuse the contextualized embedding and knowledge embedding of the healthcare question and the contextualized embedding and knowledge embedding of the candidate healthcare answer, respectively. For example, given (C,K) and (C,,Ka) transformation parameter matrices ME Rc and NE R*d (C and k are vector space dimensions of contextualized embedding and knowledge embedding, respectively) are used to respectively project the contextualized embedding and knowledge embedding into the same vector space Rd to obtain new feature representations CKq and CK"i. A detailed calculation process is as follows: CKq =Cq* M@ Kq* N
CKd = Cd* M D Kdl* N
As an implementation, step 109 may specifically include: the feature representation of the healthcare question and the feature representation of the candidate healthcare answer are input into the trained DNN model, and the DNN model calculates a correlation matrix R (R =(CKq* CKdi) of the word segmentation granularity; and the correlation abstract feature vector in the correlation matrix R is extracted using a multi-kernel convolution of a convolutional neural network (CNN) and a Gaussian filter and input into a multilayer perceptron (MLP) to obtain a question-answer correlation S i. The DNN model may be an existing PACRR model, KNRM model or DRMMTKS model. As an implementation, step 110 may be: sorting the candidate healthcare answers according to the correlations, and outputting sorted candidate healthcare answers. As another implementation, step 110 may be: outputting a specified number of candidate healthcare answers with a relatively-high correlation as a healthcare answer to the healthcare question. The healthcare QA method provided in this embodiment is written in python, combines the pytorch deep learning framework in the field of deep learning and Nvidia GPU server computing power, and learns healthcare QA tasks through end-to-end supervised learning, which can achieve effective retrieval of answers to healthcare questions raised by users and thus improve the efficiency of healthcare QA retrieval. The contextualized model in the present disclosure learns a large amount of text semantic information through unsupervised pretraining of a large amount of text, in combination with two tasks of text content prediction and continuous sentence prediction. Moreover, in combination with a knowledge representation model obtained by unsupervised pretraining with the knowledge graph, more external knowledge is input so that potential correlative information in the question and answer can be discovered, further improving the performance of the model. The question-answer correlation of the present disclosure is more explanatory, andfine-grained correlation calculations such as word segmentation and entity are recognized. The method first acquires embedding representations of word segmentation and entity granularity in QA through the WordPiece word segmentation and NER technology. On this basis, a correlation matrix is obtained through similarity calculation, and by visualizing the correlation matrix, the question-answer correlation can be easily explained. The effect of the method provided in the present disclosure is verified, and the verification experiment results are shown in Table 1. Table 1: Experimental method MAP Precision@3 Recall@3 NDCG@3
(average (accuracy) (recall) (normalized loss accuracy) cumulativegain) feature-based DRMMTKS 0.7676 0.3010 0.9030 0.7868 knowledge-embedding DRMMTKS-W 0.7762 0.3019 0.9056 0.7944 context-based DRMMTKS-BC 0.9421 0.3310 0.9929 0.9540
CK-HQA DRMMTKS-BC-W 0.9430 0.3310 0.9929 0.9573
Notes: feature-based, knowledge-embedding, and context-based are benchmark models. It can be seen from the experimental results that the method CK-HQA (a healthcare QA method based on a contextualized language model and knowledge embedding) provided by the present disclosure is significantly superior to other benchmark models, and has achieved the optimal results in all experimental indicators. A second aspect of the present disclosure provides a healthcare QA system based on a contextualized language model and knowledge embedding. FIG. 2 is a structure diagram of the healthcare QA system based on a contextualized language model and knowledge embedding provided in embodiment 2 of the present disclosure. As shown in FIG. 2, the healthcare QA system provided in this embodiment includes: a healthcare question acquisition module 201, configured to acquire a healthcare question; a candidate healthcare answer acquisition module 202, configured to acquire a healthcare answer in a healthcare answer database and mark the healthcare answer as a candidate healthcare answer; contextualized embedding generation module 203, configured to generate contextualized embedding of the healthcare question and contextualized embedding of the candidate healthcare answer using a trained contextualized language model, where, the contextualized language model may be a BERT model; knowledge embedding generation module 204, configured to generate knowledge embedding of the healthcare question and knowledge embedding of the candidate healthcare answer using a trained knowledge representation model, where, the knowledge representation model may be a TransE model; a feature representation generation module 205 for healthcare questions, configured to fuse the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a fused feature representation of the healthcare question; a feature representation generation module 206 for candidate healthcare answers, configured to fuse the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a fused feature representation of the candidate healthcare answer; a correlation calculation module 207, configured to calculate a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer, where, the correlation may be calculated by using a trained DNN model, and the DNN model may be a PACRR model, a KNRM model, or a DRMMTKS model; and a healthcare answer determination module 208, configured to determine a healthcare answer corresponding to the healthcare question according to the correlations. The healthcare QA system provided in this embodiment can achieve effective retrieval of answers to healthcare questions raised by users and improve the retrieval efficiency of healthcare QA. Each embodiment of the present specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts among the embodiments may refer to each other. For a system disclosed in the embodiments, since the system corresponds to the method disclosed in the embodiments, the description is relatively simple, and reference can be made to the method description. Specific examples are used herein for illustration of the principles and implementations of the present disclosure. The description of the embodiments is used to help understand the method and its core principles of the present disclosure. In addition, those skilled in the art can make various modifications to specific implementations and application scope in accordance with the teachings of the present disclosure. In conclusion, the content of the present specification shall not be construed as a limitation to the present disclosure.

Claims (5)

  1. What is claimed is: 1. A healthcare question answering (QA) method based on a contextualized language model and knowledge embedding, comprising: acquiring a healthcare question; using a trained contextualized language model to generate contextualized embedding of the healthcare question; using a trained knowledge representation model to generate knowledge embedding of the healthcare question; fusing the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; acquiring a healthcare answer in a healthcare answer database and marking the healthcare answer as a candidate healthcare answer; using the trained contextualized language model to generate contextualized embedding of the candidate healthcare answer; using the trained knowledge representation model to generate knowledge embedding of the candidate healthcare answer; fusing the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; repeating the steps of "acquiring a healthcare answer in a healthcare answer database and marking the healthcare answer as a candidate healthcare answer" to "calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer" to obtain correlations of various candidate healthcare answers in the healthcare answer database with the healthcare question; and determining a healthcare answer to the healthcare question according to the correlations.
  2. 2. The healthcare QA method based on a contextualized language model and knowledge embedding according to claim 1, wherein, the trained knowledge representation model is a knowledge representation model based on a knowledge graph; wherein, the knowledge representation model is trained by a method comprising: acquiring a knowledge graph; extracting a corresponding entity relationship triple from a resource description framework
    (RDF) file corresponding to the knowledge graph, and converting the entity relationship triple into a knowledge graph encoding file, wherein, the entity relationship triple comprises: entity, relationship, and entity-relationship pair, and the knowledge graph encoding file comprises an entity number file, a relationship number file, and an entity-relationship pair number file; and training the knowledge representation model based on the knowledge graph encoding file to obtain the trained knowledge representation model.
  3. 3. The healthcare QA method based on a contextualized language model and knowledge embedding according to claim 1, wherein, the contextualized language model is a BERT model; wherein, the knowledge representation model is a TransE model; wherein, the calculating a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer specifically comprises: inputting the feature representation of the healthcare question and the feature representation of the candidate healthcare answer into a trained deep neural network (DNN) model to obtain the correlation of the candidate healthcare answer with the healthcare question; wherein, the DNN model is a PACRR model, a KNRM model, or a DRMMTKS model.
  4. 4. The healthcare QA method based on a contextualized language model and knowledge embedding according to claim 1, wherein, the determining a healthcare answer to the healthcare question according to the correlations specifically comprises: sorting the candidate healthcare answers according to the correlations, and outputting sorted candidate healthcare answers; wherein, the determining a healthcare answer to the healthcare question according to the correlations specifically comprises: outputting a specified number of candidate healthcare answers with a relatively-high correlation as a healthcare answer to the healthcare question.
  5. 5. A healthcare QA system based on a contextualized language model and knowledge embedding, comprising: a healthcare question acquisition module, configured to acquire a healthcare question; a candidate healthcare answer acquisition module, configured to acquire a healthcare answer in a healthcare answer database and mark the healthcare answer as a candidate healthcare answer; contextualized embedding generation module, configured to generate contextualized embedding of the healthcare question and contextualized embedding of the candidate healthcare answer using a trained contextualized language model; knowledge embedding generation module, configured to generate knowledge embedding of the healthcare question and knowledge embedding of the candidate healthcare answer using a trained knowledge representation model; a feature representation generation module for healthcare questions, configured to fuse the contextualized embedding of the healthcare question and the knowledge embedding of the healthcare question to obtain a feature representation of the healthcare question; a feature representation generation module for candidate healthcare answers, configured to fuse the contextualized embedding of the candidate healthcare answer and the knowledge embedding of the candidate healthcare answer to obtain a feature representation of the candidate healthcare answer; a correlation calculation module, configured to calculate a correlation of the candidate healthcare answer with the healthcare question according to the feature representation of the healthcare question and the feature representation of the candidate healthcare answer; and a healthcare answer determination module, configured to determine a healthcare answer to the healthcare question according to the correlations.
AU2020104254A 2020-04-23 2020-12-23 Healthcare question answering (qa) method and system based on contextualized language model and knowledge embedding Ceased AU2020104254A4 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010326646.4A CN111524593B (en) 2020-04-23 2020-04-23 Medical question-answering method and system based on context language model and knowledge embedding
CN202010326646.4 2020-04-23

Publications (1)

Publication Number Publication Date
AU2020104254A4 true AU2020104254A4 (en) 2021-03-11

Family

ID=71904081

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2020104254A Ceased AU2020104254A4 (en) 2020-04-23 2020-12-23 Healthcare question answering (qa) method and system based on contextualized language model and knowledge embedding

Country Status (2)

Country Link
CN (1) CN111524593B (en)
AU (1) AU2020104254A4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116975241A (en) * 2023-09-20 2023-10-31 广东技术师范大学 Liver cancer auxiliary diagnosis and question-answering method, system and medium based on large language model

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380325B (en) * 2020-08-15 2022-05-31 电子科技大学 Knowledge graph question-answering system based on joint knowledge embedded model and fact memory network
CN112784600B (en) * 2021-01-29 2024-01-16 北京百度网讯科技有限公司 Information ordering method, device, electronic equipment and storage medium
CN112800203B (en) * 2021-02-05 2021-12-07 江苏实达迪美数据处理有限公司 Question-answer matching method and system fusing text representation and knowledge representation
CN113032531B (en) * 2021-05-21 2021-11-30 北京金山数字娱乐科技有限公司 Text processing method and device
CN114547312B (en) * 2022-04-07 2022-08-16 华南师范大学 Emotional analysis method, device and equipment based on common sense knowledge graph
CN117575020A (en) * 2023-11-14 2024-02-20 平安创科科技(北京)有限公司 Intelligent question-answering method, device, equipment and medium based on artificial intelligence

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647233B (en) * 2018-04-02 2020-11-17 北京大学深圳研究生院 Answer sorting method for question-answering system
CN110543557B (en) * 2019-09-06 2021-04-02 北京工业大学 Construction method of medical intelligent question-answering system based on attention mechanism
CN110737763A (en) * 2019-10-18 2020-01-31 成都华律网络服务有限公司 Chinese intelligent question-answering system and method integrating knowledge map and deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116975241A (en) * 2023-09-20 2023-10-31 广东技术师范大学 Liver cancer auxiliary diagnosis and question-answering method, system and medium based on large language model
CN116975241B (en) * 2023-09-20 2024-01-09 广东技术师范大学 Liver cancer auxiliary diagnosis and question-answering method, system and medium based on large language model

Also Published As

Publication number Publication date
CN111524593A (en) 2020-08-11
CN111524593B (en) 2022-08-16

Similar Documents

Publication Publication Date Title
AU2020104254A4 (en) Healthcare question answering (qa) method and system based on contextualized language model and knowledge embedding
Wang et al. An overview of image caption generation methods
Zhu et al. Knowledge-based question answering by tree-to-sequence learning
Qin et al. A stacking gated neural architecture for implicit discourse relation classification
CN104049755B (en) Information processing method and device
CN111552821B (en) Legal intention searching method, legal intention searching device and electronic equipment
CN114722839B (en) Man-machine cooperative dialogue interaction system and method
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN113779190B (en) Event causal relationship identification method, device, electronic equipment and storage medium
Wang et al. A template-guided hybrid pointer network for knowledge-basedtask-oriented dialogue systems
CN117556276B (en) Method and device for determining similarity between text and video
Mrini et al. Recursive tree-structured self-attention for answer sentence selection
CN115114419A (en) Question and answer processing method and device, electronic equipment and computer readable medium
CN114490954A (en) Document level generation type event extraction method based on task adjustment
CN114723013A (en) Multi-granularity knowledge enhanced semantic matching method
Adewoyin et al. RSTGen: imbuing fine-grained interpretable control into long-FormText generators
Zhang et al. Refsql: A retrieval-augmentation framework for text-to-sql generation
CN117009456A (en) Medical query text processing method, device, equipment, medium and electronic product
Huang et al. Text sentiment analysis based on Bert and Convolutional Neural Networks
CN115203388A (en) Machine reading understanding method and device, computer equipment and storage medium
CN115130461A (en) Text matching method and device, electronic equipment and storage medium
Liu et al. The BERT-BiLSTM-CRF question event information extraction method
Xu et al. AHRNN: attention‐based hybrid robust neural network for emotion recognition
CN113919338A (en) Method and device for processing text data
Hu et al. Service-oriented Text-to-SQL Parsing

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry