CN117077786A

CN117077786A - Knowledge graph-based data knowledge dual-drive intelligent medical dialogue system and method

Info

Publication number: CN117077786A
Application number: CN202310829332.XA
Authority: CN
Inventors: 吴伟; 刘纯玉; 周福辉
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-07-07
Filing date: 2023-07-07
Publication date: 2023-11-17

Abstract

The invention discloses a knowledge graph-based data knowledge dual-drive intelligent medical dialogue system and a method. The system extracts a medical named entity in the patient problem through a medical named entity identification module, and inputs the medical named entity into a medical knowledge graph matching module to match a knowledge entity so as to acquire professional medical background knowledge; the knowledge entity sampling module selects the most relevant knowledge entity to the question, and reduces the influence of the irrelevant knowledge entity on answer generation; then, inputting the questions and the knowledge entity obtained by sampling into a large language model together for fine tuning training; and finally, outputting the answer through a dialogue generating module. The method presents higher scores on bilingual evaluation criteria (BLEU) and automatic abstract evaluation criteria (ROUGE), and the generated answers are closer to the level of human doctors. The invention obviously improves the practicability of the medical dialogue system.

Description

Knowledge graph-based data knowledge dual-drive intelligent medical dialogue system and method

Technical Field

The invention relates to a data knowledge dual-drive intelligent medical dialogue system and method based on a knowledge graph, and belongs to the technical field of medical care informatics and artificial intelligence intersection.

Background

Large Language Models (LLM) are becoming an important tool in the field of intelligent medicine. These advanced models are trained with powerful computing power and massive data, giving machines the ability to understand and generate human language. In smart medicine, it can help doctors to diagnose and predict diseases, and provide patient consultation and health management services. As the demand for intelligent medical diagnosis continues to grow, large language models are becoming increasingly important in accurately and efficiently diagnosing diseases and reducing human error.

Large language models are built by learning language statistics in large amounts of data, however, in the medical field there are many unique terms, jargon and text formats, which makes the large language model fine-tuned for optimal performance.

Recently, some researchers have used medical corpora to fine tune large language models and adjust their weight parameters to better understand the language of the medical field. The current medical dialogue system is mainly used for optimizing English, and the Chinese question-answering capability is weak. To solve this problem, honglin Xiong, shaping Wang, YItao Zhu et al 'doctorGLM: fine-tuning your Chinese Doctor is not a Herculean Task,' (arXiv preprint arXiv:2305.07340,2023) collects a large number of Chinese medical dialogue datasets with the help of ChatGPT and achieves Fine tuning of the ChatGLM-6B model on a single A10080G GPU in 13 hours. This makes it easier to deploy a large language dialog model in chinese for medical use. The paper "ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge," (Cureus 15 (6): e40895,2023) by Li Y, li Z, zhang K et al uses the collected 700 diseases and their corresponding symptoms and 5000 generated doctor-patient dialogs to achieve medical testing and drug recommendation. In addition, the paper acquires 20 ten thousand real doctor-patient dialogues from an online questionnaire medical consultation website. By fine tuning a large language model (LLaMA) using twenty-five and five thousand doctor-patient dialogues, the model is provided with the ability to understand patient needs, provide intelligent advice, and provide valuable assistance in a variety of medical related fields. The paper "HuaTuo (Hua Tuo): tuning LLaMA Model with Chinese Medical Knowledge," (arXiv preprint arXiv:2304.06975,2023) by Haochun Wang, chi Liu, nuwa Xi et al also utilizes LLaMA as the base model and fine-tunes it on the Chinese medical dialogue dataset using the Instruct-tuning technique. Instruct-tuning is an effective method of fine tuning. The principle is to guide the behavior of the language model by providing explicit instructions or examples and fine-tune it according to specific tasks or fields. To solve the problem of instruction or example singleness, the paper introduces medical knowledge graph to construct eight thousand instructions. In addition to introducing instruction or example sets, chang Shu, baian Chen, fangyu Liu et al 'Visual Med-Alpaca: AParameter-Efficient Biomedical LLM with Visual Capabilities' integrates LLaMA models with medical vision models for multimodal biomedical tasks. The model can efficiently perform various medical tasks by means of several hours of instruction adjustment and pluggable vision modules.

Fine tuning has been demonstrated to significantly improve the performance of large language models on medical tasks. However, existing corpus-based dialogue dataset tuning techniques are not ideal enough and may produce misleading or inaccurate answers due to lack of specialized medical knowledge. In order to better meet the requirements of the medical field, fine tuning and training are required to be performed by combining professional medical knowledge, so that the large language model is ensured to be more accurate and reliable when medical consultation and suggestion are provided.

Disclosure of Invention

Aiming at the defects or shortcomings of the prior art, the invention provides a data knowledge dual-drive intelligent medical dialogue system and a method based on a knowledge graph, which combine the advantages of the knowledge graph, a large language model, deep reinforcement learning and other technologies, solve the problem of insufficient specialized knowledge of the medical dialogue system, and enable answers generated by the large language model to be closer to the level of human doctors. Specifically, higher bilingual evaluation criteria (BLEU) and automatic digest evaluation criteria (ROUGE) scores are achieved.

The technical scheme adopted for solving the technical problems is as follows: a data knowledge dual-drive intelligent medical dialogue system based on a knowledge graph comprises a medical named entity recognition module, a medical knowledge graph matching module, a knowledge entity sampling module, a large language model fine tuning module and a dialogue generation module.

The medical named entity recognition module is used for extracting medical named entities in patient problems, including disease names, body parts, medical procedures, medicines and departments. The module is the basis of a subsequent medical knowledge graph matching module.

The medical knowledge map matching module is used for matching the related medical named entities such as disease names, medical procedures and the like with nodes in the medical knowledge map after the related medical named entities are obtained so as to obtain related professional background knowledge.

The number of knowledge entities obtained by the knowledge graph matching module is often large, and the knowledge entities are irrelevant to the problem raised by the patient. The knowledge entity sampling module samples the knowledge entity obtained by the medical knowledge graph matching module to obtain the knowledge entity most suitable for answering the patient questions.

The large language model fine tuning module inputs the questions raised by the patient and the knowledge entity obtained by the knowledge entity sampling module into the large language model for fine tuning training, so that the large language model can generate answers close to the level of human doctors.

The dialogue generation module is an output module of the system. And fine-tuning the trained model by utilizing a large language model fine-tuning module, and outputting the generated answer aiming at the input patient questions and related knowledge entities.

The invention also provides a method for realizing the data knowledge dual-drive intelligent medical dialogue system based on the knowledge graph, which comprises the following steps:

step 1: data set acquisition and related knowledge graph preparation. A patient-doctor question and answer data set and a medical knowledge graph are acquired.

Step 2: medical named entity identification. The BERT-BILSTM-CRF model is utilized to extract medical named entities in the patient problem.

Step 3: medical knowledge graph matching. And matching the extracted medical named entity with the head entity of the triplet in the medical knowledge graph. And if the matching is successful, taking tail entities corresponding to all the head entities as knowledge entities.

Step 4: knowledge entity sampling. And (3) sampling the knowledge entity matched in the step (3) by using a sampler based on deep reinforcement learning to obtain the knowledge entity most suitable for answering the patient questions.

Step 5: and (5) fine tuning of a large language model. And (3) inputting the knowledge entity obtained by sampling in the step (4) and the original problem of the patient into a large language model for fine tuning training.

Step 6: and generating a question and answer. And (5) generating an answer according to the model parameters obtained by the fine tuning training in the step (5).

Step 7: and testing the model effect. The quality of the model generated answer is verified on the verification set.

The beneficial effects are that:

1. the invention introduces the medical knowledge graph into the fine tuning training of the large language model in the medical dialogue field, and provides the expertise of the medical field for the large language model. Through fine tuning training, the large language model can better understand medical concepts and terms, accuracy and specialty of answer generation are improved, and practicability of a medical dialogue system is improved.

2. The invention designs a sampler based on deep reinforcement learning, which samples the knowledge entity, reduces the influence of irrelevant knowledge entities on the model generation quality, and ensures that the answers generated by the model are more targeted.

3. Compared with a fine-tuning large-scale language model in the traditional medical field, the knowledge graph is introduced through the sampler based on deep reinforcement learning, and the knowledge graph is greatly improved in bilingual evaluation auxiliary index (BLEU) and retrospective oriented generalized evaluation auxiliary index (ROUGE) evaluation indexes. Specifically, bilingual evaluation auxiliary index is improved by 7.1%, and retrospective-oriented generalized evaluation auxiliary index (ROUGE) is improved by 2.7%.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a graph comparing the performance of the fine-tuning large language model of the present invention with other medical fields on bilingual evaluation auxiliary indicators (BLEU) and retrospective guided generalized evaluation auxiliary indicators (ROUGE).

Description of the drawings: fig. 2a is a comparison of the bluu score of the method of the invention and other methods, and fig. 2b is a comparison of the rouge score.

FIG. 3 is a comparison of an example of the effect of generating a fine-tuning large language model of the present invention with other medical fields.

Detailed Description

The invention is described in further detail below with reference to the drawings.

It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention.

As shown in fig. 1, the invention provides a method for implementing a data knowledge dual-drive intelligent medical dialogue system based on a knowledge graph, which comprises the following steps:

and step 1, acquiring a data set and preparing a related knowledge graph.

The invention first collects a patient-doctor dialogue dataset and a medical knowledge-graph. Taking the Chinese medical dialogue data set (CMD) and the Chinese medical multi-modal knowledge graph (CM 3 KG) data set as examples, the Chinese medical dialogue data set consists of 792099 question-answer pairs including men, department, obstetrics, gynecology, oncology, pediatrics and surgery. The chinese medical multimodal knowledge graph contains 8808 symptom nodes, 3353 medical examination nodes, 17318 drug nodes, and 366 food nodes. Table 1 is an example of a patient-physician dialogue, and table 2 is an example of nodes and their relationships in a knowledge graph.

Table 1 patient-doctor dialogue example

Table 2 node in knowledge graph and relationship example thereof

And 2, identifying the medical named entity. And (3) after the patient-doctor dialogue data set and the medical knowledge graph are acquired according to the step (1), extracting medical named entities in the patient problem by using a BERT-BILSTM-CRF model. The medical named entity recognition task can be seen as a character-level classification problem, wherein each character in the input sequence carries a label indicating whether it belongs to a medical named entity. These labels are typically assigned using a BIO tagging scheme that identifies the beginning (B-), internal (I-) and external (O) positions of named entities in the sequence. For example, for the input sequence "Ganmaoke may take Ganmaoling particles" where "Ganmaoling particles" means a drug. In the BIO tagging scheme, the first character "feel" will be tagged as B-MEDICINE indicating the beginning of the pharmaceutical entity, while the last character "grain" of the Ganmaoling grain will be tagged as O indicating that the entity ends at this location with the other characters "cap", "ling" and "grain" will be tagged as I-MEDICINE indicating that they are inside the pharmaceutical entity.

To achieve character-level classification, each character of a sentence is first embedded into a continuous latent space. The present invention uses a combination of BERT and LSTM as an encoder for this purpose. BERT uses a bi-directional transducer architecture to pretrain a large amount of unannotated text data using two unsupervised learning objectives, namely a masked language model and next sentence prediction. The pre-training method enables the BERT to learn the context representation of rich words and sentences, and can fine-tune downstream tasks such as named entity recognition and the like. In addition to BERT, the encoder architecture of the present invention also uses LSTM as a sequential modeling component. LSTM is able to capture dependencies between markers in the entire sequence. In particular, bi-directional LSTM is used to encode the forward and backward context of each tag, enabling the model to use the information of the forward and backward tags for classification tasks and, in addition, to enhance the ability of the model to capture global dependencies between tags, the present invention adds a Conditional Random Field (CRF) layer above the BERT-LSTM encoder. The CRF layer uses transition probabilities between tokens to model sequential dependencies between token assignments to help optimize the classifier generated output sequence. Specifically, the CRF layer receives as input the coded feature sequence obtained from the BERT-LSTM encoder and outputs a tag sequence corresponding to each tag in the input sentence.

Assuming q is the input sequence, the predicted tag sequenceCan be expressed as:

the loss function of the medical named entity recognition network is defined as:

wherein N represents the number of samples, M represents the number of categories, y _ij Indicating whether the real label of sample i is of category j,the probability that the model predicts sample i as class j is represented.

And 3, medical knowledge graph matching. And (3) matching the medical named entity extracted in the step (2) with the head entity of the triplet in the medical knowledge graph. And if the medical naming entity is consistent with the head entity of the triplet in the knowledge graph, taking the tail entity corresponding to the head entity as the knowledge entity.

And 4, sampling a knowledge entity. The number of the knowledge entities matched in the step 3 is often large, and the sampler based on deep reinforcement learning is designed to sample the knowledge entities matched in the step 3 so as to obtain the knowledge entity most suitable for answering the patient questions. The state, actions, rewards, and optimizations for deep reinforcement learning are modeled as follows:

status: all knowledge entities are encoded with BERT to obtain their hidden representations, based on the matched knowledge entities. These hidden representations are then subjected to an average pooling operation to obtain a current state representation. Mathematically, the form of the current state is as follows:

where s represents the current state, n is the number of matched knowledge entities, e _k Representing the kth knowledge entity.

The actions are as follows: to determine the probability of selecting each knowledge entity, the present invention employs a three-layer multi-layer perceptron (MLP) as the policy network. The input layer of the MLP contains 768 neurons, corresponding to the size of the state representation. The output layer of the MLP contains n neurons and is activated using a softmax activation function, where n is the number of matches to the knowledge entity. Each neuron in the output layer represents a probability of selecting a corresponding knowledge entity. Mathematically, this can be expressed as:

p＝softmax(MLP(s))

where s is the current state. p [ k ]]E p represents the probability of selecting the kth knowledge entity. Next, the policy network may be based on p [ k ]]Sampling the knowledge entity. The present invention represents the action corresponding to the kth entity as a [ k ]]It belongs to the set {0,1} ^p[k] 。a[k]=1 denotes selecting the kth knowledge entity, and a [ k ]]=0 indicates no selection. Thus, the joint probability density function of the policy network output can be expressed as:

where θ is a parameter of the policy network.

Rewarding: the present invention utilizes the loss values of a large language model to construct a reward function. Specifically, the reward function is defined as:

r＝-L _llm +c

where r is a prize, L _llm Is a loss of the large language model, and will be described in step 6, c is a hyper-parameter.

Optimizing: the goal of the optimization is to optimize the parameter θ by maximizing the desired jackpot value over all possible policy trajectories. The optimal θ can be expressed as:

where τ is the policy track consisting of state s, action a, and prize r.

The desired jackpot value for a trajectory may be expressed as:

where B is the total number of states, r _m Is in the mth state s _m Take action a _m The awards obtained.

The present invention uses a gradient descent technique to update the parameter θ to maximize J (θ). In order to prevent the network from being excessively updated due to the influence of a large prize value, the present invention subtracts the offset value during the update. Thus, the gradient is expressed as:

wherein A is _m ＝r _m -r _m Is an advantage in reinforcement learning. r is (r) _m Is a rewarding expectation, can be achieved by s _m And a _m And (5) calculating to obtain the product. Action a _m And directly selecting a knowledge entity according to the output strategy distribution p. a, a _m Can be expressed as:

and 5, fine tuning the large language model. By using the medical named entity recognition network and the knowledge entity sampling network, i.e. step 2, step 3 and step 4, the knowledge entity most suitable for answering the patient's questions can be obtained. The invention uses the sampled knowledge entity and the original patient questions as the input of the large language model for fine adjustment, thus realizing accurate answer to the patient questions. The invention adopts ChatGLM-6B as a basic model. The ChatGLM-6B model is an open bilingual language model based on the Generic Language Model (GLM) framework, with 62 billion parameters. The invention adopts a parameter adjustment (p-turn) technique. The technique only trims 0.1% of the parameters and can achieve good performance. The loss function of the ChatGLM-6B fine tuning network is defined as:

where cross sentropy represents the cross entropy loss function, z represents the actual answer,representing the answer generated by ChatGLM-6B.

And 6, generating a question and answer. And 5, generating an answer according to the model parameters obtained by fine tuning training in the step.

And 7, testing the model effect.

In order to test the model effect, 80% of patient-doctor data are randomly selected as a training set, the rest 20% of data are selected as a test set, and 10% of data are randomly selected from the training set as a verification set to adjust the model super-parameters. The test model of the invention is used for testing the performance of bilingual evaluation auxiliary indexes (BLEU), retrospective oriented generalized evaluation auxiliary indexes (ROUGE) and other indexes. The calculation formula of these indexes is as follows:

wherein BP is a brevity penalty term for penalizing a case where the generated text is shorter than the reference text, pn measures the ratio of N-gram in the generated text to appear in the reference text, N is the maximum N-gram length, G is the generated sentence, S is the real sentence, and CountS (w) represents word w in the generated textThe number of occurrences in this S, countG (w), represents the number of occurrences of word w in reference text G, β ² Is a constant and is typically set to 1.2.

The effects of the present invention will be described in further detail with reference to simulation experiments.

1. Simulation conditions and parameter settings:

the simulation experiments of the invention are carried out on simulation platforms of Python3.9.0, pytorch1.11 and CUDA 11.3. The CPU model of the computer is Intel Kuui 9-12900K, and the GPU model is Ing Weida Geforce RTX 3090. The learning rate is set to 2e-5.

2. The simulation content:

fig. 2 shows the performance comparison of the technical solution of the present invention with other fine-tuning large language models in the medical field on bilingual evaluation auxiliary index (BLEU) and retrospective oriented generalized evaluation auxiliary index (ROUGE) indexes. The abscissa is the different technical schemes. Fig. 2 (a) is a bilingual evaluation aid indicator (BLEU) on the ordinate, and fig. 2 (b) is a retrospectively guided generalized evaluation aid indicator (ROUGE) on the ordinate. By comparison, the answers generated by the method are closer to the level of a real doctor.

FIG. 3 shows a comparison of the technical solution of the present invention with a question-answer example of a large language model in the general field. By comparing, the answers generated by the method are more specific, and the answers generated by other large-scale language models are more unoccupied.

In summary of the simulation results and analysis, the knowledge graph-based data knowledge dual-drive intelligent medical dialogue system provided by the invention obtains better performance on bilingual evaluation auxiliary indicators (BLEU) and retrospective oriented generalized evaluation auxiliary indicators (ROUGE) indicators. Compared with question-answer examples of other large language models, the answer generated by the method is more targeted. These illustrate that the answers generated by the method of the present invention are closer to the level of the real doctor, so that the present invention can be better applied in the actual medical dialogue scene.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explanation of the principles of the present invention and are in no way limiting of the invention. Accordingly, any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present invention should be included in the scope of the present invention. Furthermore, the appended claims are intended to cover all such changes and modifications that fall within the scope and boundary of the appended claims, or equivalents of such scope and boundary.

Claims

1. The data knowledge double-drive intelligent medical dialogue system based on the knowledge graph is characterized by comprising a medical named entity recognition module, a medical knowledge graph matching module, a knowledge entity sampling module, a large language model fine tuning module and a dialogue generating module;

the medical named entity recognition module is used for extracting medical named entities in the patient problems, including disease names, body parts, medical procedures, medicines and departments, and is the basis of a subsequent medical knowledge graph matching module;

the medical knowledge map matching module is used for matching the disease name and the medical program medical naming entity with the nodes in the medical knowledge map after the disease name and the medical naming entity are obtained so as to obtain relevant professional background knowledge;

the number of the knowledge entities obtained by the knowledge graph matching module is huge, the knowledge entities comprise knowledge entities which are not related to the questions raised by the patient, and the knowledge entity sampling module samples the knowledge entities obtained by the medical knowledge graph matching module to obtain the knowledge entities which can answer the questions of the patient most;

the large language model fine tuning module inputs the questions raised by the patient and the knowledge entity obtained by the knowledge entity sampling module into the large language model for fine tuning training, so that the large language model can generate answers close to the level of human doctors;

the dialogue generating module is an output module of the system, and a large language model fine tuning module is utilized to fine tune a trained model, and the generated answer is output aiming at the input patient questions and related knowledge entities.

2. The method for realizing the data knowledge dual-drive intelligent medical dialogue system based on the knowledge graph is characterized by comprising the following steps of:

step 1: acquiring a data set and preparing a related knowledge graph;

acquiring a patient-doctor question and answer data set and a medical knowledge graph;

step 2: medical named entity identification;

extracting medical named entities in the patient problems by using a BERT-BILSTM-CRF model;

step 3: medical knowledge graph matching;

matching the extracted medical named entity with head entities of the triples in the medical knowledge graph, and taking tail entities corresponding to all the head entities as knowledge entities if the matching is successful;

step 4: sampling a knowledge entity;

sampling the knowledge entity matched in the step 3 by using a sampler based on deep reinforcement learning to obtain the knowledge entity most suitable for answering the patient questions;

step 5: fine tuning of a large language model;

inputting the knowledge entity obtained by sampling in the step 4 and the original problem of the patient into a large language model for fine tuning training;

step 6: generating questions and answers;

generating an answer according to the model parameters obtained by the fine tuning training in the step 5;

step 7: testing the model effect;

the quality of the model generated answer is verified on the verification set.

3. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 1 comprises: a patient-physician session dataset and a medical knowledge-graph are collected.

4. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 2 comprises: after the patient-physician dialogue dataset and medical knowledge graph are acquired according to step 1 above, to achieve character-level classification, each character of the sentence is first embedded into a continuous latent space, using the combination of BERT and LSTM as the encoder, the BERT utilizing a bi-directional transducer architecture that is pre-trained on large amounts of un-annotated text data, using two unsupervised learning objectives: masking language models and next sentence predictions, this pre-training method enabling BERT to learn rich word and sentence context representations, fine tuning named entity recognition downstream tasks, the encoder architecture using LSTM as a sequential modeling component in addition to BERT, LSTM being able to capture dependencies between labels in the whole sequence, bi-directional LSTM being used to encode forward and backward context for each label, enabling models to classify tasks using past and future label information, adding a Conditional Random Field (CRF) layer over the BERT-LSTM encoder, the CRF layer modeling sequential dependencies between label assignments using transition probabilities between labels, the CRF layer receiving the encoded signature sequences obtained from the BERT-LSTM encoder as input and outputting a label sequence corresponding to each label in the input sentence;

assuming q is the input sequence, the predicted tag sequenceExpressed as:

5. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 3 comprises: and (3) matching the medical named entity extracted in the step (2) with the head entity of the triplet in the medical knowledge graph, and taking the tail entity corresponding to the head entity as the knowledge entity if the medical named entity is consistent with the head entity of the triplet in the knowledge graph.

6. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 4 comprises: sampling the knowledge entity matched in the step 3 by using a sampler based on deep reinforcement learning to obtain the knowledge entity most suitable for answering the patient questions, wherein the state, action, rewards and optimization modeling of the deep reinforcement learning comprises:

status: according to the matched knowledge entities, all the knowledge entities are encoded using BERT to obtain their hidden representations, which are then subjected to an averaging pooling operation to obtain a current state representation in the form of:

where s represents the current state, n is the number of matched knowledge entities, e _k Representing a kth knowledge entity;

the actions are as follows: to determine the probability of selecting each knowledge entity, a three-layer multi-layer perceptron (MLP) is employed as the policy network, the input layer of the MLP containing 768 neurons corresponding to the size of the state representation, the output layer of the MLP containing n neurons and being activated using a softmax activation function, where n is the number of matched knowledge entities, each neuron in the output layer representing the probability of selecting a respective knowledge entity expressed as:

p＝softmax(MLP(s))

where s is the current state, p [ k ]]E p represents the probability of selecting the kth knowledge entity, and the policy network is based on p [ k ]]Sampling the knowledge entity, representing the action corresponding to the kth entity as a [ k ]]It belongs to the set {0,1} ^p[k] ，a[k]=1 denotes selecting the kth knowledge entity, and a [ k ]]=0 denotes no choice, and the joint probability density function of the policy network output is expressed as:

wherein θ is a parameter of the policy network;

rewarding: constructing a reward function by using the loss value of the large language model, wherein the reward function is defined as:

r＝-L _llm +c

where r is a prize, L _llm Is the loss of the large language model, c is a hyper-parameter;

optimizing: the goal of the optimization is to optimize the parameter θ by maximizing the desired jackpot value over all possible policy trajectories, the optimal θ being expressed as:

where τ is the policy track consisting of state s, action a and prize r;

the desired jackpot value for a track is expressed as:

where B is the total number of states, r _m Is in the mth state s _m Take action a _m A prize obtained;

the parameter θ is updated using a gradient descent technique to maximize J (θ), the gradient being expressed as a subtraction of the offset value during the update in order to prevent the network from being over-updated by the influence of the large prize value:

wherein the method comprises the steps ofIs an advantage in reinforcement learning, +.>Is a rewarding expectation, by s _m And->Calculated, action->Directly selecting a knowledge entity according to the output policy profile p, a>Expressed as:

7. the method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 5 comprises: the sampled knowledge entity and the original patient questions are used as input of a large language model to be finely tuned, accurate answer to the patient questions is achieved, a ChatGLM-6B is used as a basic model, the ChatGLM-6B model is an open bilingual language model based on a General Language Model (GLM) framework, 62 hundred million parameters are provided, a parameter adjustment (p-turn) technology is adopted, and a loss function of the ChatGLM-6B fine tuning network is defined as follows:

8. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 6 comprises: and (5) generating an answer according to the model parameters obtained by the fine tuning training in the step (5).

9. The method for implementing a knowledge-based data knowledge dual-driven intelligent medical dialogue system according to claim 2, wherein the step 7 comprises: randomly selecting 80% of patient-doctor data as a training set, and the rest 20% of data as a test set, and randomly selecting 10% of data from the training set as a verification set to adjust model super-parameters, wherein the test model is expressed on bilingual evaluation auxiliary indexes (BLEU) and retrospective oriented generalized evaluation auxiliary indexes (ROUGE) indexes, and the calculation formulas of the indexes are as follows:

wherein BP is a brevity penalty term for penalizing a case where the generated text is shorter than the reference text, pn measures the ratio of N-gram in the generated text to occur in the reference text, N is the maximum N-gram length, G is the generated sentence, S is the real sentence, countS (w) represents the number of times word w occurs in the generated text S, countG (w) represents the number of times word w occurs in the reference text G, beta ² Is a constant set to 1.2.