WO2023124837A1 - 问诊处理方法、装置、设备及存储介质 - Google Patents

问诊处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023124837A1
WO2023124837A1 PCT/CN2022/137008 CN2022137008W WO2023124837A1 WO 2023124837 A1 WO2023124837 A1 WO 2023124837A1 CN 2022137008 W CN2022137008 W CN 2022137008W WO 2023124837 A1 WO2023124837 A1 WO 2023124837A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate
symptom
diagnosis
information
disease
Prior art date
Application number
PCT/CN2022/137008
Other languages
English (en)
French (fr)
Inventor
黄亮
郭旭炀
刘慧�
康西龙
李鑫
Original Assignee
北京京东拓先科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东拓先科技有限公司 filed Critical 北京京东拓先科技有限公司
Publication of WO2023124837A1 publication Critical patent/WO2023124837A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a method, device, equipment and storage medium for medical inquiry processing.
  • Inquiry is a method of diagnosis that asks the patient or his companions for information related to the disease and the patient's subjective symptoms, so as to understand the patient's various discomforts and the issuance, development, diagnosis and treatment of the disease.
  • users can interact with the consultation system through smart terminals to obtain preliminary diagnosis and treatment results and suggestions for diagnosis and treatment given by the system.
  • Embodiments of the present application provide a method, device, equipment, and storage medium for medical inquiry processing, wherein:
  • the first aspect of the embodiments of the present application provides a method for processing medical inquiries, including:
  • the symptom information including at least one symptom data input by a user
  • the first candidate disease set includes a plurality of candidate diseases
  • the first candidate symptom set includes the All candidate symptoms of multiple candidate diseases
  • diagnosis and treatment information corresponding to the symptom information as of the current inquiry round from the first candidate disease set by probabilistic analysis Determining diagnosis and treatment information corresponding to the symptom information as of the current inquiry round from the first candidate disease set by probabilistic analysis, the diagnosis and treatment information being used to indicate at least one candidate disease;
  • the information entropy of the at least one candidate disease in the diagnosis and treatment information, and the pre-trained diagnosis decision model it is determined to output the diagnosis and treatment information or to perform a next round of inquiry.
  • the determining the diagnosis and treatment information corresponding to the symptom information up to the current inquiry round from the first candidate disease set by probabilistic analysis includes:
  • the diagnosis and treatment information includes the scores of the plurality of candidate diseases ranging from high to high to a low preset number of candidate diseases.
  • the obtaining the score of each candidate disease in the first candidate disease set includes:
  • the first candidate disease is any one of the plurality of candidate diseases
  • the score of the first candidate disease is determined according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease.
  • the determining the score of the first candidate disease according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease includes:
  • the score of the first candidate disease is determined according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease and noise parameters.
  • determining to output the diagnosis and treatment information or to perform a next round of inquiry includes:
  • the pre-trained diagnosis decision model it is determined to output the diagnosis and treatment information or to conduct the next round of inquiry, including:
  • the diagnosis decision-making model is obtained by training a fully-connected neural network through a plurality of sample sequences using a reinforcement learning algorithm, and the sample sequence includes at least one symptom data and a decision result, and the decision result is used to indicate the output of diagnosis and treatment information Or proceed to the next round of inquiries.
  • the determining to output the diagnosis and treatment information or to conduct the next round of inquiry according to the output value of the diagnosis decision model includes:
  • diagnosis decision model If the diagnosis decision model outputs the first value, determine to output the diagnosis and treatment information
  • the diagnostic decision model outputs the second value, it is determined to perform a next round of inquiry.
  • the method further includes:
  • the inquiry information is used to inquire whether the user has the target inquiry symptoms
  • the determining the target query symptom from the first candidate symptom set includes at least one of the following:
  • the candidate symptom with the fastest decrease in the overall information entropy of the first candidate disease set is used as the target query symptom.
  • the overall information entropy of the first candidate symptom in the first candidate symptom set to the first candidate disease set is based on the second candidate disease in the next round of inquiry determined by the ratio of the sum of scores of all candidate diseases in the set to the sum of scores of all candidate diseases in the first set of candidate diseases;
  • the second candidate disease set is determined according to the first candidate disease set and the first candidate symptom, and the first candidate symptom is any candidate symptom in the first candidate symptom set.
  • the inquiry knowledge map includes nodes of diseases, symptoms, disease inducements, disease department information, identification of symptoms, complications, and course of disease;
  • the acquiring the first set of candidate diseases corresponding to the symptom information based on the preset consultation knowledge map includes: obtaining the candidate diseases corresponding to each symptom data from the consultation knowledge map according to the at least one symptom data set to obtain the first candidate disease set.
  • the second aspect of the embodiment of the present application provides a medical inquiry processing device, including:
  • a receiving module configured to receive symptom information from the client, where the symptom information includes at least one symptom data input by the user;
  • An acquisition module configured to acquire a first candidate disease set and a first candidate symptom set corresponding to the symptom information based on a preset consultation knowledge map, the first candidate disease set includes a plurality of candidate diseases, and the first candidate The symptom set includes all candidate symptoms of the plurality of candidate diseases;
  • a processing module configured to determine the diagnosis and treatment information corresponding to the symptom information as of the current query round from the first candidate disease set through probability analysis, and the diagnosis and treatment information is used to indicate at least one candidate disease;
  • the information entropy of the at least one candidate disease in the diagnosis and treatment information, and the pre-trained diagnosis decision model it is determined to output the diagnosis and treatment information or to perform a next round of inquiry.
  • a third aspect of the embodiments of the present application provides an electronic device, including: a memory, a processor, and a computer program; wherein the computer program is stored in the memory and is configured to be executed by the processor to implement the following: The method of any one of the first aspects.
  • a fourth aspect of the embodiments of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to implement the method according to any one of the first aspect.
  • a fifth aspect of the embodiments of the present application provides a computer program product, including a computer program, and when the computer program is executed by a processor, the method described in any one of the first aspects is implemented.
  • FIG. 1 is a schematic diagram of a scenario of a medical inquiry processing method provided in an embodiment of the present application
  • FIG. 2 is an interactive schematic diagram 1 of the inquiry processing method provided by the embodiment of the present application.
  • FIG. 3 is the second interactive schematic diagram of the inquiry processing method provided by the embodiment of the present application.
  • Fig. 4 is a structural schematic diagram 1 of the medical inquiry processing device provided by the embodiment of the present application.
  • Fig. 5 is a schematic structural diagram II of the medical inquiry processing device provided by the embodiment of the present application.
  • FIG. 6 is a hardware structural diagram of an electronic device provided by an embodiment of the present application.
  • the term "corresponding" may indicate that there is a direct or indirect correspondence between the two, or that there is an association between the two, or that it indicates and is indicated, configuration and is configuration etc.
  • Knowledge map it combines the theories and methods of applied mathematics, graphics, information visualization technology, information science and other disciplines with metrology citation analysis, co-occurrence analysis and other methods, and uses a visual map to vividly display the core structure of the subject, Modern theories that develop history, frontier fields, and overall knowledge structure to achieve multidisciplinary integration. It can provide practical and valuable references for subject research.
  • Named entity generally refers to an entity with specific meaning or strong reference in the text.
  • entities involved in the medical field include names of people, department names, dates and times, and medical proper nouns (such as disease names, abbreviations, treatments), etc. .
  • NER Named Entity Recognition
  • Relation extraction used to judge whether there is a relationship between two entities in a sentence, and the type of relationship, such as "I don't have a fever", "No” is a negative word, "Fever” is a symptom word, "No " and "fever” are a modification relationship.
  • Natural language processing (NLP) algorithms including syntax analysis, syntactic analysis, semantic analysis, etc.
  • grammatical analysis includes word segmentation, part-of-speech tagging, entity recognition, spelling check, etc.
  • the basic task of syntactic analysis is to determine the syntactic structure of a sentence or the dependency relationship between words in a sentence.
  • Semantic analysis mainly includes semantic disambiguation and semantic representation.
  • Deep Q Network is a deep reinforcement learning model.
  • thermodynamics Information entropy is borrowed from thermodynamics by C.E.Shannon (Shannon) to solve the problem of quantitative measurement of information.
  • Heat entropy in thermodynamics represents the physical quantity of molecular state disorder, and Shannon uses the concept of information entropy to describe the uncertainty of information. The greater the information entropy, the greater the uncertainty and the smaller the probability.
  • the following two interactive methods of consultation are generally used to determine the diagnosis and treatment information: first, mainly rely on the personal experience of the doctor, obtain the user's symptom information through online and offline inquiries, and determine the diagnosis and treatment information; second, use the questionnaire designed by experts in advance Scale, set the corresponding jump path, and determine the diagnosis and treatment information.
  • the first method above uses purely manual consultation, which is time-consuming and labor-intensive, and wastes a lot of high-quality medical resources.
  • the above-mentioned second method adopts the existing similar scales and answer sheet templates.
  • the inquiry path appears to be the same, and it can only cover a limited number of departments. It relies heavily on the quality of the template customized by medical experts.
  • the embodiment of the present application proposes a method of consultation and processing.
  • the main idea of the invention is as follows: First, by constructing a complete knowledge map of consultation, it covers 200+ common diseases, 400+ symptoms, disease causes, disease Nodes such as department information, differential symptoms, complications, and course of disease, combined with the consultation knowledge graph, reason the symptoms entered by the user to obtain a candidate disease set and a candidate symptom set. Secondly, the score (ie probability value) of each candidate disease in the candidate disease set is determined through the probabilistic graphical model, and a preset number of candidate diseases with scores ranging from high to low are obtained as the most likely disease for the user. Finally, use preset rules or reinforcement learning algorithms to judge whether the consultation should end.
  • the consultation knowledge map in the above scheme covers multi-disciplinary data in the medical field. Combined with the consultation knowledge map to obtain candidate diseases and candidate symptoms, it can effectively reduce the misdiagnosis rate of the consultation system and improve the accuracy of diagnosis and treatment information.
  • FIG. 1 is a schematic diagram of a scenario of a method for processing a medical inquiry provided by an embodiment of the present application.
  • this scenario includes a first terminal device 11 , a second terminal device 12 and a medical inquiry server 13 (or referred to as a medical inquiry server or a medical inquiry platform).
  • the first terminal device 11 and the second terminal device 12 are connected in communication with the consultation service terminal 13 respectively.
  • the first terminal device 11 and the second terminal device 12 pre-install the application program APP of the consultation server 13, and users who use the first terminal device 11 or the second terminal device 12 can use this The application program accesses the consultation server 13.
  • users using the first terminal device 11 or the second terminal device 12 can also access the consultation service terminal 13 through channels such as webpages and applets.
  • the first terminal device 11 may be a patient terminal device, such as a patient's smart phone, tablet computer, notebook computer, desktop computer and other terminal devices, and for example a fixed or mobile terminal set in a hospital public area (such as a fixed or mobile intelligent robot).
  • a patient terminal device such as a patient's smart phone, tablet computer, notebook computer, desktop computer and other terminal devices
  • a fixed or mobile terminal set in a hospital public area such as a fixed or mobile intelligent robot.
  • the second terminal device 12 may be a terminal device at the doctor's end, such as a doctor's smart phone, tablet computer, notebook computer, desktop computer and other terminal devices.
  • the doctor can access the consultation service terminal 13 through the second terminal device 12, and obtain the diagnosis and treatment information given by the consultation service terminal 13, which can be used to assist the doctor in medical diagnosis.
  • the consultation server 11 has a built-in processing device, and the processing device is used to execute the method steps of the embodiment of the present application.
  • the storage space of the medical consultation server 13 stores a medical consultation knowledge map.
  • the processing device may also be integrated with the intelligent robot, so that the intelligent robot executes the method steps of the embodiments of the present application.
  • the storage space of the intelligent robot stores a consultation knowledge map.
  • the user can directly interact with the intelligent robot through voice or text input to obtain medical information or inquire about information.
  • the intelligent robot acts as a common terminal, and feeds back diagnosis and treatment information or inquiry information to the user by interacting with the consultation server (considering the large memory space occupied by the knowledge map, the intelligent robot may not store the consultation knowledge map).
  • the client can correspond to any terminal device shown in FIG. 1
  • the consultation server can correspond to the consultation system shown in FIG. 1 .
  • FIG. 2 is the first interactive schematic diagram of the consultation processing method provided by the embodiment of the present application.
  • the inquiry processing method of the present embodiment includes:
  • Step 201 Receive symptom information from a client, where the symptom information includes at least one symptom data input by a user.
  • the user accesses the consultation server through the client, and can describe one or more symptoms on the client through voice or text, and the client sends one or more symptoms to the consultation server based on the user's voice or text description. Multiple symptom data.
  • the client can use existing Named Entity Recognition (NER) and Relation Extraction (RE) algorithms to extract the main symptom data described by the user.
  • NER Named Entity Recognition
  • RE Relation Extraction
  • Step 202 Obtain the first candidate disease set and the first candidate symptom set corresponding to the symptom information based on the preset consultation knowledge map.
  • the first set of candidate diseases includes multiple candidate diseases, and the first set of candidate symptoms includes all candidate symptoms of the multiple candidate diseases.
  • the consultation knowledge graph includes nodes of diseases, symptoms, disease causes, disease department information, symptoms, complications, and course of disease. Acquiring the first candidate disease set corresponding to the symptom information based on the preset consultation knowledge map, specifically including: according to at least one symptom data, obtaining the candidate disease set corresponding to each symptom data from the consultation knowledge map, and obtaining the first candidate disease gather. Wherein, the first candidate disease set can be recorded as V cand .
  • the first candidate symptom set may be denoted as P cand .
  • Step 203 Determine the diagnosis and treatment information corresponding to the symptom information up to the current query round from the first candidate disease set by probabilistic analysis, and the diagnosis and treatment information is used to indicate at least one candidate disease.
  • diagnosis and treatment information can be determined by the following probability analysis method:
  • Step 2031 Obtain the score of each candidate disease in the first candidate disease set, and the score of each candidate disease is used to indicate the probability value of the user having the candidate disease.
  • Step 2032 according to the scores of multiple candidate diseases in the first candidate disease set, determine the diagnosis and treatment information corresponding to the symptom information up to the current inquiry round.
  • the diagnosis and treatment information includes a preset number of candidate diseases with scores ranging from high to low among the plurality of candidate diseases. It should be noted that this embodiment does not specifically limit the preset number, which can be reasonably set according to actual needs.
  • the preset number is 10
  • the first 10 disease data with higher scores are obtained, and the diagnosis and treatment information corresponding to the symptom information of the current inquiry round is at least
  • the top 10 disease data (i.e. diagnostic data) with higher scores are included.
  • diagnosis and treatment information may also include a treatment suggestion for each candidate disease among the preset number of candidate diseases with higher scores.
  • step 2031 specifically includes: obtaining the score of each candidate disease in the first candidate disease set through a probability graphical model.
  • the probability graphical model may be a noisy-or probability graphical model. It should be noted that this embodiment does not specifically limit the probability graphical model, and besides the noisysy-or probability graphical model, other probability graphical models can also be used to determine the score of each candidate disease.
  • the following takes the noisy-or probability graphical model as an example to describe in detail how to obtain the score of each candidate disease in the first candidate disease set.
  • obtaining the score of each candidate disease in the first candidate disease set through a probabilistic graphical model includes the following steps:
  • Step 1 Obtain the contribution of each symptom of the first candidate disease to the first candidate disease.
  • the contribution is used to indicate the statistical probability value of the sample that the first candidate disease is accompanied by the candidate symptoms.
  • the first candidate disease is one of the multiple candidate diseases any of the .
  • the sample statistical probability value is determined based on a large number of statistical samples.
  • the contribution ⁇ j of the j-th symptom (Sym) to the i-th disease (Dis) can be determined by the following formula:
  • #occurrence(Sym j ) indicates the number of occurrences of the jth symptom in the statistical sample
  • #co_occurrence(Sis i ,Sym j ) indicates the number of simultaneous occurrences of the jth symptom and the ith disease in the statistical sample.
  • Step 2. Determine the score of the first candidate disease according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease.
  • the score of the first candidate disease is determined according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease and noise parameters.
  • This step can also be expressed as: input the contribution of all candidate symptoms of the first candidate disease to the noisy-or probability graphical model to obtain the score of the first candidate disease.
  • noisy-or probability graphical model can be expressed by the following formula:
  • Step 204 according to any one of the preset number of inquiries, the information entropy of at least one candidate disease in the diagnosis and treatment information, and the pre-trained diagnosis decision model, determine to output the diagnosis and treatment information or conduct the next round of inquiry.
  • determining to output the diagnosis and treatment information may be understood as determining to stop inquiry and output the diagnosis and treatment information. It should be understood that, if it is determined to perform the next round of inquiry, it is not necessary to output the diagnosis and treatment information determined in the current round of inquiry.
  • determining to output the diagnosis and treatment information or to perform the next round of inquiry includes: if the sum of the information entropy of all candidate diseases in the at least one candidate disease is less than A preset threshold is determined to output diagnosis and treatment information; or, if the sum of information entropy of all candidate diseases in at least one candidate disease is greater than or equal to the preset threshold, it is determined to proceed to the next round of inquiry.
  • the information entropy of each candidate disease can be determined by the following formula:
  • entropy(v) represents the information entropy of candidate disease v
  • S v represents the score of candidate disease v
  • Information entropy is used to describe the uncertainty of information.
  • the preset number is 10
  • determine the information entropy of each of the 10 candidate diseases if the sum of the information entropy of the 10 candidate diseases is less than the preset Setting a threshold indicates that the uncertainty (or high accuracy) of the ten candidate disease data determined in step 204 is low, so the diagnosis and treatment information including the ten candidate disease data can be output. If the sum of the information entropy of these 10 candidate diseases is greater than or equal to the preset threshold value, it indicates that the instability of the data of these 10 candidate diseases determined in step 204 is high (or the accuracy is low), so it is necessary to proceed to the next step. Round inquiry.
  • determining to output diagnosis and treatment information or conduct the next round of inquiries includes: judging whether the current inquiry round reaches the preset number of inquiries; if the current inquiry round When the preset number of inquiries is reached, it is determined to output diagnosis and treatment information; or, if the current round of inquiries does not reach the preset number of inquiries, it is determined to proceed to the next round of inquiries.
  • the question answering server outputs the diagnosis and treatment information determined for the fifth time. It should be understood that the question-and-answer server will determine the diagnosis and treatment information for each round of inquiry, and the accuracy of the diagnosis and treatment information will continue to improve as the number of inquiries increases.
  • determining to output diagnosis and treatment information or to conduct the next round of inquiry includes: inputting symptom information into the pre-trained diagnostic decision-making model, and determining according to the output value of the diagnostic decision-making model Whether to output diagnosis and treatment information.
  • diagnosis decision model if the diagnosis decision model outputs the first value, it is determined to output diagnosis and treatment information.
  • the diagnostic decision model outputs the second value, it is determined to perform the next round of inquiry.
  • first value when the first value is 1, it is determined to output diagnosis and treatment information; when the second value is 0, it is determined to perform the next round of inquiry.
  • specific numerical values of the first value and the second value are not specifically limited, as long as the two decision results can be distinguished.
  • the diagnosis and decision-making model is obtained by training the fully connected neural network through multiple sample sequences using reinforcement learning algorithms.
  • the sample sequence includes at least one symptom data and decision results.
  • the decision results are used to indicate the output of diagnosis and treatment information or the next round of questioning. inquire.
  • the construction of the sample sequence of the diagnostic decision-making model includes: using the NLP algorithm to structurally process the dialogue data between the client and the Q&A server to obtain dialogue samples ⁇ "symptom 1", “symptom 2", ..., “symptom k” ⁇ , Label the decision-making action "decision 1" corresponding to the dialogue sample, and obtain a sample sequence ⁇ "symptom 1", “symptom 2", ..., “symptom k”, "decision 1" ⁇ .
  • the DQN algorithm can be used to build a sequence decision model:
  • At indicates the action at time t.
  • at can be understood as the decision-making action of the current round, and its action space is 2.
  • st t represents the state at time t, and is a one hot vector with a dimension of D.
  • the size of D is the total number of all candidate symptoms in the sample sequence.
  • MLP is a fully connected neural network.
  • the consultation processing method shown in this embodiment by receiving the symptom information from the client, first obtain the candidate disease set corresponding to the symptom information based on the preset consultation knowledge map, and determine the score of each candidate disease in the candidate disease set through probability analysis , the higher the score, the greater the probability that the user has a candidate disease. Then, according to the scores of multiple candidate diseases in the candidate disease set, the diagnosis and treatment information corresponding to the symptom information collected up to the current inquiry round is determined, and the diagnosis and treatment information includes at least one candidate disease with a higher score.
  • the information entropy of at least one candidate disease in the diagnosis and treatment information, and the pre-trained diagnosis decision model it is determined whether to output the diagnosis and treatment information or to proceed to the next round of inquiries.
  • the above scheme combined with the inquiry knowledge map can effectively reduce the misdiagnosis rate of the inquiry system and improve the accuracy of diagnosis and treatment;
  • the length of the query can be improved to improve the user's query experience.
  • FIG. 3 is the second interactive schematic diagram of the consultation processing method provided by the embodiment of the present application.
  • the inquiry processing method of the present embodiment also includes:
  • Step 301 if it is determined to conduct the next round of inquiry, determine the target inquiry symptom from the first candidate symptom set.
  • the target candidate symptoms can be determined from the first candidate symptom set through any of the following implementation methods:
  • the candidate disease with the highest score among the multiple candidate diseases in the first candidate disease set select the detailed symptoms of at least one candidate symptom confirmed by the user in the current query round as the target query symptom .
  • candidate disease 1 has the highest score among the 10 candidate diseases with higher scores determined in the current inquiry round, it indicates that the user has the highest probability of suffering from candidate disease 1, and the consultation server can further send the user Ask for details of one or more symptoms that the user has identified in Candidate Disease 1. For example, candidate disease 1 is "cold", and the symptom confirmed by the user is "runny nose”, and the consultation server can further ask the user such as "what is the color of the nose?" (ie the detailed symptoms of the symptom of runny nose), by receiving The user's reply data, determine and send the corresponding diagnosis result and/or treatment suggestion, or, continue to ask questions.
  • candidate disease 1 has the highest score among the 10 candidate diseases with higher scores determined in the current inquiry round, it indicates that the user has the highest probability of suffering from candidate disease 1, and the consultation server can further send the user Ask for one or more symptoms of Candidate Disease 1 that the user has not yet identified.
  • candidate disease 1 is "cold", and the symptoms confirmed by the user are "runny nose”, and other symptoms of "cold” include “fever", "dizziness”, “loss of appetite”, etc.
  • the consultation server can further ask the user For example, "Do you have any symptoms of fever?" (that is, other symptoms not confirmed by the user of the cold disease), by receiving the user's reply data, determine and send the corresponding diagnosis result and/or treatment suggestion, or continue to ask questions.
  • the overall information entropy of the first candidate symptoms in the first candidate symptom set to the first candidate disease set is based on the sum of the scores of all candidate diseases in the second candidate disease set of the next round of inquiry and the first candidate It is determined by the ratio of the sum of the scores of all candidate diseases in the disease set.
  • the second candidate disease set is determined according to the first candidate disease set and the first candidate symptom
  • the first candidate symptom is any candidate symptom in the first candidate symptom set.
  • the overall information entropy of each candidate symptom of each candidate disease in the first candidate disease set to the first candidate disease set can be determined by the following formula:
  • entropy(p) represents the overall information entropy of the candidate symptom p (that is, the first candidate symptom) to the candidate disease set V cand (that is, the first candidate disease set); prob(p) represents that if the candidate symptom p is the target query symptom (that is, the symptoms of the next round of inquiry), the ratio of the sum of the scores of the new candidate disease set V cand ⁇ V p (ie the second candidate disease set) to the sum of the scores of the candidate disease set V cand ; S v represents the candidate disease v score.
  • the symptoms with the most information value can be quickly and accurately located, thereby improving the efficiency of determining a certain disease and speeding up the consultation speed of the consultation server.
  • Step 302. Send inquiry information to the client, where the inquiry information is used to inquire whether the user has the target inquiry symptoms.
  • Step 303 receiving reply information from the client.
  • the reply information of the user includes confirmation information or non-confirmation information.
  • the inquiry information of the consultation server is "do you have any symptoms of fever?", and the user's reply information is "yes" or "no".
  • Step 304 update the diagnosis and treatment information based on the reply information, and determine to output the updated diagnosis and treatment information or conduct the next round of inquiry.
  • the symptom information is updated based on the reply information
  • the candidate disease set and the candidate symptom set corresponding to the updated symptom information are obtained based on the preset inquiry knowledge map, and the current inquiry is determined from the updated candidate disease set through probability analysis.
  • the diagnosis and treatment information corresponding to the round of symptom information determines whether to output diagnosis and treatment information or continue to inquire.
  • the specific implementation process is similar to steps 202 to 204 of the embodiment shown in FIG. .
  • the updated symptom information is determined based on the reply information of the user.
  • the updated symptom information may be to add at least one new symptom data on the basis of the original symptom data, or to exclude at least one symptom data on the basis of the original symptom data.
  • the consultation server obtains the set of candidate diseases corresponding to the updated symptom information. It should be understood that as the symptom data changes, the number of candidate diseases in the candidate disease set may increase, decrease or remain unchanged.
  • a preset questioning strategy is used to select Target symptom, ask the user again. Carry out a new round of data processing and analysis based on the reply information from the client, and finally determine whether to output diagnosis and treatment information or continue to ask questions. Based on the preset questioning strategy in this embodiment, conducting targeted inquiries can simultaneously improve the efficiency of the questioning system and the accuracy of diagnosis and treatment.
  • the embodiment of the present application can divide the functional modules of the consultation processing device according to the above method embodiments.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
  • FIG. 4 is a first structural schematic diagram of a medical inquiry processing device provided by an embodiment of the present application.
  • the medical inquiry processing device 400 of this embodiment includes: a receiving module 401 , an acquiring module 402 and a processing module 403 .
  • a receiving module 401 configured to receive symptom information from a client, where the symptom information includes at least one symptom data input by a user;
  • An acquisition module 402 configured to acquire a first candidate disease set and a first candidate symptom set corresponding to the symptom information based on a preset consultation knowledge map, the first candidate disease set includes a plurality of candidate diseases, and the first The set of candidate symptoms includes all candidate symptoms of the plurality of candidate diseases;
  • a processing module 403 configured to determine from the first set of candidate diseases through probability analysis the diagnosis and treatment information corresponding to the symptom information as of the current inquiry round, the diagnosis and treatment information being used to indicate at least one candidate disease;
  • the information entropy of the at least one candidate disease in the diagnosis and treatment information, and the pre-trained diagnosis decision model it is determined to output the diagnosis and treatment information or to perform a next round of inquiry.
  • the obtaining module 402 is configured to obtain a score of each candidate disease in the first candidate disease set, and the score is used to indicate the probability that the user has the candidate disease value;
  • the processing module 403 is configured to determine the diagnosis and treatment information corresponding to the symptom information as of the current inquiry round according to the scores of the multiple candidate diseases in the first candidate disease set, the diagnosis and treatment information including the multiple A preset number of candidate diseases with scores ranging from high to low among the candidate diseases.
  • the acquiring module 402 is configured to acquire the contribution degree of each candidate symptom of the first candidate disease to the first candidate disease, and the contribution degree is used to indicate the The first candidate disease is accompanied by a sample statistical probability value of the candidate symptom, and the first candidate disease is any one of the plurality of candidate diseases;
  • the processing module 403 is configured to determine the score of the first candidate disease according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease.
  • the processing module 403 is configured to determine the first candidate disease according to the contribution of all candidate symptoms of the first candidate disease to the first candidate disease and noise parameters score.
  • processing module 403 is configured to:
  • processing module 403 is configured to:
  • processing module 403 is configured to:
  • the diagnosis decision-making model is obtained by training a fully-connected neural network through a plurality of sample sequences using a reinforcement learning algorithm, and the sample sequence includes at least one symptom data and a decision result, and the decision result is used to indicate the output of diagnosis and treatment information Or proceed to the next round of inquiries.
  • processing module 403 is configured to:
  • diagnosis decision model If the diagnosis decision model outputs the first value, determine to output the diagnosis and treatment information
  • the diagnostic decision model outputs the second value, it is determined to perform a next round of inquiry.
  • FIG. 5 is a second structural schematic diagram of the medical inquiry processing device provided by the embodiment of the present application.
  • the medical inquiry processing device 400 of this embodiment includes: a sending module 404 .
  • a processing module 403 configured to determine the target query symptom from the first candidate symptom set if it is determined to conduct the next round of query;
  • a sending module 404 configured to send inquiry information to the client, where the inquiry information is used to inquire whether the user has the target inquiry symptom;
  • the receiving module 401 is configured to receive the reply information from the client, and the processing module 403 is configured to update the diagnosis and treatment information based on the reply information, and determine to output the updated diagnosis and treatment information or conduct a next round of inquiry.
  • the processing module 403 is configured to perform at least one of the following:
  • the candidate symptom with the fastest decrease in the overall information entropy of the first candidate disease set is used as the target query symptom.
  • the overall information entropy of the first candidate symptoms in the first candidate symptom set to the first candidate disease set is based on the second candidate disease set in the next round of inquiry Determined by the ratio of the sum of scores of all candidate diseases in the first candidate disease set to the sum of scores of all candidate diseases in the first candidate disease set;
  • the second candidate disease set is determined according to the first candidate disease set and the first candidate symptom, and the first candidate symptom is any candidate symptom in the first candidate symptom set.
  • the inquiry knowledge map includes nodes of diseases, symptoms, disease causes, disease department information, symptoms, complications, and disease course;
  • the acquisition module 402 is configured to acquire a candidate disease set corresponding to each symptom data from the consultation knowledge map according to the at least one symptom data, and obtain the first candidate disease set.
  • the medical inquiry processing device provided in this embodiment can implement the technical solution of any one of the above method embodiments, and its implementation principle and technical effect are similar, and will not be repeated here.
  • FIG. 6 is a hardware structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 6, the electronic device 500 provided in this embodiment includes:
  • the computer program is stored in the memory 501 and is configured to be executed by the processor 502 to implement the technical solutions of any of the above method embodiments.
  • the implementation principles and technical effects are similar and will not be repeated here.
  • the memory 501 can be independent or integrated with the processor 502 .
  • the electronic device 500 further includes: a bus 503 for connecting the memory 501 and the processor 502 .
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by the processor 502 to implement the technical solution of any one of the foregoing method embodiments.
  • An embodiment of the present application provides a computer program product, including a computer program, and when the computer program is executed by a processor, the technical solution of any one of the foregoing method embodiments is implemented.
  • the embodiment of the present application also provides a chip, including: a processing module and a communication interface, where the processing module can execute the technical solution of any one of the foregoing method embodiments.
  • the chip also includes a storage module (such as a memory), the storage module is used to store instructions, and the processing module is used to execute the instructions stored in the storage module, and the execution of the instructions stored in the storage module makes the processing module perform any of the foregoing.
  • a storage module such as a memory
  • the storage module is used to store instructions
  • the processing module is used to execute the instructions stored in the storage module, and the execution of the instructions stored in the storage module makes the processing module perform any of the foregoing.
  • processor can be a central processing unit (English: Central Processing Unit, referred to as: CPU), and can also be other general-purpose processors, digital signal processors (English: Digital Signal Processor, referred to as: DSP), application-specific integrated circuits (English: Application Specific Integrated Circuit, referred to as: ASIC), etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in conjunction with the invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the storage may include a high-speed RAM memory, and may also include a non-volatile storage NVM, such as at least one disk storage, and may also be a U disk, a mobile hard disk, a read-only memory, a magnetic disk, or an optical disk.
  • NVM non-volatile storage
  • the bus can be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the buses in the drawings of the present application are not limited to only one bus or one type of bus.
  • the above-mentioned storage medium can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable In addition to programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory magnetic memory
  • flash memory magnetic disk or optical disk.
  • a storage media may be any available media that can be accessed by a general purpose or special purpose computer.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and the storage medium may be located in Application Specific Integrated Circuits (ASIC for short).
  • ASIC Application Specific Integrated Circuits
  • the processor and the storage medium can also exist in the electronic device as discrete components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

本申请提供一种问诊处理方法、装置、设备及存储介质。该方法包括:通过接收来自客户端的症状信息,基于问诊知识图谱获取症状信息对应的候选疾病集合,通过概率分析确定候选疾病集合中每个候选疾病的得分;再根据候选疾病集合中多个候选疾病的得分,确定截止当前问询轮次收集到的症状信息对应的诊疗信息,诊疗信息包括得分较高的至少一个候选疾病。最后根据预设问询次数、诊疗信息中至少一个候选疾病的信息熵、诊断决策模型的任意一项,确定是输出诊疗信息还是进行下一轮问询。

Description

问诊处理方法、装置、设备及存储介质
本申请要求于2021年12月30日提交中国专利局、申请号为202111662512.0、申请名称为“问诊处理方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种问诊处理方法、装置、设备及存储介质。
背景技术
随着信息技术的快速发展以及“互联网+”概念的大力推进,为医疗行业的改革提供了一个新的探索方向。在当前时代背景下,计算机辅助问诊***的开发成为研究的热点。
问诊是询问病人或其陪同者与疾病相关的信息、病人的自觉症状,从而了解病人的各种不适感觉和疾病的发证、发展、诊疗情况的一种诊病方法。在“互联网+”的发展背景下,用户可通过智能终端与问诊***进行问答交互,从而获得***给出的初步诊疗结果和诊疗建议。
发明内容
本申请实施例提供一种问诊处理方法、装置、设备及存储介质,其中:
本申请实施例的第一方面提供一种问诊处理方法,包括:
接收来自客户端的症状信息,所述症状信息包括用户输入的至少一个症状数据;
基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合以及第一候选症状集合,所述第一候选疾病集合包括多个候选疾病,所述第一候选症状集合包括所述多个候选疾病的所有候选症状;
通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息用于指示至少一个候选疾病;
根据预设问询次数、诊疗信息中所述至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出所述诊疗信息或进行下一轮问询。
在本申请实施例的一个可选实施例中,所述通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,包括:
获取所述第一候选疾病集合中每个候选疾病的得分,所述得分用于指示所述用户有所述候选疾病的概率值;
根据所述第一候选疾病集合中所述多个候选疾病的得分,确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息包括所述多个候选疾病中得分由高到低的预设数量的候选疾病。
在本申请实施例的一个可选实施例中,所述获取所述第一候选疾病集合中每个候选疾病的得分,包括:
获取所述第一候选疾病的每个候选症状对所述第一候选疾病的贡献度,所述贡献度用于指示所述第一候选疾病伴随有所述候选症状的样本统计概率值,所述第一候选疾病为所述多个候选疾病中的任意一个;
根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度,确定所述第一候选疾病的得分。
在本申请实施例的一个可选实施例中,所述根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度,确定所述第一候选疾病的得分,包括:
根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度以及噪声参数,确定所述第一候选疾病的得分。
在本申请实施例的一个可选实施例中,根据所述诊疗信息中所述至少一个候选疾病的信息熵,确定输出所述诊疗信息或进行下一轮问询,包括:
若所述至少一个候选疾病中所有候选疾病的信息熵之和小于预设阈值,确定输出所述诊疗信息;或者
若所述至少一个候选疾病中所有候选疾病的信息熵之和大于或等于所述预设阈值,确定进行下一轮问询。
在本申请实施例的一个可选实施例中,根据所述预设问询次数,确定输出所述诊疗信息或进行下一轮问询,包括:
判断所述当前问询轮次是否达到预设问询次数;
若所述当前问询轮次达到所述预设问询次数,确定输出所述诊疗信息;或者,若所述当前问询轮次未达到所述预设问询次数,确定进行下一轮问询。
在本申请实施例的一个可选实施例中,根据所述预训练的诊断决策模型,确定输出所述诊疗信息或进行下一轮问询,包括:
将所述症状信息输入所述预训练的诊断决策模型,根据所述诊断决策模型的输出值确定输出所述诊疗信息或进行下一轮问询;
其中,所述诊断决策模型是采用强化学习算法通过多个样本序列对全连接神经网络进行训练得到的,所述样本序列包括至少一个症状数据以及决策结果,所述决策结果用于指示输出诊疗信息或进行下一轮问询。
在本申请实施例的一个可选实施例中,所述根据所述诊断决策模型的输出值确定输出所述诊疗信息或进行下一轮问询,包括:
若所述诊断决策模型输出第一值,确定输出所述诊疗信息;或者
若所述诊断决策模型输出第二值,确定进行下一轮问询。
在本申请实施例的一个可选实施例中,所述方法还包括:
若确定进行下一轮问询,从所述第一候选症状集合中确定目标问询症状,并向所述客户端发送问询信息,所述问询信息用于询问所述用户是否有所述目标问询症状;
接收来自所述客户端的回复信息,基于所述回复信息更新所述诊疗信息,确定输出更新后的诊疗信息或进行下一轮问询。
在本申请实施例的一个可选实施例中,所述从所述第一候选症状集合中确定目标问询症状,包括以下至少一项:
从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取当前问询轮次用户确认的至少一个候选症状的详细症状作为所述目标问询症状;或者
从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取除所述用户已确认的症状之外的其他症状作为所述目标问询症状;或者
从所述第一候选疾病集合的多个候选疾病的所有候选症状中选取,对所述第一候选疾病集合的整体信息熵下降最快的候选症状作为所述目标问询症状。
在本申请实施例的一个可选实施例中,所述第一候选症状集合中的第一候选症状对所述第一候选疾病集合的整体信息熵是根据下一轮问询的第二候选疾病集合的所有候选疾病的得分总和与所述第一候选疾病集合的所有候选疾病的得分总和的比值确定的;
其中,所述第二候选疾病集合是根据所述第一候选疾病集合以及所述第一候选症状确定的,所述第一候选症状为所述第一候选症状集合中的任意一个候选症状。
在本申请实施例的一个可选实施例中,所述问诊知识图谱包括疾病、症状、疾病诱因、疾病科室信息、鉴别症状、并发症以及病程的节点;
所述基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合,包括:根据所述至少一个症状数据,从所述问诊知识图谱中获取每个症状数据对应的候选疾病集合,获得所述第一候选疾病集合。
本申请实施例的第二方面提供一种问诊处理装置,包括:
接收模块,用于接收来自客户端的症状信息,所述症状信息包括用户输入的至少一个症状数据;
获取模块,用于基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合以及第一候选症状集合,所述第一候选疾病集合包括多个候选疾病,所述第一候选症状集合包括所述多个候选疾病的所有候选症状;
处理模块,用于通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息用于指示至少一个候选疾病;
根据预设问询次数、诊疗信息中所述至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出所述诊疗信息或进行下一轮问询。
本申请实施例的第三方面提供一种电子设备,包括:存储器,处理器以及计算机程序;其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行以实现如第一方面中任一项所述的方法。
本申请实施例的第四方面提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现如第一方面中任一项所述的方法。
本申请实施例的第五方面提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现第一方面中任一项所述的方法。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的问诊处理方法的场景示意图;
图2为本申请实施例提供的问诊处理方法的交互示意图一;
图3为本申请实施例提供的问诊处理方法的交互示意图二;
图4为本申请实施例提供的问诊处理装置的结构示意图一;
图5为本申请实施例提供的问诊处理装置的结构示意图二;
图6为本申请实施例提供的电子设备的硬件结构图。
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述。这些附 图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例的说明书、权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述之外的顺序实施。
应当理解,本文中使用的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本申请实施例的描述中,术语“对应”可表示两者之间具有直接对应或间接对应的关系,也可以表示两者之间具有关联关系,也可以是指示与被指示、配置与被配置等关系。
首先对本申请实施例涉及的相关术语进行简要介绍。
知识图谱:是通过将应用数学、图形学、信息可视化技术、信息科学等学科的理论与方法与计量学引文分析、共现分析等方法结合,并利用可视化的图谱形象地展示学科的核心结构、发展历史、前沿领域以及整体知识架构达到多学科融合目的的现代理论。它能为学科研究提供切实的、有价值的参考。
命名实体:一般指的是文本中具有特定意义或指代性强的实体,例如医疗领域涉及的实体包括人名、科室名称、日期时间以及医疗专有名词(如疾病名称、缩写、治疗手段)等。
命名实体识别(NER):是从非结构化的输入文本中抽取上述实体,并且可以按照业务需求识别出更多的实体,例如医疗领域中的药品名称、批号、价格等。
关系抽取(RE),用来判断一句话中的两个实体之间是否存在关系,以及关系的类型,例如“我没有发热”,“没有”是否定词,“发热”是症状词,“没有”和“发热” 是一种修饰关系。
自然语言处理(NLP)算法,包括语法分析、句法分析、语义分析等。其中语法分析包括分词、词性标注、实体识别、拼写检查等,句法分析的基本任务是确定句子的句法结构或者句中词汇之间的依存关系,语义分析主要包含语义消岐和语义表示。
Deep Q Network(DQN),是一种深度强化学习模型。
信息熵,是C.E.Shannon(香农)从热力学中借用过来的,用于解决对信息的量化度量问题。热力学中的热熵表示分子状态混乱程度的物理量,香农用信息熵的概念来描述信息的不确定度。信息熵越大,不确定性越大,概率越小。
目前普遍采用如下两种问诊交互方式确定诊疗信息:第一,主要依赖于医生的个人经验,通过线上线下询问获取用户的症状信息,确定诊疗信息;第二,采用专家事先设计好的答卷量表,设定好相应的跳转路径,确定诊疗信息。上述第一种方式采用纯人工的问诊,耗时耗力,浪费大量的优质医疗资源。上述第二种方式采用现有的类似量表和答卷模板的方式,问诊路径显得千篇一律,且只能覆盖有限的科室,并严重依赖医学专家定制的模板的质量,由于医生自带科室属性(通常一个医生只熟悉自己科室的疾病诊断和问诊),问诊过程中无法顾及所有科室的疾病,诊疗信息可能存在偏差。另外,医生制定的答题量表也很难进行合并、扩展。
随着互联网的发展,健康产业与互联网结合地愈发紧密,线上问诊逐渐成为一种用户可以选择的快捷健康咨询途径,线上问诊的场景较线上线下医生问诊有很大的差别,问诊***通过对线上问诊对话数据的分析,给出诊断建议、治疗建议等诊疗信息。目前现有的问诊***生不成熟,主要针对一些常见或症状较轻的疾病,在客观性、准确性以及标准化等方面存在不足。
针对上述问题,本申请实施例提出一种问诊处理方法,其主要发明思路如下:首先,通过构建完备的问诊知识图谱,覆盖包括200+种常见疾病、400+种症状、疾病诱因、疾病科室信息、鉴别症状、并发症、病程等节点,结合问诊知识图谱对用户输入的症状进行推理,得出候选疾病集合和候选症状集合。其次,通过概率图模型确定候选疾病集合中每个候选疾病的得分(即概率值),获取得分从高到低的预设数量的候选疾病,将其作为用户最有可能患有的疾病。最后,采用预设规则或强化学习算法判断问诊是否应当结束,如果确定问诊结束,输出用户最有可能患有的疾病数据和等诊疗信息;如果需要进一步问询,可根据预设的发问策略确定从候选症状集合中选取目标症状,向用户再次发问。
上述方案中的问诊知识图谱涵盖了医疗领域多学科数据,结合问诊知识图谱获得 候选疾病和候选症状,可有效降低问诊***的误诊率,提高诊疗信息的准确性。
在介绍本申请提供的问诊处理方法之前,下面先对问诊处理方法的应用场景进行简要介绍。
图1为本申请实施例提供的问诊处理方法的场景示意图。如图1所示,该场景包括第一终端设备11、第二终端设备12以及问诊服务端13(或称为问诊服务器、问诊平台)。其中,第一终端设备11和第二终端设备12分别与问诊服务端13通信连接。
在一种可选的实施方式中,第一终端设备11和第二终端设备12预安装问诊服务端13的应用程序APP,使用第一终端设备11或第二终端设备12的用户可通过该应用程序访问问诊服务端13。
在一种可选的实施方式中,使用第一终端设备11或第二终端设备12的用户还可以通过网页、应用小程序等渠道访问问诊服务端13。
作为一种示例,第一终端设备11可以是患者端的终端设备,例如患者的智能手机、平板电脑、笔记本电脑、台式电脑等终端设备,又例如医院公共区域设置的固定或移动终端(如固定或移动的智能机器人)。
作为一种示例,第二终端设备12可以是医生端的终端设备,例如医生的智能手机、平板电脑、笔记本电脑、台式电脑等终端设备。示例性的,在问诊过程中,医生可通过第二终端设备12访问问诊服务端13,获取问诊服务端13给出的诊疗信息,该信息可用于辅助医生进行医疗诊断。
在一种可选的实施方式中,问诊服务端11内置处理装置,该处理装置用于执行本申请实施例的方法步骤。可选的,问诊服务端13的存储空间存储有问诊知识图谱。
在一种可选的实施方式中,若第一终端设备11为智能机器人,还可以将处理装置集成于智能机器人,使得智能机器人执行本申请实施例的方法步骤。可选的,智能机器人的存储空间存储有问诊知识图谱。示例性的,用户可通过语音或文本输入的方式,直接与智能机器人进行交互,获取诊疗信息或问询信息。
可选的,智能机器人作为普通终端,通过与问诊服务端交互(考虑到知识图谱占用的内存空间较大,智能机器人可以不存储问诊知识图谱),向用户反馈诊疗信息或问询信息。
基于上述场景,下面通过具体实施例对本申请实施例提供的技术方案进行详细说明。下述实施例以客户端和问诊服务端的交互为例对方案进行说明,其中客户端可对应图1所示任意一个终端设备,问诊服务端可对应图1所示的问诊***。
需要说明的是,本申请实施例提供的技术方案可以包括以下内容中的部分或全部, 下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。
图2为本申请实施例提供的问诊处理方法的交互示意图一。如图2所示,本实施例的问诊处理方法,包括:
步骤201、接收来自客户端的症状信息,症状信息包括用户输入的至少一个症状数据。
本实施例中,用户通过客户端访问问诊服务端,可通过语音或文本的方式在客户端描述一个或多个症状,客户端基于用户的语音或文本描述,向问诊服务端发送一个或多个症状数据。
可选的,客户端可使用现有的命名实体识别(NER)和关系抽取(RE)算法,提取用户描述的主要症状数据。
步骤202、基于预设的问诊知识图谱获取症状信息对应的第一候选疾病集合以及第一候选症状集合。
其中,第一候选疾病集合包括多个候选疾病,第一候选症状集合包括多个候选疾病的所有候选症状。
本实施例中,问诊知识图谱包括疾病、症状、疾病诱因、疾病科室信息、鉴别症状、并发症以及病程的节点。基于预设的问诊知识图谱获取症状信息对应的第一候选疾病集合,具体包括:根据至少一个症状数据,从问诊知识图谱中获取每个症状数据对应的候选疾病集合,获得第一候选疾病集合。其中,第一候选疾病集合可记为V cand
可选的,针对第一候选疾病集合中的每个疾病,再基于预设的问诊知识图谱获取每个候选疾病包含的所有症状数据,生成第一候选症状集合。其中,第一候选症状集合可记为P cand
步骤203、通过概率分析从第一候选疾病集合中确定截止当前问询轮次的症状信息对应的诊疗信息,诊疗信息用于指示至少一个候选疾病。
在一个可选实施例中,可通过如下概率分析方法确定诊疗信息:
步骤2031、获取第一候选疾病集合中每个候选疾病的得分,每个候选疾病的得分用于指示用户有该候选疾病的概率值。
步骤2032、根据第一候选疾病集合中多个候选疾病的得分,确定截止当前问询轮次的症状信息对应的诊疗信息。
其中,诊疗信息包括多个候选疾病中得分由高到低的预设数量的候选疾病。需要说明的是,本实施例对预设数量不作具体限定,可根据实际需求进行合理设置。
示例性的,假设预设数量取10,则根据第一候选疾病集合中多个候选疾病的得分,获 取得分较高的前10个疾病数据,当前问询轮次的症状信息对应的诊疗信息至少包括得分较高的前10个疾病数据(即诊断数据)。
可选的,诊疗信息还可以包括得分较高的预设数量的候选疾病中每个候选疾病的治疗建议。
在一个可选实施例中,步骤2031,具体包括:通过概率图模型获取第一候选疾病集合中每个候选疾病的得分。
其中,概率图模型可以为Noisy-or概率图模型。需要说明的是,本实施例对概率图模型不作具体限定,除了Noisy-or概率图模型之外,还可以采用其他概率图模型确定每个候选疾病的得分。
为了便于理解,下面以Noisy-or概率图模型为例,对如何获取第一候选疾病集合中每个候选疾病的得分进行详细说明。
作为一种示例,通过概率图模型获取第一候选疾病集合中每个候选疾病的得分,包括如下步骤:
步骤1、获取第一候选疾病的每个症状对第一候选疾病的贡献度,贡献度用于指示第一候选疾病伴随有候选症状的样本统计概率值,第一候选疾病为多个候选疾病中的任意一个。
其中,样本统计概率值是基于大量统计样本确定的。
具体的,可通过如下公式确定第j种症状(Sym)对第i种疾病(Dis)的贡献度λ j
Figure PCTCN2022137008-appb-000001
其中,#occurrence(Sym j)表示统计样本中出现第j种症状的次数,#co_occurrence(Sis i,Sym j)表示统计样本中同时出现第j种症状和第i种疾病的次数。
下面通过一个示例对样本统计概率值的计算进行说明。
假设在20000个问诊病历(即问诊样本)中“咳嗽”出现了100次,“感冒”和“咳嗽”同时出现(co_occurrence)在一个问诊病历中的次数为10次,那么“咳嗽”对“感冒”的贡献度为:P(感冒|咳嗽)=10/100=0.1。
步骤2、根据第一候选疾病的所有候选症状对第一候选疾病的贡献度,确定第一候选疾病的得分。
具体的,根据第一候选疾病的所有候选症状对第一候选疾病的贡献度以及噪声参数,确定第一候选疾病的得分。
本步骤也可以表示为:将第一候选疾病的所有候选症状对第一候选疾病的贡献度输入 Noisy-or概率图模型,获得第一候选疾病的得分。
其中,Noisy-or概率图模型可通过如下公式表示:
Figure PCTCN2022137008-appb-000002
其中,
Figure PCTCN2022137008-appb-000003
表示第一候选疾病v 1的得分,p j表示第一候选疾病的第j个症状,j∈[1,k],k为正整数,λ 0表示噪声参数,λ j表示第一候选疾病伴随有第j个症状的样本统计概率值。
步骤204、根据预设问询次数、诊疗信息中至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出诊疗信息或进行下一轮问询。
本步骤中,确定输出诊疗信息可以理解为确定停止问询并输出诊疗信息。应理解,确定进行下一轮问询,则无需输出当前问询轮次确定的诊疗信息。
下面通过几个具体实施方式对是否继续问询进行详细说明。
在一个可选的实施方式中,根据诊疗信息中至少一个候选疾病的信息熵,确定输出诊疗信息或进行下一轮问询,包括:若至少一个候选疾病中所有候选疾病的信息熵之和小于预设阈值,确定输出诊疗信息;或者,若至少一个候选疾病中所有候选疾病的信息熵之和大于或等于预设阈值,确定进行下一轮问询。
作为一种示例,获取多个候选疾病中得分由高到低的预设数量的候选疾病的得分;确定预设数量的候选疾病中每个候选疾病的信息熵;若预设数量的候选疾病中所有候选疾病的信息熵之和小于预设阈值,确定输出诊疗信息;或者,若预设数量的候选疾病中所有候选疾病的信息熵之和大于或等于预设阈值,确定进行下一轮问询。
本实施方式中,每个候选疾病的信息熵可通过如下公式确定:
entropy(v)=-S v*logS v
其中,entropy(v)表示候选疾病v的信息熵,S v表示候选疾病v的得分。信息熵用于描述信息的不确定性。
示例性的,假设预设数量取10,在获取这10个候选疾病的得分后,确定这10个候选疾病中每个候选疾病的信息熵,若这10个候选疾病的信息熵之和小于预设阈值,表明步骤204中确定的这10个候选疾病数据的不确定性较低(或者说准确性较高),因此可以输出包括这10个候选疾病数据的诊疗信息。若这10个候选疾病的信息熵之和大于或等于预设阈值,表明步骤204中确定的这10个候选疾病数据的不稳定性较高(或者说准确性较低),因此需要进行下一轮问询。
在一个可选的实施方式中,根据预设问询次数,确定输出诊疗信息或进行下一轮 问询,包括:判断当前问询轮次是否达到预设问询次数;若当前问询轮次达到预设问询次数,确定输出诊疗信息;或者,若当前问询轮次未达到预设问询次数,确定进行下一轮问询。
示例性的,假设预设次数取5,客户端与问答服务端交互5次之后,问答服务端输出第五次确定的诊疗信息。应理解,问答服务端在每一轮问询时都会确定该轮次的诊疗信息,随着问询次数的增加,诊疗信息的准确性不断提升。
在一个可选的实施方式中,根据预训练的诊断决策模型,确定输出诊疗信息或进行下一轮问询,包括:将症状信息输入预训练的诊断决策模型,根据诊断决策模型的输出值确定是否输出诊疗信息。
一种情况下,若诊断决策模型输出第一值,确定输出诊疗信息。
一种情况下,若诊断决策模型输出第二值,确定进行下一轮问询。
示例性的,第一值为1时,确定输出诊疗信息;第二值为0时,确定进行下一轮问询。本实施例对第一值和第二值的具体数值不作具体限定,只要能够区分两种决策结果即可。
其中,诊断决策模型是采用强化学习算法通过多个样本序列对全连接神经网络进行训练得到的,样本序列包括至少一个症状数据以及决策结果,决策结果用于指示输出诊疗信息或进行下一轮问询。
诊断决策模型的样本序列的构建,包括:使用NLP算法对客户端与问答服务端的对话数据进行结构化处理,获得对话样本{“症状1”,“症状2”,…,“症状k”},标注对话样本对应的决策动作“决策1”,得到一个样本序列{“症状1”,“症状2”,…,“症状k”,“决策1”}。
当收集到足够多的样本序列,可使用DQN算法搭建序列决策模型:
a t=MLP(s t)
其中,a t表示t时刻的动作,在本实施例中,a t可以理解为当前轮次的决策动作,其动作空间为2,例如a t=0表示“继续发问”,a t=1表示“下诊断”。s t表示t时刻的状态,是一个one hot向量,维度为D,在本实施例中,D的大小为样本序列中所有候选症状的总数。MLP为全连接神经网络。
当MLP网络输出的预测值a t的准确率达到预设阈值时,结束诊断决策模型的训练过程。
本实施例示出的问诊处理方法,通过接收来自客户端的症状信息,首先基于预设的问诊知识图谱获取症状信息对应的候选疾病集合,通过概率分析确定候选疾病集合 中每个候选疾病的得分,得分越高说明用户有候选疾病的概率越大。再根据候选疾病集合中多个候选疾病的得分,确定截止当前问询轮次收集到的症状信息对应的诊疗信息,诊疗信息包括得分较高的至少一个候选疾病。最后根据预设问询次数、诊疗信息中至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定是输出诊疗信息还是进行下一轮问询。一方面,上述方案结合问诊知识图谱,可有效降低问诊***的误诊率,提高诊疗的准确性;另一方面,通过分析诊疗信息中候选疾病的信息熵确定是否停止问询,可缩短问询时长,提升用户的问询体验。
图3为本申请实施例提供的问诊处理方法的交互示意图二。在图2所示实施例的基础上,如图3所示,本实施例的问诊处理方法,还包括:
步骤301、若确定进行下一轮问询,从第一候选症状集合中确定目标问询症状。
本实施例中,若确定进行下一轮问询,可通过如下实施方式的任意一项从第一候选症状集合中确定目标候选症状:
在一个可选的实施方式中,从第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取当前问询轮次用户确认的至少一个候选症状的详细症状作为目标问询症状。
示例性的,若在当前问询轮次中确定的得分较高的10个候选疾病中,候选疾病1的得分最高,表明用户患有候选疾病1的概率最大,问诊服务端可进一步向用户询问候选疾病1中用户已确认的一个或多个症状的详细信息。例如,候选疾病1为“感冒”,用户确认的症状有“流鼻涕”,问诊服务端可进一步询问用户例如“鼻涕的颜色是?”(即流鼻涕这一症状的详细症状),通过接收用户的回复数据,确定并发送相应的诊断结果和/或治疗建议,或者,继续发问。
在一个可选的实施方式中,从第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取除用户已确认的症状之外的其他症状作为目标问询症状。
示例性的,若在当前问询轮次中确定的得分较高的10个候选疾病中,候选疾病1的得分最高,表明用户患有候选疾病1的概率最大,问诊服务端可进一步向用户询问候选疾病1中用户尚未确认的一个或多个症状。例如,候选疾病1为“感冒”,用户确认的症状有“流鼻涕”,“感冒”的其他症状包括例如“发烧”、“头晕”、“食欲不振”等,问诊服务端可进一步询问用户例如“是否有发烧的症状?”(即感冒这一疾病的用户未确认的其他症状),通过接收用户的回复数据,确定并发送相应的诊断结果和/或治疗建议,或者,继续发问。
在一个可选的实施方式中,从第一候选疾病集合的多个候选疾病的所有候选症状 数据中选取,对第一候选疾病集合的整体信息熵下降最快的候选症状作为目标问询症状。
本实施方式中,第一候选症状集合中的第一候选症状对第一候选疾病集合的整体信息熵是根据下一轮问询的第二候选疾病集合的所有候选疾病的得分总和与第一候选疾病集合的所有候选疾病的得分总和的比值确定的。其中,第二候选疾病集合是根据第一候选疾病集合以及第一候选症状确定的,第一候选症状为第一候选症状集合中的任意一个候选症状。
作为一种示例,可通过如下公式确定第一候选疾病集合中每个候选疾病的每个候选症状对第一候选疾病集合的整体信息熵:
entropy(p)=-prob(p)*log(prob(p))
Figure PCTCN2022137008-appb-000004
其中,entropy(p)表示候选症状p(即第一候选症状)对候选疾病集合V cand(即第一候选疾病集合)的整体信息熵;prob(p)表示如果候选症状p为目标问询症状(即下一轮问询的症状),新的候选疾病集合V cand∩V p(即第二候选疾病集合)的得分总和与候选疾病集合V cand的得分总和的比值;S v表示候选疾病v的得分。
示例性的,若在当前问询轮次中确定得分较高的10个候选疾病,可通过获取这10个候选疾病的候选症状,针对获取到的所有候选症状,确定每个候选症状对当前候选疾病集合(即这10个候选疾病)的整体信息熵。应理解,信息熵越小,不确定性越小,概率值越大。因此,可以从得分较高的10个候选疾病的所有候选症状中,选取候选症状的信息熵最小的候选症状,作为下一轮问询中重点向用户询问的目标症状。
本实施方式中,通过确定信息熵下降最快的候选症状,实现快速、准确地定位到最有信息价值的症状,从而提高判定某疾病的效率,可加速问诊服务端的问诊速度。
步骤302、向客户端发送问询信息,问询信息用于询问用户是否有目标问询症状。
步骤303、接收来自客户端的回复信息。
本步骤中,用户的回复信息包括确认信息或不确认信息。例如,问诊服务端的问询信息为“是否有发烧的症状?”,用户的回复信息为“有”或“没有”。
步骤304、基于回复信息更新诊疗信息,确定输出更新后的诊疗信息或进行下一轮问询。
本步骤中,基于回复信息更新症状信息,基于预设的问诊知识图谱获取更新后的症状信息对应的候选疾病集合以及候选症状集合,通过概率分析从更新的候选疾病集 合中确定截止当前问询轮次的症状信息对应的诊疗信息,确定是输出诊疗信息还是继续问询,具体实现过程与图2所示实施例的步骤202至步骤204类似,可参照上文实施例,此处不再赘述。
需要指出的是,基于用户的回复信息,确定更新后的症状信息。更新后的症状信息可能是在原有症状数据的基础上增加至少一个新的症状数据,也可能是在原有症状数据的基础上排除至少一个症状数据。问诊服务端基于预设的问诊知识图谱,获取更新后的症状信息对应的候选疾病集合。应理解,随着症状数据的变化,候选疾病集合中的候选疾病的个数可能增加,也可能减少或不变。采用概率图模型确定更新后的候选疾病集合中每个候选疾病的得分,根据更新后的候选疾病集合中每个候选疾病的得分,确定截止当前问询轮次的症状信息对应的诊疗信息(即更新后的诊疗信息),通过上文实施例的预设规则或强化学习算法,确定是否输出更新后的诊疗信息。
本实施例示出的问诊处理方法,在上一个实施例的基础上,若确定需要进行下一轮问询时,基于当前候选疾病集合中的多个候选疾病,采用预设的发问策略,选取目标症状,向用户再次发问。基于客户端的回复信息进行新一轮的数据处理分析,并最终确定是输出诊疗信息还是继续发问。基于本实施例中预设的发问策略,进行有针对性性的询问,可同时提高问诊***的问诊效率以及诊疗的准确性。
上文描述了本申请实施例提供的问诊处理方法,下面将描述本申请实施例提供的问诊处理装置。
本申请实施例可以根据上述方法实施例对问诊处理装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以使用硬件的形式实现,也可以使用软件功能模块的形式实现。
需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。下面以使用对应各个功能划分各个功能模块为例进行说明。
图4为本申请实施例提供的问诊处理装置的结构示意图一。如图4所示,本实施例的问诊处理装置400,包括:接收模块401,获取模块402以及处理模块403。
接收模块401,用于接收来自客户端的症状信息,所述症状信息包括用户输入的至少一个症状数据;
获取模块402,用于基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合以及第一候选症状集合,所述第一候选疾病集合包括多个候选疾病,所述第一候选症状集合包括所述多个候选疾病的所有候选症状;
处理模块403,用于通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息用于指示至少一个候选疾病;
根据预设问询次数、诊疗信息中所述至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出所述诊疗信息或进行下一轮问询。
在本实施例的一个可选实施例中,获取模块402,用于获取所述第一候选疾病集合中每个候选疾病的得分,所述得分用于指示所述用户有所述候选疾病的概率值;
处理模块403,用于根据所述第一候选疾病集合中所述多个候选疾病的得分,确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息包括所述多个候选疾病中得分由高到低的预设数量的候选疾病。
在本实施例的一个可选实施例中,获取模块402,用于获取所述第一候选疾病的每个候选症状对所述第一候选疾病的贡献度,所述贡献度用于指示所述第一候选疾病伴随有所述候选症状的样本统计概率值,所述第一候选疾病为所述多个候选疾病中的任意一个;
处理模块403,用于根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度,确定所述第一候选疾病的得分。
在本实施例的一个可选实施例中,处理模块403,用于根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度以及噪声参数,确定所述第一候选疾病的得分。
在本实施例的一个可选实施例中,处理模块403,用于:
若所述至少一个候选疾病中所有候选疾病的信息熵之和小于预设阈值,确定输出所述诊疗信息;或者
若所述至少一个候选疾病中所有候选疾病的信息熵之和大于或等于所述预设阈值,确定进行下一轮问询。
在本实施例的一个可选实施例中,处理模块403,用于:
判断所述当前问询轮次是否达到预设问询次数;
若所述当前问询轮次达到所述预设问询次数,确定输出所述诊疗信息;或者,若所述当前问询轮次未达到所述预设问询次数,确定进行下一轮问询。
在本实施例的一个可选实施例中,处理模块403,用于:
将所述症状信息输入所述预训练的诊断决策模型,根据所述诊断决策模型的输出值确定输出所述诊疗信息或进行下一轮问询;
其中,所述诊断决策模型是采用强化学习算法通过多个样本序列对全连接神经网络进行训练得到的,所述样本序列包括至少一个症状数据以及决策结果,所述决策结果用于指示输出诊疗信息或进行下一轮问询。
在本实施例的一个可选实施例中,处理模块403,用于:
若所述诊断决策模型输出第一值,确定输出所述诊疗信息;或者
若所述诊断决策模型输出第二值,确定进行下一轮问询。
图5为本申请实施例提供的问诊处理装置的结构示意图二。在图4所示装置的基础上,如图5所示,本实施例的问诊处理装置400,包括:发送模块404。
处理模块403,用于若确定进行下一轮问询,从所述第一候选症状集合中确定目标问询症状;
发送模块404,用于向所述客户端发送问询信息,所述问询信息用于询问所述用户是否有所述目标问询症状;
接收模块401,用于接收来自所述客户端的回复信息,处理模块403,用于基于所述回复信息更新所述诊疗信息,确定输出更新后的诊疗信息或进行下一轮问询。
在本实施例的一个可选实施例中,处理模块403,用于执行以下至少一项:
从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取当前问询轮次用户确认的至少一个候选症状的详细症状作为所述目标问询症状;或者
从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取除所述用户已确认的症状之外的其他症状作为所述目标问询症状;或者
从所述第一候选疾病集合的多个候选疾病的所有候选症状中选取,对所述第一候选疾病集合的整体信息熵下降最快的候选症状作为所述目标问询症状。
在本实施例的一个可选实施例中,所述第一候选症状集合中的第一候选症状对所述第一候选疾病集合的整体信息熵是根据下一轮问询的第二候选疾病集合的所有候选疾病的得分总和与所述第一候选疾病集合的所有候选疾病的得分总和的比值确定的;
其中,所述第二候选疾病集合是根据所述第一候选疾病集合以及所述第一候选症状确定的,所述第一候选症状为所述第一候选症状集合中的任意一个候选症状。
在本实施例的一个可选实施例中,所述问诊知识图谱包括疾病、症状、疾病诱因、疾病科室信息、鉴别症状、并发症以及病程的节点;
获取模块402,用于根据所述至少一个症状数据,从所述问诊知识图谱中获取每个症状数据对应的候选疾病集合,获得所述第一候选疾病集合。
本实施例提供的问诊处理装置,可以执行上述任一方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
图6为本申请实施例提供的电子设备的硬件结构图。如图6所示,本实施例提供的电子设备500,包括:
存储器501;
处理器502;以及
计算机程序;
其中,计算机程序存储在存储器501中,并被配置为由处理器502执行以实现上述任一方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
可选的,存储器501既可以是独立的,也可以跟处理器502集成在一起。当存储器501是独立于处理器502之外的器件时,电子设备500还包括:总线503,用于连接存储器501和处理器502。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器502执行以实现如前述任一方法实施例的技术方案。
本申请实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如前述任一方法实施例的技术方案。
本申请实施例还提供了一种芯片,包括:处理模块与通信接口,该处理模块能执行前述任一方法实施例的技术方案。
进一步地,该芯片还包括存储模块(如,存储器),存储模块用于存储指令,处理模块用于执行存储模块存储的指令,并且对存储模块中存储的指令的执行使得处理模块执行前述任一方法实施例的技术方案。
应理解,上述处理器可以是中央处理单元(英文:Central Processing Unit,简称:CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器可能包含高速RAM存储器,也可能还包括非易失性存储NVM,例如至少一个磁盘存储器,还可以为U盘、移动硬盘、只读存储器、磁盘或光盘等。
总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。
上述存储介质可以是由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除 可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。存储介质可以是通用或专用计算机能够存取的任何可用介质。
一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于专用集成电路(Application Specific Integrated Circuits,简称:ASIC)中。当然,处理器和存储介质也可以作为分立组件存在于电子设备中。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例的技术方案的范围。

Claims (16)

  1. 一种问诊处理方法,包括:
    接收来自客户端的症状信息,所述症状信息包括用户输入的至少一个症状数据;
    基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合以及第一候选症状集合,所述第一候选疾病集合包括多个候选疾病,所述第一候选症状集合包括所述多个候选疾病的所有候选症状;
    通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息用于指示至少一个候选疾病;
    根据预设问询次数、诊疗信息中所述至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出所述诊疗信息或进行下一轮问询。
  2. 根据权利要求1所述的方法,其中,所述通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,包括:
    获取所述第一候选疾病集合中每个候选疾病的得分,所述得分用于指示所述用户有所述候选疾病的概率值;
    根据所述第一候选疾病集合中所述多个候选疾病的得分,确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息包括所述多个候选疾病中得分由高到低的预设数量的候选疾病。
  3. 根据权利要求2所述的方法,其中,所述获取所述第一候选疾病集合中每个候选疾病的得分,包括:
    获取所述第一候选疾病的每个候选症状对所述第一候选疾病的贡献度,所述贡献度用于指示所述第一候选疾病伴随有所述候选症状的样本统计概率值,所述第一候选疾病为所述多个候选疾病中的任意一个;
    根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度,确定所述第一候选疾病的得分。
  4. 根据权利要求3所述的方法,其中,所述根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度,确定所述第一候选疾病的得分,包括:
    根据所述第一候选疾病的所有候选症状对所述第一候选疾病的贡献度以及噪声参数,确定所述第一候选疾病的得分。
  5. 根据权利要求1-4任一项所述的方法,其中,根据所述诊疗信息中所述至少一个候选疾病的信息熵,确定输出所述诊疗信息或进行下一轮问询,包括:
    若所述至少一个候选疾病中所有候选疾病的信息熵之和小于预设阈值,确定输出所述 诊疗信息;或者
    若所述至少一个候选疾病中所有候选疾病的信息熵之和大于或等于所述预设阈值,确定进行下一轮问询。
  6. 根据权利要求1-5任一项所述的方法,其中,根据所述预设问询次数,确定输出所述诊疗信息或进行下一轮问询,包括:
    判断所述当前问询轮次是否达到预设问询次数;
    若所述当前问询轮次达到所述预设问询次数,确定输出所述诊疗信息;或者,若所述当前问询轮次未达到所述预设问询次数,确定进行下一轮问询。
  7. 根据权利要求1-6任一项所述的方法,其中,根据所述预训练的诊断决策模型,确定输出所述诊疗信息或进行下一轮问询,包括:
    将所述症状信息输入所述预训练的诊断决策模型,根据所述诊断决策模型的输出值确定输出所述诊疗信息或进行下一轮问询;
    其中,所述诊断决策模型是采用强化学习算法通过多个样本序列对全连接神经网络进行训练得到的,所述样本序列包括至少一个症状数据以及决策结果,所述决策结果用于指示输出诊疗信息或进行下一轮问询。
  8. 根据权利要求7所述的方法,其中,所述根据所述诊断决策模型的输出值确定输出所述诊疗信息或进行下一轮问询,包括:
    若所述诊断决策模型输出第一值,确定输出所述诊疗信息;或者
    若所述诊断决策模型输出第二值,确定进行下一轮问询。
  9. 根据权利要求1-8任一项所述的方法,其中,所述方法还包括:
    若确定进行下一轮问询,从所述第一候选症状集合中确定目标问询症状,并向所述客户端发送问询信息,所述问询信息用于询问所述用户是否有所述目标问询症状;
    接收来自所述客户端的回复信息,基于所述回复信息更新所述诊疗信息,确定输出更新后的诊疗信息或进行下一轮问询。
  10. 根据权利要求9所述的方法,其中,所述从所述第一候选症状集合中确定目标问询症状,包括以下至少一项:
    从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取当前问询轮次用户确认的至少一个候选症状的详细症状作为所述目标问询症状;或者
    从所述第一候选疾病集合的多个候选疾病中的得分最高的候选疾病中,选取除所述用户已确认的症状之外的其他症状作为所述目标问询症状;或者
    从所述第一候选疾病集合的多个候选疾病的所有候选症状中选取,对所述第一候选疾 病集合的整体信息熵下降最快的候选症状作为所述目标问询症状。
  11. 根据权利要求10所述的方法,其中,所述第一候选症状集合中的第一候选症状对所述第一候选疾病集合的整体信息熵是根据下一轮问询的第二候选疾病集合的所有候选疾病的得分总和与所述第一候选疾病集合的所有候选疾病的得分总和的比值确定的;
    其中,所述第二候选疾病集合是根据所述第一候选疾病集合以及所述第一候选症状确定的,所述第一候选症状为所述第一候选症状集合中的任意一个候选症状。
  12. 根据权利要求1-11任一项所述的方法,其中,所述问诊知识图谱包括疾病、症状、疾病诱因、疾病科室信息、鉴别症状、并发症以及病程的节点;
    所述基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合,包括:根据所述至少一个症状数据,从所述问诊知识图谱中获取每个症状数据对应的候选疾病集合,获得所述第一候选疾病集合。
  13. 一种问诊处理装置,包括:
    接收模块,用于接收来自客户端的症状信息,所述症状信息包括用户输入的至少一个症状数据;
    获取模块,用于基于预设的问诊知识图谱获取所述症状信息对应的第一候选疾病集合以及第一候选症状集合,所述第一候选疾病集合包括多个候选疾病,所述第一候选症状集合包括所述多个候选疾病的所有候选症状;
    处理模块,用于通过概率分析从所述第一候选疾病集合中确定截止当前问询轮次的所述症状信息对应的诊疗信息,所述诊疗信息用于指示至少一个候选疾病;
    根据预设问询次数、诊疗信息中所述至少一个候选疾病的信息熵、预训练的诊断决策模型的任意一项,确定输出所述诊疗信息或进行下一轮问询。
  14. 一种电子设备,包括:存储器,处理器以及计算机程序;其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行以实现如权利要求1-12中任一项所述的方法。
  15. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现如权利要求1-12中任一项所述的方法。
  16. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现权利要求1-12中任一项所述的方法。
PCT/CN2022/137008 2021-12-30 2022-12-06 问诊处理方法、装置、设备及存储介质 WO2023124837A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111662512.0A CN114300127A (zh) 2021-12-30 2021-12-30 问诊处理方法、装置、设备及存储介质
CN202111662512.0 2021-12-30

Publications (1)

Publication Number Publication Date
WO2023124837A1 true WO2023124837A1 (zh) 2023-07-06

Family

ID=80973176

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137008 WO2023124837A1 (zh) 2021-12-30 2022-12-06 问诊处理方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN114300127A (zh)
WO (1) WO2023124837A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114300127A (zh) * 2021-12-30 2022-04-08 北京京东拓先科技有限公司 问诊处理方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504028A (zh) * 2019-08-22 2019-11-26 上海软中信息***咨询有限公司 一种疾病问诊方法、装置、***、计算机设备和存储介质
CN111984771A (zh) * 2020-07-17 2020-11-24 北京欧应信息技术有限公司 一种基于智能对话的自动问诊***
CN112037880A (zh) * 2020-08-31 2020-12-04 康键信息技术(深圳)有限公司 用药推荐方法、装置、设备及存储介质
CN114300127A (zh) * 2021-12-30 2022-04-08 北京京东拓先科技有限公司 问诊处理方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504028A (zh) * 2019-08-22 2019-11-26 上海软中信息***咨询有限公司 一种疾病问诊方法、装置、***、计算机设备和存储介质
CN111984771A (zh) * 2020-07-17 2020-11-24 北京欧应信息技术有限公司 一种基于智能对话的自动问诊***
CN112037880A (zh) * 2020-08-31 2020-12-04 康键信息技术(深圳)有限公司 用药推荐方法、装置、设备及存储介质
CN114300127A (zh) * 2021-12-30 2022-04-08 北京京东拓先科技有限公司 问诊处理方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114300127A (zh) 2022-04-08

Similar Documents

Publication Publication Date Title
CN110993081B (zh) 一种医生在线推荐方法及***
WO2022007823A1 (zh) 一种文本数据处理方法及装置
WO2019201098A1 (zh) 问答交互方法和装置、计算机设备及计算机可读存储介质
CN106874643B (zh) 基于词向量自动构建知识库实现辅助诊疗的方法和***
WO2019153737A1 (zh) 用于对评论进行评估的方法、装置、设备和存储介质
JP6095621B2 (ja) 回答候補間の関係を識別および表示する機構、方法、コンピュータ・プログラム、ならびに装置
CN110675944A (zh) 分诊方法及装置、计算机设备及介质
WO2023029502A1 (zh) 基于问诊会话构建用户画像的方法、装置、设备和介质
CN108509484B (zh) 分类器构建及智能问答方法、装置、终端及可读存储介质
US11468989B2 (en) Machine-aided dialog system and medical condition inquiry apparatus and method
CN110096573B (zh) 一种文本解析方法及装置
WO2023178971A1 (zh) 就医的互联网挂号方法、装置、设备及存储介质
WO2021114836A1 (zh) 一种文本通顺度确定方法、装置、设备及介质
WO2021129123A1 (zh) 语料数据处理方法、装置、服务器和存储介质
CN113707299A (zh) 基于问诊会话的辅助诊断方法、装置及计算机设备
Liu et al. Augmented LSTM framework to construct medical self-diagnosis android
CN111241397A (zh) 一种内容推荐方法、装置和计算设备
WO2023124837A1 (zh) 问诊处理方法、装置、设备及存储介质
CN114360678A (zh) 信息处理方法、装置、设备和存储介质
CN113868387A (zh) 一种基于改进tf-idf加权的word2vec医疗相似问题检索方法
CN117747087A (zh) 问诊大模型的训练方法、基于大模型的问诊方法和装置
CN116453674A (zh) 一种智慧医疗***
CN116719840A (zh) 一种基于病历后结构化处理的医疗信息推送方法
WO2019192122A1 (zh) 文档主题参数提取方法、产品推荐方法、设备及存储介质
CN117009456A (zh) 医疗查询文本的处理方法、装置、设备、介质和电子产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22914092

Country of ref document: EP

Kind code of ref document: A1