CN108461110B

CN108461110B - Medical information processing method, device and equipment

Info

Publication number: CN108461110B
Application number: CN201710093245.7A
Authority: CN
Inventors: 宣森炎; 郑重; 李楠
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2017-02-21
Filing date: 2017-02-21
Publication date: 2021-07-23
Anticipated expiration: 2037-02-21
Also published as: CN108461110A

Abstract

The application provides a medical information processing method, a device and equipment, wherein the method comprises the following steps: obtaining disease type information, symptom description information and time nodes of symptom occurrence in advance from at least one electronic medical record, wherein the time nodes are used for describing the time length elapsed from the onset of disease; integrating the results obtained by a plurality of electronic medical records of each disease type in advance to obtain a corresponding relation model of symptom description information and time nodes of each disease type information; after the target symptom information and the time nodes thereof are obtained, the corresponding relation matched with the target symptom information and the time nodes thereof is searched in a corresponding relation model which is constructed in advance, and disease analysis is carried out according to the searching result. By applying the scheme of the application, the disease analysis efficiency and accuracy can be improved.

Description

Medical information processing method, device and equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, and a device for processing medical information.

Background

In the medical industry, doctors mainly perform disease analysis such as disease diagnosis and disease development trend prediction through experience. The experience of the doctor is subjective and not easy to quantify, and the doctor needs long-time clinical practice and communication summary to obtain the experience. Therefore, the disease analysis efficiency is low because the disease analysis needs to be performed through manual experience, and the accuracy of the disease analysis is low because the experience of a doctor is subjective and the difficulty in obtaining the experience is high.

Disclosure of Invention

The application provides a medical information processing method, a medical information processing device and medical information processing equipment, and aims to solve the problems of low disease analysis efficiency and low accuracy in the prior art.

According to a first aspect of embodiments of the present application, there is provided a medical information processing method, the method including:

obtaining disease type information, symptom description information and time nodes of symptom occurrence in advance from at least one electronic medical record, wherein the time nodes are used for describing the time length elapsed from the onset of disease;

integrating the results obtained by a plurality of electronic medical records of each disease type in advance to obtain a corresponding relation model of symptom description information and time nodes of each disease type information;

after the target symptom information and the time nodes thereof are obtained, the corresponding relation matched with the target symptom information and the time nodes thereof is searched in a corresponding relation model which is constructed in advance, and disease analysis is carried out according to the searching result.

According to a second aspect of embodiments of the present application, there is provided a medical information processing apparatus including:

the model building module is used for obtaining disease type information, symptom description information and time nodes of symptom occurrence in advance from at least one electronic medical record, and the time nodes are used for describing the time length elapsed from the onset of disease; integrating the obtained results of the multiple electronic medical records of each disease type to obtain a corresponding relation model of the symptom description information and the time node of each disease type information;

and the information analysis module is used for searching the corresponding relation matched with the target symptom information and the time node thereof in a pre-constructed corresponding relation model after obtaining the target symptom information and the time node thereof, and analyzing diseases according to the searching result.

According to a third aspect of embodiments of the present application, there is provided an electronic apparatus, including:

a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to:

after obtaining the target symptom information and the time nodes thereof, searching the corresponding relation matched with the target symptom information and the time nodes thereof in a pre-constructed corresponding relation model, and analyzing diseases according to the searching result;

the construction process of the corresponding relation model comprises the following steps:

acquiring disease type information, symptom description information and a time node of symptom occurrence from at least one electronic medical record, wherein the time node is used for describing the time length elapsed from the onset of disease;

and integrating the obtained results of the multiple electronic medical records of each disease type to obtain a corresponding relation model of the symptom description information and the time node of each disease type information.

When the medical information processing method, the medical information processing device and the medical information processing equipment are applied, disease type information, symptom description information and time nodes with symptoms can be obtained from at least one electronic medical record through a natural language processing technology, so that a corresponding relation model of the symptom description information and the time nodes of each disease type information is built, after target symptom information and the time nodes of the target symptom information are obtained, a corresponding relation matched with the target symptom information and the time nodes of the target symptom information is searched in the pre-built corresponding relation model, and disease analysis is carried out according to the searching result. As the electronic medical records of a plurality of patients, a plurality of time nodes and symptoms appearing in the time nodes are taken as reference bases for each disease, the data is comprehensive, the accuracy of the analysis result is improved, and the analysis efficiency can be improved due to automatic analysis.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a flowchart of an embodiment of a medical information processing method according to the present application.

Fig. 2 is a flowchart of an embodiment of symptom normalization in the medical information processing method of the present application.

Fig. 3 is a flowchart of an embodiment of constructing a sequelae model in the medical information processing method of the present application.

Fig. 4 is a flowchart of another embodiment of the medical information processing method of the present application.

Fig. 5 is a hardware configuration diagram of an electronic device in which the medical information processing apparatus according to the present application is located.

Fig. 6 is a block diagram of an embodiment of a medical information processing apparatus according to the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

Currently, in the medical industry, doctors mainly perform disease analysis such as disease diagnosis and disease development trend prediction through artificial experience. Different doctors obtain different experiences due to different practical experiences and learning abilities, and doctors with less experience often analyze diseases according to symptoms appearing at individual time, so that the data is one-sided and inaccurate. Meanwhile, the disease analysis is performed through experience, resulting in low efficiency of disease analysis.

In order to avoid the defects of low accuracy and low efficiency of disease analysis, the application provides a medical information processing method which can be divided into a corresponding relation model construction stage and a corresponding relation model application stage. In one example, the build phase and the application phase may be performed by the same electronic device. In another example, since the building stage requires a device with higher processing capability to perform big data analysis, the application stage requires relatively low processing capability on the electronic device, and after the corresponding relationship model is successfully built, different electronic devices can share the model, thereby avoiding resource waste caused by model building of each electronic device.

As shown in fig. 1, fig. 1 is a flowchart of an embodiment of a medical information processing method according to the present application, and the method may include the following steps 101 to 103, where step 101 and step 102 are stages of pre-constructing a correspondence model, and step 103 is a stage of performing disease analysis by applying the correspondence model.

In step 101, disease type information, symptom description information, and time nodes of occurrence of symptoms are obtained from at least one electronic medical record in advance, and the time nodes are used for describing the time length elapsed from the onset of disease.

In step 102, the results obtained from the multiple electronic medical records of each disease type are integrated in advance to obtain a model of correspondence between the symptom description information and the time node of each disease type information.

In step 103, after obtaining the target symptom information and the time node thereof, a corresponding relationship matching the target symptom information and the time node thereof is searched in a corresponding relationship model constructed in advance, and disease analysis is performed according to the search result.

Medical information such as a patient's symptom, a time when the symptom appears, and a diagnosis result is often recorded in the electronic medical record. According to the method, the medical information in the electronic medical records is subjected to big data analysis, symptoms (namely the development process of the diseases) of the diseases under different time nodes are obtained, the symptoms of each disease under different time nodes can be constructed into a corresponding relation model of symptom description information of each disease type and the time nodes, and then the corresponding relation model is utilized for disease analysis.

As can be seen from the above embodiments, the disease type information, the symptom description information, and the time node at which a symptom appears can be obtained in the unstructured text such as the illness state and the medical history in the electronic medical record through the natural language processing technology, so as to construct a corresponding relationship model of the symptom description information and the time node of each disease type information, and after obtaining the target symptom information and the time node thereof, the corresponding relationship matching the target symptom information and the time node thereof is searched in the corresponding relationship model constructed in advance, and the disease analysis is performed according to the search result. As the electronic medical records of a plurality of patients, a plurality of time nodes and symptoms appearing in the time nodes are taken as reference bases for each disease, the data is comprehensive, the accuracy of the analysis result is improved, and the analysis efficiency can be improved due to automatic analysis.

For the disease type information, the disease type information may be identification information for indicating a disease type, such as a name of the disease type. The disease types can be classified according to requirements, either coarsely or finely. For example, the disease type information may include cold, arrhythmia, coronary heart disease, cerebral hemorrhage, leukemia, diabetes, and the like. The disease types may also be subdivided in order to obtain more accurate disease types. For example, colds are classified as wind-cold type colds and wind-heat type colds. In each electronic medical record, disease type information is often recorded in the diagnosis result, so that the disease type information can be directly extracted from the diagnosis result.

With respect to the symptom description information, the symptom description information is information describing symptoms, and one or more symptoms may occur for each disease. For example, in the case of wind-cold type cold, the patient often has symptoms such as watery nasal discharge, sneezing, thin and white tongue coating, and aversion to cold. Aiming at wind-heat type cold, patients often have symptoms of sweating, sore throat, dry mouth and tongue, yellow phlegm, nasal obstruction and the like.

In one example, if a symptom recording area for separately recording symptom description information exists in the electronic medical record, since the symptom recording area records the symptom description information, the symptom description information can be directly extracted from the symptom recording area, so as to obtain the symptom of the case.

In one example, if a symptom recording area for separately recording symptom description information does not exist in an electronic medical record, but the symptom description information and other information are simultaneously recorded in the medical record, the symptom description information needs to be identified from among a plurality of information.

In order to identify the symptom description information from a plurality of information, the method can match characters in a prestored symptom description mode in at least one electronic medical record, wherein the symptom description mode comprises characters which can appear in the context of the symptom description information and the position relation between the symptom description information and the characters; and obtaining symptom description information from the context of the matching information according to the position relation between the symptom description information and the characters.

Among them, some conventional patterns are often used in describing symptoms, such as: "… … symptom appears", "… … appears", "accompanied by … … phenomenon", and the like, and these patterns may be referred to as symptom description patterns. The symptom description mode records characters which can appear in the context of the symptom description information. Such as "appearance", "symptom", "accompanying", "phenomenon", etc., and also includes the positional relationship of the symptom describing information and the character, so as to determine the position of the symptom describing information according to the position of the character. For example, in the "present … … symptom" mode, the symptom-describing information appears at a position between the character "present" and the character "symptom", as well as in the "present … …" mode, the symptom-describing information appears at a position after the character "present". Therefore, when the characters in the prestored symptom description mode are matched in the electronic medical record, the positions of the characters in the electronic medical record can be determined, and then the positions of the symptom description information in the electronic medical record can be estimated according to the position relation between the symptom description information and the characters and the positions of the characters in the electronic medical record, so that the symptom description information can be obtained from the context of the matching information.

The symptom description mode can be obtained through manual input, and can also be obtained through learning from big data in a mode of mode learning.

In one example, the step of determining the symptom description model comprises:

using known symptom description information as seeds, and extracting the seeds from at least one electronic medical record by using a matching algorithm;

extracting characters from the context of the seeds in the electronic medical record based on the extracted seeds, and identifying the position relation between the seeds and the characters;

and determining a symptom description mode according to the extracted characters and the occurrence frequency of the identified position relation.

The known symptom description information may include manually input symptom description information, and may also include symptom description information determined by using the scheme of the present application. For example, after the symptom description pattern is determined, the symptom description information is extracted using the symptom description pattern, and the extracted symptom description information may be used as the known symptom description information in the next round of pattern training.

The embodiment can take the known symptom description information as a seed (sample), and extract the seed from at least one electronic medical record by using a matching algorithm. The purpose of extracting the seeds is to determine the positions of the seeds in the electronic medical record, so as to extract characters from the context of the seeds in the electronic medical record based on the extracted seeds, identify the position relationship between the seeds and the characters, and determine whether the characters and the position relationship are used as symptom description modes according to the occurrence frequency of the characters and the position relationship.

In the embodiment, the seeds are matched from the electronic medical record by using a matching algorithm. In one example, the matching algorithm may be a forward maximum matching method, and the accuracy of extracting the seeds may be improved by using the forward maximum matching method to extract the seeds.

Regarding the frequency of occurrence, since a plurality of characters can be extracted and a plurality of positional relationships can be recognized, instead of all proposed characters and recognized positional relationships constituting a symptom description pattern, the higher the repetition rate of the extracted characters and recognized positional relationships among all information extracted and recognized, the more likely it is a regular pattern employed in describing symptoms, and therefore, in all information extracted and recognized, the frequency of occurrence of the extracted characters and recognized positional relationships is calculated, thereby determining the symptom description pattern.

For example, after the seeds are extracted, the character strings near the extracted seeds and the position relationship between the character strings and the seeds can be enumerated to determine a symptom description preliminary pattern, then the ratio of the number of occurrences of the symptom description preliminary pattern to the number of all symptom description preliminary patterns is used as the frequency of occurrence of the symptom description preliminary pattern, and the frequency of occurrence is used as an index to be scored, so that the symptom description preliminary pattern with high score is determined as the symptom description pattern.

After obtaining the symptom description mode, in one example, if the electronic medical record describes the symptom in a unified description mode, matching the characters in the prestored symptom description mode in at least one electronic medical record, and directly extracting the symptom description information from the context of the matching information according to the position relationship between the symptom description information and the characters.

The method utilizes the symptom description mode to directly extract the symptom description information from the electronic medical record, and improves the efficiency of obtaining the symptom description information.

In another example, since the same symptom can be described in different forms, in order to facilitate integration of results obtained from multiple electronic medical records of each disease type, the symptoms need to be standardized, and the symptoms are normalized by clustering. Specifically, the obtaining of the symptom description information from the context of the matching information according to the position relationship between the symptom description information and the character includes: extracting original description information of symptoms from the context of the matching information according to the position relation between the description information of symptoms and the characters; the original description information of the symptom of the same symptom is normalized to the same description information of the symptom.

The embodiment confirms the information extracted from the context of the matching information as the original description information of the symptom, and normalizes the original description information of the symptom of the same symptom into the same description information of the symptom, thereby improving the efficiency of subsequent data integration. In one aspect, the symptom raw description information may be normalized to standard symptom description information each time the symptom raw description information is obtained. On the other hand, after obtaining all the original description information of symptoms, the same original description information of symptoms may be normalized to the standard description information of symptoms.

For the symptom normalization operation, in an example, standard symptom description information and a symptom original description information base where the standard symptom description information may appear may be preset, the obtained symptom original description information is matched with information in the symptom original description information base, and when the matching degree meets a preset requirement, the symptom original description information is normalized to the standard symptom description information corresponding to the symptom original description information base.

In another example, the present application further provides a method for symptom normalization, as shown in fig. 2, fig. 2 is a flowchart of an embodiment of symptom normalization in the medical information processing method of the present application, and normalizing original description information of a symptom of the same symptom into description information of the same symptom includes steps 201 to 204:

in step 201, for some original description information of symptoms in the extracted information, the original description information of symptoms appearing at the same time in the same disease is divided into different cluster clusters.

The extracted information refers to original description information of symptoms extracted from the context of the matching information, and part of the original description information of symptoms is used as the basis for dividing the cluster. Because symptoms of the same disease described at the same time are different, original description information of all symptoms appearing at the same time in the same disease can be divided into different cluster clusters, so that the cluster clusters are divided preliminarily, and the efficiency and accuracy of cluster division can be improved.

In step 202, the similarity between the current symptom original description information and the symptom original description information in the cluster is calculated, and whether to add the current symptom original description information to the cluster or create a new cluster and add the current symptom original description information to the new cluster is determined according to the similarity.

Wherein, the current symptom original description information is the symptom original description information which is not added with the cluster in the extracted information.

Determining whether to add the current symptom original description information into the cluster according to the similarity between the current symptom original description information and the symptom original description information in the cluster, if the similarity is higher than a preset threshold, adding the current symptom original description information into the cluster, if the similarity is lower than the preset threshold, newly building the cluster, and adding the current symptom original description information into the newly built cluster.

In one example, the similarity between original description information of symptoms can be expressed by the ratio of the number of same characters to the length of a shorter character string.

Therefore, in the step, the symptom original description information which is not added into the clustering cluster is added into the existing clustering cluster or the newly-built clustering cluster, so that the symptom original description information is added into the clustering cluster.

In step 203, after all the symptom original description information is added to the corresponding cluster, whether the cluster is merged or not is judged according to the highest similarity of the symptom original description information among different clusters, and corresponding processing is executed.

And when all the symptom original description information is added into the corresponding cluster, the clustering of the symptom original description information is finished. Because different clusters can belong to the same symptom, whether the clusters are combined or not can be judged according to the highest similarity of original description information of the symptoms among the different clusters, and corresponding processing is executed. For example, a first cluster and a second cluster are used for explanation, and similarity matching is performed on any symptom original description information in the first cluster and any symptom original description information in the second cluster to obtain a similarity value. And when the similarity of each symptom original description information in the first cluster is matched with the similarity of each symptom original description information in the second cluster, obtaining all similarity values, determining the highest similarity value of the symptom original description information in the two clusters, judging whether the highest similarity is greater than a preset similarity threshold value, if so, merging the clusters of the first cluster and the second cluster, and otherwise, not merging the clusters.

In step 204, after all the clusters are merged, judged and processed, the original description information of the symptoms of the same cluster is unified into the description information of the same symptoms.

After all the clusters are combined, judged and processed, each cluster represents different symptoms, so that original symptom description information of the same cluster can be unified into the same symptom description information, namely, a standard symptom description information is assigned to the cluster. When the standard symptom description information is appointed for the cluster, the cluster can be named with the standard symptom description information artificially; the symptom description information with the highest occurrence rate in the cluster may be designated as standard symptom description information or the like.

Because the cluster comprises different symptom original description information of the same symptom, when the symptom original description information is detected to belong to a certain cluster, the symptom original description information is converted into standard symptom description information corresponding to the cluster.

As can be seen from the above embodiments, the embodiment adopts a clustering manner to realize the normalization of the symptoms, which is easy to implement.

In an optional implementation manner, in order to improve the accuracy of the symptom description information, information extracted from the context of the matching information may be used as candidate symptoms, the candidate symptoms are scored by using the similarity of the candidate symptoms and the character strings of the seeds, the candidate symptom information with the score meeting the requirement is used as original symptom description information, and the candidate symptom information with the score not meeting the requirement is not used as original symptom description information. The character string similarity includes English character similarity, Chinese character similarity, number similarity, etc.

Therefore, the information extracted from the context of the matching information is screened to ensure that the screened information is the original description information of the symptom, so that the accuracy of the original description information of the symptom is improved.

Aiming at the time nodes, the time nodes are used for describing the time length after the onset of disease, and the purpose is to normalize the time information of the symptom appearing under the same disease into the relative time taking the same time as the reference time, so that the integration of the acquired results of a plurality of electronic medical records of each disease type is facilitated, and the integration feasibility and the integration efficiency are improved. The time node takes the onset time as the starting time, and for example, the time node may be "… … for the first day, the second day, and the third day", or "… … after one day, five days, and thirty days", or the like.

In one example, if the time of occurrence of the symptom in the electronic medical record is described in a time node manner, the time node can be directly extracted from the context of the symptom description information of at least one piece of electronic medical record by using a pre-stored time expression, so that the efficiency of obtaining the time node is improved.

The time expression is an expression that does not contain time description information of specific time, for example, "day of the turn", "month of the turn", and the like, and is used for describing time nodes.

For example, electronic medical records: slight sneezing appears in the first day, severe sneezing and rhinorrhea appear in the third day, severe sneezing and rhinorrhea appear in the fourth day, symptoms of chilly and thin and white tongue coating are accompanied, and the patient can be diagnosed in our hospital and is diagnosed with the cold due to wind-cold.

As can be seen, the symptom description pattern of "… … appeared", "… … appeared symptom", "… … accompanied symptom" can be employed to extract the symptom description information; the time nodes "day one", "day three", and "day four" can be extracted directly from the medical records using the time expression "day one".

In another example, if the time of occurrence of a symptom in an electronic medical record is described not in the form of a time node but in another form, it is necessary to normalize the time information to a relative time based on the time of occurrence of the symptom to obtain the time node of occurrence of the symptom.

Specifically, obtaining a time node of occurrence of a symptom from at least one electronic medical record in advance includes:

extracting time information from the context of the symptom description information of at least one electronic medical record by utilizing a pre-stored time expression;

and normalizing the extracted time information into relative time taking the onset time as reference time to obtain a time node where the symptom appears.

The time expression is time description information which does not contain specific time, and is a phrase describing symptom occurrence time in the electronic medical record. For example, the temporal expression may be: days, months, years, the rest of the days, etc. The temporal expression may be an expression determined by means of manual input. For example, the user enters the temporal expression directly into the system. For another example, the user inputs a commonly used time sample (e.g., three days ago, one month ago, ten days ago, one week ago, etc.) into the system, and the system replaces the chinese numbers or arabic numbers in the time sample with fixed characters, thereby obtaining a time expression, etc.

After the time expression is obtained, traversal search can be performed in the electronic medical record, so that time information can be extracted. In the electronic medical record, the sequence of the time information generally follows the time sequence of the actual occurrence of symptoms. The time information segments the electronic medical record into descriptions of different time periods, and the extracted symptoms in each period represent detailed symptoms of the current time period.

And extracting time information from the context of the symptom description information of at least one electronic medical record by using a pre-stored time expression, so as to obtain the corresponding relation between the time information and the symptom description information, wherein the extracted time information is the symptom occurrence time of the symptom description information.

After extracting time information from the context of symptom description information using a pre-stored time expression, the extracted time information may be normalized to a relative time with the onset time as a reference time, to obtain a time node at which a symptom appears.

Because the sequence of the time information in the electronic medical record generally follows the time sequence of the actual occurrence of symptoms, the first time information can be determined as the disease occurrence time. After determining the onset time, all time information may be normalized to a relative time with the onset time as a reference time, thereby obtaining a time node at which symptoms appear.

The present application also provides a specific normalization method, where the normalization of the extracted time information into a relative time with the onset time as a reference time to obtain a time node at which a symptom appears includes:

extracting a number in the time information, and taking the number as an absolute value of time;

extracting a time unit in the time information, and converting an absolute time value into a time value of a unified time unit according to the time unit;

extracting information used for describing relative relation of time in the time information;

determining the onset time of the disease in the electronic case;

and normalizing the time information into relative time taking the disease occurrence time as reference time according to the time value, the information for describing the relative relation of time and the disease occurrence time to obtain a time node with the symptom.

Because the time information can be described by adopting different time units in the same electronic medical record, the number in the time information can be extracted firstly, and the number is taken as the absolute value of time; and extracting the time unit in the time information, and converting the absolute time value into a time value of a unified time unit according to the time unit. The same time unit may be pre-designated, for example, as a day, week, etc. The conversion relationship between the different time units and the unified time unit can be established in advance, for example, the conversion relationship is converted into 7 days in 1 week, 30 days in one month, 365 days in one year, and the like.

In addition to the numbers and time units, the time information may also include information describing relative time relationships, and the information may appear after the time units, such as "back" and "front". Therefore, information for describing the relative relationship of time in the time information can be extracted.

Because the sequence of the time information in the electronic medical record generally follows the time sequence of the actual occurrence of symptoms, the first time information can be determined as the disease occurrence time.

After the time value, the information for describing the relative relationship between times, and the onset time are determined, the time information may be normalized to a relative time with the onset time as a reference time according to the time value, the information for describing the relative relationship between times, and the onset time, so as to obtain a time node where symptoms appear.

For example, electronic medical records: the patient has slight disadvantageousness of the movement of the two lower limbs before 3 months, and the patient has no weakness, no emaciation, no dizziness, headache, pale complexion, fever, chilliness and no language ambiguity before 1 week, and the patient needs to be treated in our hospital today for further diagnosis and treatment.

Therefore, the time information can be extracted from the electronic medical record by using the pre-stored time expression: 3 months ago, 1 week ago, today. By using the scheme of the application, the time information can be converted into: day 1, day 83, day 90.

It can be understood that other normalization manners may also be adopted to normalize the extracted time information into a relative time using the onset time as a reference time, and obtain a time node where a symptom appears, which is not described in detail herein.

After the disease type information, the symptom description information and the time node of the occurrence of the symptom are obtained, the obtained results of the plurality of electronic medical records of each disease type can be integrated, and the corresponding relation model of the symptom description information and the time node of each disease type information is obtained. Because the time nodes in each electronic medical record are relative time taking the disease occurrence time as reference time, and the disease occurrence time of the same disease is mostly the same, the symptom description information corresponding to each disease type and the time nodes with the symptoms are integrated to obtain a corresponding relation model of the symptom description information of each disease type and the time nodes, and the corresponding relation model records the symptoms of the disease appearing at different time nodes, so that the corresponding relation model can be used for disease analysis and the like.

In one example, for each electronic medical record, integrating the symptom description information of the disease appearing at different time nodes in each electronic medical record according to the obtained disease type information, symptom description information and time nodes where the symptoms appear; and integrating a corresponding relation model of the symptom description information and the time nodes of each disease type information according to the symptom description information of the diseases in each electronic medical record at different time nodes.

In this embodiment, because the sequence of occurrence of the time nodes in the electronic medical records generally follows the time sequence of actual occurrence of the symptoms, and the time nodes divide the electronic medical records into descriptions of different time periods, and the symptoms extracted in each period represent detailed symptoms of the current time period, for each electronic medical record, the symptom description information can be sorted according to the sequence of occurrence of the time nodes and the correspondence between the time nodes and the symptom description information, so that the symptom description information of the disease in each electronic medical record appearing at different time nodes is integrated, and the disease-symptom-time correspondence of the electronic medical record can be obtained because the electronic medical record records the disease type information.

After the disease-symptom-time corresponding relation of each electronic medical record is determined, a corresponding relation model of symptom description information and time nodes of each disease type information can be integrated.

For example, the symptom-time correspondence is classified by disease type according to the disease-symptom-time correspondence, the symptom-time correspondence of the same disease type is classified into the same class, and the symptom-time correspondence of different disease types is classified into different classes. For the same disease type, firstly judging whether the symptoms corresponding to the onset time in different symptom-time corresponding relations are the same, if so, indicating that the time nodes in the different symptom-time corresponding relations use the same time as the reference time, and integrating the different symptom-time corresponding relations; if the time nodes in the different symptom-time corresponding relations are different, the time nodes in the different symptom-time corresponding relations are not relative time taking the same time as reference time, and as one processing means, the symptom-time corresponding relations with the same symptom corresponding to the onset time can be extracted, the extracted symptom-time corresponding relations are integrated, and the unextracted symptom-time corresponding relations are not integrated. As another processing means, different symptom-time correspondence relations are compared to estimate the actual onset time and symptom at the onset time, time nodes in different symptom-time correspondence relations are normalized to relative time based on the estimated onset time, and then the normalized correspondence relations are integrated.

As can be seen from the above embodiments, in the present embodiment, the symptom description information of the disease appearing at different time nodes in each electronic medical record is integrated, and then the corresponding relationship model between the symptom description information of each disease type information and the time nodes is integrated according to the symptom description information of the disease appearing at different time nodes in each electronic medical record, so that the integration efficiency can be improved, and the implementation is easy.

After the correspondence model is constructed, disease analysis can be performed using the constructed correspondence model. For example, after obtaining the target symptom information and the time node thereof, the corresponding relationship matching the target symptom information and the time node thereof is searched in a corresponding relationship model constructed in advance, and the disease analysis is performed according to the search result.

The target symptom information may be symptom information that needs to be queried and is input by a user, and the time node is a time node at which the target symptom appears. Because the pre-constructed corresponding relation model comprises the corresponding relation between the symptom description information of each disease type information and the time node, the corresponding relation matched with the target symptom information and the time node thereof can be searched in the corresponding relation model, and the disease analysis is carried out according to the searching result.

Since the correspondence model includes the correspondence between the symptom description information and the time node for each disease type information, a variety of disease analyses can be performed. In one example, the disease analysis may include disease diagnosis, and the type of disease corresponding to the target symptom may be determined according to the search result. In another example, the disease analysis may further include disease prediction, i.e. determining a disease type corresponding to the target symptom and a symptom that may appear after the time node according to the search result. Therefore, the information such as symptoms, tendency and the like of the disease can be obtained by utilizing the corresponding relation model, and the method can be applied to the fields of medical education, monitoring, clinical decision support and the like.

It is to be understood that the disease analysis may be other analysis, as long as the analysis is performed depending on the correspondence between the symptom description information of the disease type information and the time node, and is not listed here.

In an alternative implementation mode, the incidence time and the correlation of symptom sequelae can be analyzed by combining the sequelae data of the diseases, so that the probability of suffering the sequelae can be presumed. This example previously constructed a sequela model. As shown in fig. 3, fig. 3 is a flowchart of an embodiment of constructing a sequelae model in the medical information processing method of the present application, including the following steps 301 to 303:

in step 301, sequela information and symptom description information appearing at the time of first visit are obtained while disease type information is obtained from at least one electronic medical record.

When corresponding sequelae exist in a certain disease, sequelae information is often recorded in the electronic medical record, so that the sequelae information can be extracted from the electronic medical record. When the electronic medical record includes information on both disease type and sequelae, the disease type is the result of initial diagnosis, and the sequelae are some disease symptoms left after the disease is substantially improved. In this embodiment, the medical time may be determined first, and then the symptom corresponding to the medical time may be determined as the symptom description information appearing at the time of medical treatment. The following is illustrated with a specific example:

for example, electronic medical records: the patient has no obvious reason before 1 and half years, has slight disadvantageousness of double lower limb movement, has difficulty in walking and symptoms of dizziness and facial distortion after 1 year, is collected in our hospital, diagnoses 'cerebral infarction', and is slightly improved after the treatment of improving circulation by using ginkgo-dipyridamole needles, ozagrel needles and the like. The patients have slight disadvantageousness in the movement of the lower limbs before 1 month, and the patients have disadvantageousness in the movement of the lower limbs, are lack of fatigue, but have no emaciation, no dizziness and headache, no pale complexion, no fever and chilly and no language ambiguity today, and are hospitalized with the 'sequelae of cerebral infarction' for further diagnosis and treatment.

In the electronic medical record, the disease type information is: cerebral infarction; the first hospitalizing time is as follows: the symptom description information appearing in hospitalization is as follows for more than 1 year: the double lower limbs are in movement disorder, inconvenient walking, dizziness and crooked mouth angles; the sequelae information of cerebral infarction is: the two lower limbs are disadvantaged in movement and lack of strength. Furthermore, it is possible to supplement the symptoms that do not occur, such as no emaciation, no dizziness, headache, pale complexion, no fever, aversion to cold, and no language ambiguity.

In step 302, a corresponding relationship between symptom description information appearing at the time of medical treatment and sequelae information for each disease type information is established based on disease type information, sequelae information, and symptom description information appearing at the time of first medical treatment obtained from a plurality of electronic medical records.

After the disease type information, sequelae information and symptom description information appearing when the patient first visits a doctor in each electronic medical record are obtained, the corresponding relation between the symptom description information appearing when the patient visits a doctor and the sequelae information of each disease type information can be established.

In step 303, the probability of sequelae suffered by the patient is calculated according to the occurrence frequency of the corresponding relationship, and an sequelae model of the corresponding relationship between symptom description information and sequelae probability occurring when the patient is hospitalized is constructed according to the calculation result.

In this case, the occurrence frequency of the correspondence relationship may be generated based on a ratio of the correspondence relationship between the disease type information and the symptom description information appearing at the time of medical treatment to the correspondence relationship between the disease type information and the symptom description information appearing at the time of medical treatment, in one example.

And taking the occurrence frequency of the corresponding relation as a factor of the sequelae probability, thereby calculating and obtaining the sequelae probability of the patient, and constructing an sequelae model of the corresponding relation between symptom description information and the sequelae probability, which appears when the patient is hospitalized, of the disease type information.

After the sequelae model is obtained, the sequelae model can be used to predict the probability of the patient suffering from sequelae. Specifically, after obtaining the target disease type information and the symptom description information appearing at the time of hospitalizing, the corresponding relation matched with the target disease type information and the symptom description information appearing at the time of hospitalizing is searched in a pre-constructed sequelae model, and the probability of the patient suffering from sequelae is determined according to the search result.

According to the embodiment, because the probability of the sequelae is different due to different hospitalization time after symptoms appear, in the method, after the sequelae model is built, the corresponding relation matched with the type information of the target disease and the symptom description information appearing during hospitalization is searched in the pre-built sequelae model, and the probability of the sequelae suffered by the patient is determined according to the searching result, so that the probability of the sequelae suffered by the patient is predicted.

The various technical features in the above embodiments can be arbitrarily combined, so long as there is no conflict or contradiction between the combinations of the features, but the combination is limited by the space and is not described one by one, and therefore, any combination of the various technical features in the above embodiments also belongs to the scope disclosed in the present specification.

One of the combinations is exemplified below. Fig. 4 is a flowchart of another embodiment of the medical information processing method according to the present application, as shown in fig. 4. The process of constructing the correspondence model is mainly described in the flowchart.

41. And (5) time extraction. And extracting time information from the context of the symptom description information of at least one electronic medical record by using a pre-stored time expression, wherein the time expression is time description information which does not contain specific time. In the electronic medical record, the sequence of the time expressions generally follows the time sequence of the actual occurrence of symptoms. The time expression divides the medical record text into descriptions of different time periods, and the extracted symptoms in each period represent detailed symptoms of the current time period.

42. And (4) symptom extraction.

421. And (4) extracting seeds. The seed database records known symptom description information (seeds), and the seeds are extracted from at least one electronic medical record by using a matching algorithm.

422. And (4) pattern learning. Extracting characters from the context of the seeds in the electronic medical record based on the extracted seeds, and identifying the position relation between the seeds and the characters; and determining a symptom description mode according to the extracted characters and the occurrence frequency of the identified position relation.

423. And (5) matching the patterns. Matching characters in a symptom description mode in at least one electronic medical record, wherein the symptom description mode comprises characters appearing in a symptom description information context and a position relation between the symptom description information and the characters; and obtaining symptom description information from the context of the matching information according to the position relation between the symptom description information and the characters.

In order to determine the corresponding relationship between the symptom description information and the time node where the symptom appears, the electronic medical record is firstly segmented into descriptions in different time periods by using a time expression, and when the patterns are matched, character matching can be performed from the segmented descriptions by using a symptom description pattern, so that the corresponding relationship between the obtained symptom description information and the time period is established, namely the time period is the symptom appearing time of the symptom description information.

424. Obtaining the seeds. And taking the obtained symptom description information as a seed, and repeatedly executing the steps of pattern learning, pattern matching and seed obtaining until new symptom description information is not found.

43. Symptom & time normalization. The original description information of the symptom of the same symptom is normalized to the same description information of the symptom. And normalizing the extracted time information into relative time taking the onset time as reference time to obtain a time node where the symptom appears.

44. And (5) a corresponding relation model. And integrating the obtained results of the multiple electronic medical records of each disease type to obtain a corresponding relation model of the symptom description information and the time node of each disease type information.

45. And (4) a sequelae model. The construction steps of the sequelae model comprise: acquiring sequelae information and symptom description information appearing when a doctor visits for the first time while acquiring disease type information from at least one electronic medical record; establishing a corresponding relation between symptom description information appearing when each disease type information is hospitalized and sequelae information based on disease type information, sequelae information and symptom description information appearing when the patient is first hospitalized which are obtained from a plurality of electronic medical records; and calculating the probability of sequelae of the patient according to the occurrence frequency of the corresponding relation, and constructing a sequelae model of the corresponding relation between symptom description information and sequelae probability appearing when the patient is hospitalized according to the calculation result.

46. And (5) disease analysis. After the target symptom information and the time nodes thereof are obtained, the corresponding relation matched with the target symptom information and the time nodes thereof is searched in a corresponding relation model which is constructed in advance, and disease analysis is carried out according to the searching result. After the target disease type information and the symptom description information appearing when the patient is hospitalized are obtained, the corresponding relation matched with the target disease type information and the symptom description information appearing when the patient is hospitalized is searched in a pre-constructed sequelae model, and the probability of the patient suffering from sequelae is determined according to the searching result.

In the embodiment of the application, the electronic medical records of all departments of different hospitals can be uploaded to a cloud service terminal, for example, to the Alice cloud. And establishing a corresponding relation model and a sequela model by using the scheme and the data on the cloud server. If a disease analysis request sent by a client is received, target symptom information and time nodes thereof are obtained according to the disease analysis request, a corresponding relation matched with the target symptom information and the time nodes thereof is searched in a corresponding relation model constructed in advance, disease analysis is carried out according to the searching result, and the analysis result is returned to the client. If a sequela probability query request sent by a client is received, acquiring target disease type information and symptom description information appearing when the doctor is hospitalized according to the sequela probability query request, searching a corresponding relation matched with the target disease type information and the symptom description information appearing when the doctor is hospitalized in a pre-constructed sequela model, and determining the probability of sequelae suffered by a patient according to a search result; and sending the probability of sequelae suffered by the patient to the client.

According to the method and the system, all data are collected, and a relatively perfect electronic medical record library is obtained. The larger the data volume in the electronic medical record database is, the more accurate the disease analysis is performed by using the corresponding relation model and the sequelae model after the corresponding relation model and the sequelae model are constructed according to the electronic medical record in the electronic medical record database.

Corresponding to the embodiment of the medical information processing method, the application also provides embodiments of a medical information processing device, a readable medium and an electronic device.

The present application provides one or more machine-readable media having instructions stored thereon that, when executed by one or more processors, cause a terminal device to perform a medical information processing method as described above.

The embodiment of the medical information processing device can be applied to various electronic devices, for example, the electronic devices can comprise mobile phones, tablet computers, PCs and the like. The embodiments of the apparatus may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading, by a processor of the electronic device where the device is located, a corresponding computer program instruction in the nonvolatile memory into the memory for operation. From a hardware aspect, as shown in fig. 5, a hardware structure diagram of an electronic device in which the medical information processing apparatus 531 is located is shown, in addition to the processor 510, the memory 530, the network interface 540, and the nonvolatile memory 520 shown in fig. 5, the electronic device in which the apparatus is located in the embodiment may also include other hardware generally according to the actual functions of the device, and one hardware is not shown in fig. 5.

Referring to fig. 6, a block diagram of an embodiment of a medical information processing apparatus according to the present application is shown:

the device includes: a model building module 610 and an information analysis module 620.

The model building module 610 is used for obtaining disease type information, symptom description information and time nodes of occurrence of symptoms from at least one electronic medical record in advance, wherein the time nodes are used for describing the time length elapsed from the onset of disease; and integrating the obtained results of the multiple electronic medical records of each disease type to obtain a corresponding relation model of the symptom description information and the time node of each disease type information.

And the information analysis module 620 is configured to search a corresponding relationship matching the target symptom information and the time node thereof in a pre-constructed corresponding relationship model after obtaining the target symptom information and the time node thereof, and perform disease analysis according to a search result.

In an alternative implementation, the model building module 610 includes (not shown in fig. 6):

the information matching module is used for matching characters in a prestored symptom description mode in at least one electronic medical record, wherein the symptom description mode comprises characters which can appear in the context of symptom description information and the position relation between the symptom description information and the characters.

And the symptom obtaining module is used for obtaining the symptom description information from the context of the matching information according to the position relation between the symptom description information and the characters.

In an alternative implementation, the model building module 610 further includes a mode determination module (not shown in fig. 6) for:

In an alternative implementation, the symptom obtaining module is specifically configured to:

extracting original description information of symptoms from the context of the matching information according to the position relation between the description information of symptoms and the characters;

the original description information of the symptom of the same symptom is normalized to the same description information of the symptom.

aiming at partial original description information of symptoms in the extracted information, dividing original description information of symptoms appearing at the same time in the same disease into different cluster clusters;

calculating the similarity between the current symptom original description information and the symptom original description information in the cluster, and determining whether to add the current symptom original description information into the cluster or create a new cluster and add the current symptom original description information into the new cluster according to the similarity, wherein the current symptom original description information is the symptom original description information which is not added into the cluster in the extracted information;

after all the symptom original description information is added into the corresponding cluster, judging whether to combine the clusters according to the highest similarity value of the symptom original description information among different clusters, and executing corresponding processing;

and after all the clustering clusters are subjected to merging judgment and processing, unifying original symptom description information of the same clustering cluster into the same symptom description information.

the information extraction module is used for extracting time information from the context of the symptom description information of at least one electronic medical record by using a pre-stored time expression, wherein the time expression is time description information which does not contain specific time.

And the time normalization module is used for normalizing the extracted time information into relative time taking the disease occurrence time as reference time to obtain a time node where the symptom appears.

In an optional implementation manner, the time normalization module is specifically configured to:

determining the onset time of the disease in the electronic case;

In an alternative implementation, the model building module 610 includes an information integration module (not shown in fig. 6) for:

integrating symptom description information of diseases appearing at different time nodes in each electronic medical record according to the acquired disease type information, symptom description information and time nodes of appearing symptoms;

and integrating a corresponding relation model of the symptom description information and the time nodes of each disease type information according to the symptom description information of the diseases in each electronic medical record at different time nodes.

In an optional implementation manner, the apparatus further includes a probability analysis module configured to:

after obtaining the target disease type information and symptom description information appearing when seeking medical advice, searching a corresponding relation matched with the target disease type information and the symptom description information appearing when seeking medical advice in a pre-constructed sequelae model, and determining the probability of suffering sequelae of a patient according to the searching result;

the model building module is further configured to:

acquiring sequelae information and symptom description information appearing when a doctor visits for the first time while acquiring disease type information from at least one electronic medical record;

establishing a corresponding relation between symptom description information appearing when each disease type information is hospitalized and sequelae information based on disease type information, sequelae information and symptom description information appearing when the patient is first hospitalized which are obtained from a plurality of electronic medical records;

and calculating the probability of sequelae of the patient according to the occurrence frequency of the corresponding relation, and constructing a sequelae model of the corresponding relation between symptom description information and sequelae probability appearing when the patient is hospitalized according to the calculation result.

Based on this, the present application also provides an electronic device, comprising:

a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to:

The implementation process of the functions and actions of each module in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A medical information processing method, characterized in that the method comprises:

calculating the probability of sequelae of the patient according to the occurrence frequency of the corresponding relation, and constructing a sequelae model of the corresponding relation between symptom description information and sequelae probability appearing when the patient is hospitalized according to the calculation result;

after the target disease type information and the symptom description information appearing when the patient is hospitalized are obtained, the corresponding relation matched with the target disease type information and the symptom description information appearing when the patient is hospitalized is searched in a pre-constructed sequelae model, and the probability of the patient suffering from sequelae is determined according to the searching result.

2. The method of claim 1, wherein obtaining symptom description information from at least one electronic medical record in advance comprises:

matching characters in a prestored symptom description mode in at least one electronic medical record, wherein the symptom description mode comprises characters appearing in a symptom description information context and a position relation between the symptom description information and the characters;

and obtaining symptom description information from the context of the matching information according to the position relation between the symptom description information and the characters.

3. The method of claim 2, wherein the step of determining the symptom description model comprises:

4. The method according to claim 2, wherein obtaining symptom description information from the context of matching information according to the position relationship between the symptom description information and the character comprises:

5. The method of claim 4, wherein normalizing original description information of symptoms of the same symptom to the same symptom description information comprises:

6. The method of claim 1, wherein obtaining the time node of occurrence of the symptom from at least one electronic medical record in advance comprises:

extracting time information from the context of symptom description information of at least one electronic medical record by using a pre-stored time expression, wherein the time expression is time description information which does not contain specific time;

7. The method of claim 6, wherein normalizing the extracted time information into a relative time with a disease onset time as a reference time to obtain a time node at which a symptom occurs comprises:

determining the disease occurrence time of the diseases in the electronic medical record;

8. The method according to any one of claims 1 to 7, wherein the integrating the results obtained from the plurality of electronic medical records of each disease type in advance to obtain the model of the correspondence between the symptom description information and the time node of each disease type information comprises:

9. A medical information processing apparatus characterized by comprising:

the information analysis module is used for searching a corresponding relation matched with the target symptom information and the time node thereof in a pre-constructed corresponding relation model after obtaining the target symptom information and the time node thereof, and analyzing diseases according to a searching result;

the apparatus further comprises a probability analysis module to:

the model building module is further configured to:

10. The apparatus of claim 9, wherein the model building module comprises:

the information matching module is used for matching characters in a prestored symptom description mode in at least one electronic medical record, wherein the symptom description mode comprises characters which can appear in the context of symptom description information and the position relation between the symptom description information and the characters;

11. The apparatus of claim 10, wherein the model building module further comprises a mode determination module configured to:

12. The apparatus of claim 10, wherein the symptom acquisition module is specifically configured to:

13. The apparatus of claim 12, wherein the symptom acquisition module is specifically configured to:

14. The apparatus of claim 9, wherein the model building module comprises:

the information extraction module is used for extracting time information from the context of the symptom description information of at least one electronic medical record by utilizing a pre-stored time expression, wherein the time expression does not contain time description information of specific time;

15. The apparatus of claim 14, wherein the time normalization module is specifically configured to:

16. The apparatus of any one of claims 9 to 15, wherein the model building module comprises an information integration module configured to:

17. An electronic device, comprising:

a processor; a memory for storing the processor-executable instructions;

wherein the processor is configured to:

integrating the obtained results of the multiple electronic medical records of each disease type to obtain a corresponding relation model of the symptom description information and the time node of each disease type information;