CN115424618A - Electronic medical record voice interaction equipment based on machine learning - Google Patents

Electronic medical record voice interaction equipment based on machine learning Download PDF

Info

Publication number
CN115424618A
CN115424618A CN202210930450.5A CN202210930450A CN115424618A CN 115424618 A CN115424618 A CN 115424618A CN 202210930450 A CN202210930450 A CN 202210930450A CN 115424618 A CN115424618 A CN 115424618A
Authority
CN
China
Prior art keywords
voice
medical record
electronic medical
model
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210930450.5A
Other languages
Chinese (zh)
Inventor
李迦南
罗震宇
王莉娟
裘向军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Taiyi Ruijing Computer Technology Co ltd
Original Assignee
Shanghai Taiyi Ruijing Computer Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Taiyi Ruijing Computer Technology Co ltd filed Critical Shanghai Taiyi Ruijing Computer Technology Co ltd
Priority to CN202210930450.5A priority Critical patent/CN115424618A/en
Publication of CN115424618A publication Critical patent/CN115424618A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Evolutionary Computation (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses electronic medical record voice interaction equipment based on machine learning, which comprises a voice input device, a client and a server, wherein an electronic medical record system is deployed at the client, and a voice word segmentation system, a recording and sampling system and a machine learning system are deployed at the server; the voice input device collects voice information; the voice word segmentation system converts the voice information into text information, processes the text information to generate an electronic medical record and then sends the electronic medical record to the electronic medical record system; the electronic medical record system displays the electronic medical record through the client and corrects the electronic medical record; the input sampling system collects the corrected electronic medical records, sends the electronic medical records to the voice word segmentation system for processing, and then provides the electronic medical records to the machine learning system for model training; the machine learning system provides an acoustic model, a language model, and a writing model for the speech segmentation system. The invention realizes accurate and efficient automatic generation of the electronic medical record based on automatic machine learning, and improves the diagnosis efficiency of doctors and the accuracy of the electronic medical record.

Description

Electronic medical record voice interaction equipment based on machine learning
Technical Field
The invention relates to the field of electronic medical records, in particular to electronic medical record voice interaction equipment based on machine learning.
Background
During the inquiry between doctors and patients, doctors are more inclined to communicate face-to-face with patients to provide better clinical service. Due to the requirements of the development of electronic medical records and the regulation and regulation of hospitals, doctors need to face the entry of more and more complicated electronic medical records, and the proportion of the time facing computers in the inquiry time is gradually increased. The medical record recording mode with lower invasiveness and more convenience needs to be researched and developed urgently, so that more inquiry processes return to face-to-face communication.
With the maturity of speech recognition technology and machine learning technology, it has become possible to utilize speech recognition technology with higher speech recognition rate and artificial intelligence with understanding context reasoning to help doctors to realize the scene of automatically writing medical records while talking with patients.
In the prior art, a microphone device is used in CN107564571A to record medical records in a structured medical record template in a voice mode, and in CN113724695A, identities are judged and medical records are automatically generated only through voice conversations between a doctor and a patient, and both the doctor and the patient are helped to talk by using a voice recognition technology and automatically write medical records at the same time, but the following defects exist:
1) The voice recognition plays a role in assisting in recording, not assisting in the inquiry process, and doctors only read the medical records which the doctors want to record word by word;
2) The identity is judged only through voice conversation between a doctor and a patient, the misjudgment rate is high, and the condition that a third person breaks through the diagnosis conversation, which is common in the diagnosis conversation process, cannot be accurately dealt with, so that the important problem of automatically inputting medical records through voice in the inquiry conversation process is solved.
3) The doctor speaks and there are accents and different word usage, abbreviation habits, the learning algorithm for improving the recognition rate is based on the correction result of all doctors, and for each individual doctor, the voice input still has higher error rate;
4) Although supervised learning has a higher promotion on the evolution of the algorithm, an additional process is needed to participate in prompting the right and wrong, and the cost is higher.
Disclosure of Invention
The invention aims to provide electronic medical record voice interaction equipment based on machine learning, and the electronic medical record voice interaction equipment based on machine automatic learning realizes accurate and efficient automatic generation of an electronic medical record.
The technical scheme adopted by the invention for solving the technical problems is to provide electronic medical record voice interaction equipment based on machine learning, which comprises a voice input device, a client and a server, wherein the client is provided with an electronic case system, and the server is provided with a voice word segmentation system, an input sampling system and a machine learning system; the voice input device collects voice information; the voice word segmentation system converts voice information into text information, processes the text information to generate an electronic medical record and sends the electronic medical record to the electronic medical record system; the electronic medical record system displays the electronic medical record through the client and corrects the electronic medical record; the input sampling system collects the corrected electronic medical records, sends the electronic medical records to the voice word segmentation system for processing, and then provides the electronic medical records to the machine learning system for model training; the machine learning system provides an acoustic model, a language model and a writing model for the voice word segmentation system; the server is internally provided with a diagnosis and treatment voice database, a diagnosis and treatment text database, a dictionary, a medical term knowledge base, a dialect language base and a standard medical record model base.
Further, the voice word segmentation system comprises a voice recognition module and an NLP module; the voice recognition module acquires voice information through a voice input device, wherein the voice information comprises inquiry dialogue and dictation medical history; the voice recognition module acquires the identity information of the doctor and the patient through an identity information acquisition device connected with the client; the voice recognition module converts inquiry dialogue voice information into doctor-patient dialogue texts with identity information and time information according to the identity information, and the voice recognition module converts the dictation medical record voice information into dictation medical record texts with identity information and time information; and the NLP module processes the doctor-patient conversation text and/or the oral medical record text to generate an electronic medical record and then sends the electronic medical record to the electronic medical record system.
Further, the voice recognition module performs voice vector conversion on the voice information and then sequentially inputs the voice information into the acoustic model for acoustic recognition and the language model for language recognition to generate a preliminary dialogue text; meanwhile, the client monitors whether the identity information request changes or not through the connected identity information acquisition device, the electronic medical record system monitors whether the patient medical record is switched or not, the preliminary dialogue text is combined to judge whether conversion of dialogue figures exists or not, and dialogue contents which are inserted in the preliminary dialogue text and are irrelevant to the inquiry are recognized and removed to generate a doctor-patient dialogue text with identity information and time information.
Furthermore, historical diagnosis and treatment voice information is stored in the diagnosis and treatment voice database, and the acoustic model is obtained by obtaining the historical diagnosis and treatment voice information from the diagnosis and treatment voice database by a machine learning system, extracting voice characteristics and carrying out model training; the diagnosis and treatment text database stores historical doctor-patient conversation texts and dictation medical history texts, and the language model is obtained by a machine learning system after obtaining the historical doctor-patient conversation texts and the dictation medical history texts from the diagnosis and treatment text database and performing model training; the dictionary is connected with the acoustic model and the voice model and stores the corresponding relation between the voice and the text.
Furthermore, the NLP module carries out word segmentation on the doctor-patient conversation text to identify keywords in the doctor-patient conversation text, converts the keywords into characters in a daily writing literary style through a writing model, fills an objective fact description part of the electronic medical record in combination with the writing requirements of the electronic medical record and sends the objective fact description part to the electronic medical record system; the objective fact description part of the electronic medical record comprises a chief complaint part, a present medical history part and an existing medical history part.
Further, the NLP module preprocesses and divides the doctor-patient conversation text according to the diagnosis and treatment text database and the medical term knowledge base and obtains keywords; then, medical named entity recognition is carried out on the keywords through a rule set by a rule description language; judging whether the identified single keyword or the identified compound keyword is a single keyword or a compound keyword according to an entropy-expanded term extraction algorithm; and finally, filtering through a general word bank, extracting the label of the electronic medical record and constructing a label vector after removing the conventional keywords, and filling each part of the electronic medical record according to the label of the electronic medical record through a writing model.
Further, the machine learning system comprises an acoustic model learning module, a language model learning module and a writing model learning module; the medical term knowledge base, the dialect language base and the standard medical record template base are arranged in the machine learning system; the acoustic model learning module generates an individualized acoustic model associated with the identity information by using the voice information acquired by the voice input device; the language model learning module adjusts a language model through comparison between the generated doctor-patient conversation text and the corrected electronic medical record; and the writing model learning module is used for adjusting the writing model through the comparison between the generated electronic medical record and the corrected electronic medical record.
Further, the acoustic model learning module acquires voice information, distinguishes figures corresponding to the voice information according to voice characteristics, and establishes a primary acoustic model; meanwhile, acquiring identity information of people in the voice information; then, semantic reasoning is carried out according to the text information converted by the voice word segmentation system, and the voice information corresponds to the identity information of the character; and learning by combining historical diagnosis and treatment voice information stored in the diagnosis and treatment voice database, updating the voice personalized features of figures in the voice information, and generating a personalized acoustic model.
Furthermore, the language model learning module establishes a language model according to the historical diagnosis and treatment voice information and the historical doctor and patient conversation text, in combination with the medical term knowledge base and the dialect voice base and according to the time information and the identity information corresponding to the voice information, and updates personalized features in the language model by comparing the dictation medical record text with the corrected medical record text.
Further, the writing model learning module establishes a personalized writing model according to a medical record of the electronic case system, a standard medical record model library and a medical term knowledge library; and updating the optimized writing model according to the modification record of the medical record recorded by the recording and sampling system.
Compared with the prior art, the invention has the following beneficial effects: according to the electronic medical record voice interaction equipment based on machine learning, the voice word segmentation system is combined with data of doctor-patient conversation content, doctor-patient identity information and doctor-patient personalized voice characteristics, so that the identity of a conversing person can be more accurately identified, a third person can be conveniently distinguished, interference is eliminated, and voice is automatically recorded into a medical record; the machine learning system establishes an acoustic model and a language model personalized by a doctor, so that the speaking content of the doctor is identified more accurately, and entry errors caused by different accent, word and abbreviation habits of the doctor are avoided; the machine learning system automatically learns without additional judgment results of doctors, and acoustic models of voice recognition, voice models of language organizations and writing models generated by medical records are more accurate along with the increase of the service time; the efficiency of seeing a doctor is improved, the accuracy of electronic medical record is improved.
Drawings
FIG. 1 is a schematic structural diagram of a machine learning-based electronic medical record voice interaction device according to an embodiment of the present invention;
FIG. 2 is a flowchart of the operation of a speech recognition module according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a speech recognition module according to an embodiment of the present invention;
fig. 4 is a flowchart of the NLP module according to the embodiment of the present invention;
FIG. 5 is a schematic diagram of an NLP module according to an embodiment of the present invention;
FIG. 6 is a flowchart of the acoustic model learning module operation according to an embodiment of the present invention;
FIG. 7 is a diagram of an acoustic model learning module learning process according to an embodiment of the present invention;
FIG. 8 is a flowchart of the language model learning module according to an embodiment of the present invention;
FIG. 9 is a diagram of a language model learning module learning process according to an embodiment of the present invention;
FIG. 10 is a flowchart of the writing model learning module according to an embodiment of the present invention.
In the figure:
1. a client; 2. a server side; 3. a voice input device; 4. an electronic medical record system; 5. a speech word segmentation system; 6. inputting a sampling system; 7. a machine learning system.
Detailed Description
The invention is further described below with reference to the figures and examples.
Fig. 1 is a schematic structural diagram of a machine learning-based electronic medical record voice interaction device according to an embodiment of the present invention.
Referring to fig. 1, the electronic medical record voice interaction device based on machine learning according to the embodiment of the present invention includes a voice input device 3, a client 1 and a server 2, where the client 1 is deployed with an electronic medical record system 4, and the server 2 is deployed with a voice word segmentation system 5, an input sampling system 6 and a machine learning system 7; the voice input device 3 collects voice information; the voice word segmentation system 5 converts the voice information into text information, processes the text information to generate an electronic medical record and then sends the electronic medical record to the electronic medical record system 4; the electronic medical record system 4 displays the electronic medical record through the client 1 and corrects the electronic medical record; the input sampling system 6 collects the corrected electronic medical records, sends the electronic medical records to the voice word segmentation system 5 for processing, and then provides the electronic medical records to the machine learning system 7 for model training; the machine learning system 7 provides acoustic models, language models, and writing models for the speech segmentation system 5.
The server 2 is provided with a diagnosis and treatment voice database, a diagnosis and treatment text database, a dictionary, a medical term knowledge base, a dialect language base and a standard medical record model base.
The functions and working processes of the parts are as follows:
1. voice input device
The voice input device 3 is linked (for example, USB protocol) with the client 1 device and/or the server 2 device (for example, computer device, mobile device, etc.) through a connection line, so as to exchange data with the system carried and operated by the client 1 and/or the server 2, a microphone in the voice input device 3 collects conversations from a doctor and a patient, and the collected data is processed by the voice segmentation system 5.
2. Electronic medical record system
The electronic medical record system 4 is deployed on the client 1 and used by a doctor through the client 1, can check the medical record which is recorded for a patient, and can also check whether the electronic medical record automatically generated by the voice word segmentation system 5 is accurate, the electronic medical record system 4 can simultaneously display the automatically generated electronic medical record and the corresponding doctor-patient conversation text in parallel, and meanwhile, the doctor is supported to correct on the basis of the automatically generated electronic medical record.
The corrected electronic medical record is transmitted to the recording sampling system 6 of the server 2 through the interface, and the learning data required by the machine learning system 7 is generated after the voice word segmentation system 5 processes the data.
In the inquiry process, the electronic medical record system 4 monitors whether the patient medical record is switched or the identity information request changes in real time, and provides a log analysis basis for the voice word segmentation system 5 to analyze whether other people enter a dialogue.
3. Voice word segmentation system
The speech word segmentation system 5 comprises a speech recognition module and an NLP (Natural Language Processing) module;
the speech segmentation system 5 obtains the dialogue speech information from the doctor and the patient from the speech input device 3, and converts the speech phonemes into the doctor and patient dialogue text through the speech recognition module after vector conversion, acoustic model recognition and language model recognition, wherein the acoustic model and the speech model are optimized according to the speech characteristics of the doctor and the patient according to the machine learning system 7. After the doctor-patient conversation text is subjected to word segmentation processing by the NLP module, identity information from the doctor and the patient is obtained through an interface, keywords in the conversation are identified, the electronic medical record is automatically filled according to the keywords, and finally, after the inquiry is finished, a doctor can check the automatically-input electronic medical record in the electronic medical record system 4.
Speech recognition module
The voice recognition module is used for converting the acquired voice information into a doctor-patient conversation text with identity information and time information, and the specific flow is as shown in fig. 2: the voice recognition module acquires voices of the doctor and the patient from the voice input device 3, and acquires the identity information of the doctor and the patient in a clinic scene by matching with identity information acquisition equipment such as card swiping equipment and the like connected with the interface of the client 1. After the speech phonemes are subjected to vector conversion, the speech phonemes are input into the acoustic model and the speech model which are individually optimized by the machine learning system 7, and a preliminary dialogue text is formed. The electronic medical record system 4 monitors in real time whether a patient medical record is switched or an identity information request is changed by combining the voice characteristics and the context of the conversation, analyzes whether conversion of conversation persons exists or not, for example, whether the last patient has finished the inquiry or not, whether a patient or medical staff suddenly intrudes or not, and the like, and eliminates conversation contents which are inserted in the process and are irrelevant to the inquiry in the text. Finally, the doctor-patient dialogue text with the time information and the identity information is generated.
The specific working principle of the language identification module is shown in fig. 3: the diagnosis and treatment voice database stores historical diagnosis and treatment voice information, the acoustic model is obtained by the machine learning system 7 through obtaining the historical diagnosis and treatment voice information from the diagnosis and treatment voice database, and voice features are extracted to perform model training; the diagnosis and treatment text database stores historical doctor-patient conversation texts and dictation medical history texts, and the language model is obtained by a machine learning system 7 after obtaining the historical doctor-patient conversation texts and the dictation medical history texts from the diagnosis and treatment text database and performing model training; the dictionary is connected with the acoustic model and the voice model and stores the corresponding relation between the voice and the text.
NLP module
The NLP module is used for processing the dialog text output by the speech recognition module and automatically filling the dialog text into the electronic medical record, and the specific flow is as shown in fig. 4: the NLP module acquires the doctor-patient conversation text output by the voice recognition module, then carries out word segmentation on the doctor-patient conversation text by combining a medical term word bank in the machine learning system 7, recognizes keywords, acquires original key conversation materials, converts the writing model which is personalized and optimized by the machine learning system 7 into the literary writing style characters of doctors according to the daily writing requirement of the machine learning system 7, and simultaneously converts part of the keywords into standard entries in a dictionary according to the writing requirement of the electronic medical record. The generated characters and dictionaries are automatically filled in all fields of the electronic medical record, and the medical record comprises an objective fact description part of the main complaint, the current medical history, the past medical history and the like which do not need to be judged by a doctor to draw a conclusion. And finally, after the inquiry is finished, the doctor confirms the automatically filled electronic medical record and carries out correction and adjustment by looking at the electronic medical record system 4.
In addition, the entry sampling system 6 provides the corrected electronic medical record to the machine learning system 7 as a learning material, and also provides the corrected electronic medical record after word segmentation by the NLP module.
The specific working principle of the NLP module is shown in fig. 5: firstly, preprocessing doctor-patient conversation texts and segmenting words through a diagnosis and treatment text database and a medical term word library; then, medical named entity recognition is carried out on the data through a rule set by a rule description language; judging whether the identified single keyword or the identified compound keyword is a single keyword or a compound keyword according to an entropy-expanded term extraction algorithm; and finally, filtering through a general word bank, removing conventional key words, and extracting the labels.
And in the label extraction process, firstly, extracting labels and constructing label vectors by combining keywords obtained in entity identification through a medical term lexicon, carrying out quantization operation, and then carrying out correlation calculation between the labels and the description and between the labels to construct a patient case. And finally, automatically filling the electronic medical record through the writing model.
4. Input sampling system
The input sampling system 6 samples the content input by the doctor, including the patient medical record content input by the doctor and the content corrected by the doctor after automatic generation (including the text copied and pasted when the original text is compared in the electronic medical record system 4), and increases the corresponding creation time and modification time, which are used as important marks for the machine learning system 7 to perform the association analysis learning of the characters and the voice. The sampled text is subjected to word segmentation by an NLP module of the voice word segmentation system 5 and then provided to a machine learning system 7.
5. Machine learning system
The machine learning system 7 provides personalized models for doctors and patients who have visited for the work of the voice word segmentation system 5, and the training of the models is based on voice information, doctor-patient conversation texts after voice conversion and electronic medical record data recorded in electronic medical records. The personalized model includes an acoustic model, a language model, and a writing model. All models adopt the principle of silent automatic learning, and the cost of manual intervention during learning is ensured to be minimized. The machine learning system 7 includes an acoustic model learning module, a language model learning module, and a writing model learning module.
Acoustic model learning module
The acoustic model learning module generates personalized acoustic models for doctors and patients who have been treated by using the voice information collected by the voice input device 3, and provides the personalized acoustic models when the voice recognition module performs acoustic recognition, and the working flow of the method is shown in fig. 6: the acoustic model learning module acquires the voice information of the doctor and the patient collected by the voice input device 3, firstly distinguishes which different persons exist in the voice information, and establishes a preliminary acoustic model. And meanwhile, acquiring character information in a diagnosis scene, for example, acquiring identity information of a corresponding doctor according to an account number logged by a doctor who visits in the current diagnosis scene, and acquiring identity information of a corresponding patient according to behaviors of a patient diagnosis card swiping record, an electronic medical record switching record and the like on card swiping equipment. And then, semantic reasoning is carried out on doctor-patient conversation texts output by the voice recognition module, the specific identity of a person is automatically judged, and a voice history record is established at the same time. And then learning is carried out by combining historical diagnosis and treatment voice information generated by historical diagnosis and treatment, and the personalized characteristics of the voice of the speaker are newly added or updated to generate a personalized acoustic model. By adopting the principle of automatic learning, as the amount of collected voice data of a specific person increases, the acoustic model of the specific person is more and more accurate.
The concrete learning process of the acoustic model learning module is shown in fig. 7: firstly, extracting the characteristics of voice information and storing the voice information; then, using MFCC (Mel-scale Frequency Cepstral Coefficients, mel Cepstral Coefficients, abbreviated as MFCC) to extract voice characteristic parameters; then, performing single-factor and multi-factor training on the voice features extracted in the MFCC mode; then, carrying out deep neural network training after adopting Viterbi algorithm to carry out forced alignment; and finally, training the processed features to obtain the acoustic model based on the DNN neural network.
Language model learning module
The language model learning module compares the generated doctor-patient conversation text with the final result generated by the doctor's operation of modifying, checking and comparing the original text and the copy-paste operation of the automatically generated electronic medical record, and further automatically adjusts the language model. The work flow is shown in fig. 8: the language model learning module firstly uses the generated historical doctor and patient dialogue text and the historical diagnosis and treatment voice information, combines a medical term knowledge base and a dialect voice base which are arranged in the machine learning system 7, distinguishes voices of different characters in the voices according to time lines and identity information, establishes a primary language model, compares the generated doctor and patient dialogue text with a final result generated by operation of modifying, checking original text comparison and copying and pasting an automatically generated electronic medical record by a doctor, and updates personalized features in the language model. By adopting automatic supervised learning, the language model is more and more accurate along with the correction of a doctor to a medical record and the increase of the behavior of the doctor to the original text. The specific learning process of the language model learning module is shown in fig. 9.
Writing model learning module
And the writing model learning module is used for automatically supervising and learning the automatically generated modification records of the electronic medical record by using doctors and automatically adjusting the writing model. The work flow is shown in fig. 10: the writing model learning module first uses the electronic medical record written by the doctor, the standard medical record model library in the machine learning system 7 and the medical term knowledge library to establish an individualized preliminary writing model for the doctor. And updating the personalized writing module by utilizing the modification record of the doctor on the automatically generated electronic medical record, which is captured by the logging sampling system 6. And automatic supervised learning is adopted, and the writing model is more and more accurate along with the increase of the correction behaviors of doctors on medical records.
In the electronic medical record voice interaction equipment based on machine learning, when in actual use, a doctor inserts the voice input device 3 before an inquiry and a doctor logs in an account number in the electronic medical record system 4 to start seeing a doctor, the voice input device 3 starts to collect voice information, when the patient sees a doctor, the identity information of the patient is confirmed through a card swiping device, the voice segmentation system 5 analyzes the voice information of the doctor and the patient and converts the voice information into a doctor and patient conversation text, the electronic medical record meeting the requirements is generated according to the personalized writing style of the doctor, and after the inquiry and the doctor are finished, the doctor corrects the electronic medical record. In the process, the recording and sampling system 6 performs recording and sampling on the content and the modification record and the like recorded by the doctor, and provides the content and the modification record to the machine learning system 7 for further automatically optimizing the relevant models generated by voice recognition and medical history.
In the implementation process, a period of tuning optimization can be carried out, only the voice input device 3 is inserted to capture the voice information of the doctor and the patient, the doctor still writes medical records on the electronic medical record system 4 manually, the recorded results are sampled by the recording and sampling system 6, and the machine learning system 7 automatically trains related models in advance. After ten cases are studied, the function of automatically generating the electronic medical record can be opened.
Preferably, an identity authentication system can be added, and besides voice input, a key interaction module, a camera module and the like can be added to the external voice input hardware equipment.
The identity authentication of doctors does not need to use an identity authentication interface of an electronic medical record for login, voice and voiceprint authentication can be used for login, meanwhile, a voice recognition library is established for all patients to be diagnosed, the situation of inquiry dialogue recognition error caused by accidentally intruding other people during inquiry is further reduced by combining the situation that a camera focuses on the entering and exiting of indoor personnel, and the inquiry recording accuracy can be greatly improved.
The physician may use a key interaction module or natural language instructions (e.g., wait for me to see a prior laboratory report) to instruct the electronic medical record system 4 to quickly open a certain type or a certain medical record of the patient for follow-up, thereby enhancing the user experience.
In summary, in the electronic medical record voice interaction device based on machine learning according to the embodiment of the present invention, the voice word segmentation system 5 combines the doctor-patient conversation content, the doctor-patient identity information, and the doctor-patient personalized voice feature data, so as to more accurately identify the identity of the conversing person, facilitate distinguishing of the third person, eliminate interference, and realize automatic voice input of the medical record; the machine learning system 7 establishes an acoustic model and a language model personalized by the doctor, so that the speaking content of the doctor can be identified more accurately, and the recording errors caused by different accent, word and abbreviation habits of the doctor are avoided; the machine learning system 7 automatically learns without additional judgment results of doctors, and as the use time increases, acoustic models of voice recognition, voice models of language organizations and writing models generated by medical records are more accurate; the efficiency of seeing a doctor is improved, the accuracy of electronic medical record is improved.
Although the present invention has been described with respect to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An electronic medical record voice interaction device based on machine learning is characterized by comprising a voice input device, a client and a server, wherein an electronic medical record system is deployed at the client, and a voice word segmentation system, an input sampling system and a machine learning system are deployed at the server;
the voice input device collects voice information;
the voice word segmentation system converts voice information into text information, processes the text information to generate an electronic medical record and sends the electronic medical record to the electronic medical record system;
the electronic medical record system displays the electronic medical record through the client and corrects the electronic medical record;
the input sampling system collects the corrected electronic medical records, sends the electronic medical records to the voice word segmentation system for processing, and then provides the electronic medical records to the machine learning system for model training;
the machine learning system provides an acoustic model, a language model and a writing model for the voice word segmentation system;
the server is internally provided with a diagnosis and treatment voice database, a diagnosis and treatment text database, a dictionary, a medical term knowledge base, a dialect language base and a standard medical record model base.
2. The electronic medical record voice interaction device based on machine learning of claim 1, wherein the voice segmentation system comprises a voice recognition module and an NLP module; the voice recognition module acquires voice information through a voice input device, wherein the voice information comprises an inquiry dialogue and a dictation medical record; the voice recognition module acquires the identity information of the doctor and the patient through an identity information acquisition device connected with the client; the voice recognition module converts inquiry dialogue voice information into doctor-patient dialogue texts with identity information and time information according to the identity information, and the voice recognition module converts the dictation medical record voice information into dictation medical record texts with identity information and time information; and the NLP module processes the doctor-patient conversation text and/or the oral medical record text to generate an electronic medical record and then sends the electronic medical record to the electronic medical record system.
3. The electronic medical record voice interaction device based on machine learning as claimed in claim 2, wherein the voice recognition module performs voice vector conversion on voice information and then sequentially inputs the voice information into the acoustic model for acoustic recognition and the language model for language recognition to generate a preliminary dialogue text; meanwhile, the client monitors whether the identity information request changes or not through the connected identity information acquisition device, the electronic medical record system monitors whether patient medical records are switched or not, the preliminary dialogue text is combined to judge whether conversion of dialogue figures exists or not, dialogue contents which are inserted in the preliminary dialogue text and are irrelevant to the inquiry are recognized and removed, and the doctor-patient dialogue text with identity information and time information is generated.
4. The electronic medical record voice interaction device based on machine learning as claimed in claim 2, wherein the medical voice database stores historical medical voice information, and the acoustic model is obtained by a machine learning system extracting voice features from the medical voice information acquired from the medical voice database and performing model training; the diagnosis and treatment text database stores historical doctor-patient conversation texts and dictation medical history texts, and the language model is obtained by a machine learning system after obtaining the historical doctor-patient conversation texts and the dictation medical history texts from the diagnosis and treatment text database and performing model training; the dictionary is connected with the acoustic model and the voice model and stores the corresponding relation between the voice and the text.
5. The electronic medical record voice interaction device based on machine learning as claimed in claim 2, wherein the NLP module performs word segmentation on the doctor-patient dialogue text to identify keywords in the doctor-patient dialogue text, converts the keywords into daily writing literary style characters through a writing model, fills an objective fact description part of the electronic medical record in combination with the writing requirements of the electronic medical record, and sends the objective fact description part to the electronic medical record system; the objective fact description part of the electronic medical record comprises a chief complaint part, a present medical history part and an existing medical history part.
6. The electronic medical record voice interaction device based on machine learning as claimed in claim 5, wherein the NLP module preprocesses and segments doctor-patient dialogue text and obtains keywords according to a diagnosis and treatment text database and a medical term knowledge base; then, medical named entity recognition is carried out on the keywords through a rule set by a rule description language; judging whether the identified single keyword or the identified compound keyword is a single keyword or a compound keyword according to an entropy-expanded term extraction algorithm; and finally, filtering through a general word bank, extracting the label of the electronic medical record and constructing a label vector after removing the conventional keywords, and filling each part of the electronic medical record according to the label of the electronic medical record through a writing model.
7. The machine learning-based electronic medical record voice interaction device as claimed in claim 1, wherein the machine learning system comprises an acoustic model learning module, a language model learning module and a writing model learning module; the medical term knowledge base, the dialect language base and the standard medical record model base are arranged in the machine learning system;
the acoustic model learning module generates an individual acoustic model associated with the identity information by utilizing the voice information collected by the voice input device;
the language model learning module adjusts a language model through comparison between the generated doctor-patient conversation text and the corrected electronic medical record;
and the writing model learning module is used for adjusting the writing model through the comparison between the generated electronic medical record and the corrected electronic medical record.
8. The electronic medical record voice interaction device based on machine learning as claimed in claim 7, wherein the acoustic model learning module acquires voice information, distinguishes persons corresponding to the voice information according to voice characteristics, and establishes a preliminary acoustic model; meanwhile, acquiring identity information of people in the voice information; then, semantic reasoning is carried out according to the text information converted by the voice word segmentation system, and the voice information corresponds to the identity information of the figure; and learning by combining historical diagnosis and treatment voice information stored in the diagnosis and treatment voice database, updating the voice personalized characteristics of characters in the voice information, and generating a personalized acoustic model.
9. The electronic medical record voice interaction device based on machine learning as claimed in claim 7, wherein the language model learning module establishes a language model according to time information and identity information corresponding to voice information according to historical diagnosis and treatment voice information and historical doctor-patient dialogue text in combination with a medical term knowledge base and a dialect voice base, and updates personalized features in the language model by comparing the dictation medical record text with the corrected medical record text.
10. The electronic medical record voice interaction device based on machine learning of claim 7, wherein the writing model learning module builds a personalized writing model from medical records of an electronic medical record system, a standard medical record template library and a medical term knowledge library; and updating the optimized writing model according to the modification record of the medical record recorded by the recording and sampling system.
CN202210930450.5A 2022-08-03 2022-08-03 Electronic medical record voice interaction equipment based on machine learning Pending CN115424618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210930450.5A CN115424618A (en) 2022-08-03 2022-08-03 Electronic medical record voice interaction equipment based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210930450.5A CN115424618A (en) 2022-08-03 2022-08-03 Electronic medical record voice interaction equipment based on machine learning

Publications (1)

Publication Number Publication Date
CN115424618A true CN115424618A (en) 2022-12-02

Family

ID=84196207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210930450.5A Pending CN115424618A (en) 2022-08-03 2022-08-03 Electronic medical record voice interaction equipment based on machine learning

Country Status (1)

Country Link
CN (1) CN115424618A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116913450A (en) * 2023-09-07 2023-10-20 北京左医科技有限公司 Method and device for generating medical records in real time
CN118072901A (en) * 2024-04-18 2024-05-24 中国人民解放军海军青岛特勤疗养中心 Outpatient electronic medical record generation method and system based on voice recognition

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116913450A (en) * 2023-09-07 2023-10-20 北京左医科技有限公司 Method and device for generating medical records in real time
CN116913450B (en) * 2023-09-07 2023-12-19 北京左医科技有限公司 Method and device for generating medical records in real time
CN118072901A (en) * 2024-04-18 2024-05-24 中国人民解放军海军青岛特勤疗养中心 Outpatient electronic medical record generation method and system based on voice recognition

Similar Documents

Publication Publication Date Title
US11646032B2 (en) Systems and methods for audio processing
CN109313892B (en) Robust speech recognition method and system
Mariooryad et al. Compensating for speaker or lexical variabilities in speech for emotion recognition
CN115424618A (en) Electronic medical record voice interaction equipment based on machine learning
WO2021047319A1 (en) Voice-based personal credit assessment method and apparatus, terminal and storage medium
CN109192194A (en) Voice data mask method, device, computer equipment and storage medium
CN109346086A (en) Method for recognizing sound-groove, device, computer equipment and computer readable storage medium
CN113724695B (en) Electronic medical record generation method, device, equipment and medium based on artificial intelligence
CN112233680B (en) Speaker character recognition method, speaker character recognition device, electronic equipment and storage medium
WO2006097975A1 (en) Voice recognition program
CN104299623A (en) Automated confirmation and disambiguation modules in voice applications
CN110265008A (en) Intelligence pays a return visit method, apparatus, computer equipment and storage medium
Kopparapu Non-linguistic analysis of call center conversations
CN110782902A (en) Audio data determination method, apparatus, device and medium
CN117637097A (en) Method and system for generating electronic medical record based on outpatient service dialogue of large model
Alghifari et al. On the use of voice activity detection in speech emotion recognition
Rosenberg Speech, prosody, and machines: Nine challenges for prosody research
Kanabur et al. An extensive review of feature extraction techniques, challenges and trends in automatic speech recognition
US10872615B1 (en) ASR-enhanced speech compression/archiving
Kumar A Comprehensive Analysis of Speech Recognition Systems in Healthcare: Current Research Challenges and Future Prospects
CN117877660A (en) Medical report acquisition method and system based on voice recognition
Huang et al. A review of automated intelligibility assessment for dysarthric speakers
Debnath et al. Study of speech enabled healthcare technology
CN113990288B (en) Method for automatically generating and deploying voice synthesis model by voice customer service
CN113436617B (en) Voice sentence breaking method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination