KR20210067010A

KR20210067010A - Unity Plug-in for Image and Voice Recognition and Patient Information Processing for Surgery Using AR Glass and Deep Learning

Info

Publication number: KR20210067010A
Application number: KR1020190156159A
Authority: KR
Inventors: 장용석; 육창근
Original assignee: (주)다울디엔에스
Priority date: 2019-11-28
Filing date: 2019-11-28
Publication date: 2021-06-08

Abstract

The present invention relates to a unity plug-in for recognizing an image and voice and processing patient information prior to a surgical procedure using an AR glass and deep learning. The unity plug-in of the present invention comprises: a medical information support unit for collecting and storing medical information on a subject, and providing the collected medical information; a head-mounted display unit mounted on the head of an examiner, displaying the medical information on the subject provided from the medical information support unit on an augmented screen in both eyes of the examiner, and transmitting image and voice information data acquired from the examiner during the examination of the subject; a learning unit for inquiring and confirming, from the medical information support unit, recognition information analyzed and classified through a deep learning method from image and voice information source data transmitted from the head-mounted display unit; and a medical information communication module for requesting the medical information support unit for inquiry and confirmation of the recognition information classified through the learning unit, and transmitting the recognition information and analysis information transmitted from the medical information support unit to the head-mounted display unit. Accordingly, a success rate for command recognition increases through deep learning on a contaminated image and a modulated voice in a surgical environment. Also, a medical history and a diagnosis result of a patient can be checked through a real-time inquiry without relying on a medical chart even in a state of being in direct contact with the patient, such that a medical environment without the medical chart can be created and the convenience of medical treatment can be enhanced.

Description

AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인{Unity Plug-in for Image and Voice Recognition and Patient Information Processing for Surgery Using AR Glass and Deep Learning}Unity Plug-in for Image and Voice Recognition and Patient Information Processing for Surgery Using AR Glass and Deep Learning

본 발명은 외과 수술 환경에서 오염된 영상과 변조된 음성을 딥러닝 학습을 통해 명령 인식에 대한 성공률을 높이고, 환자와 직접 대면한 상태에서도 환자의 진료 이력, 진단 결과 등을 의료차트에 의존하지 않고 실시간 조회를 통해 확인함으로써 의료차트가 없는 의료 환경 조성 및 진료행위의 편리성을 증진시킬 수 있는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인에 관한 것이다.The present invention increases the success rate for command recognition through deep learning learning of contaminated images and modulated voices in a surgical environment, and does not rely on medical charts to display the patient's medical history, diagnosis results, etc., even when directly facing the patient. It relates to a Unity plug-in for image and voice recognition and patient information processing for surgery using AR glasses and deep learning that can improve the convenience of treatment and creation of a medical environment without medical charts by checking through real-time inquiry.

통상적인 의료정보지원시스템은 기존 환자 정보의 조회 시, 검사자인 의사가 항시 의료용 디지털 영상 및 통신 표준(Digital Imaging and Communication in Medicine, DICOM), 일명 다이콤 시스템에 접속하여 PC 시스템 앞에서만 사용이 가능하다. 따라서 수중 중에는 환자의 정보를 재확인하는 것이 어렵다는 문제가 있었다.In the normal medical information support system, when inquiring about existing patient information, the examiner, the doctor, can always access the medical digital imaging and communication standard (DICOM), also known as the DICOM system, and use it only in front of the PC system. Do. Therefore, there was a problem that it was difficult to reconfirm the patient's information while underwater.

또한 외과 수중 중 환자 정보 조회를 위해 키보드를 사용하는 것은 어렵기 때문에, 이를 해결하기 위해 카메라와 음성을 통해 명령어를 인식하여 환자의 정보를 의료정보지원유닛, 즉 HIS(Hospital Information System)에 요청하여야 한다. 다만 일반적인 환경에서 제스처 인식과 음성 인식은 많은 알고리즘과 학습데이터가 존재하나, 수술실의 낮은 조도 및 수술 중 오염되거나 또는 수술용 장갑을 착용한 경우 제스처 인식은 어려울 뿐만 아니라, 음성 인식에 있어서도 수술용 마스크를 착용하여 변조된 음성에 대한 인식이 어렵다는 문제가 있다.In addition, since it is difficult to use the keyboard to inquire about patient information during surgery, it is necessary to request patient information to the medical information support unit, that is, HIS (Hospital Information System) by recognizing commands through the camera and voice to solve this problem. do. However, in a general environment, there are many algorithms and learning data for gesture recognition and voice recognition, but gesture recognition is difficult in the case of low illumination in the operating room, contaminated during surgery, or wearing surgical gloves, as well as a surgical mask for voice recognition. There is a problem in that it is difficult to recognize the modulated voice by wearing the

대한민국 공개특허공보 제10-2019-0108923호(2019.09.25. 공개)Republic of Korea Patent Publication No. 10-2019-0108923 (published on September 25, 2019)

본 발명은 상기한 바와 같은 문제점을 해결하기 위해 안출된 것으로,The present invention has been devised to solve the problems as described above,

외과 수술 환경에서 오염된 영상과 변조된 음성을 딥러닝 학습을 통해 명령 인식에 대한 성공률을 높이고, 환자와 직접 대면한 상태에서도 환자의 진료 이력, 진단 결과 등을 의료차트에 의존하지 않고 실시간 조회를 통해 확인함으로써 의료차트가 없는 의료 환경 조성 및 진료행위의 편리성을 증진시킬 수 있는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인을 제공하는 것을 목적으로 한다.In the surgical environment, the success rate for command recognition is increased through deep learning learning of contaminated images and modulated voices, and real-time inquiry of the patient’s medical history and diagnosis results without relying on medical charts even in the face-to-face condition with the patient The purpose of this is to provide a unity plug-in for image and voice recognition and patient information processing for surgical operations using AR glasses and deep learning that can improve the convenience of treatment and creation of a medical environment without medical charts.

본 발명에 따른 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인은 피검사자의 의료정보를 수집 및 저장하고, 수집된 의료정보를 제공하기 위한 의료정보지원유닛; 검사자의 두부에 장착되고, 상기 의료정보지원유닛으로부터 제공되는 피검사의 의료정보를 검사자의 양안에 증강 화면으로 디스플레이하고, 피검사자의 검사 시 검사자로부터 취득한 영상 및 음성정보데이터를 송출하는 헤드마운트 디스플레이유닛; 상기 헤드마운트 디스플레이유닛으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝 학습방법을 통해 분석 및 분류된 인식정보를 상기 의료정보지원유닛으로부터 조회 및 확인하기 위한 학습유닛; 및 상기 학습유닛을 통해 분류된 인식정보를 상기 의료정보지원유닛으로 조회 및 확인을 요청하고, 상기 의료정보지원유닛으로부터 전송된 인식정보 및 분석정보를 상기 헤드마운트 디스플레이유닛으로 전송하기 위한 의료정보통신모듈;을 포함하여 이루어진다.Image/voice recognition and patient information processing unit plug-in for surgical operation using AR glasses and deep learning according to the present invention includes a medical information support unit for collecting and storing medical information of a subject, and providing the collected medical information; A head-mounted display unit mounted on the head of the examiner, displaying the medical information of the examination subject provided from the medical information support unit as augmented screens in both eyes of the examiner, and transmitting image and audio information data obtained from the examiner during the examination of the subject ; a learning unit for inquiring and confirming the image and audio information source data transmitted from the head-mounted display unit by analyzing and classifying recognition information from the medical information support unit through a deep learning learning method; and medical information communication for requesting inquiry and confirmation of the recognition information classified through the learning unit to the medical information support unit, and transmitting the recognition information and analysis information transmitted from the medical information support unit to the head-mounted display unit module; includes.

본 발명은 상기 헤드마운트 디스플레이유닛은 카메라에 의하여 촬영 영상정보 소스데이터와 마이크에 의하여 인식된 음성정보 소스데이터를 학습유닛으로 전송하기 위한 소스데이터전달부와, 상기 의료정보지원유닛 및 상기 학습유닛으로부터 전송된 데이터의 화면 표시를 위해 분류 및 구분하는 증강화면구성부와, 상기 증강화면구성부에 의하여 구성된 화면을 표시하기 위한 증강화면표시부와, 상기 소스데이터전달부의 영상 및 음성데이터를 송출하고, 상기 학습유닛을 통해 상기 의료정보진원유닛으로 정보 조회 요청 및 요청된 정보를 전달받기 위한 통신부를 포함하여 이루어진 것을 특징으로 한다.In the present invention, the head-mounted display unit includes a source data transfer unit for transmitting image information source data captured by a camera and audio information source data recognized by a microphone to a learning unit, and from the medical information support unit and the learning unit. An augmented screen configuration unit for classifying and dividing the transmitted data for screen display, an augmented screen display unit for displaying a screen configured by the augmented screen configuration unit, and transmitting the video and audio data of the source data transfer unit, the It is characterized in that it comprises a communication unit for receiving the information inquiry request and the requested information to the medical information diagnosis unit through the learning unit.

본 발명에 따른 상기 헤드마운트 디스플레이유닛은 상기 통신부를 통하여 전송된 데이터를 파서(Parser)를 통하여 분류 및 구분하여 상기 증강화면구성부로 전달하기 위한 프로토콜해석부를 더 포함하는 것을 특징으로 한다.The head-mounted display unit according to the present invention is characterized in that it further comprises a protocol analysis unit for classifying and dividing the data transmitted through the communication unit through a parser and transmitting the data to the augmented screen composition unit.

본 발명에 따른 상기 학습유닛은 상기 헤드마운트 디스플레이유닛으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝에 의한 분류 학습을 통해 외과 수술 환경에서 변조된 영상정보 및 음성정보로부터 특징데이터를 추출 및 분류하기 위한 영상학습부와 음성학습부를 포함하여 이루어지는 것을 특징으로 한다.The learning unit according to the present invention extracts and classifies the image and audio information source data transmitted from the head-mounted display unit through classification learning by deep learning to extract and classify feature data from the image information and audio information modulated in the surgical environment. It is characterized in that it comprises a video learning unit and a voice learning unit for

본 발명에 따른 상기 영상학습부는 수술 중 검사자의 손모양 또는 행동에 대한 특징을 추출하기 위한 특징추출부와, 상기 특징추출부로부터 추출된 특징데이터를 특징벡터로 변환하기 위한 특징벡터변환부와, 상기 특징벡터변환부에 의하여 변환된 특징벡터를 분류 인자에 의해 명령어를 인식하기 위한 벡터분류부를 포함하여 이루어진 것을 특징으로 한다.The image learning unit according to the present invention comprises a feature extraction unit for extracting features of the examiner's hand shape or behavior during surgery, and a feature vector conversion unit for converting the feature data extracted from the feature extraction unit into a feature vector; It is characterized in that it comprises a vector classification unit for recognizing the instruction by the classification factor of the feature vector transformed by the feature vector transformation unit.

본 발명에 따른 상기 음성학습부는 수술 중 검사자의 변질된 음성에 대한 특징을 추출하기 위한 변조특징추출부와, 상기 변조특징추출부로부터 추출된 특징데이터를 특징벡터로 변환하기 위한 변조특징벡터변환부와, 상기 변조특징벡터변환부에 의하여 변환된 특징벡터를 분류 인자에 의해 명령어를 인식하기 위한 변조벡터분류부를 포함하여 이루어진 것을 특징으로 한다.The voice learning unit according to the present invention includes a modulation feature extracting unit for extracting features of an examiner's altered voice during surgery, and a modulation feature vector conversion unit for converting the feature data extracted from the modulation feature extracting unit into a feature vector. and a modulation vector classifying unit for recognizing a command by a classification factor using the feature vector converted by the modulation feature vector converting unit.

본 발명에 따른 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인은 외과 수술 환경에서 오염된 영상과 변조된 음성을 딥러닝 학습을 통해 명령 인식에 대한 성공률을 높일 수 있다.The Unity plug-in for image and voice recognition and patient information processing for surgery using AR glasses and deep learning according to the present invention can increase the success rate for command recognition through deep learning learning of contaminated images and modulated voices in a surgical environment. have.

또한 본 발명은 환자와 직접 대면한 상태에서도 환자의 진료 이력, 진단 결과 등을 의료차트에 의존하지 않고 실시간 조회를 통해 확인함으로써 의료차트가 없는 의료 환경 조성 및 진료행위의 편리성을 증진시킬 수 있다.In addition, the present invention can improve the convenience of medical care and the creation of a medical environment without a medical chart by checking the patient's medical history, diagnosis result, etc. through real-time inquiry without relying on the medical chart even in the state of direct contact with the patient. .

또한 본 발명은 유니티 플러그(Unity Plug-in)인 형태로 이루어져 타 시스템에도 접목이 용이하고, 이를 통해 증강현실을 이용한 의료지원시스템의 보급률을 늘려 진료행위의 편의성을 향상시킬 수 있다.In addition, the present invention is made in the form of a Unity Plug-in, so that it can be easily grafted to other systems, thereby increasing the penetration rate of the medical support system using augmented reality to improve the convenience of medical treatment.

도 1은 본 발명에 따른 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인을 나타내는 개념도,
도 2는 본 발명에 따른 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인을 나타내는 상세구성도,
도 3은 본 발명에 따른 헤드마운트 디스플레이유닛을 나타내는 사시도,
도 4는 본 발명에 따른 유니티 플러그인을 나타내는 개념도.1 is a conceptual diagram showing the image/voice recognition and patient information processing Unity plug-in for surgical operation using AR glasses and deep learning according to the present invention;
2 is a detailed configuration diagram showing the image/voice recognition and patient information processing Unity plug-in for surgical operation using AR glasses and deep learning according to the present invention;
3 is a perspective view showing a head mounted display unit according to the present invention;
4 is a conceptual diagram illustrating a Unity plug-in according to the present invention.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 설명하기 위하여 이하에서는 본 발명의 바람직한 실시례를 예시하고 이를 참조하여 살펴본다.In order to explain the present invention, the operational advantages of the present invention, and the objects achieved by the practice of the present invention, preferred embodiments of the present invention are exemplified below and will be described with reference to them.

먼저, 본 출원에서 사용한 용어는 단지 특정한 실시례를 설명하기 위해 사용된 것으로서, 본 발명을 한정하려는 의도가 아니며, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 또한 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.First, the terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention, and the singular expression may include a plural expression unless the context clearly indicates otherwise. Also in the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other It is to be understood that this does not preclude the possibility of addition or presence of features or numbers, steps, operations, components, parts, or combinations thereof.

본 발명을 설명함에 있어서, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.In describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description thereof will be omitted.

도 1 내지 도 3에 도시된 바와 같이 본 발명에 따른 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인은 의료정보지원유닛(10)과, 헤드마운트 디스플레이유닛(20)과, 학습유닛(30) 및 의료정보통신모듈(40)을 포함하여 구성된다.As shown in FIGS. 1 to 3, the Unity plug-in for image/voice recognition and patient information processing for surgical operation using AR glasses and deep learning according to the present invention is a medical information support unit 10 and a head mounted display unit 20 ) and is configured to include a learning unit 30 and a medical information communication module 40 .

도 1 내지 도 3에 도시된 바와 같이 본 발명에 따른 의료정보지원유닛(10)은 피검사자의 의료정보를 수집 및 저장하고, 수집된 의료정보를 제공하게 된다.1 to 3 , the medical information support unit 10 according to the present invention collects and stores medical information of a subject, and provides the collected medical information.

이러한 의료정보지원유닛(10)은 피검사자, 즉 환자의 의료정보를 수집하고 저장하기 위해 RIS(11), PACS(13), EMR(15), OCS(17) 등으로 구성된다.The medical information support unit 10 is composed of a RIS 11 , a PACS 13 , an EMR 15 , an OCS 17 and the like to collect and store medical information of a subject, that is, a patient.

먼저 RIS(11)(Radiology Information System)는 방서선과 정보 시스템으로 환자의 방사선에 의한 진료 및 진단정보를 저장하고 전송한다. 다음으로 PACS(13)(Picture Archiving and Communications System)는 의료 영상을 저장 및 전송하기 시스템이다. 또한 EMR(15)(Electronic Medical Record)은 환자의 전자 의료 기록에 대한 것이고, OCS(17)(Order Communication System)는 의사의 처방을 전달하기 위한 시스템이다.First, the RIS (11) (Radiology Information System) is a radiographic and information system that stores and transmits treatment and diagnosis information by radiation of a patient. Next, the PACS 13 (Picture Archiving and Communications System) is a system for storing and transmitting medical images. In addition, the EMR ( 15 ) (Electronic Medical Record) is for the electronic medical record of the patient, and the OCS ( 17 ) (Order Communication System) is a system for delivering a doctor's prescription.

이와 같이 의료정보지원유닛(10)은 상기한 바와 같은 시스템을 통하여 피검사자, 즉 환자의 의료정보를 수집하고 저장하며, 후술할 헤드마운트 디스플레이유닛(20)이나 학습유닛(30)에 의하여 정보전달의 요청 및 조회 시, 그 결과 및 해당 정보를 헤드마운트 디스플레이유닛(20)으로 전송하여 환자를 진료하고 있는 검사자, 즉 의사에게 실시간으로 증강화면을 표시하게 된다.As such, the medical information support unit 10 collects and stores medical information of the subject, that is, the patient through the system as described above, and transmits information by the head mounted display unit 20 or the learning unit 30 to be described later. Upon request and inquiry, the result and corresponding information are transmitted to the head mounted display unit 20 to display the augmented screen in real time to the examiner who is treating the patient, that is, the doctor.

이와 같이 구성된 의료정보지원유닛(10)은 병원 내 진료 현황, 약학 정보, 행정 정보 등이 포괄적으로 수집 및 저장되어 필요에 따라, 즉 일반 진료나 수술 중 필요한 정보의 조회 및 요청 시, 해당 정보를 헤드마운트 디스플레이유닛(20)으로 전송하여 증강화면으로 표시하게 할 수 있다.The medical information support unit 10 configured in this way comprehensively collects and stores the status of medical treatment in the hospital, pharmaceutical information, administrative information, etc., and provides the information as needed, that is, when inquiring and requesting information necessary during general treatment or surgery. It can be transmitted to the head-mounted display unit 20 to be displayed as an augmented screen.

도 1 내지 도 3에 도시된 바와 같이 본 발명에 따른 헤드마운트 디스플레이유닛(20)은 검사자의 두부에 장착되고, 의료정보지원유닛(10)으로부터 제공되는 피검사의 의료정보를 검사자의 양안에 증강 화면으로 디스플레이하고, 피검사자의 검사 시 검사자로부터 취득한 영상 및 음성정보데이터를 송출하도록 구성된다.1 to 3, the head mounted display unit 20 according to the present invention is mounted on the examiner's head and augments the medical information of the subject provided from the medical information support unit 10 to both eyes of the examiner. It is displayed on a screen and configured to transmit image and audio information data acquired from the examinee during the examination of the subject.

이를 위한 헤드마운트 디스플레이유닛(20)은 소스데이터전달부(21), 증강화면구성부(22), 증강화면표시부(23) 및 통신부(24)를 포함하여 구성된다.The head mounted display unit 20 for this purpose is configured to include a source data transfer unit 21 , an augmented screen configuration unit 22 , an augmented screen display unit 23 , and a communication unit 24 .

먼저 헤드마운트 디스플레이유닛(20)은 검사자의 두부에 착용할 수 있도록 착용밴드(27)가 구비되고, 전면에는 시스루 글라스(26)가 구비된다.First, the head mounted display unit 20 is provided with a wearing band 27 to be worn on the examiner's head, and a see-through glass 26 is provided on the front side.

헤드마운트 디스플레이유닛(20)의 소스데이터전달부(21)는 착용밴드(27) 전방에 구비되는 카메라(21a)와, 착용밴드(27) 측방에 연결되는 마이크(21b)로 구성된다. 즉 소스데이터전달부(21)는 의사가 헤드마운트 디스플레유닛을 착용한 후, 카메라(21a)를 통하여 촬영된 영상정보와 마이크(21b)를 통하여 인식된 음성정보를 소스데이터로 하고, 이러한 소스데이터를 통신부(24)를 통하여 후술할 학습유닛(30)으로 전송하게 된다.The source data transfer unit 21 of the head mounted display unit 20 includes a camera 21a provided in front of the wearing band 27 and a microphone 21b connected to the side of the wearing band 27 . That is, the source data transfer unit 21 uses the image information photographed through the camera 21a and the voice information recognized through the microphone 21b as source data after the doctor wears the head mounted display unit, and this source data is transmitted to the learning unit 30 to be described later through the communication unit 24 .

아울러 헤드마운트 디스플레이유닛(20)의 증강화면구성부(22)는 장치에 내장되어 있으며, 통신부(24)에 의하여 전달받은 데이터를 화면에 표시하기 위해 분류 배치시키게 된다.In addition, the augmented screen configuration unit 22 of the head mounted display unit 20 is built in the device, and the data received by the communication unit 24 are sorted and arranged to display the data on the screen.

이렇게 증강화면구성부(22)에 의하여 분류 배치된 데이터는 헤드마운트 디스플레이유닛(20)의 증강화면표시부(23)에 의하여 시스루 글라스(26)에 증강화면을 표시하여 제공하게 된다.The data classified and arranged by the augmented screen configuration unit 22 is provided by displaying the augmented screen on the see-through glass 26 by the augmented screen display unit 23 of the head mounted display unit 20 .

또한 헤드마운트 디스플레이유닛(20)의 통신부(24)는 착용밴드(27) 일측에 배치되어 소스데이터전달부(21)에 의하여 취득한 영상데이터 및 음성데이터를 송출하여 학습유닛(30)을 전송하고, 학습유닛(30) 또는 의료정보통신부(24)를 통해 정보 조회 요청 및 요청된 정보를 수신하게 된다. 이렇게 수신된 정보는 증강화면구성부(22) 및 증강화면표시부(23)에 의하여 표시될 수 있다.In addition, the communication unit 24 of the head mounted display unit 20 is disposed on one side of the wearing band 27 and transmits the image data and audio data acquired by the source data transfer unit 21 to transmit the learning unit 30, The information inquiry request and the requested information are received through the learning unit 30 or the medical information communication unit 24 . The information thus received may be displayed by the augmented screen configuration unit 22 and the augmented screen display unit 23 .

아울러 통신부(24)의 통신방식은 무선통신, 즉 와이파이(Wi-Fi), 무선네트워크 등의 접속을 통하여 구현될 수 있다.In addition, the communication method of the communication unit 24 may be implemented through wireless communication, that is, Wi-Fi, a wireless network, and the like connection.

아울러 헤드마운트 디스플레이유닛(20)은 착용밴드(27) 일측에 안내용 스피커(28)가 구비되어 진료나 시술 중 필요한 정보를 의사에게 음성을 전달하게 된다.In addition, the head-mounted display unit 20 is provided with a speaker 28 for guidance on one side of the wearing band 27 to transmit voice to the doctor for necessary information during treatment or surgery.

나아가 본 발명에 따른 헤드마운트 디스플레이유닛(20)은 통신부(24)에 의하여 전송된 데이터를 파서(Parser)를 통하여 분류 및 구분하여 증강화면구성부(22)로 전달하기 위한 프로토콜해석부(25)가 더 구비될 수 있다.Further, the head mounted display unit 20 according to the present invention classifies and classifies the data transmitted by the communication unit 24 through a parser and transmits the data to the augmented screen configuration unit 22. Protocol analysis unit 25 may be further provided.

도 1 내지 도 3에 도시된 바와 같이 본 발명에 따른 학습유닛(30)은 헤드마운트 디스플레이유닛(20)으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝 학습방법을 통해 분석 및 분류된 인식정보를 상기 의료정보지원유닛(10)으로부터 조회 및 확인할 수 있도록 구성된다.1 to 3, the learning unit 30 according to the present invention analyzes and classifies the image and audio information source data transmitted from the head mounted display unit 20 through a deep learning learning method. It is configured to be inquired and confirmed from the medical information support unit (10).

또한 외과 수중 중 환자 정보 조회를 위해 키보드를 사용하는 것은 어렵기 때문에, 이를 해결하기 위해 카메라(21a)와 음성을 통해 명령어를 인식하여 환자의 정보를 의료정보지원유닛(10), 즉 HIS(Hospital Information system)에 요청하여야 한다. 다만 일반적인 환경에서 제스처 인식과 음성 인식은 많은 알고리즘과 학습데이터가 존재하나, 수술실의 낮은 조도 및 수술 중 오염되거나 또는 수술용 장갑을 착용한 경우 제스처 인식은 어려울 뿐만 아니라, 음성 인식에 있어서도 수술용 마스크를 착용하여 변조된 음성에 대한 인식이 어렵다는 문제가 있다.In addition, since it is difficult to use a keyboard for inquiring patient information during surgery, in order to solve this, commands are recognized through the camera 21a and voice, and the patient's information is transferred to the medical information support unit 10, that is, HIS (Hospital). information system). However, in a general environment, there are many algorithms and learning data for gesture recognition and voice recognition, but gesture recognition is difficult in the case of low illumination in the operating room, contaminated during surgery, or wearing surgical gloves, as well as a surgical mask for voice recognition. There is a problem in that it is difficult to recognize the modulated voice by wearing the

이에 본 발명에서는 수술 중, 오염되거나, 수술용 장갑 및 마스크 착용으로 인하여 인식이 어려운 제스처 등의 영상정보와 음성정보를 딥러닝에 의한 학습을 통해 분류 및 분석하고, 이렇게 분류 및 분석된 학습데이터를 이용하여 수술 중 의사의 명령에 의한 정보 조회 및 요청에 대한 인식 성공률을 높이고자 한다.Accordingly, in the present invention, image information and voice information such as gestures that are difficult to recognize during surgery, contaminated, or difficult to recognize due to wearing surgical gloves and masks are classified and analyzed through deep learning learning, and the classified and analyzed learning data is used in the present invention. It is intended to increase the recognition success rate for information inquiry and request by the doctor's orders during surgery.

이를 위한 학습유닛(30)은 헤드마운트 디스플레이유닛(20)으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝에 의한 분류 학습을 통해 외과 수술 환경에서 변조된 영상정보 및 음성정보로부터 특징데이터를 추출 및 분류하는 영상학습부(31)와 음성학습부(33)를 포함하여 이루어진다.The learning unit 30 for this purpose extracts feature data from the image information and audio information modulated in the surgical environment through classification learning by deep learning the image and audio information source data transmitted from the head mounted display unit 20, and It consists of an image learning unit 31 and an audio learning unit 33 to classify.

학습유닛(30)의 영상학습부(31)는 오픈 소스인 'YOLO' 딥러닝 기반의 영상인식을 통한 학습데이터를 생성하고, 음성학습부(33)는 오픈 소스인 'KALDI' 딥러닝 기반의 음성인식을 통한 학습데이터를 생성하게 된다. 여기서, 학습데이터는 외과 수술 중 의사의 제스처, 즉 손모양이나 행동 및 음성을 분석 및 분류하여 학습된 데이터를 의미한다.The video learning unit 31 of the learning unit 30 generates learning data through image recognition based on the open source 'YOLO' deep learning, and the voice learning unit 33 is based on the open source 'KALDI' deep learning. Learning data is generated through voice recognition. Here, the learning data refers to data learned by analyzing and classifying a gesture, ie, hand shape, action, and voice of a doctor during a surgical operation.

먼저 영상학습부(31)는 수술실의 낮은 조도 및 수술 중 오염되거나 또는 수술용 장갑을 착용한 경우 제스처 인식 성공률을 높이기 위해 특징추출부(31a)와, 특징벡터변환부(31b) 및 벡터분류부(31c)로 구성된다.First, the image learning unit 31 includes a feature extraction unit 31a, a feature vector conversion unit 31b, and a vector classification unit in order to increase the success rate of gesture recognition when the operating room is contaminated during operation or when the operating room is contaminated during operation or wearing surgical gloves. It consists of (31c).

여기서 특징추출부(31a)는 수술 중 검사자, 즉 의사의 손모양 또는 행동에 대한 특징을 추출하게 된다. 이렇게 특징추출부(31a)에 의하여 추출된 특징데이터는 특징벡터변환부(31b)에 의하여 특징벡터로 변환하게 된다. 벡터분류부(31c)는 변환된 특징벡터의 분류 인자에 의해 환자의 정보 조회 및 요청에 관한 명령어를 인식하게 된다.Here, the feature extraction unit 31a extracts features of the hand shape or action of the examiner, that is, the doctor during surgery. The feature data extracted by the feature extraction unit 31a in this way is converted into a feature vector by the feature vector conversion unit 31b. The vector classification unit 31c recognizes a command related to a patient's information inquiry and request by the classification factor of the transformed feature vector.

다음으로 음성학습부(33)는 수술용 마스크를 착용하여 변조된 음성에 대한 인식에 대한 인식 성공률을 높이기 위해 변조특징추출부(33a)와, 변조특징벡터변환부(33b) 및 변조벡터분류부(33c)를 포함하여 구성된다.Next, the voice learning unit 33 includes a modulation feature extraction unit 33a, a modulation feature vector conversion unit 33b, and a modulation vector classification unit to increase the recognition success rate for recognition of a modulated voice by wearing a surgical mask. (33c) is included.

여기서 변조특징추출부(33a)는 수술 중 검사자, 즉 의사의 변질된 음성에 대한 특징을 추출하게 된다. 이렇게 변조특징추출부(33a)에 의하여 추출된 특징데이터는 변조특징벡터변환부(33b)에 의하여 특징벡터로 변환된다. 변조벡터분류부(33c)는 변환된 특징벡터의 분류 인자에 의해 환자의 정보 조회 및 요청에 관한 명령어를 인식하게 된다.Here, the modulation feature extraction unit 33a extracts the characteristics of the altered voice of the examiner, that is, the doctor during the operation. The feature data extracted by the modulation feature extraction unit 33a in this way is converted into a feature vector by the modulation feature vector conversion unit 33b. The modulation vector classification unit 33c recognizes a command related to a patient's information inquiry and request by the classification factor of the transformed feature vector.

상기한 바와 같이 학습유닛(30)의 영상학습부(31) 및 음성학습부(33)는 딥러닝 학습을 통해 외과 수술 환경에서의 특징데이터를 추출하여 인식 성공률을 높이기 위한 학습데이터를 제공하게 된다.As described above, the image learning unit 31 and the voice learning unit 33 of the learning unit 30 extract feature data in the surgical environment through deep learning learning to provide learning data for increasing the recognition success rate. .

이렇게 학습유닛(30)에 의하여 외과 시술 환경에서, 즉 오염되거나 변조된 제스처나 음성데이터를 분석 및 분류 학습을 통하여 생성된 학습데이터를 이용하여 인식 성공률을 높여 환자의 정확한 정보가 헤드마운트 디스플레이유닛(20)에 표시될 수 있다. In this way, the learning unit 30 increases the recognition success rate by using the learning data generated through analysis and classification learning of contaminated or modulated gestures or voice data in the surgical environment, that is, the accurate information of the patient is displayed on the head mounted display unit ( 20) can be shown.

도 1 내지 도 3에 도시된 바와 같이 본 발명에 따른 의료정보통신모듈(40)은 학습유닛(30)을 통해 분류된 인식정보를 의료정보지원유닛(10)으로 조회 및 확인을 요청하고, 의료정보지원유닛(10)으로부터 전송된 인식정보 및 분석정보를 헤드마운트 디스플레이유닛(20)으로 전송할 수 있도록 구성된다.1 to 3, the medical information and communication module 40 according to the present invention requests the medical information support unit 10 to inquire and confirm the recognition information classified through the learning unit 30, It is configured to transmit the recognition information and analysis information transmitted from the information support unit 10 to the head mounted display unit 20 .

즉 의료정보통신모듈(40)은 학습유닛(30)을 통하여 분석 및 분류된 학습데이터, 즉 외과 수술 환경에서 취하여진 의사에 의한 제스처 내지 음성에 의한 명령어를 인식한 후, 인식된 명령에 의하여 환자의 정보 내지 필요한 정보를 의료정보지원유닛(10)에 전송하여 조회 및 요청하게 된다. 이렇게 전송된 정보 조회 및 요청에 따라 의료정보지원유닛(10)은 의료정보통신모듈(40)을 통하여 인식정보 내지 분석정보를 헤드마운트 디스플레이유닛(20)에 전송하게 된다. 헤드마운트 디스플레이유닛(20)은 전송된 정보를 증강화면을 표시하여 의사에게 전달하게 된다.That is, the medical information communication module 40 recognizes the learning data analyzed and classified through the learning unit 30, that is, a gesture or a voice command by a doctor taken in a surgical environment, and then, according to the recognized command, the patient information or necessary information is transmitted to the medical information support unit 10 for inquiry and request. According to the inquiry and request for the transmitted information, the medical information support unit 10 transmits the recognition information or analysis information to the head mounted display unit 20 through the medical information communication module 40 . The head mounted display unit 20 transmits the transmitted information to the doctor by displaying the augmented screen.

나아가 본 발명에 따른 의료정보지원유닛과 헤드마운트 디스플레이유닛은 전술한 바와 같이 학습유닛을 통하지 않고, 통상적인 경우 다이콤 시스템에 의한 영상 및 음성정보를 요청하고, 수신하여 증강화면에 표시하는 것도 가능하다.Furthermore, the medical information support unit and the head-mounted display unit according to the present invention do not go through the learning unit as described above, and in normal cases, it is also possible to request, receive, and display the video and audio information by the daicom system on the augmented screen. Do.

한편 도 4에 도시된 바와 같이 본 발명에 따른 의료정보지원시스템은 유니티 플러그인(Unity Plug-in)(so, DLL)인 형태로 구성될 수 있으며, 이를 통하여 타 시스템에서도 접목이 용이하도록 할 수 있다. 따라서 본 발명에 따른 증강현실을 이용한 의료지원시스템의 보급률을 높이고, 이를 통하여 진료행위의 편의성을 향상에 기여할 수 있다.Meanwhile, as shown in FIG. 4, the medical information support system according to the present invention may be configured in the form of a Unity Plug-in (so, DLL), and through this, it can be easily grafted to other systems. . Therefore, it is possible to increase the penetration rate of the medical support system using augmented reality according to the present invention, thereby contributing to the improvement of the convenience of medical treatment.

이와 같이 본 발명은 도면에 도시된 일실시례를 참고로 설명되었으나, 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시례가 가능하다는 점을 이해할 것이다.As described above, the present invention has been described with reference to one embodiment shown in the drawings, but this is merely exemplary, and various modifications and equivalent other embodiments are possible therefrom by those of ordinary skill in the art. will understand

따라서 본 발명의 진정한 기술적 보호범위는 첨부된 청구범위의 기술적 사상에 의해 정해져야 할 것이다.Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

10 : 의료정보지원유닛
11 : RIS 13 : PACS
15 : EMR 17 : OCS
20 : 헤드마운트 디스플레이유닛
21 : 소스데이터전달부 21a : 카메라
21b : 마이크 22 : 증강화면구성부
23 : 증강화면표시부 24 : 통신부
25 : 프로토콜해석부 26 : 시스루 글라스
27 : 착용밴드 28 : 안내용 스피커
30 : 학습유닛
31 : 영상학습부 31a : 특징추출부
31b : 특징벡터변환부 31c : 벡터분류부
33 : 음성학습부(33) 33a : 변조특징추출부
33b : 변조특징벡터변환부 33c : 변조벡터분류부
40 : 의료정보통신모듈10: Medical information support unit
11: RIS 13: PACS
15: EMR 17: OCS
20: head mounted display unit
21: source data transfer unit 21a: camera
21b: microphone 22: augmented screen composition unit
23: augmented screen display unit 24: communication unit
25: protocol analysis unit 26: see-through glass
27: wearing band 28: speaker for guidance
30: learning unit
31: image learning unit 31a: feature extraction unit
31b: feature vector conversion unit 31c: vector classification unit
33: voice learning unit 33 33a: modulation feature extraction unit
33b: modulation feature vector conversion unit 33c: modulation vector classification unit
40: medical information communication module

Claims

피검사자의 의료정보를 수집 및 저장하고, 수집된 의료정보를 제공하기 위한 의료정보지원유닛(10);
검사자의 두부에 장착되고, 상기 의료정보지원유닛(10)으로부터 제공되는 피검사의 의료정보를 검사자의 양안에 증강 화면으로 디스플레이하고, 피검사자의 검사 시 검사자로부터 취득한 영상 및 음성정보데이터를 송출하는 헤드마운트 디스플레이유닛(20);
상기 헤드마운트 디스플레이유닛(20)으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝 학습방법을 통해 분석 및 분류된 인식정보를 상기 의료정보지원유닛(10)으로부터 조회 및 확인하기 위한 학습유닛(30); 및
상기 학습유닛(30)을 통해 분류된 인식정보를 상기 의료정보지원유닛(10)으로 조회 및 확인을 요청하고, 상기 의료정보지원유닛(10)으로부터 전송된 인식정보 및 분석정보를 상기 헤드마운트 디스플레이유닛(20)으로 전송하기 위한 의료정보통신모듈(40);
을 포함하여 이루어진 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.
a medical information support unit 10 for collecting and storing the medical information of the subject and providing the collected medical information;
A head mounted on the head of an examiner, displaying the medical information of the subject provided from the medical information support unit 10 as augmented screens in both eyes of the examiner, and transmitting image and audio information data acquired from the examiner during the examination of the subject mount display unit 20;
A learning unit 30 for inquiring and confirming the recognition information classified and analyzed through the deep learning learning method on the image and audio information source data transmitted from the head mounted display unit 20 from the medical information support unit 10 ; and
Request for inquiry and confirmation of the recognition information classified through the learning unit 30 to the medical information support unit 10, and display the recognition information and analysis information transmitted from the medical information support unit 10 on the head mounted display Medical information communication module 40 for transmitting to the unit 20;
A Unity plugin for image/voice recognition and patient information processing for surgical operations using AR glasses and deep learning, including

제 1 항에 있어서, 상기 헤드마운트 디스플레이유닛(20)은
카메라(21a)에 의하여 촬영 영상정보 소스데이터와 마이크(21b)에 의하여 인식된 음성정보 소스데이터를 학습유닛(30)으로 전송하기 위한 소스데이터전달부(21)와,
상기 의료정보지원유닛(10) 및 상기 학습유닛(30)으로부터 전송된 데이터의 화면 표시를 위해 분류 및 구분하는 증강화면구성부(22)와,
상기 증강화면구성부(22)에 의하여 구성된 화면을 표시하기 위한 증강화면표시부(23)와,
상기 소스데이터전달부(21)의 영상 및 음성데이터를 송출하고, 상기 학습유닛(30)을 통해 상기 의료정보진원유닛으로 정보 조회 요청 및 요청된 정보를 전달받기 위한 통신부(24)를 포함하여 이루어진 것을 특징으로 하는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.
According to claim 1, wherein the head mounted display unit (20)
A source data transfer unit 21 for transmitting the source data of the image information captured by the camera 21a and the source data of the audio information recognized by the microphone 21b to the learning unit 30;
an augmented screen configuration unit 22 for classifying and classifying data transmitted from the medical information support unit 10 and the learning unit 30 for screen display;
an augmented screen display unit 23 for displaying the screen constituted by the augmented screen construction unit 22;
and a communication unit 24 for transmitting the video and audio data of the source data transmission unit 21 and receiving an information inquiry request and requested information to the medical information diagnosis unit through the learning unit 30 . A Unity plug-in for image and voice recognition and patient information processing for surgical operations using AR glasses and deep learning, characterized in that.

제 2 항에 있어서,
상기 헤드마운트 디스플레이유닛(20)은 상기 통신부(24)를 통하여 전송된 데이터를 파서(Parser)를 통하여 분류 및 구분하여 상기 증강화면구성부(22)로 전달하기 위한 프로토콜해석부(25)를 더 포함하는 것을 특징으로 하는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.
3. The method of claim 2,
The head mounted display unit 20 categorizes and classifies the data transmitted through the communication unit 24 through a parser and transmits the data to the augmented screen configuration unit 22. A protocol analysis unit 25 is further added. Image and voice recognition and patient information processing Unity plug-in for surgery using AR glasses and deep learning, characterized in that it includes.

제 1 항에 있어서, 상기 학습유닛(30)은
상기 헤드마운트 디스플레이유닛(20)으로부터 전송된 영상 및 음성정보 소스데이터를 딥러닝에 의한 분류 학습을 통해 외과 수술 환경에서 변조된 영상정보 및 음성정보로부터 특징데이터를 추출 및 분류하기 위한 영상학습부(31)와 음성학습부(33)를 포함하여 이루어지는 것을 특징으로 하는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.
According to claim 1, wherein the learning unit (30)
An image learning unit for extracting and classifying the image and audio information source data transmitted from the head mounted display unit 20 through classification learning by deep learning to extract and classify feature data from the image information and audio information modulated in the surgical environment ( 31) and the voice learning unit 33, AR glasses and deep learning for surgical operation using image/voice recognition and patient information processing Unity plug-in, characterized in that it comprises.

제 4 항에 있어서, 상기 영상학습부(31)는
수술 중 검사자의 손모양 또는 행동에 대한 특징을 추출하기 위한 특징추출부(31a)와,
상기 특징추출부(31a)로부터 추출된 특징데이터를 특징벡터로 변환하기 위한 특징벡터변환부(31b)와,
상기 특징벡터변환부(31b)에 의하여 변환된 특징벡터를 분류 인자에 의해 명령어를 인식하기 위한 벡터분류부(31c)를 포함하여 이루어진 것을 특징으로 하는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.
The method of claim 4, wherein the image learning unit (31)
A feature extraction unit (31a) for extracting features of the examiner's hand shape or behavior during surgery;
a feature vector conversion unit 31b for converting the feature data extracted from the feature extraction unit 31a into a feature vector;
Surgical operation image using AR glasses and deep learning, characterized in that it includes a vector classification unit 31c for recognizing a command by a classification factor for the feature vector converted by the feature vector conversion unit 31b Unity plugin for speech recognition and patient information processing.

제 4 항에 있어서, 상기 음성학습부(33)는
수술 중 검사자의 변질된 음성에 대한 특징을 추출하기 위한 변조특징추출부(33a)와,
상기 변조특징추출부(33a)로부터 추출된 특징데이터를 특징벡터로 변환하기 위한 변조특징벡터변환부(33b)와,
상기 변조특징벡터변환부(33b)에 의하여 변환된 특징벡터를 분류 인자에 의해 명령어를 인식하기 위한 변조벡터분류부(33c)를 포함하여 이루어진 것을 특징으로 하는 AR 글라스 및 딥러닝을 이용한 외과 수술용 영상·음성 인식 및 환자 정보 처리 유니티 플러그인.According to claim 4, wherein the voice learning unit (33)
A modulation feature extraction unit (33a) for extracting features of the inspector's altered voice during surgery;
a modulation feature vector conversion unit 33b for converting the feature data extracted from the modulation feature extraction unit 33a into a feature vector;
Surgical operation using AR glasses and deep learning, characterized in that it comprises a modulation vector classification unit (33c) for recognizing the command by the classification factor of the feature vector converted by the modulation feature vector conversion unit (33b) Unity plugin for image/voice recognition and patient information processing.