KR20230001968A

KR20230001968A - Voice and gesture integrating device of vehicle

Info

Publication number: KR20230001968A
Application number: KR1020210085139A
Authority: KR
Inventors: 이동석; 이재환; 오경민
Original assignee: 혜윰기술 주식회사
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2023-01-05
Also published as: KR102492229B1

Abstract

The present invention relates to a voice and gesture recognition device for a vehicle to extract only voice commands and accurately recognize a user's hand gestures. According to one embodiment of the present invention, the voice and gesture recognition device comprises: a transmitting pad emitting an electric field to detect gestures of a user's hand; a receiving pad detecting distortion of the electric field emitted from the transmitting pad; a gesture recognizer recognizing the gesture by the distortion of the electric field in the receiving pad; a microphone unit detecting the user's voice signal; a voice preprocessor performing preprocessing including noise-canceling for the voice signal with respect to the voice signal detected by the microphone unit; a voice recognizer recognizing the voice signal preprocessed by the voice preprocessor; and a communicator transmitting the signals for the gestures recognized by the gesture recognizer and the voice signals detected by the voice recognizer to an electronic control unit of a vehicle.

Description

차량용 음성 및 제스처 인식 장치{VOICE AND GESTURE INTEGRATING DEVICE OF VEHICLE}Vehicle voice and gesture recognition device {VOICE AND GESTURE INTEGRATING DEVICE OF VEHICLE}

본 발명은 차량용 음성 및 제스처 인식 장치에 관한 것으로, 더욱 상세하게는 차량에서 운전자가 운전하는 동안 차량의 장비에 대해 제어할 수 있는 차량용 음성 및 제스처 인식 장치에 관한 것이다.The present invention relates to a voice and gesture recognition device for a vehicle, and more particularly, to a voice and gesture recognition device for a vehicle that allows a driver to control equipment of a vehicle while driving.

차량은 동작하는 동안 운전자가 지속적으로 전방을 주시한 상태에서 운행될 필요가 있다. 이렇게 차량이 동작하는 동안 운전자는 차량의 다른 기능을 제어하려면 해당 기능에 대한 물리적인 버튼이나 스위치를 조작하여야 한다. 이렇게 차량의 운행 동안 차량의 다양한 기능을 조작하는 것은 운전자가 차량의 운행에 집중하지 못하는 문제가 있을 수 있다.While the vehicle is operating, it is necessary for the driver to continuously look ahead. While the vehicle is operating in this way, the driver must operate a physical button or switch for the corresponding function to control other functions of the vehicle. Manipulating various functions of the vehicle while driving the vehicle may cause a problem in that the driver cannot concentrate on driving the vehicle.

그에 따라 차량의 다양한 기능을 조작하기 위해 음성인식이나 운전자가 손동작 등과 같은 제스처를 통해 차량의 기능을 조작하고자 하는 연구가 이루어지고 있다. 종래에서, 음성인식은 일반적으로 두 가지 단계로 구분할 수 있다. 첫 번째 단계는 운전자의 음성을 주변 노이즈와 관계없이 또렷하게 분리하기 위해 전처리 기술이 요구되고, 두 번째 단계는 정확한 음성 명령을 인식하고 차량의 운행 상황과 안전을 고려하여 운전자와의 인터렉션 과정을 최소화하고 최대한 안전을 고려한 보수적인 결과물이 도출될 필요가 있다.Accordingly, in order to manipulate various functions of the vehicle, research is being conducted to manipulate the functions of the vehicle through gestures such as voice recognition or a driver's hand motion. Conventionally, voice recognition can generally be divided into two stages. The first step requires pre-processing technology to clearly separate the driver's voice regardless of surrounding noise. It is necessary to derive conservative results considering safety as much as possible.

하지만, 종례에 이러한 음성 인식 기술은 특정 위치에 있는 사람(예컨대, 운전자)의 음성만을 인식할 수 있게 지향각이 고정되어 있어, 동승자나 뒷좌석에 위치한 탑승자가 음성 인식을 이용하기 어려운 문제가 있다. 또한, 서버를 기반으로 음성을 인식하는 제품의 경우, 인터넷망이 연결되어 있어야 하는 문제가 있다.However, conventional voice recognition technology has a fixed angle of view to recognize only the voice of a person (eg, a driver) in a specific location, making it difficult for a passenger or a passenger in the back seat to use voice recognition. In addition, in the case of a product that recognizes voice based on a server, there is a problem that an Internet network must be connected.

더욱이, 상기와 같이, 음성 인식에만 한정하여 차량의 기능을 제어하는 경우에 세밀하게 차량의 다양한 기능을 제어하는 것이 쉽지 않은 문제가 있다.Furthermore, as described above, in the case of controlling vehicle functions limited to voice recognition, it is not easy to control various vehicle functions in detail.

대한민국 등록특허 제10-1650769호 (2016.08.18.)Republic of Korea Patent No. 10-1650769 (2016.08.18.) 대한민국 등록특허 제10-1260053호 (2013.04.25.)Republic of Korea Patent No. 10-1260053 (2013.04.25.)

본 발명이 해결하고자 하는 과제는, 사용자의 주변에서 발생하는 노이즈를 음성과 분리하여 정확한 음성 명령만을 추출할 수 있으며, 사용자의 손동작 제스처를 정확하게 인식할 수 있는 차량용 음성 및 제스처 인식 장치를 제공하는 것이다.An object to be solved by the present invention is to provide a voice and gesture recognition device for a vehicle capable of extracting only accurate voice commands by separating noise generated from the user's surroundings from voice and accurately recognizing the user's hand gestures. .

본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치는, 사용자의 손에 대한 제스처를 감지하기 위해 전기장(electric field)을 방출하는 송신 패드; 상기 송신 패드에서 방출된 전기장이 왜곡되는 것을 검출하는 수신 패드; 상기 수신 패드에서 전기장의 왜곡으로 상기 제스처를 인식하는 제스처 인식기; 상기 사용자의 음성 신호를 검출하는 마이크부; 상기 마이크부에서 검출된 음성 신호에 대해 음성 신호에 대한 노이즈의 제거를 포함하는 전처리를 수행하는 음성 전처리기; 상기 음성 전처리기에서 전처리된 음성 신호를 인식하는 음성 인식기; 및 상기 제스처 인식기 및 상기 음성 인식기에서 인식된 제스처에 대한 신호 및 음성 신호를 차량의 전자제어유닛으로 전송하는 통신기를 포함할 수 있다.A voice and gesture recognition apparatus according to an embodiment of the present invention includes a transmission pad emitting an electric field to detect a gesture of a user's hand; a receiving pad that detects distortion of the electric field emitted from the transmitting pad; a gesture recognizer recognizing the gesture by distortion of an electric field in the receiving pad; a microphone unit for detecting the user's voice signal; a voice pre-processor which performs pre-processing on the voice signal detected by the microphone unit, including removing noise of the voice signal; a voice recognizer for recognizing the voice signal preprocessed by the voice preprocessor; and a communicator transmitting signals and voice signals for gestures recognized by the gesture recognizer and the voice recognizer to an electronic control unit of the vehicle.

상기 수신 패드에서 전기장의 왜곡으로 상기 제스처의 깊이 및 방향을 검출하는 제스처 검출기를 더 포함하고, 상기 제스처 인식기는 상기 제스처 검출기에서 인식된 깊이 및 방향을 이용하여 상기 제스처를 인식할 수 있다.The gesture detector may further include a gesture detector detecting depth and direction of the gesture by distortion of an electric field at the receiving pad, and the gesture recognizer may recognize the gesture using the depth and direction recognized by the gesture detector.

상기 제스처 검출기는 상기 제스처의 깊이 및 방향을 검출하여 제스처 패턴을 검출하고, 검출된 제스처 패턴과 기 설정된 제스처 패턴을 비교하며, 상기 제스처 인식기는 상기 제스처 검출기에서 상기 검출된 제스처 패턴과 상기 기 설정된 제스처 패턴이 일치하는 경우에 상기 제스처를 인식할 수 있다.The gesture detector detects a gesture pattern by detecting the depth and direction of the gesture, compares the detected gesture pattern with a preset gesture pattern, and the gesture recognizer detects the gesture pattern detected by the gesture detector and the preset gesture pattern. When the patterns match, the gesture may be recognized.

상기 마이크부는 복수 개의 마이크를 포함하고, 상기 복수 개의 마이크는, 상기 차량의 내부에 서로 다른 위치에 배치될 수 있다.The microphone unit may include a plurality of microphones, and the plurality of microphones may be disposed at different positions inside the vehicle.

상기 음성 전처리기는, 상기 마이크부에서 검출된 음성 신호에서 노이즈를 제거하는 노이즈 제거부; 상기 노이즈 제거부에서 노이즈가 제거된 음성 신호에서 잔향을 제거하는 잔향 제거부; 상기 잔향 제거부에서 잔향이 제거된 음성 신호에서 음성 신호의 방향 및 위치를 검출하는 방향 및 위치 검출부; 및 상기 방향 및 위치 검출부에서 방향 및 위치가 검출된 음성 신호에 대해 빔(beam)을 형성하는 빔 형성부를 포함할 수 있다.The voice preprocessor may include a noise removal unit removing noise from the voice signal detected by the microphone unit; a reverberation removal unit that removes reverberation from the voice signal from which the noise has been removed by the noise removal unit; a direction and position detector for detecting a direction and position of a voice signal from the voice signal from which the reverberation has been removed by the reverberation remover; and a beam forming unit configured to form a beam for the audio signal whose direction and position are detected by the direction and position detection unit.

상기 노이즈 제거부는 사용자의 음성 신호 주파수 범위보다 높은 고주파수에 해당하는 신호 및 사용자의 음성 신호 주파수 범위보다 낮은 저주파수에 해당하는 신호를 제거하여 노이즈를 제거할 수 있다.The noise removal unit may remove noise by removing a signal corresponding to a high frequency higher than the frequency range of the user's voice signal and a signal corresponding to a low frequency lower than the frequency range of the user's voice signal.

상기 음성 전처리기에서 전처리된 음성 신호에서 사용자의 음성 신호의 방향 및 위치에 대해 고려된 음성 신호를 검출하는 음성신호 검출부를 더 포함하고, 상기 음성 인식부는, 상기 음성신호 검출부에서 방향 및 위치에 대해 고려된 음성 신호를 인식할 수 있다.A voice signal detection unit for detecting a voice signal considering the direction and position of the user's voice signal from the voice signal preprocessed by the voice preprocessor, wherein the voice recognition unit determines the direction and location of the voice signal in the voice signal detector. The considered audio signal can be recognized.

상기 음성신호 검출부에서 검출된 음성신호와 기 설정된 음성 신호 패턴을 비교하여 출력하는 음성패턴 검출부를 더 포함하고, 상기 음성 인식부는 상기 음성패턴 검출부에서 기 설정된 음성 신호 패턴에 해당하는 검출된 음성신호를 인식할 수 있다.A voice pattern detection unit which compares the voice signal detected by the voice signal detector with a preset voice signal pattern and outputs the voice pattern, wherein the voice recognition unit outputs the detected voice signal corresponding to the voice signal pattern preset by the voice pattern detector. Recognizable.

본 발명에 의하면, 인식된 음성을 전처리 과정을 거쳐 노이즈와 음성을 분리하여 빔을 형성함으로써, 음성신호를 검출할 수 있어 정확하게 사용자의 음성을 인식할 수 있다.According to the present invention, a voice signal can be detected and a user's voice can be accurately recognized by separating a recognized voice from noise through a pre-processing process to form a beam.

또한, 전자기장(electric filed)을 이용하여 사용자의 제스처를 인식할 수 있어 3차원적으로 사용자의 제스처를 인식함으로써 정확하게 제스처를 인식할 수 있는 효과가 있다.In addition, since a user's gesture can be recognized using an electromagnetic field, the user's gesture can be recognized in three dimensions, thereby providing an effect of accurately recognizing the gesture.

도 1은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치를 도시한 블록도이다.
도 2는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 수신 패드를 도시한 도면이다.
도 3은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 검출기를 도시한 도면이다.
도 4는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 마이크부를 도시한 도면이다.
도 5는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성 전처리기를 도시한 도면이다.
도 6은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성패턴 검출부를 도시한 도면이다.
도 7은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 동작의 일례에 대해 설명하기 위한 도면이다.
도 8은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 동작의 일례에 대해 설명하기 위한 도면이다.
도 9는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 동작의 일례에 대해 설명하기 위한 도면이다.
도 10은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 동작의 일례에 대해 설명하기 위한 도면이다.
도 11은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 인식 방법을 설명하기 위한 도면이다.
도 12는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성 인식 방법을 설명하기 위한 도면이다.1 is a block diagram illustrating a voice and gesture recognition device according to an embodiment of the present invention.
2 is a diagram illustrating a receiving pad of a voice and gesture recognition device according to an embodiment of the present invention.
3 is a diagram illustrating a gesture detector of a voice and gesture recognition apparatus according to an embodiment of the present invention.
4 is a diagram illustrating a microphone unit of a voice and gesture recognition device according to an embodiment of the present invention.
5 is a diagram illustrating a voice preprocessor of a voice and gesture recognition apparatus according to an embodiment of the present invention.
6 is a diagram illustrating a voice pattern detection unit of a voice and gesture recognition apparatus according to an embodiment of the present invention.
7 is a diagram for explaining an example of a gesture operation of a voice and gesture recognition apparatus according to an embodiment of the present invention.
8 is a diagram for explaining an example of a gesture operation of a voice and gesture recognition apparatus according to an embodiment of the present invention.
9 is a diagram for explaining an example of a gesture operation of a voice and gesture recognition apparatus according to an embodiment of the present invention.
10 is a diagram for explaining an example of a gesture operation of a voice and gesture recognition apparatus according to an embodiment of the present invention.
11 is a diagram for explaining a gesture recognition method of a voice and gesture recognition apparatus according to an embodiment of the present invention.
12 is a diagram for explaining a voice recognition method of a voice and gesture recognition apparatus according to an embodiment of the present invention.

이하에서는 본 발명을 구현하기 위한 구체적인 실시예에 대하여 도면을 참조하여 상세히 설명하도록 한다. Hereinafter, specific embodiments for implementing the present invention will be described in detail with reference to the drawings.

아울러 본 발명을 설명함에 있어서 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다.In addition, in the description of the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description will be omitted.

또한, 어떤 구성요소가 다른 구성요소에 '연결', '지지', '접속', '공급', '전달', '접촉'된다고 언급된 때에는 그 다른 구성요소에 직접적으로 연결, 지지, 접속, 공급, 전달, 접촉될 수도 있지만 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.In addition, when a component is referred to as 'connecting', 'supporting', 'connecting', 'supplying', 'transferring', or 'contacting' to another component, it is directly connected to, supported by, or connected to the other component. It may be supplied, delivered, or contacted, but it should be understood that other components may exist in the middle.

본 명세서에서 사용된 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로 본 발명을 한정하려는 의도로 사용된 것은 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함한다.Terms used in this specification are only used to describe specific embodiments and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise.

또한, 본 명세서에서 상측, 하측, 측면 등의 표현은 도면에 도시를 기준으로 설명한 것이며 해당 대상의 방향이 변경되면 다르게 표현될 수 있음을 미리 밝혀둔다. 마찬가지의 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다.In addition, in this specification, expressions such as upper, lower, side, etc. are described based on the drawings, and it is made clear in advance that they may be expressed differently if the direction of the object is changed. For the same reason, some components in the accompanying drawings are exaggerated, omitted, or schematically illustrated, and the size of each component does not entirely reflect the actual size.

또한, 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 이와 같은 용어들에 의해 한정되지는 않는다. 이 용어들은 하나의 구성요소들을 다른 구성요소로부터 구별하는 목적으로만 사용된다.In addition, terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by these terms. These terms are only used to distinguish one component from another.

명세서에서 사용되는 "포함하는"의 의미는 특정 특성, 영역, 정수, 단계, 동작, 요소 및/또는 성분을 구체화하며, 다른 특정 특성, 영역, 정수, 단계, 동작, 요소, 성분 및/또는 군의 존재나 부가를 제외시키는 것은 아니다.As used herein, the meaning of "comprising" specifies specific characteristics, regions, integers, steps, operations, elements, and/or components, and other specific characteristics, regions, integers, steps, operations, elements, elements, and/or groups. does not exclude the presence or addition of

도 1은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치를 도시한 블록도이다. 도 2는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 수신 패드를 도시한 도면이고, 도 3은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 검출기를 도시한 도면이다. 도 4는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 마이크부를 도시한 도면이며, 도 5는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성 전처리기를 도시한 도면이다. 도 6은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성패턴 검출부를 도시한 도면이다. 도 7 내지 도 10은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 동작의 일례에 대해 설명하기 위한 도면이다.1 is a block diagram illustrating a voice and gesture recognition device according to an embodiment of the present invention. 2 is a diagram showing a receiving pad of a voice and gesture recognition device according to an embodiment of the present invention, and FIG. 3 is a diagram showing a gesture detector of a voice and gesture recognition device according to an embodiment of the present invention. 4 is a diagram showing a microphone unit of a voice and gesture recognition device according to an embodiment of the present invention, and FIG. 5 is a diagram showing a voice preprocessor of a voice and gesture recognition device according to an embodiment of the present invention. 6 is a diagram illustrating a voice pattern detection unit of a voice and gesture recognition apparatus according to an embodiment of the present invention. 7 to 10 are diagrams for explaining an example of a gesture operation of a voice and gesture recognition apparatus according to an embodiment of the present invention.

도 1 내지 도 10을 참조하여, 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치(100)에 대해 설명한다. 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치(100)는, 사용자가 차량이 운행하는 동안 차량의 다양한 기능을 조작하기 위해 이용된다. 이러한 음성 및 제스처 인식 장치(100)는, 송신 패드(110), 수신 패드(120), 제스처 검출기(130), 제스처 인식기(140), 마이크부(150), 음성 전처리기(160), 음성신호 검출부(170), 음성패턴 검출부(180), 음성 인식기(190) 및 통신부(200)를 포함한다.A voice and gesture recognition apparatus 100 according to an embodiment of the present invention will be described with reference to FIGS. 1 to 10 . The voice and gesture recognition apparatus 100 according to an embodiment of the present invention is used by a user to manipulate various functions of a vehicle while the vehicle is running. The voice and gesture recognition apparatus 100 includes a transmitting pad 110, a receiving pad 120, a gesture detector 130, a gesture recognizer 140, a microphone unit 150, a voice preprocessor 160, and a voice signal. It includes a detection unit 170, a voice pattern detection unit 180, a voice recognizer 190 and a communication unit 200.

송신 패드(110)는, 사용자의 손동작에 대한 제스처를 인식한다. 이러한 송신 패드(110)는 소정의 전기장(electric field)을 방출하고, 이렇게 송신 패드(110)에서 방출된 전기장은 송신 패드(110)의 상부에 소정의 영역을 가지며 분포될 수 있다. 따라서 송신 패드(110)에서 방출된 전기장 내에 사용자의 손 등이 배치되거나 손을 이용하여 특정 제스처를 취하는 경우에 전기장은 왜곡이 발생한다. 송신 패드(110)는 이렇게 전기장의 왜곡이 발생함에 따라 발생된 신호를 수신 패드(120)로 전송한다. 이를 위해 송신 패드(110)는 수신 패드(120)와 전기적으로 연결되거나 무선통신을 통해 신호가 전달될 수 있다.The transmission pad 110 recognizes a user's hand gesture. The transmission pad 110 emits a predetermined electric field, and the electric field emitted from the transmission pad 110 may be distributed over a predetermined area on the transmission pad 110 . Therefore, when a user's hand or the like is placed in the electric field emitted from the transmission pad 110 or a specific gesture is made using the hand, the electric field is distorted. The transmitting pad 110 transmits the signal generated as the electric field is distorted to the receiving pad 120 . To this end, the transmission pad 110 may be electrically connected to the reception pad 120 or a signal may be transmitted through wireless communication.

수신 패드(120)는 송신 패드(110)에서 전송된 신호를 수신하고, 수신된 신호에 대해 전기장이 왜곡된 신호를 통해 사용자의 손동작을 감지한다. 이를 위해 수신 패드(120)는, 중앙부(121), 동부(123), 서부(125), 남부(127) 및 북부(129)를 포함한다.The receiving pad 120 receives the signal transmitted from the transmitting pad 110 and senses the user's hand motion through a signal in which the electric field of the received signal is distorted. To this end, the receiving pad 120 includes a central portion 121, an eastern portion 123, a western portion 125, a southern portion 127, and a northern portion 129.

중앙부(121)는 소정의 너비를 가지도록 배치된다. 중앙부(121)는 사용자의 제스처가 이동되지 않고 소정의 위치에 일정 시간 이상 동안 고정되어 있는지 여부에 대한 인식할 수 있다.The central portion 121 is disposed to have a predetermined width. The central part 121 may recognize whether the user's gesture is fixed at a predetermined position for a predetermined period of time or longer without moving.

동부(123)는 중앙부(121)의 우측에 배치되고, 소정의 너비를 가지도록 배치된다. 동부(123)는 사용자의 제스처가 중앙부(121)를 기준으로 우측 방향(x축 방향)으로 이동하는 경우에 이를 인식할 수 있다.The east part 123 is disposed on the right side of the central part 121 and is disposed to have a predetermined width. The east part 123 may recognize when the user's gesture moves in the right direction (x-axis direction) with respect to the central part 121 .

서부(125)는 중앙부(121)의 좌측에 배치되며, 소정의 너비를 가지도록 배치된다. 서부(125)는 사용자의 제스처가 중앙부(121)를 기준으로 좌측 방향(-x축 방향)으로 이동하는 경우에 이를 인식할 수 있다.The western portion 125 is disposed on the left side of the central portion 121 and is disposed to have a predetermined width. The western portion 125 may recognize when the user's gesture moves in the left direction (-x-axis direction) with respect to the central portion 121 .

남부(127)는 중앙부(121)의 아래 측에 배치되고, 소정의 너비를 가지도록 배치된다. 남부(127)는 사용자의 제스처가 중앙부(121)를 기준으로 아랫방향(-y 방향)으로 이동하는 경우에 이를 인식할 수 있다.The southern portion 127 is disposed below the central portion 121 and has a predetermined width. The southern portion 127 may recognize when the user's gesture moves in a downward direction (-y direction) with respect to the central portion 121 .

북부(129)는 중앙부(121)의 위 측에 배치되며, 소정의 너비를 가지도록 배치된다. 북부(129)는 사용자의 제스처가 중앙부(121)를 기준으로 위 방향(y축 방향)으로 이동하는 경우에 이를 인식할 수 있다.The northern portion 129 is disposed above the central portion 121 and has a predetermined width. The northern part 129 may recognize when the user's gesture moves in an upward direction (y-axis direction) with respect to the central part 121 .

또한, 수신 패드(120)는 사용자의 제스처가 중앙부(121), 동부(123), 서부(125), 남부(127) 및 북부(129)에 복합적으로 이동되는 것을 인식하여, 사용자의 제스처 동작에 대해 복합적인 동작(예컨대, 원을 그리는 동작 등)을 인식할 수 있다.In addition, the receiving pad 120 recognizes that the user's gesture moves in a complex manner to the central portion 121, the eastern portion 123, the western portion 125, the southern portion 127, and the northern portion 129, and responds to the user's gesture operation. It can recognize complex motions (eg, motions of drawing a circle, etc.)

제스처 검출기(130)는 수신 패드(120)로부터 인식된 사용자의 제스처 동작에 대해 깊이 및 방향을 검출한다. 이를 위해 제스처 검출기(130)는 깊이 검출부(131) 및 방향 검출부(133)를 포함한다.The gesture detector 130 detects depth and direction of the user's gesture motion recognized from the receiving pad 120 . To this end, the gesture detector 130 includes a depth detection unit 131 and a direction detection unit 133.

깊이 검출부(131)는 송신 패드(110)에서 사용자의 손과 송신 패드(110)와의 거리를 이용하여 사용자가 동작한 제스처에 대한 깊이를 검출한다.The depth detector 131 detects a depth of a gesture operated by the user by using a distance between the user's hand and the transmission pad 110 on the transmission pad 110 .

방향 검출부(133)는 수신 패드(120)의 중앙부(121), 동부(123), 서부(125), 남부(127) 및 북부(129)에서 각각 인식된 사용자의 제스처를 이용하여 방향을 검출한다.The direction detection unit 133 detects directions using gestures of users recognized in the center 121, east 123, west 125, south 127, and north 129 of the receiving pad 120, respectively. .

이렇게 깊이 검출부(131) 및 방향 검출부(133)는 각각 사용자의 제스처에 대해 소정의 시간 간격(예컨대, 0.1ms)으로 제스처의 깊이 및 방향을 각각 검출하고, 검출된 제스처에 의해 전기장이 왜곡되는 패턴이 기 설정된 전기장의 왜곡 패턴과 비교한다.In this way, the depth detection unit 131 and the direction detection unit 133 respectively detect the depth and direction of the user's gesture at a predetermined time interval (eg, 0.1 ms), and the pattern in which the electric field is distorted by the detected gesture. It is compared with the distortion pattern of the preset electric field.

기 설정된 전기장의 왜곡 패턴은, 인공신경망을 이용하여 학습되어 저장된 전기장의 왜곡 패턴일 수 있다. 즉, 기 설정된 전기장의 왜곡 패턴은, 사용자의 제스처에 따라 전기장이 왜곡되는 것을 학습하여 학습된 결과가 지속적으로 업데이트될 수 있다.The preset distortion pattern of the electric field may be a distortion pattern of the electric field learned and stored using an artificial neural network. That is, the preset distortion pattern of the electric field learns that the electric field is distorted according to the user's gesture, and the learned result can be continuously updated.

따라서 제스처 검출기(130)는 깊이 검출부(131) 및 방향 검출부(133)를 통해 검출된 사용자의 제스처가 기 설정된 전기장의 왜곡 패턴과 비교하여 특정 패턴과 일치하는 경우에 해당 특정 패턴이 검출된 결과를 제스처 인식기(140)에 전달한다.Therefore, the gesture detector 130 compares the user's gesture detected through the depth detection unit 131 and the direction detection unit 133 with a preset distortion pattern of the electric field, and when the specific pattern matches, the result of the detection of the specific pattern is displayed. It is passed on to the gesture recognizer 140.

제스처 인식기(140)는 제스처 검출기(130)로부터 사용자의 제스처가 특정 패턴에 해당하는 경우에 사용자의 제스처를 인식하고, 또한, 인식된 사용자의 제스처가 차량의 제어 명령과 일치하는 제스처인지 확인한다. 제스처 인식기(140)는 사용자의 제스처와 차량 제어 명령이 일치하는 경우 해당 신호를 통신기에 전송하고, 일치하지 않는 경우 해당 신호를 통신기에 전달하지 않고 사용자의 제스처를 다시 인식한다.The gesture recognizer 140 recognizes the user's gesture when the user's gesture corresponds to a specific pattern from the gesture detector 130, and also determines whether the recognized user's gesture matches the vehicle control command. The gesture recognizer 140 transmits a corresponding signal to the communicator when the user's gesture and the vehicle control command match, and recognizes the user's gesture again without transmitting the corresponding signal to the communicator when they do not match.

예컨대, 차량 제어 명령과 일치하는 사용자의 제스처는 도 7에 도시된 바와 같이 손바닥이 양방향으로 이동하는 동작을 할 때, 손바닥이 좌측(-x축 방향)으로 이동하는 경우 운전석 조명을 켜거나 끄고, 손바닥이 우측(x축 방향)으로 이동하는 경우 조수석 조명을 켜거나 끌 수 있다. 또한, 손바닥이 이동되지 않고 일정 시간 이상(예컨대, 5초 이상) 유지되는 경우 비상등을 킬 수 있다.For example, the user's gesture matching the vehicle control command turns on or off the driver's seat light when the palm moves to the left (-x-axis direction) when the palm moves in both directions as shown in FIG. When the palm moves to the right (x-axis direction), the passenger seat light can be turned on or off. In addition, when the palm is not moved and maintained for a predetermined time or more (eg, 5 seconds or more), an emergency light may be turned on.

또한, 도 8에 도시된 바와 같이, 손바닥이 북측(y축 방향)이나 남측(-y축 방향)으로 이동되는 경우 선루프를 일부(예컨대, 10%) 열거나 일부 닫도록 하는 동작일 수 있으며, 손바닥이 두 번씩 북측(y축 방향)이나 남측(-y축 방향)으로 이동되는 경우 선루프를 전부 열거나 전부 닫도록 하는 동작일 수 있다.In addition, as shown in FIG. 8, when the palm is moved to the north (y-axis direction) or south (-y-axis direction), the sunroof may be partially opened or partially closed (eg, 10%), , When the palm is moved to the north (y-axis direction) or the south (-y-axis direction) twice, the sunroof may be fully opened or completely closed.

그리고 도 9에 도시된 바와 같이, 손가락이 시계방향으로 회전하는 제스처는 선루프를 일부씩(예컨대, 5%) 열도록 동작하되, 지속적으로 회전되면 회전될 때마다 조금씩 선루프를 여는 동작일 수 있다. 또한, 도 10에 도시된 바와 같이, 손가락이 반시계 방향으로 회전하는 제스처는 선루프를 일부씩(예컨대, 5%) 닫도록 동작하되, 지속적으로 회전되면 회전될 때마다 조금씩 선루프를 닫는 동작일 수 있다.And, as shown in FIG. 9, the gesture of rotating the finger clockwise may be an operation to open the sunroof partially (eg, 5%), but if it is continuously rotated, it may be an operation to open the sunroof little by little each time it is rotated. there is. In addition, as shown in FIG. 10, the gesture of rotating the finger in a counterclockwise direction operates to partially close the sunroof (eg, 5%), but if it is continuously rotated, the sunroof is closed little by little each time it is rotated. can be

마이크부(150)는 도 4에 도시된 바와 같이, 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)를 포함한다.As shown in FIG. 4 , the microphone unit 150 includes a first microphone 151 , a second microphone 153 and a third microphone 155 .

제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)는 각각 차량의 내부에 설치되며, 서로 다른 위치에 설치될 수 있다. 예컨대, 제1 마이크(151)는 차량의 운전자석에 인접한 위치에 배치되고, 제2 마이크(153)는 차량의 보조석에 인접한 위치에 배치되며, 제3 마이크(155)는 차량의 앞좌석과 뒷좌석 사이에 배치될 수 있다.The first microphone 151, the second microphone 153, and the third microphone 155 are each installed inside the vehicle and may be installed in different locations. For example, the first microphone 151 is disposed adjacent to the driver's seat of the vehicle, the second microphone 153 is disposed adjacent to the passenger seat of the vehicle, and the third microphone 155 is disposed adjacent to the front and rear seats of the vehicle. can be placed in between.

마이크부(150)는 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)를 통해 음성신호를 각각 인식한다. 이때, 마이크부(150)는 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)에서 인식된 음성 중 소정의 dB 이상의 신호를 검출한다. 즉, 마이크부(150)는 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155) 각각에서 인식되는 음성 신호 중 일정 이하의 dB에 해당하는 작은 소리와 같은 노이즈를 인식하지 않고, 일정 이상의 dB에 해당하는 음성 신호만 인식한다.The microphone unit 150 recognizes voice signals through the first microphone 151, the second microphone 153, and the third microphone 155, respectively. At this time, the microphone unit 150 detects a signal of a predetermined dB or more among the voices recognized by the first microphone 151 , the second microphone 153 , and the third microphone 155 . That is, the microphone unit 150 does not recognize noise such as a small sound corresponding to a certain dB or less among voice signals recognized by each of the first microphone 151, the second microphone 153, and the third microphone 155. It recognizes only voice signals corresponding to a certain dB level or higher.

음성 전처리기(160)는 마이크부(150)에서 인식된 음성 신호에 대해 전처리를 수행한다. 이를 위해 음성 전처리기(160)는 노이즈 제거부(161), 잔향 제어부, 방향 및 위치 검출부(165) 및 빔 형성부(167)를 포함한다.The voice preprocessor 160 preprocesses the voice signal recognized by the microphone unit 150 . To this end, the voice preprocessor 160 includes a noise removal unit 161, a reverberation control unit, a direction and position detection unit 165, and a beam forming unit 167.

노이즈 제거부(161)는 마이크부(150)에서 인식된 음성 신호에서 노이즈를 제거한다. 노이즈 제거부(161)는 음성 신호에서 고주파 및 저주파에 해당하는 신호를 제거하여 음성 신호에 대한 노이즈를 제거한다. 노이즈 제거부(161)는 사용자의 음성에 해당하는 주파수 범위를 설정하고, 설정된 주파수 범위보다 높은 주파수를 갖는 고주파 신호 및 설정된 주파수 범위보다 낮은 주파수를 갖는 저주파 신호를 제거한다.The noise removal unit 161 removes noise from the voice signal recognized by the microphone unit 150 . The noise removal unit 161 removes noise from the voice signal by removing high and low frequency signals from the voice signal. The noise removal unit 161 sets a frequency range corresponding to the user's voice and removes a high frequency signal having a higher frequency than the set frequency range and a low frequency signal having a lower frequency than the set frequency range.

잔향 제거부(163)는 노이즈 제거부(161)에서 설정된 주파수 범위에 해당하는 음성 신호에서 소리가 울리다가 그친 뒤에도 남아서 울리는 소리인 잔향을 제거한다. 이렇게 잔향 제거부(163)에서 잔향을 제거함에 따라 마이크부(150)에서 인식된 음성 신호에서 사용자의 음성 신호만을 또렷하게 남길 수 있다.The reverberation removal unit 163 removes reverberation, which is a sound that remains even after the sound stops ringing in the voice signal corresponding to the frequency range set by the noise removal unit 161. As the reverberation is removed by the reverberation removal unit 163, only the user's voice signal can be clearly left in the voice signal recognized by the microphone unit 150.

방향 및 위치 검출부(165)는 잔향 제거부(163)에서 잔향이 제거된 음성 신호가 마이크부(150)의 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155) 중 어느 것에서 가장 크게 인식되었는지를 통해 사용자의 음성 신호에 대한 방향 및 위치를 검출한다.The direction and position detection unit 165 determines which one of the first microphone 151, the second microphone 153, and the third microphone 155 of the microphone unit 150 receives the voice signal from which the reverberation has been removed by the reverberation removal unit 163. The direction and position of the user's voice signal are detected through whether it is recognized as the largest among the ones.

예컨대, 제1 마이크(151)에서 음성 신호가 가장 크게 인식되고, 제2 마이크(153) 및 제3 마이크(155) 순서로 음성 신호의 크기가 작아지게 인식되는 경우, 음성 신호가 인식된 위치가 제1 마이크(151) 및 제2 마이크(153) 사이 또는 제1 마이크(151)에 인접한 위치에서 발생한 것으로 음성 신호의 방향 및 위치를 검출한다. 따라서 방향 및 위치 검출부(165)는 제1 마이크(151)가 차량의 운전석에 가까운 위치에 배치된 경우 음성 신호가 차량의 운전석에서 발생한 것으로 검출할 수 있다.For example, when the first microphone 151 recognizes the largest voice signal and the second microphone 153 and the third microphone 155 recognize the loudest voice signal in that order, the position at which the voice signal is recognized is The direction and location of the voice signal generated between the first microphone 151 and the second microphone 153 or adjacent to the first microphone 151 are detected. Accordingly, when the first microphone 151 is disposed close to the driver's seat of the vehicle, the direction and position detection unit 165 may detect that the voice signal is generated in the driver's seat of the vehicle.

빔 형성부(167)는 방향 및 위치 검출부(165)에서 검출된 음성 신호에 대한 방향 및 위치를 이용하여 음성 신호에 대한 빔(beam)을 형성한다. 빔 형성부(167)에서 형성된 음성 신호에 대한 빔은 사용자의 음성이 발생된 방향으로 지향성을 형성하고, 관심 있는 방향으로부터 입력되는 음성 신호를 선택적으로 획득할 수 있다.The beam forming unit 167 forms a beam for the voice signal by using the direction and position of the voice signal detected by the direction and position detection unit 165 . The beam of the voice signal formed by the beam forming unit 167 forms directivity in the direction in which the user's voice is generated, and can selectively obtain a voice signal input from a direction of interest.

즉, 운전석에서 음성 신호가 발생한 경우, 빔 형성부(167)를 거침에 따라 운전석에서 발생된 음성 신호에 대해 보다 명확하게 음성 신호를 처리할 수 있다.That is, when a voice signal is generated in the driver's seat, as it passes through the beam forming unit 167, the voice signal can be more clearly processed for the voice signal generated in the driver's seat.

상기와 같이, 음성 전처리기(160)에서 음성 신호에 대해 전처리가 이루어짐에 따라 음성 전처리기(160)를 거쳐 출력되는 음성 신호는 최적의 음성 신호가 검출될 수 있게 처리될 수 있다.As described above, as the voice signal is preprocessed by the voice preprocessor 160, the voice signal output through the voice preprocessor 160 can be processed so that an optimal voice signal can be detected.

음성신호 검출부(170)는 음성 전처리기(160)에서 전처리가 이루어진 음성 신호가 입력되고, 입력된 음성 신호에 대해 최적의 음성 신호를 검출할 수 있다. 예컨대, 음성신호 검출부(170)는 음성 신호가 발생된 위치가 운전석인 경우, 운전석에서 발생된 음성 신호에 대해 보다 최적화되어 음성 신호를 검출할 수 있다.The audio signal detector 170 may receive the audio signal preprocessed by the audio preprocessor 160 and detect an optimal audio signal for the input audio signal. For example, when the location where the voice signal is generated is the driver's seat, the voice signal detector 170 may be more optimized for the voice signal generated in the driver's seat to detect the voice signal.

음성패턴 검출부(180)는 음성신호 검출부(170)에서 검출된 음성 신호와 기 설정된 음성 패턴을 비교하고, 검출된 음성 신호가 기 설정된 음성 패턴에 해당하는지 판단한다. 음성패턴 검출부(180)는 이를 위해 음성 메모리(181)를 포함한다. 음성 메모리(181)에는, 기 설정된 음성 패턴들이 저장될 수 있다.The voice pattern detector 180 compares the voice signal detected by the voice signal detector 170 with a preset voice pattern, and determines whether the detected voice signal corresponds to the preset voice pattern. The voice pattern detector 180 includes a voice memory 181 for this purpose. Preset voice patterns may be stored in the voice memory 181 .

음성 메모리(181)에 저장된 기 설정된 음성 패턴들은, 예컨대, 썬루프 열어, 썬루프 오픈, 썬루프 열어줘, 썬루프 닫아, 썬루프클로즈, 썬루프 닫아줘, 썬루프2단 열어, 썬루프2단 닫아, 운전석 독서등 켜, 운전석 불 켜, 운전석 조명켜, 왼쪽 불켜, 왼쪽 조명 켜줘, 좌측 조명 켜줘, 좌측불켜, 운전석 독서등 꺼, 운전석 불 꺼, 운전석 조명 꺼, 왼쪽 불꺼, 왼쪽 조명 꺼줘, 좌측 조명 꺼줘, 좌측불꺼, 조수석 독서등 켜, 조수석 불켜, 조수석 조명켜, 오른쪽 불켜, 오른쪽 조명 켜줘, 우측 조명 켜줘, 우측불켜, 조수석 독서등 꺼, 조수석 불꺼, 조수석 조명꺼, 오른쪽 불꺼, 오른쪽 조명 꺼줘, 우측 조명 꺼줘, 우측불꺼, 실내등 켜, 실내조명 켜, 전체 불 켜, 전체 조명 켜, 불 다 켜, 모든 불 다 켜, 실내등 꺼, 실내조명 꺼, 전체 불 꺼, 전체 조명 꺼, 불 다 꺼, 모든 불 다 꺼, 도어 연동 켜, 도어 연동 꺼, SOS 보내, SOS 꺼, SOS Test 보내, SOS Test Off, 시트 위치 1번 저장, 시트 위치 2번 저장, 시트 위치 1번 설정, 시트 위치 1번 모드, 시트 위치 1번으로 바꿔줘, 시트 위치 2번 설정, 시트 위치 2번 모드, 시트 위치 2번으로 바꿔줘, 시트 릴렉스 모드, 쿨링 시트 강, 쿨링시트 세게 틀어줘, 쿨링 시트 중, 쿨링 시트 약, 쿨링 시트 약하게 틀어줘, 시트 시원하게 해줘, 의자 시원하게 해줘, 쿨링 시트 꺼줘, 히트 시트 강, 히트시트 세게 틀어줘, 히트 시트 중, 히트 시트 약, 히트 시트 약하게 해줘, 시트 따뜻하게 해줘, 의자 따뜻하게 해줘, 히트시트 꺼줘, 온도 조절기 켜, 온도 설정 xx, 에어컨 켜, 에어컨 꺼, 히터 켜, 히터 꺼, 외부 공기 유입 모드, 외부 공기 차단 모드 등 일 수 있다.Preset voice patterns stored in the voice memory 181 are, for example, open sunroof, open sunroof, open sunroof, close sunroof, close sunroof, close sunroof, open second stage of sunroof, sunroof 2 Close the door, driver's seat reading light on, driver's seat light on, driver's seat light on, left light on, left light on, left light on, left light on, driver's seat reading light off, driver's seat light off, driver's seat light off, left light off, left light off, Left light off, Left light off, Passenger seat reading light on, Passenger light on, Passenger light on, Right light on, Right light on, Right light on, Right light on, Passenger seat reading light off, Passenger light off, Passenger light off, Right light off, Right light Turn it off, turn off the right light, turn off the right light, turn on the interior light, turn on the interior light, turn on all lights, turn on all lights, turn on all lights, turn all lights on, turn off the interior light, turn off the interior light, turn off all lights, turn off all lights, all lights Off, all lights off, door interlock on, door interlock off, send SOS, SOS off, send SOS Test, SOS Test Off, save seat position 1, save seat position 2, set seat position 1, seat position 1 Burn mode, change seat position 1, set seat position 2, seat position 2 mode, change seat position 2, seat relax mode, cooling seat strong, turn cooling seat hard, cooling seat medium, cooling seat weak , Cooling sheet soft, Sheet cool, Chair cool, Cooling sheet off, Heat sheet strong, Heat sheet hard, Heat sheet medium, Heat sheet weak, Heat sheet weak, Seat warm, Chair warm, It may be: turn off heat seat, turn on thermostat, set temperature xx, turn on air conditioner, turn off air conditioner, turn on heater, turn off heater, outside air intake mode, outside air blocking mode, etc.

음성 인식기(190)는 검출된 음성 신호가 기 설정된 음성 패턴과 비교하여 차량의 제어 명령과 일치하는지 확인한다. 이렇게 음성 인식기(190)에서 음성신호 검출기에서 검출된 음성 신호가 차량의 제어 명령과 일치하는 경우 해당 신호를 통신기에 전송하고, 일치하지 않는 경우 해당 음성 신호를 통신기에 전달하지 않고 음성 신호를 다시 인식한다.The voice recognizer 190 compares the detected voice signal with a preset voice pattern to determine whether it matches the control command of the vehicle. In this way, when the voice signal detected by the voice signal detector in the voice recognizer 190 matches the vehicle control command, the corresponding signal is transmitted to the communicator, and when it does not match, the voice signal is not transmitted to the communicator and the voice signal is recognized again. do.

통신부(200)는 제스처 인식기(140) 및 음성 인식기(190)에서 인식된 제스처에 대한 신호 및 음성 신호를 차량 전자제어유닛(ECU, 10)으로 송신한다. 따라서 차량 전자제어유닛(10)은 통신부(200)에서 수신된 제스처 신호 및 음성 신호에 대응되도록 차량의 기능을 제어한다.The communication unit 200 transmits signals and voice signals for gestures recognized by the gesture recognizer 140 and the voice recognizer 190 to the vehicle electronic control unit (ECU) 10 . Accordingly, the vehicle electronic control unit 10 controls vehicle functions to correspond to the gesture signal and the voice signal received from the communication unit 200 .

도 11은 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 제스처 인식 방법을 설명하기 위한 도면이다.11 is a diagram for explaining a gesture recognition method of a voice and gesture recognition apparatus according to an embodiment of the present invention.

도 11을 참조하여, 음성 및 제스처 인식 장치(100)에서 제스처 인식 방법에 대해 설명한다.Referring to FIG. 11 , a gesture recognition method in the voice and gesture recognition apparatus 100 will be described.

송신 패드(110)에서 전기장이 발생한다(S101).An electric field is generated in the transmission pad 110 (S101).

송신 패드(110)에서는 사용자의 손동작에 대한 제스처를 인식하기 위해 소정의 전기장이 방출된다. 그에 따라 송신 패드(110)는 전기장의 왜곡을 통해 사용자의 손동작에 대해 제스처에 대한 전기적인 신호를 수신 패드(120)로 전송한다.A predetermined electric field is emitted from the transmission pad 110 to recognize a user's hand gesture. Accordingly, the transmitting pad 110 transmits an electrical signal for a user's hand gesture to the receiving pad 120 through distortion of the electric field.

수신 패드(120)에서는 감지된 전기신호를 검출한다(S103).The receiving pad 120 detects the sensed electrical signal (S103).

수신 패드(120)에서는 중앙부(121), 동부(123), 서부(125), 남부(127) 및 북부(129)에서 감지된 전기신호를 순차적으로 검출한다. 수신 패드(120)에서 중앙부(121), 동부(123), 서부(125), 남부(127) 및 북부(129)의 신호를 순차적으로 검출하는 순서는 변경될 수 있다.The receiving pad 120 sequentially detects electric signals sensed from the central part 121, the eastern part 123, the western part 125, the southern part 127, and the northern part 129. The order of sequentially detecting signals of the central part 121, the eastern part 123, the western part 125, the southern part 127, and the northern part 129 from the receiving pad 120 may be changed.

검출된 전기신호에 대해 시간 별 이동 위치를 기록한다(S105).The movement position by time is recorded for the detected electrical signal (S105).

제스처 검출기(130)에서 사용자의 제스처에 대해 소정의 시간 간격(예컨대, 0.1ms)으로 제스처의 깊이 및 방향을 검출한다. 제스처의 깊이는 사용자의 손과 송신 패드(110) 사이의 거리를 이용하여 깊이를 측정하고, 제스처의 방향은 평면상에서 사용자의 손이 이동하는 방향을 통해 검출한다.The gesture detector 130 detects the depth and direction of the user's gesture at a predetermined time interval (eg, 0.1 ms). The depth of the gesture is measured using the distance between the user's hand and the transmission pad 110, and the direction of the gesture is detected through the direction in which the user's hand moves on a plane.

검출된 제스처에 대한 전기장 왜곡과 기 설정된 전기장 패턴을 비교한다(S107).The electric field distortion of the detected gesture is compared with a preset electric field pattern (S107).

검출된 제스처에 의해 전기장이 왜곡되는 패턴이 기 설정된 전기장의 왜곡 패턴과 비교하고, 특정 패턴과 일치하는 경우에 해당 특정 패턴이 검출된 결과를 출력한다.A pattern in which the electric field is distorted by the detected gesture is compared with a preset distortion pattern of the electric field, and when the pattern matches the specific pattern, the result of the detection of the specific pattern is output.

검출된 제스처에 대한 전기장의 왜곡된 패턴이 기 설정된 전기장의 왜곡 패턴이 일치하는지 확인한다(S109).It is checked whether the distortion pattern of the electric field for the detected gesture matches the preset distortion pattern of the electric field (S109).

기 설정된 전기장의 왜곡 패턴은, 인공신경망을 이용하여 학습되어 저장된 전기장의 왜곡 패턴일 수 있다. 따라서 검출된 사용자의 제스처가 기 설정된 전기장의 왜곡 패턴과 비교하여 특정 패턴과 일치하는 경우에 해당 특정 패턴이 검출된 결과를 출력한다.The preset distortion pattern of the electric field may be a distortion pattern of the electric field learned and stored using an artificial neural network. Therefore, when the detected user's gesture is compared with a preset distortion pattern of the electric field and matches a specific pattern, a result of the detection of the specific pattern is output.

손동작을 인식한다(S111).The hand gesture is recognized (S111).

손동작의 인식은, 제스처 인식기(140)에서 이루어지며, 제스처 검출기(130)로부터 사용자의 제스처가 특정 패턴에 해당하는 경우에 사용자의 제스처를 인식한다.Recognition of the hand motion is performed in the gesture recognizer 140, and when the user's gesture corresponds to a specific pattern from the gesture detector 130, the user's gesture is recognized.

인식된 손동작에 대한 제스처가 차량의 제어 명령과 일치하는지 확인한다(S113).It is checked whether the gesture for the recognized hand motion matches the vehicle control command (S113).

인식된 손동작에 대한 제스처가 차량의 제어 명령과 일치하는지에 대한 확인은, 제스처 인식기(140)에서 이루어지며, 인식된 사용자의 제스처가 차량의 제어 명령과 일치하는 제스처인지 확인한다.The gesture recognizer 140 determines whether the gesture of the recognized hand motion matches the vehicle control command, and determines whether the recognized user's gesture matches the vehicle control command.

통신 명령을 전송한다(S115).A communication command is transmitted (S115).

따라서 인식된 사용자의 제스처가 차량의 제어 명령과 일치하면, 제스처 인식기(140)는 해당 명령을 통신기에 전송한다.Therefore, if the recognized user's gesture matches the vehicle control command, the gesture recognizer 140 transmits the corresponding command to the communicator.

도 12는 본 발명의 일 실시예에 따른 음성 및 제스처 인식 장치의 음성 인식 방법을 설명하기 위한 도면이다.12 is a diagram for explaining a voice recognition method of a voice and gesture recognition apparatus according to an embodiment of the present invention.

음성 신호를 순차적으로 수신한다(S201).Voice signals are sequentially received (S201).

차량 내부에 설치된 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)에서 음성 신호를 순차적으로 수신한다. 이때, 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)는 차량 내부에서 서로 다른 위치에 설치될 수 있다.Voice signals are sequentially received from the first microphone 151, the second microphone 153, and the third microphone 155 installed inside the vehicle. In this case, the first microphone 151, the second microphone 153, and the third microphone 155 may be installed at different locations inside the vehicle.

수신된 음성 신호 중 특정 dB 이상의 음성 신호를 검출한다(S203).Among the received voice signals, a voice signal having a specific dB or higher level is detected (S203).

제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155)는, 노이즈가 수신되는 것을 최소화하기 위해 소정 이상의 dB에 해당하는 소리에 해당하는 음성 신호를 검출한다.The first microphone 151, the second microphone 153, and the third microphone 155 detect a voice signal corresponding to a sound corresponding to a predetermined or more dB in order to minimize noise reception.

검출된 음성 신호에서 노이즈를 제거한다(S205).Noise is removed from the detected voice signal (S205).

노이즈는 사용자의 음성 신호에 해당하는 소정 범위의 주파수에 대한 것을 제외한 나머지 음성 신호이다. 따라서 노이즈 제거부(161)는 소정 범위보다 높은 고주파나 소정 범위보다 낮은 저주파를 제거하여 노이즈를 제거할 수 있다.Noise is a voice signal other than a frequency within a predetermined range corresponding to the user's voice signal. Accordingly, the noise removal unit 161 may remove noise by removing high frequencies higher than a predetermined range or low frequencies lower than a predetermined range.

노이즈가 제거된 음성 신호에서 잔향을 제거한다(S207).Reverberation is removed from the noise-removed voice signal (S207).

잔향 제거부(163)는 노이즈가 제거된 음성 신호에서 잔향을 제거하여 음성 신호에 대해 사용자의 음성에 대한 신호만 남길 수 있다.The reverberation remover 163 may remove the reverberation from the noise-removed voice signal, leaving only a signal for the user's voice in the voice signal.

잔향이 제거된 음성 신호의 방향 및 위치를 검출한다(S209).The direction and position of the voice signal from which the reverberation has been removed is detected (S209).

방향 및 위치 검출부(165)에서 잔향이 제거된 음성 신호에 대해 방향 및 위치를 검출한다. 방향 및 위치 검출부(165)는 마이크부(150)의 제1 마이크(151), 제2 마이크(153) 및 제3 마이크(155) 중 어느 것에서 가장 크게 인식되었는지를 통해 사용자의 음성 신호에 대한 방향 및 위치를 검출한다.The direction and position detector 165 detects the direction and position of the voice signal from which the reverberation has been removed. The direction and position detection unit 165 determines the direction of the user's voice signal through which one of the first microphone 151, the second microphone 153, and the third microphone 155 of the microphone unit 150 is recognized as the loudest. and detect location.

방향 및 위치가 검출된 음성 신호에 대한 빔을 형성한다(S211).A beam is formed for the audio signal whose direction and position are detected (S211).

빔 형성부(167)는 방향 및 위치 검출부(165)에서 검출된 음성 신호에 대한 방향 및 위치를 이용하여 음성 신호에 대한 빔(beam)을 형성한다. The beam forming unit 167 forms a beam for the voice signal by using the direction and position of the voice signal detected by the direction and position detection unit 165 .

빔이 형성된 음성 신호에 대한 최적의 음성 신호를 검출한다(S213).An optimal audio signal for the audio signal on which the beam is formed is detected (S213).

음성신호 검출부(170)는 음성 전처리기(160)에서 전처리가 이루어진 음성 신호가 입력되고, 입력된 음성 신호에 대해 최적의 음성 신호를 검출할 수 있다.The audio signal detector 170 may receive the audio signal preprocessed by the audio preprocessor 160 and detect an optimal audio signal for the input audio signal.

검출된 음성 신호와 음성 패턴을 비교한다(S215).The detected voice signal and voice pattern are compared (S215).

음성패턴 검출부(180)는 음성신호 검출부(170)에서 검출된 음성 신호와 기 설정된 음성 패턴을 비교하고, 검출된 음성 신호가 기 설정된 음성 패턴에 해당하는지 판단한다. The voice pattern detector 180 compares the voice signal detected by the voice signal detector 170 with a preset voice pattern, and determines whether the detected voice signal corresponds to the preset voice pattern.

패턴이 확인된 음성 신호에 대해 차량 제어를 위한 명령인지 확인한다(S217).It is checked whether the pattern is a command for vehicle control with respect to the confirmed voice signal (S217).

음성 인식기(190)는 검출된 음성 신호가 기 설정된 음성 패턴과 비교하여 차량의 제어 명령과 일치하는지 확인한다.The voice recognizer 190 compares the detected voice signal with a preset voice pattern to determine whether it matches the control command of the vehicle.

확인된 음성 신호를 전송한다(S219).The confirmed voice signal is transmitted (S219).

음성 인식기(190)에서 음성신호 검출기에서 검출된 음성 신호가 차량의 제어 명령과 일치하는 경우 해당 신호를 통신기에 전송하고, 일치하지 않는 경우 해당 음성 신호를 통신기에 전달하지 않고 음성 신호를 다시 인식한다.In the voice recognizer 190, if the voice signal detected by the voice signal detector matches the vehicle control command, the corresponding signal is transmitted to the communicator, and if it does not match, the voice signal is not transmitted to the communicator and the voice signal is recognized again. .

위에서 설명한 바와 같이 본 발명에 대한 구체적인 설명은 첨부된 도면을 참조한 실시예에 의해서 이루어졌지만, 상술한 실시예는 본 발명의 바람직한 예를 들어 설명하였을 뿐이므로, 본 발명이 상기 실시예에만 국한되는 것으로 이해돼서는 안 되며, 본 발명의 권리범위는 후술하는 청구범위 및 그 등가개념으로 이해되어야 할 것이다.As described above, the detailed description of the present invention has been made by the embodiments with reference to the accompanying drawings, but since the above-described embodiments have only been described as preferred examples of the present invention, it is believed that the present invention is limited only to the above embodiments. Should not be understood, the scope of the present invention should be understood as the following claims and equivalent concepts.

100: 음성 및 제스처 인식 장치
110: 송신 패드
120: 수신 패드
121: 중앙부
123: 동부
125: 서부
127: 남부
129: 북부
130: 제스처 검출기
131: 깊이 검출부
133: 방향 검출부
140: 제스처 인식기
150: 마이크부
151: 제1 마이크
153: 제2 마이크
155: 제3 마이크
160: 음성 전처리기
161: 노이즈 제거부
163: 잔향 제거부
165: 방향 및 위치 검출부
167: 빔 형성부
170: 음성신호 검출부
180: 음성패턴 검출부
181: 음성 메모리
190: 음성 인식기
200: 통신부
10: 차량 전자제어유닛100: voice and gesture recognition device
110: transmission pad
120: receiving pad
121: central part
123: East
125: West
127: South
129: North
130: gesture detector
131: depth detector
133: direction detection unit
140: gesture recognizer
150: microphone unit
151: first microphone
153: second microphone
155: third microphone
160: voice preprocessor
161: noise removal unit
163: reverberation removal unit
165: direction and position detection unit
167: beam forming unit
170: voice signal detection unit
180: voice pattern detection unit
181 Voice memory
190: voice recognizer
200: Ministry of Communication
10: vehicle electronic control unit

Claims

사용자의 손에 대한 제스처를 감지하기 위해 전기장(electric field)을 방출하는 송신 패드;
상기 송신 패드에서 방출된 전기장이 왜곡되는 것을 검출하는 수신 패드;
상기 수신 패드에서 전기장의 왜곡으로 상기 제스처를 인식하는 제스처 인식기;
상기 사용자의 음성 신호를 검출하는 마이크부;
상기 마이크부에서 검출된 음성 신호에 대해 음성 신호에 대한 노이즈의 제거를 포함하는 전처리를 수행하는 음성 전처리기;
상기 음성 전처리기에서 전처리된 음성 신호를 인식하는 음성 인식부; 및
상기 제스처 인식기 및 상기 음성 인식부에서 인식된 제스처에 대한 신호 및 음성 신호를 차량의 전자제어유닛으로 전송하는 통신기를 포함하는,
음성 및 제스처 인식 장치.a transmitting pad that emits an electric field to detect a user's hand gesture;
a receiving pad that detects distortion of the electric field emitted from the transmitting pad;
a gesture recognizer recognizing the gesture by distortion of an electric field in the receiving pad;
a microphone unit for detecting the user's voice signal;
a voice pre-processor which performs pre-processing on the voice signal detected by the microphone unit, including removing noise of the voice signal;
a voice recognition unit recognizing the voice signal preprocessed by the voice preprocessor; and
A communicator for transmitting a signal and a voice signal for the gesture recognized by the gesture recognizer and the voice recognition unit to an electronic control unit of the vehicle,
Voice and gesture recognition devices.

청구항 1에 있어서,
상기 수신 패드에서 전기장의 왜곡으로 상기 제스처의 깊이 및 방향을 검출하는 제스처 검출기를 더 포함하고,
상기 제스처 인식기는 상기 제스처 검출기에서 인식된 깊이 및 방향을 이용하여 상기 제스처를 인식하는,
음성 및 제스처 인식 장치.The method of claim 1,
Further comprising a gesture detector for detecting the depth and direction of the gesture by distortion of the electric field at the receiving pad,
The gesture recognizer recognizes the gesture using the depth and direction recognized by the gesture detector.
Voice and gesture recognition devices.

청구항 2에 있어서,
상기 제스처 검출기는 상기 제스처의 깊이 및 방향을 검출하여 제스처 패턴을 검출하고, 검출된 제스처 패턴과 기 설정된 제스처 패턴을 비교하며,
상기 제스처 인식기는 상기 제스처 검출기에서 상기 검출된 제스처 패턴과 상기 기 설정된 제스처 패턴이 일치하는 경우에 상기 제스처를 인식하는,
음성 및 제스처 인식 장치.The method of claim 2,
The gesture detector detects a gesture pattern by detecting the depth and direction of the gesture, compares the detected gesture pattern with a preset gesture pattern,
The gesture recognizer recognizes the gesture when the gesture pattern detected by the gesture detector and the preset gesture pattern match,
Voice and gesture recognition devices.

청구항 1에 있어서,
상기 마이크부는 복수 개의 마이크를 포함하고,
상기 복수 개의 마이크는, 상기 차량의 내부에 서로 다른 위치에 배치되는,
음성 및 제스처 인식 장치.The method of claim 1,
The microphone unit includes a plurality of microphones,
The plurality of microphones are disposed at different positions inside the vehicle,
Voice and gesture recognition devices.

청구항 4에 있어서,
상기 음성 전처리기는,
상기 마이크부에서 검출된 음성 신호에서 노이즈를 제거하는 노이즈 제거부;
상기 노이즈 제거부에서 노이즈가 제거된 음성 신호에서 잔향을 제거하는 잔향 제거부;
상기 잔향 제거부에서 잔향이 제거된 음성 신호에서 음성 신호의 방향 및 위치를 검출하는 방향 및 위치 검출부; 및
상기 방향 및 위치 검출부에서 방향 및 위치가 검출된 음성 신호에 대해 빔(beam)을 형성하는 빔 형성부를 포함하는,
음성 및 제스처 인식 장치.The method of claim 4,
The voice preprocessor,
a noise removal unit removing noise from the voice signal detected by the microphone unit;
a reverberation removal unit that removes reverberation from the voice signal from which the noise has been removed by the noise removal unit;
a direction and position detector for detecting a direction and position of a voice signal from the voice signal from which the reverberation has been removed by the reverberation remover; and
A beam forming unit for forming a beam for the audio signal whose direction and position are detected by the direction and position detection unit,
Voice and gesture recognition devices.

청구항 5에 있어서,
상기 노이즈 제거부는 사용자의 음성 신호 주파수 범위보다 높은 고주파수에 해당하는 신호 및 사용자의 음성 신호 주파수 범위보다 낮은 저주파수에 해당하는 신호를 제거하여 노이즈를 제거하는,
음성 및 제스처 인식 장치.The method of claim 5,
The noise removal unit removes noise by removing a signal corresponding to a high frequency higher than the frequency range of the user's voice signal and a signal corresponding to a low frequency lower than the frequency range of the user's voice signal.
Voice and gesture recognition devices.

청구항 5에 있어서,
상기 음성 전처리기에서 전처리된 음성 신호에서 사용자의 음성 신호의 방향 및 위치에 대해 고려된 음성 신호를 검출하는 음성신호 검출부를 더 포함하고,
상기 음성 인식부는, 상기 음성신호 검출부에서 방향 및 위치에 대해 고려된 음성 신호를 인식하는,
음성 및 제스처 인식 장치.The method of claim 5,
Further comprising a voice signal detector for detecting a voice signal considering the direction and position of the user's voice signal from the voice signal preprocessed by the voice preprocessor;
The voice recognition unit recognizes the voice signal considered for the direction and location in the voice signal detector,
Voice and gesture recognition devices.

청구항 7에 있어서,
상기 음성신호 검출부에서 검출된 음성신호와 기 설정된 음성 신호 패턴을 비교하여 출력하는 음성패턴 검출부를 더 포함하고,
상기 음성 인식부는 상기 음성패턴 검출부에서 기 설정된 음성 신호 패턴에 해당하는 검출된 음성신호를 인식하는,
음성 및 제스처 인식 장치.The method of claim 7,
Further comprising a voice pattern detector for comparing the voice signal detected by the voice signal detector with a preset voice signal pattern and outputting the result;
The voice recognition unit recognizes the detected voice signal corresponding to the voice signal pattern preset in the voice pattern detector,
Voice and gesture recognition devices.