KR101653235B1

KR101653235B1 - Apparatus and method for econgnizing gesture

Info

Publication number: KR101653235B1
Application number: KR1020160029869A
Authority: KR
Inventors: 김도형; 윤우한; 이재연; 김혜진; 윤영우; 윤호섭; 지수영
Original assignee: 한국전자통신연구원
Priority date: 2016-03-11
Filing date: 2016-03-11
Publication date: 2016-09-12
Also published as: KR20160034275A

Abstract

제스쳐 인식 장치가 개시된다. 이 제스쳐 인식 장치는 입력 영상으로부터 사용자의 얼굴 영역을 검출하는 휴먼 검출부와, 상기 검출된 얼굴 영역을 기준으로 상기 사용자의 팔의 제스쳐가 발생하는 제스쳐 영역을 설정하는 제스쳐 영역 설정부와, 상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 팔 검출부 및 상기 제스쳐 영역 내에 존재하는 팔 영역의 위치, 이동 방향성 및 형태 정보를 분석하여, 상기 사용자의 목표 제스쳐를 판별하는 제스쳐 판정부를 포함한다. 이러한 제스쳐 인식 장치에 의하면, 로봇이 사용자의 음성을 인식하기 어려운 원거리에서 인간 로봇 상호 작용을 위한 유용한 수단으로 활용될 수 있다.A gesture recognition apparatus is disclosed. The gesture recognition apparatus includes a gesture region setting unit for setting a gesture region where a gesture of the user's arm occurs based on the detected face region, And a gesture judging unit for judging the target gesture of the user by analyzing the position, movement direction and type information of the arm region existing in the gesture region and an arm detecting unit for detecting the arm region of the user existing in the gesture region. According to such a gesture recognition apparatus, the robot can be utilized as a useful means for human robot interaction at a distance that is difficult for the user to recognize the voice.

Description

제스쳐 인식 장치 및 그 방법{APPARATUS AND METHOD FOR ECONGNIZING GESTURE}[0001] APPARATUS AND METHOD FOR ECONNINGING GESTURE [0002]

본 발명은 사용자의 제스쳐를 인식 장치에 관한 것으로서, 특히 제스쳐 인식을 위해 사용자에게 어떠한 제약도 가하지 않고, 사용자가 자유롭게 행동하는 일상적인 환경에서 사용자의 제스쳐를 인식할 수 있는 제스쳐 인식 장치에 관한 것이다. The present invention relates to a gesture recognition apparatus for a user, and more particularly, to a gesture recognition apparatus capable of recognizing a user's gesture in a normal environment in which a user freely acts without any restriction to a user for gesture recognition.

인간은 얼굴 표정, 손의 움직임, 시선 방향, 머리 동작 등의 비언어적인 수단(이하, 제스쳐: gesture)을 이용하여 많은 정보를 교환할 수 있다. 인간 로봇 상호작용(Human-Robot Interaction: HRI)기술에 제스쳐를 이용한 정보 교환 방식이 적용되면, 보다 인간 친화적 HRI 기술의 구현이 가능하다. 이러한 관점에서 HRI 기술에 있어서, 제스쳐 인식 기술은 가장 주목받는 기술 중의 하나이다. A human being can exchange a lot of information using non-verbal means (hereinafter, gesture) such as facial expression, hand motion, gaze direction, and head movement. Human-Robot Interaction (HRI) technology can be applied to human-friendly HRI technology if information exchange method using gesture is applied. In this regard, in HRI technology, gesture recognition technology is one of the most remarkable technologies.

제스쳐 인식 기술은 데이터 획득 방법에 따라 신체에 센서를 부착하는 센서 기반의 제스쳐 인식 기술과 비디오 카메라를 이용한 시각 기반 제스쳐 인식 방법으로 분류할 수 있다. 이 중, 시각 기반의 제스쳐 인식 기술은 입력 데이터의 차원에 따라 2D 또는 3D 인식으로 분류되거나, 인식 대상인 신체 범위에 따라 손동작 인식, 상반신 인식, 전신 동작 인식 등으로 분류된다.Gesture recognition technology can be categorized into sensor based gesture recognition technology that attaches sensor to body according to data acquisition method and visual based gesture recognition method using video camera. Among them, the visual-based gesture recognition technology can be classified into 2D or 3D recognition according to the dimension of the input data, or classified into hand movement recognition, upper body recognition, and whole body motion recognition according to the body range to be recognized.

그런데, 기존의 제스쳐 인식 기술 기반의 HRI 기술은 원거리에서 로봇과 인간이 상호 작용을 위한 제스쳐 인식 방법에 있어서, 다음과 같은 제약들이 있다. However, the existing HRI technology based on the gesture recognition technology has the following limitations in the gesture recognition method for the interaction between the robot and the human in the long distance.

첫째, 기존의 기술들은 원거리에서 로봇과 인간 간의 상호작용을 위한 의미있는 제스쳐의 제시와 그 인식 방법을 제공하지 못한다. 즉, 기존의 기술들은 근거리에서 손 동작 만을 인식하는 수준이고, 원거리에서는 로봇과 인간 간의 상호 작용의 목적보다는 상반신과 전신 동작의 인식을 통한 상황 인식에 초점을 두고 있다. 따라서, 원거리에서, HRI를 위한 시각 기반의 제스쳐 인식에 대한 시도는 거의 없다.First, existing technologies do not provide a meaningful gesture presentation and recognition method for interaction between robot and human at a distance. In other words, existing technologies recognize only hand movements at close range and focus on context recognition through recognition of upper body and whole body movements rather than the purpose of interaction between robot and human at a distance. Thus, at a distance, there are few attempts at visually based gesture recognition for HRI.

둘째, 기존 기술들은 원거리에 존재하는 인간의 상반신 또는 전신 제스쳐를 인식하기 위해 고해상도 입력 영상을 요구하거나, 3D 정보를 획득하기 위하여 2대 이상의 카메라 및 이에 상응하는 장치들이 요구된다. 따라서 단일 카메라로 구성된 저가의 시스템 구현이 어렵다Second, existing technologies require high resolution input images to recognize a human upper body or whole body gesture existing at a distance, or two or more cameras and corresponding devices are required to acquire 3D information. Therefore, it is difficult to implement a low-cost system composed of a single camera

셋째, 기존 기술들은 단일 카메라만을 사용하는 시스템의 경우, 대부분 입력 영상을 용이하게 추출하기 위해 카메라가 고정된다. 따라서 카메라가 이동하는 로봇 플랫폼에는 제스쳐 인식 기술 기반의 HRI 기술의 적용이 어렵다 Third, in the case of a system using only a single camera, the camera is fixed in order to easily extract an input image. Therefore, it is difficult to apply the HRI technology based on the gesture recognition technology to the robot platform in which the camera moves

넷째, 기존 기술들은 제스쳐를 인식하는 시스템의 안정성을 확보하기 위해, 사용자에게 많은 제약을 요구한다. 예컨대, 시스템이 사용자 제스쳐의 시작시점과 종료시점을 알기 위해, 사용자는 장갑, 특정 색깔의 옷 등의 보조 도구를 착용하는 경우가 많다. 하지만, 사용자가 자유롭게 행동하는 일상적인 로봇 서비스 환경에서 인식의 안정성을 위해 사용자에게 이러한 협조를 기대하기 어렵다.Fourth, existing technologies require a lot of restrictions on the user in order to secure the stability of the system recognizing the gesture. For example, in order for the system to know the start time and the end time of the user gesture, the user wears auxiliary tools such as gloves and clothes of specific colors. However, it is difficult to expect such cooperation from the user for the stability of recognition in the daily robot service environment where the user is free to act.

따라서, 기존의 제스쳐 인식 기술 기반의 HRI 기술은 원거리에서 로봇과 인간이 상호 작용을 하기 위한 의미 있는 제스쳐 인식 방법을 제공하는데 한계가 있다.Therefore, HRI technology based on existing gesture recognition technology has a limitation in providing a meaningful gesture recognition method for interaction between robot and human from a remote place.

본 발명의 목적은 저해상도 영상을 이용하여 로봇과 사용자 간의 거리가 원거리에서 사용자의 제스쳐를 인식하는 있는 제스쳐 인식 장치 및 로봇 시스템을 이용한 제스쳐 인식 방법을 제공하는 것이다.An object of the present invention is to provide a gesture recognition apparatus and a gesture recognition method using a robot system in which a distance between a robot and a user recognizes a user's gesture at a long distance using a low resolution image.

상술한 목적을 달성하기 위한, 본 발명의 일면에 따른 제스쳐 인식 장치는, 입력 영상으로부터 검출된 사용자의 얼굴 영역을 기준으로 상기 사용자의 팔의 제스쳐가 발생하는 제스쳐 영역을 설정하는 제스쳐 영역 설정부; 상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 팔 검출부; 및 상기 제스쳐 영역 내에 존재하는 팔 영역의 위치, 이동 방향성 및 형태 정보를 분석하여, Waving 제스쳐와, Calling 제스쳐, Raising 제스쳐 및 Stopping 제스쳐를 포함하는 상기 사용자의 목표 제스쳐를 판별하는 제스쳐 판정부를 포함하고, 상기 제스쳐 판정부는,According to an aspect of the present invention, there is provided a gesture recognition apparatus comprising: a gesture region setting unit configured to set a gesture region where a gesture of the user's arm occurs based on a face region of a user detected from an input image; An arm detecting unit for detecting an arm region of the user existing in the gesture region; And a gesture judging unit for analyzing the position, movement direction, and type information of the arm region existing in the gesture region to determine a target gesture of the user including a Waving gesture, a Calling gesture, a Raising gesture, and a Stopping gesture, The gesture judging unit,

상기 사용자의 팔 영역 위치가 상기 제스쳐 영역 내에 존재하는지 여부를 판별하고, 판별 결과에 따라 상기 목표 제스쳐와 사용자의 일상적인 행동에 해당하는 노이즈 제스쳐를 구별하는 영역 분석부; 상기 사용자의 팔 영역의 상기 이동 방향성을 분석하여, 상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 모션 분석부; 및 상기 사용자의 양팔의 상대적 길이 비와 각도를 포함하는 상기 형태 정보를 분석하여, 상기 Raising 제스쳐와 상기 Stopping 제스쳐를 판별하는 형태 분석부를 포함함을 특징으로 한다.A region analyzer for determining whether the position of the user's arm region is within the gesture region and for distinguishing the target gesture from a noise gesture corresponding to a user's daily behavior according to a determination result; A motion analyzer for analyzing the moving direction of the user's arm region to discriminate the Waving gesture and the Calling gesture; And a morphological analysis unit for analyzing the morphological information including the relative length ratio and angle of the user's arms and discriminating the Raising gesture and the Stopping gesture.

본 발명의 다른 일면에 따른 로봇 시스템을 이용한 제스쳐 인식 방법은, 입력 영상으로부터 사용자의 얼굴 영역을 검출하는 단계; 상기 검출된 얼굴 영역의 위치와 크기에 따라 상기 사용자의 팔의 제스쳐가 발생하는 제스쳐 영역의 크기를 소정의 비율로 계산하는 단계; 상기 계산된 제스처 영역 내에 존재하는 사용자 팔 영역이 포함된 배경 분리 영상을 획득하는 단계; 상기 획득된 배경 분리 영상을 이용하여 상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 단계; 및 상기 제스쳐 영역 내에 존재하는 팔 영역의 위치, 이동 방향성 및 형태 정보를 분석하여, Waving 제스쳐와, Calling 제스쳐, Raising 제스쳐 및 Stopping 제스쳐를 포함하는 상기 사용자의 목표 제스쳐를 판별하는 단계를 포함하고, 상기 사용자의 목표 제스쳐를 판별하는 단계는, 상기 사용자의 팔 영역 위치가 상기 제스쳐 영역 내에 존재하는지 여부를 판별하고, 판별 결과에 따라 상기 목표 제스쳐와 사용자의 일상적인 행동에 해당하는 노이즈 제스쳐를 구별하는 단계; 상기 사용자의 팔 영역의 상기 이동 방향성을 분석하여, 상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 단계; 및 상기 사용자의 양팔의 상대적 길이 비와 각도를 포함하는 상기 형태 정보를 분석하여, 상기 Raising 제스쳐와 상기 Stopping 제스쳐를 판별하는 단계를 포함한다.According to another aspect of the present invention, there is provided a gesture recognition method using a robot system, comprising: detecting a face region of a user from an input image; Calculating a size of a gesture region where a gesture of the user's arm occurs at a predetermined ratio according to a position and a size of the detected face region; Obtaining a background separated image including a user arm region existing in the calculated gesture region; Detecting an arm region of the user existing in the gesture region using the obtained background separated image; And determining the target gesture of the user including the Waving gesture, the calling gesture, the Raising gesture, and the stopping gesture by analyzing the position, movement direction, and type information of the arm area existing in the gesture area, The step of discriminating the target gesture of the user includes discriminating whether or not the position of the arm region of the user exists in the gesture region and discriminating the target gesture from the noise gesture corresponding to the daily behavior of the user according to the discrimination result ; Analyzing the movement direction of the user's arm region to identify the Waving gesture and the Calling gesture; And determining the Raising gesture and the Stopping gesture by analyzing the type information including the relative length ratio and angle of the user's arms.

본 발명에 의하면, 로봇과 사용자 간의 거리가 원거리에서도 원거리 상호작용을 위한 4가지 제스쳐(Waving, Calling, Raising, Stopping)를 인식할 수 있다.According to the present invention, it is possible to recognize four kinds of gestures (Waving, Calling, Raising, Stopping) for the remote interaction even at a distance from the robot and the user.

또한, 상기 제스쳐들을 인식하기 위해, 사용자에게 어떠한 제약도 가하지 않으며, 사용자가 취할 수도 있는 일상적인 행동과 정의된 4가지 제스쳐의 구별이 가능하다.Further, in order to recognize the gestures, it is possible to distinguish between the four kinds of gestures defined and the usual actions that the user may take, without any restriction to the user.

따라서 본 발명은 음성 인식이 어려운 원거리(약 4-5m)에서의 인간로봇상호작용을 위한 유용한 수단으로 활용될 수 있다.Therefore, the present invention can be utilized as a useful means for human robot interaction at a long distance (about 4-5 m) where speech recognition is difficult.

도 1은 본 발명의 일실시예에 따른 제스쳐 인식 장치가 인식하는 사용자의 목표 제스쳐들의 일예를 나타내는 도면이다.
도 2는 본 발명의 일실시예에 따른 제스쳐 인식 장치가 인식하는 사용자의 노이즈 제스쳐들의 일예를 나타내는 도면이다.
도 3은 본 발명의 일실시예에 따른 제스쳐 인식 장치의 전체 블록도이다.
도 4는 본 발명의 일실시예에 따른 목표 제스쳐의 발생 가능 영역을 나타내는 도면이다.
도 5a 내지 도 5c는 본 발명의 일실시예에 따른 배경 분리 기법에 따라 ROI 영역 내에서 분리된 사용자의 팔 영역을 분리하는 과정을 보여주는 도면들이다.
도 6은 도 3에 도시된 배경 영상 획득부의 동작 과정을 나타내는 흐름도이다.
도 7은 본 발명의 일 실시 예에 따른 목표 제스쳐와 노이즈 제스쳐를 구별하기 위하여 사용되는 룩업 테이블이다.
도 8a는 도 3에 도시된 모션 분석부에서 수행되는 모션 제스쳐를 분석하는 과정을 보여주는 흐름도이다.
도 8b는 도 8a의 손끝 좌표를 검출하는 과정에서 손끝의 y좌표를 보여주는 도면이다.
도 8c 내지 도 8e는 손끝의 이동방향으로 보여주는 입력 영상들을 보여주는 도면들이다.
도 9a는 본 발명의 일실시예에 따른 손끝의 이동 방향을 분석하기 위한 이동 방향 코드 테이블을 보여주는 도면이다.
도 9b는 본 발명의 일실시예에 따라 손끝의 이동 방향을 분석하기 위한 이동 방향의 누적 히스토그램을 나타내는 그래프이다.
도 9c는 본 발명의 일 실시 예에 따른 영상의 획득 속도에 따른 코드 값 발생 빈도 및 손끝 이동변위를 보여주는 도면이다.
도 10은 도 3에 도시된 형태 분석부가 Raising 제스쳐와 Stopping 제스쳐를 각각 판별하는 과정을 보여주기 위한 흐름도이다.
도 11은 도 10에 도시된 양팔 영역의 상대적 길이 분석을 설명하기 위한 도면이다.
도 12는 본 발명의 일 실시예에 따른 Raising 제스쳐의 판별 조건을 나타내는 도면이다.
도 13은 본 발명의 일 실시 예에 따른 Stopping 제스쳐의 판별 조건을 보여주기 위한 도면이다.1 is a diagram illustrating an example of target gestures of a user recognized by a gesture recognition apparatus according to an exemplary embodiment of the present invention.
2 is a diagram illustrating an example of noise gestures of a user recognized by a gesture recognition apparatus according to an embodiment of the present invention.
3 is an overall block diagram of a gesture recognition apparatus according to an embodiment of the present invention.
FIG. 4 is a view showing a possible area of a target gesture according to an embodiment of the present invention. FIG.
FIGS. 5A to 5C are views illustrating a process of separating an arm region of a user separated in an ROI region according to a background separation technique according to an embodiment of the present invention.
6 is a flowchart illustrating an operation process of the background image obtaining unit shown in FIG.
7 is a lookup table used to distinguish a target gesture from a noise gesture according to an embodiment of the present invention.
8A is a flowchart illustrating a process of analyzing a motion gesture performed by the motion analysis unit shown in FIG.
8B is a view showing the y coordinate of the fingertip in the process of detecting the fingertip coordinate in Fig. 8A.
8C to 8E are views showing input images that are displayed in a moving direction of a fingertip.
FIG. 9A is a diagram illustrating a movement direction code table for analyzing a movement direction of a fingertip according to an embodiment of the present invention.
FIG. 9B is a graph illustrating a cumulative histogram of a moving direction for analyzing a moving direction of a fingertip according to an exemplary embodiment of the present invention. Referring to FIG.
FIG. 9C is a diagram showing a code value occurrence frequency and a fingertip displacement according to an image acquisition speed according to an embodiment of the present invention.
FIG. 10 is a flowchart for illustrating a process of discriminating each of the Raising gesture and Stopping gesture from the morphological analysis unit shown in FIG.
11 is a view for explaining the relative length analysis of the two-armed region shown in FIG.
12 is a diagram showing a discrimination condition of a raising gesture according to an embodiment of the present invention.
FIG. 13 is a diagram illustrating a discrimination condition of a stopping gesture according to an embodiment of the present invention.

본 발명의 제스쳐 인식 장치는 지능형 로봇, 보안 감시 시스템 등 사용자를 인식하는 다양한 기술 분야에서 적용될 수 있으며, 본 실시예에서는 제스쳐 인식 장치가 이동 수단을 갖는 지능형 로봇에 탑재된 것을 가정하여 설명하기로 한다.The gesture recognition apparatus of the present invention can be applied to various technical fields for recognizing a user such as an intelligent robot and a security surveillance system. In the present embodiment, it is assumed that the gesture recognition apparatus is mounted on an intelligent robot having a moving means .

근거리의 경우, 로봇이 사용자의 음성을 인식하여 로봇과 사용자 간의 의사표현이 가능하다. 그러나 음성인식이 어려운 원거리에서는 사용자의 제스쳐 인식이 유용한 의사표현이 될 수 있다. 따라서, 본 발명의 제스쳐 인식 장치는 사용자의 제스쳐(예컨대, 팔 제스쳐)을 통해 사용자의 의사를 인식하는 방안을 제안한다.. In the case of a short distance, the robot recognizes the user's voice and can express the robot between the user and the user. However, gesture recognition of the user can be a useful expression in the remote area where speech recognition is difficult. Accordingly, the gesture recognition apparatus of the present invention proposes a method of recognizing a user's intention through a gesture (e.g., an arm gesture) of a user.

또한, 본 발명의 제스쳐 인식 장치는 로봇과 인간 사이의 거리가 약 4-5m 인 원거리에서 로봇과 사용자 간의 상호작용을 위한 사용자의 제스쳐를 인식할 수 있는 방식을 제공하고, 동시에 단일 카메라로부터 획득된 저해상도 입력 영상으로 사용자의 제스쳐 인식이 가능한 방안을 제시한다.In addition, the gesture recognition apparatus of the present invention provides a method of recognizing the user's gesture for the interaction between the robot and the user at a distance of about 4-5 m between the robot and the human, We propose a method to recognize user 's gesture as a low - resolution input image.

더불어, 본 발명의 제스쳐 인식 장치는 사용자의 제스쳐 인식을 위해 사용자에게 어떠한 제약도 요구하지 않으며, 본 발명의 제스쳐 인식 장치가 인식하는 제스쳐들과 사용자의 일상적인 행동에서의 제스쳐의 구별이 가능한 방안을 제시한다.In addition, the gesture recognition apparatus of the present invention does not require any limitation to the user for the gesture recognition of the user, and it is possible to distinguish between the gestures recognized by the gesture recognition apparatus of the present invention and the gesture in the daily behavior of the user present.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예에 대하여 상세하게 설명한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 제스쳐 인식 장치에서 인식하는 사용자의 목표 제스쳐들을 보여주는 도면이고, 도 2는 본 발명의 일 실시 예에 따른 제스쳐 인식 장치에서 인식하는 목표 제스쳐가 아닌 사용자의 노이즈 제스쳐를 보여주는 도면이다.FIG. 1 is a view showing target gestures of a user recognized by a gesture recognition apparatus according to an exemplary embodiment of the present invention. FIG. 2 is a diagram illustrating a target gesture recognized by a gesture recognition apparatus according to an exemplary embodiment of the present invention, This is a drawing showing the gesture.

도 1을 참조하면, 본 발명의 일실시예에 따른 제스쳐 인식 장치는 원거리에서 인간 로봇 상호작용을 위하여 사용자의 특정 제스쳐들을 인식 대상으로 설정한다. 본 실시예에서는, 상기 제스쳐 인식 장치(100)가 Waving 제스쳐(12), Calling 제스쳐(14)를 포함하는 모션 제스쳐와, Raising 제스쳐(16) 및 Stopping 제스쳐(18)를 포함하는 비 모션 제스쳐로 구성된 총 4가지의 목표 제스쳐를 인식한다. 여기서, 상기 모션 제스쳐인 Waving 제스쳐(12) 및 Calling 제스쳐(14)는 사용자 습관에 따라 오른 팔을 이용한 제스쳐일 수도 있고, 왼팔을 이용한 제스쳐일 수도 있다. 본 실시예에서는, Waving 제스쳐(12) 및 Calling 제스쳐(14)는 오른손을 이용한 제스쳐로 정의된다. Referring to FIG. 1, a gesture recognition apparatus according to an exemplary embodiment of the present invention sets specific gestures of a user as recognition targets for human robot interaction at a remote location. In this embodiment, the gesture recognition apparatus 100 comprises a motion gesture including a Waving gesture 12, a Calling gesture 14, and a non-motion gesture including a Raising gesture 16 and a Stopping gesture 18. [ All four target gestures are recognized. Here, the motion gestures Waving gesture 12 and the calling gesture 14 may be a gesture using the right arm or a gesture using the left arm according to the user's habits. In this embodiment, the Waving gesture 12 and the Calling gesture 14 are defined as gestures using the right hand.

Waving 제스쳐(12)는 사용자가 오른 팔을 좌우로 흔드는 행위로서, 사용자가 원거리에 위치한 로봇의 주의를 끌기 위한 제스쳐이다. 예컨대, 상기 Waving 제스쳐(12)는 사용자가 원거리에 있는 로봇에게 서비스 요청를 전달하는 의사표현(예컨대, "LOOK AT ME") 또는 아니오라는 의사 표현(예컨대, "NO")으로 사용될 수 있다. 즉, Waving 제스쳐(10)는 로봇이 사용자에게 능동적인 서비스를 제공하기에 앞서 사용자의 의사를 확인하는 제스쳐이다.The Waving gesture (12) is a gesture for the user to draw attention of a robot located at a distance, as the user shakes the right arm from side to side. For example, the Waving gesture 12 may be used as a pseudo expression (e.g., "LOOK AT ME") or a NO ("NO") in which a user delivers a service request to a robot at a remote location. That is, the Waving gesture 10 is a gesture that confirms the user's intention before the robot provides an active service to the user.

Calling 제스쳐(14)는 사용자가 오른 팔을 아래위로 흔드는 제스쳐이다. 예컨대, Calling 제스쳐(14)는 원거리에 위치한 로봇을 사용자에게 가까이 다가오게 하는 의사표현(예컨대, "COME TO ME") 또는 따라오게 하는 의사표현(예컨대, "FOLLOW ME")으로 사용될 수 있다.Calling Gesture (14) is a gesture that shakes the user's right arm up and down. For example, the calling gesture 14 may be used as a physician expression (e.g., "COME TO ME") or a physician expression (e.g., "FOLLOW ME") that brings a robot at a remote location in close proximity to the user.

Raising 제스쳐(16)는 사용자가 오른 팔을 위로 들고 일정 시간(대략 2-3초) 동안 정지하는 제스쳐이다. 예컨대, Raising 제스쳐(16)는 사용자가 로봇에게 사용자 자신을 인식시키는 의사표현(예컨대, IT'S ME) 또는 로봇이 사용자에게 능동적인 서비스를 제공하기에 앞서 사용자가 로봇에게 "예"라는 의사표현(예컨대, "YES")으로 사용될 수 있다.The Raising gesture (16) is a gesture in which the user holds his right arm up for a period of time (approximately 2-3 seconds). For example, the Raising gesture 16 may be used by a user to provide a robot with a physician expression (e.g., IT'S ME) that allows the user to recognize the user himself / herself, , "YES").

Stopping 제스쳐(18)는 양팔을 얼굴 높이만큼 위로 들고 일정 시간(약 2-3초) 동안 정지하고 있는 제스쳐이다. 예컨대, Stopping 제스쳐는 원거리에서 로봇이 사용자를 위해 어떠한 서비스를 제공하고 있는 과정에서, 사용자가 로봇에게 상기 어떠한 서비스의 중지를 요청하는 의사표현(예컨대, "STOP IT")으로 사용될 수 있다.Stopping gesture (18) is a gesture that holds both arms up as much as the face height and stops for a period of time (about 2-3 seconds). For example, the Stopping gesture can be used as a physician expression (e.g., "STOP IT") in which the user requests the robot to stop the service at a remote location while the robot is providing services for the user.

지금까지 정의한 4가지 목표 제스쳐들(Waving 제스쳐(12), Calling 제스쳐(14), Raising 제스쳐(16) 및 Stopping 제스쳐(18))은 사용자가 로봇에게 자신의 의사를 표현하는 제스쳐들이다. 따라서 사용자가 로봇을 정면으로 바라보는 것이 바람직하다. 그러나 이러한 사용자의 행위(사용자가 로봇을 정면으로 바라보는 행위)는 단지 사용자가 로봇이 인식할 수 있는 상술한 제스쳐들(Waving 제스쳐(12), Calling 제스쳐(14), Raising 제스쳐(16) 및 Stopping 제스쳐(18))을 시작하는 시점에서 발생하는 자연스러운 동작으로 해석되어야 하고, 사용자에게 어떠한 제약을 요구하는 행위로 해석되지는 않는다.The four target gestures defined so far (Waving Gesture (12), Calling Gesture (14), Raising Gesture (16) and Stopping Gesture (18)) are gestures that users express their intention to the robot. Therefore, it is desirable for the user to look at the robot in front. However, such an action of the user (the user looking at the robot in front) can be performed only when the user performs the above-described gestures (Waving gesture 12, Calling gesture 14, Raising gesture 16, Gesture (18)), and it is not interpreted as an act that requires any restrictions on the user.

상술한 바와 같이, 본 발명의 일실시예에 따른 제스쳐 인식 장치는 사용자가 자유로운 행동을 취하는 로봇 환경에서 상술한 바와 같은 4가지의 목표 제스쳐를 인식한다. 그러나, 사용자는 상기 제스쳐 인식 장치가 인식하는 4가지의 목표 제스쳐가 아닌 사용자의 일상적인 제스쳐(노이즈 제스쳐: noise gesture)에 해당하는 행동을 취할 수도 있다. 따라서, 본 발명의 일실시예에 따른 제스쳐 인식 장치에서는, 도 2에 도시된 바와 같은 6가지의 일상적인 제스쳐들(21, 22, 23, 24, 25, 26)과 상기 4가지의 목표 제스쳐를 구별할 수 있는 방안이 제공된다.As described above, the gesture recognition apparatus according to an embodiment of the present invention recognizes the four target gestures as described above in a robot environment in which a user takes a free action. However, the user may take actions corresponding to the user's usual gesture (noise gesture) rather than the four target gestures recognized by the gesture recognition device. Accordingly, in the gesture recognition apparatus according to the embodiment of the present invention, the six regular gestures 21, 22, 23, 24, 25, 26 as shown in FIG. 2 and the four target gestures There is a way to distinguish.

도 3은 본 발명의 일실시예에 따른 제스쳐 인식 장치의 전체 블록도이다.3 is an overall block diagram of a gesture recognition apparatus according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 일실시예에 따른 제스쳐 인식 장치는 영상 생성부(110), 휴먼 검출부(120), 제스쳐 영역 설정부(130), 배경 영상 획득부(140), 팔 검출부(150) 및 제스쳐 판정부(160)를 포함한다. 3, a gesture recognition apparatus according to an exemplary embodiment of the present invention includes an image generation unit 110, a human detection unit 120, a gesture region setting unit 130, a background image acquisition unit 140, 150 and a gesture judging unit 160.

영상 입력부(110)는 임의의 환경 또는 로봇에 설치된 단일 카메라로부터 제공되는 저 해상도의 비디오 영상을 입력받아서 복수의 영상 프레임을 순차적으로 생성한다. 휴먼 검출부(120)는 순차적으로 생성된 각 영상 프레임으로부터 사용자의 존재 여부를 검출하고, 사용자가 존재하는 경우, 사용자의 얼굴의 위치 및 크기를 검출한다. 제스쳐 영역 설정부(130)는 상기 휴먼 검출부(120)로부터 검출된 상기 사용자의 얼굴의 영역의 위치를 기준으로 사용자의 제스쳐가 발생할 수 있는 제스쳐 영역을 설정한다. 배경 영상 획득부(140)는 상기 설정된 제스쳐 영역 내에서 존재하는 사용자 팔 영역을 획득하기 위하여 배경 영상을 획득한다. 팔 검출부(150)는 상기 배경 영상 획득부(140)에서 획득된 배경 영상 내에 존재하는 팔 영역을 검출한다. 제스쳐 판정부(160)는 상기 팔 검출부(150)에 의해 검출된 상기 팔 영역 내에서 팔이 위치하는 영역, 팔의 모션, 및 팔의 형태를 분석하고, 분석된 결과치를 최종적으로 사용자의 제스쳐로 인식한다. The image input unit 110 receives a video image of a low resolution provided from a single camera installed in an arbitrary environment or robot, and sequentially generates a plurality of image frames. The human detection unit 120 detects presence or absence of a user from each image frame sequentially generated, and detects the position and size of the user's face when the user exists. The gesture region setting unit 130 sets a gesture region where a user's gesture can occur based on the position of the face region of the user detected by the human detection unit 120. [ The background image acquiring unit 140 acquires a background image to acquire a user arm area existing in the set gesture area. The arm detection unit 150 detects an arm region existing in the background image acquired by the background image acquisition unit 140. [ The gesture judging unit 160 analyzes the region where the arm is located, the motion of the arm, and the shape of the arm in the arm region detected by the arm detecting unit 150, and finally outputs the analyzed result to the gesture of the user .

이하, 본 발명의 일실시예에 따른 제스쳐 인식 장치(100)에 대해 보다 상세히 설명하기로 한다. Hereinafter, the gesture recognition apparatus 100 according to an embodiment of the present invention will be described in more detail.

영상 입력부(110)는 내부에 구비된 단일 카메라를 통해 320x240의 픽셀 해상도를 갖는 저 해상도의 영상을 생성한다. 여기서, 320은 가로 픽셀 수이고, 240은 세로 픽셀 수이다. The image input unit 110 generates a low resolution image having a pixel resolution of 320x240 through a single camera provided therein. Here, 320 is the number of horizontal pixels and 240 is the number of vertical pixels.

휴먼 검출부(120)는 상기 복수의 영상 프레임을 연속적으로 입력받아서, 각 영상 프레임 내에 사용자가 존재하면, 상기 사용자의 얼굴의 위치를 검출한다. 이러한 휴먼 검출부(120)는 원거리에 존재하는 사용자의 얼굴을 안정적으로 검출하고 추적하기 위한 모듈로서 Mean Shift 컬러 추적, 근거리 얼굴 검출, 오메가 검출, 원거리 얼굴 검출 등의 결과를 결합하여 사용자의 얼굴 영역의 위치를 연속적으로 산출한다. The human detection unit 120 continuously receives the plurality of image frames, and detects the position of the user's face when the user exists in each image frame. The human detection unit 120 is a module for stably detecting and tracking a face of a user at a remote location. The human detection unit 120 combines results of Mean Shift color tracking, near face detection, omega detection, and remote face detection, The position is continuously calculated.

구체적으로, 휴먼 검출부(120)는 얼굴 추적 초기화부(122)와 원거리 얼굴 추적부(124)를 포함한다. Specifically, the human detection unit 120 includes a face tracking initialization unit 122 and a remote face tracking unit 124.

얼굴 추적 초기화부(122)에서의 동작 과정을 설명하면 다음과 같다. 먼저, 얼굴 추적 초기화부(122)는 영상 입력부(110)를 통해 복수의 영상 프레임(11)을 프레임 단위로 연속적으로 입력받는다. 연속적으로 생성된 영상 프레임들 간의 차 영상 프레임을 통해 움직임 영역이 검출된다.An operation process in the face tracking initialization unit 122 will be described below. First, the face tracking initialization unit 122 sequentially receives a plurality of image frames 11 through the image input unit 110 on a frame basis. A motion region is detected through a difference image frame between consecutively generated image frames.

검출된 움직임 영역 내에서 얼굴과 오메가 형상이 존재할 가능성이 큰 상반신 영역이 미리 지정된 비율에 따라 설정된다. 여기서, 오메가 형상은 사용자의 머리와 어깨를 잇는 윤곽선의 형태가 문자 오메가(Ω)와 유사하여 명명된 명칭이다. The upper half region having a possibility that the face and the omega shape exist in the detected movement region is set according to a predetermined ratio. Here, the shape of the omega is a name named by the shape of the contour connecting the user's head and shoulder similar to the character omega (?).

이동하는 물체가 사람인지 아닌지를 검증하기 위하여 상반신 영역 내에서 얼굴 검출 과정이 수행된다. The face detection process is performed in the upper half region to verify whether the moving object is human or not.

얼굴은 물체가 사람인지 아닌지의 여부를 구별하는 특징적 요소이다. 본 실시예에서는 아다부스팅(Adaboosting) 기법을 이용하여 엄굴 검출 과정이 수행된다. 본 실시예에서의 아다부스팅(Adaboosting) 기법을 이용한 엄굴 검출 과정은 최소 20 x 20 해상도의 얼굴이 안정적으로 검출되도록 훈련(학습)된다.A face is a characteristic element that distinguishes whether an object is a person or not. In the present embodiment, the scour detection process is performed using the adaboosting technique. In the present embodiment, the scour detection process using the adaboosting technique is performed so that a face having a resolution of at least 20 x 20 is stably detected.

아다부스팅 기법을 이용한 얼굴 검출 과정은 B. Jun and D. Kim에 의해 작성된 논문 "Robust real-time face detection using face certainty map, Proceeding of the 2nd Int'l. Conf. on Biometrics, vol .4642, pp.29-38, 2007"에서 상세히 기술되어 있으므로, 이에 대한 구체적인 설명은 생략하기로 한다. The face detection process using the AdaBoosting technique is described in a paper entitled " Robust real-time face detection using face certainty map, " Proceeding of the 2nd Int'l Conf. On Biometrics, vol. 29-38, 2007 "and therefore, a detailed description thereof will be omitted.

아다부스팅 기법에 따른 얼굴 검출 과정에 의해 얼굴 검출이 성공한 경우, 검출된 얼굴 영역이 추적 시작 영역으로 설정된다. If the face detection is successful by the face detection process according to the adaboosting technique, the detected face area is set as the tracking start area.

원거리 얼굴 추적부(124)에서 상기 설정된 추적 시작 영역을 기점으로 추적이 수행된다. 만일, 얼굴 검출이 실패한 경우, 상기 설정된 상반신 영역 내에서 오메가(Ω) 형상에 대한 검출 과정이 수행된다. Tracking is performed from the set tracking start area to the starting point in the remote face tracking unit 124. [ If the face detection fails, the detection process for the shape of the omega (?) Is performed within the set upper half region.

오메가 형상은 얼굴에 비해 검출 정확도는 떨어진다. 그러나 오메가 형상은 카메라의 전방에 위치한 사람이 뒤돌아 서 있는 상황 즉, 얼굴이 보이지 않는 상황에서도 검출이 가능하고, 얼굴보다 큰 사이즈를 갖는 특징적 요소를 갖는다. 따라서 원거리에서도 검출이 가능하다. Omega shape has lower detection accuracy than face. However, the shape of the Omega has a characteristic element that can be detected even when the person in front of the camera is turned around, that is, when the face is not visible, and has a larger size than the face. Therefore, it can be detected at a long distance.

본 실시예에서는, 상술한 오메가 검출 과정에서도 아다부스트(Adaboosting) 기법이 이용될 수 있다. 본 실시예에 따른 오메가 검출 과정은 최소 24 x 24 해상도의 오메가 형상이 검출될 수 있도록 훈련(학습)된다.In this embodiment, an adaboosting technique can also be used in the above-described omega detection process. The omega detection process according to the present embodiment is trained so that an omega shape having a resolution of at least 24 x 24 resolution can be detected.

오메가 형상의 검출이 성공한 경우, 즉, 움직이는 물체가 사람으로 검증된 경우, 오메가 영역을 기준으로 비율적으로 얼굴(또는 뒤통수) 영역이 추정된다. If the detection of the omega shape is successful, that is, if the moving object is verified as a person, the face (or back of the head) region is estimated proportionally based on the omega area.

추정된 얼굴(뒤통수) 영역은 원거리 얼굴 추적부(124)로 제공되는 추적 시작 영역으로 설정된다. The estimated face (back pain) area is set as the tracking start area provided to the remote face tracking unit 124. [

이와 같이 본 실시예의 휴먼 검출부(120)에서는 얼굴 영역 및 오메가 영역의 검출을 위하여, 탐색 영역을 상반신 영역으로 제한한다. 그 결과, 검출 속도 향상과 오 검출의 가능성을 최대한 낮출 수 있다. Thus, in the human detection unit 120 of the present embodiment, the search region is limited to the upper half region in order to detect the face region and the omega region. As a result, the detection speed can be improved and the possibility of erroneous detection can be minimized.

한편, 얼굴 영역 및 오메가 영역의 검출이 모두 실패한 경우, 다시 다음 영상 프레임을 입력받아서 사람인지 아닌지를 검증하고, 추적 시작 영역을 설정하는 일련의 과정이 반복된다. 얼굴 추적 초기화부(122)에서 사용자가 존재하는 것으로 확인되고, 이에 따라 추적 시작 영역이 설정되면, 원거리 얼굴 추적부(124)가 추적 시작 영역이 설정된 현재 영상 프레임을 기준으로 다음 영상 프레임부터 일련의 얼굴 추적 과정을 수행한다.On the other hand, if all of the detection of the face area and the omega area is unsuccessful, a series of processes of repeating the process of verifying whether the next image frame is inputted again or not, and setting the tracking start area is repeated. If it is determined that the user exists in the face tracking initialization unit 122 and thus the tracking start region is set, the remote face tracking unit 124 searches the next image frame based on the current image frame in which the tracking start region is set, The face tracking process is performed.

원거리 얼굴 추적부(124)는 얼굴 추적 초기화부(122)에 의해 설정된 영역을 기반으로 Mean Shift 컬러 추적, 근거리 얼굴 검출, 오메가 검출, 원거리 얼굴 검출 등의 결과를 결합하여 추적된 얼굴 영역의 위치를 연속적으로 산출한다.The remote face tracking unit 124 combines the results of the mean shift color tracking, the near face detection, the omega detection, and the remote face detection based on the area set by the face tracking initialization unit 122, Continuously.

구체적으로, 원거리 얼굴 추적부(124)가 추적 시작 영역과 영상 프레임을 입력받으면, 원거리 얼굴 추적부(124)는 먼저, Mean Shift 방법(D. Comaniciu and P. Meer, "Mean shift: a robust approach toward feature space analysis," IEEE Trans. on PAMI, vol.24, no.5, May. 2002.)을 사용하여 컬러 추적을 수행한다.Specifically, when the remote face tracking unit 124 receives the tracking start area and the image frame, the remote face tracking unit 124 first calculates a mean shift method (D. Comaniciu and P. Meer, "Mean shift: a robust approach to feature space analysis, "IEEE Trans. on PAMI, vol.24, no.5, May 2002.).

컬러 추적을 수행하기 위해서는 초기에 추적하고자 하는 컬러 모델이 존재하여야 한다. 컬러 모델은 얼굴 추적 초기화부(122)에서 설정된 추적 시작 영역 내의 컬러로 모델링된다. 즉 얼굴 추적 초기화부(122)에서 얼굴이 검출된 경우, 검출된 얼굴 영역 내의 컬러로 초기 모델링이 수행되며, 오메가 검출에 의해 얼굴(또는 뒤통수) 영역이 추정된 경우, 추정된 영역 내의 컬러로 초기 모델링이 수행된다. Mean Shift 기반 컬러 추적 기법은 설정된 컬러 모델과 가장 유사한 컬러를 가지는 영역을 그 결과로 산출한다. Mean Shift 기반 컬러 추적 기법은 컬러 정보를 사용하기 때문에 조명의 변화에 취약하며, 추적 대상 컬러와 유사한 컬러가 배경에 존재하는 경우, 추적의 정확도가 낮다. 따라서 본 실시예에서는 Mean Shift 기반 컬러 추적 기법을 얼굴(또는 머리)을 검출하기 위한 목적이 아니라, 얼굴형상에 대한 검출과 오메가 형상에 대한 검출을 위한 탐색 영역의 설정 목적으로 이용한다. 얼굴 탐색 영역이 설정되면 영역 내에서 얼굴 검출이 수행된다. 이때의 얼굴검출은 얼굴 추적 초기화부(122)에서와 동일한 20 x 20 픽셀 크기의 얼굴 영역까지 검출할 수 있다. In order to perform color tracking, a color model to be initially tracked must exist. The color model is modeled as the color in the tracking start area set in the face tracking initialization part 122. [ In other words, when a face is detected in the face tracking initialization unit 122, initial modeling is performed with the color in the detected face region, and when the face (or back of the head) region is estimated by omega detection, Modeling is performed. The Mean Shift-based color tracking technique yields regions with the color most similar to the set color model. Since the mean shift based color tracking method uses color information, it is vulnerable to changes in illumination. When there is a color similar to the target color in the background, the tracking accuracy is low. Therefore, in the present embodiment, the mean shift based color tracking method is used not for the purpose of detecting the face (or head) but for the detection of the face shape and the setting of the search area for detecting the shape of the omega. When the face search area is set, face detection is performed in the area. The face detection at this time can detect the same face area of 20 x 20 pixels as in the face tracking initialization unit 122.

계속해서, 도 3을 참조하면, 제스쳐 영역 설정부(130)는 휴먼 검출부(120)에 의해 사용자의 얼굴 영역이 검출되면, 검출된 얼굴 영역을 기준으로 앞서 언급한 목표 제스쳐들(12, 14, 16, 18)이 발생할 가능성이 있는 관심 영역(Region Of Interest: 이하, ROI 영역)들을 설정한다.3, when the face region of the user is detected by the human detection unit 120, the gesture region setting unit 130 sets the target gestures 12, 14, 16, and 18 are generated in the ROI region.

도 4는 본 발명의 일실시예에 따른 목표 제스쳐가 발생 가능 영역을 보여주는 도면이다.FIG. 4 is a view showing a target gesture generating area according to an embodiment of the present invention.

도 4를 참조하면, 제스쳐 영역 설정부(130)는 휴먼 검출부(120)에 의해 검출된 얼굴 영역(FR)을 포함하는 머리 영역(HR)을 설정하고, 설정된 머리 영역에 인접한 주변 영역이 설정된다. 4, the gesture region setting unit 130 sets a head region HR including the face region FR detected by the human detection unit 120, and sets a peripheral region adjacent to the set head region .

상기 주변 영역은 목표 제스쳐가 발생할 가능성이 높은 영역으로서, 5개의 ROI(Region Of Interest)영역을 포함한다.The peripheral region is a region in which a target gesture is likely to occur, and includes five Region Of Interest (ROI) regions.

구체적으로, 상기 주변 영역은 좌측 상부 영역(Left Upper Region: 31)(이하, LU 영역), 우측 상부 영역((Right Upper Region: 32) (이하, RU 영역), 상기 LL영역과 상기 RU 영역 사이에 존재하는 중앙 상부 영역(Center Upper Region: 33)(이하, CU 영역), 상기 LU 영역(31)의 하부에 인접한 좌측 하부 영역(Left Lower Region 34)(이하, LL 영역) 및 상기 RU 영역(32)의 하부에 인접한 우측 하부 영역(Right Lovwer Region: 35)(이하, RL 영역)으로 구성된 5개의 ROI(Region Of Interest)영역을 포함한다.Specifically, the peripheral region includes a left upper region 31 (hereinafter, referred to as an LU region), a right upper region 32 (hereinafter referred to as an RU region), a region between the LL region and the RU region A left lower region 34 (hereinafter referred to as an LL region) adjacent to a lower portion of the LU region 31 and a center upper region 33 (hereinafter referred to as a LU region) And a right lower region 35 (hereinafter, referred to as an RL region) adjacent to a lower portion of the ROI 32. The region of interest (ROI)

각 영역들(31~35)의 크기는 검출된 얼굴 영역의 위치와 크기에 따라 기설정된 비율로 계산된다. 본 실시예에서, 4개의 목표 제스쳐들이 설정된 ROI 영역들에만 나타나고, ROI 영역들 이외의 영역에서는 나타나지 않는 것으로 가정한다. The size of each of the regions 31 to 35 is calculated at a predetermined ratio according to the position and size of the detected face region. In this embodiment, it is assumed that the four target gestures appear only in the set ROI areas, and not in the areas other than the ROI areas.

다시 도 3을 참조하면, 배경 영상 획득부(140)는 제스쳐 영역 설정부(130)에 의해 ROI 영역들(31~35)이 설정되면, ROI 영역들(31~35)의 각 배경 영상을 획득한다.3, when the ROI regions 31 to 35 are set by the gesture region setting unit 130, the background image obtaining unit 140 obtains background images of the ROI regions 31 to 35 do.

도 5a 내지 5c는 배경 분리 기법에 따라 ROI 영역 내에서 분리된 사용자의 팔 영역을 분리하는 과정을 보여주는 도면들이다.FIGS. 5A to 5C are diagrams illustrating a process of separating an arm region of a user separated in an ROI region according to a background separation technique.

도 5a 내지 5c를 참조하면, 배경 영상 획득부(140)는 배경 분리 기법(Background subtraction technique)을 이용하여, ROI 영역들 내에 사용자 팔의 존재 여부를 조사한다. 배경 분리 기법(Background subtraction technique)은 널리 알려진 기술이므로, 이에 대한 구체적인 설명은 생략하기로 한다.Referring to FIGS. 5A to 5C, the background image acquisition unit 140 uses the background subtraction technique to check whether the user arm exists in the ROI regions. The background subtraction technique is a well-known technique, so a detailed description thereof will be omitted.

도 5a에 도시된 바와 같이, ROI 영역들 내에 사용자의 팔이 없다면 현재 영상을 배경 영상으로 갱신하고, 도 5b에 도시된 바와 같이, ROI 영역(34) 내에 사용자의 팔이 존재하면, 도 5c에 도시된 바와 같이, 배경 분리 기법에 의해 팔 영역을 분리된 최종 이미지가 생성된다. 즉, 도 5b의 배경 영상을 구성하는 각 픽셀들의 계조(gray)값 도 5a의 배경 영상을 구성하는 각 픽셀들의 계조값을 빼면, 도 5c와 같은 팔 영역만이 나타나는 최종 이미지가 생성된다.As shown in FIG. 5A, if there is no user's arm in the ROI regions, the current image is updated to the background image. If the user's arm exists in the ROI region 34 as shown in FIG. 5B, As shown, a separate final image of the arm region is created by the background separation technique. That is, gray values of the pixels constituting the background image of FIG. 5B are subtracted from the gray values of the pixels constituting the background image of FIG. 5A, whereby a final image in which only the arm region as shown in FIG. 5C appears is generated.

도 6은 도 3에 도시된 배경 영상 획득부의 동작 과정을 보여주는 흐름도이다.6 is a flowchart illustrating an operation process of the background image obtaining unit shown in FIG.

도 6을 참조하면, 배경 영상 획득부(140)는 제스쳐 영역 설정부(130)로부터 관심 영역 및 얼굴 영역이 설정된 영상 프레임을 입력받으면, 먼저 사용자가 멈춘 상태인지 이동중인 상태인지를 판단한다(S610). Referring to FIG. 6, when the background image obtaining unit 140 receives the image frame set with the region of interest and the face region from the gesture region setting unit 130, the background image obtaining unit 140 first determines whether the user is in a stopped state or in a moving state (S610 ).

본 실시예에서는, 도 1의 제스쳐 인식 장치(100)가 사용자가 멈춘 상태에서 로봇을 향해 제스쳐를 취하는 사용자의 제스쳐를 인식하므로, 사용자가 이동중인 경우에 획득된 이전 배경 영상가 삭제된다(S620). In this embodiment, since the gesture recognition apparatus 100 of FIG. 1 recognizes the gesture of the user who takes the gesture toward the robot while the user is stopped, the previous background image obtained when the user is moving is deleted (S620).

사용자의 이동 여부를 판별하는 것은 현재의 영상에서의 설정된 머리 영역과 배경 영상에서의 머리 영역이 중첩되는 면적의 크기가 기 설정된 임계 값보다 작은 경우, 사용자는 이동 중인 것으로 판단한다. If the size of the overlapping area between the head region in the current image and the head region in the background image is smaller than a predetermined threshold value, the user determines that the user is moving.

사용자가 멈춘 경우, 검출된 얼굴 영역을 기준으로 ROI 영역이 설정된다. 이 때 사용자의 얼굴이 화면 가장자리에 위치하거나, 사용자가 로봇에 너무 접근한 경우에는, ROI 영역이 전체 영상의 바깥으로 벗어난다. 이로 인해 사용자 팔의 제스쳐 발생 여부가 판별될 수 없다. 따라서 ROI 영역의 크기가 설정된 임계 값(V) 보다 작은 경우, 배경 영상의 획득 과정이 수행되지 않고, 배경 영상 획득부(140)가 다시 전체 영상을 입력받는다(S630).When the user stops, the ROI area is set based on the detected face area. At this time, if the user's face is positioned at the edge of the screen or the user is too close to the robot, the ROI area deviates to the outside of the entire image. As a result, it can not be determined whether or not a gesture occurs in the user's arm. Accordingly, if the size of the ROI region is smaller than the preset threshold value V, the background image acquisition unit 140 receives the entire image again (S630).

이어, 배경 영상 획득부(140)는 사용자가 멈춘 상태이고, ROI 영역이 안정적으로 확보되면, ROI 영역 내에 팔의 존재 유무를 파악한다. 팔의 존재 유무를 파악하기 위하여 차 영상 분석 과정 및 배경 분리 영상 분석과정이 수행된다(S640, S650). Then, the background image acquisition unit 140 grasps the presence or absence of an arm in the ROI area when the user is in a stopped state and the ROI area is stably secured. A difference image analysis process and a background separation image analysis process are performed (S640, S650) in order to determine whether or not an arm exists.

상기 차 영상 분석 과정은 현재 영상과 이전 영상의 픽셀의 차이값의 통해 모션 발명 여부를 분석한다. 여기서, 픽셀의 차이값이란 상기 현재 영상의 픽셀의 계조(gray) 값과 상기 현재 영상의 픽셀에 대응하는 이전 영상의 픽셀의 계조 값의 차이 값를 통해서 모션의 발생 여부를 분석하는 기법이다(S640). The difference image analysis process analyzes whether a motion is invented through a difference value between pixels of a current image and a previous image. Here, the difference value of the pixel is a technique for analyzing the occurrence of motion through the difference between the gray value of the pixel of the current image and the gray value of the pixel of the previous image corresponding to the pixel of the current image (S640) .

모션의 발생이 없는 경우, ROI 영역 내에는 팔이 없는 것으로 간주할 수 있다. 만약 ROI 영역 내에서 이동 중인 팔이 이동을 멈추고 정지해 있는 경우면, 차 영상 분석만으로는 팔의 존재 유무를 파악할 수 없다. 이 경우, 이전에 설정된 배경 영상과 현재 영상의 픽셀 차인 배경 분리 영상을 분석과정을 통해 움직이지 않는 팔의 존재 유무가 확인될 수 있다(S670). If there is no motion, it can be regarded that there is no arm in the ROI area. If the moving arm in the ROI region stops moving and stops, it is impossible to determine whether or not the arm exists by only the secondary image analysis. In this case, the presence or absence of an arm that does not move through the analysis process can be confirmed (S670).

ROI 영역 내에서 움직이거나 또는 정지하고 있는 팔이 없다면 최종적으로 현재 영상을 배경 영상으로 재설정(갱신)한다(S680). If there is no arm moving or stopping in the ROI area, the current image is finally reset (updated) to the background image (S680).

배경 영상이 획득되면 ,도 5c와 같이 배경 분리 기법을 통한 팔 영역이 검출된다.When the background image is acquired, an arm region is detected through a background separation technique as shown in FIG. 5C.

기존의 기술들은 사용자가 존재하지 않는 전체 영상에서 초기에 한 번 배경 영상이 획득되기 때문에 카메라가 이동하는 로봇 환경에서는 그 적용이 불가능하다. 그러나, 본 실시예에서는 기존의 기술과는 다르게 사용자 얼굴 영역을 기준으로 설정된 ROI 영역에 대해서만 팔의 존재 유무를 조사하고, 이를 통해 적응적으로 배경 영상을 갱신한다. Conventional techniques can not be applied in a robot environment in which a camera moves because a background image is acquired once in an entire image in which a user does not exist. However, in the present embodiment, the presence or absence of the arm is checked only for the ROI region set on the basis of the user's face region, and the background image is adaptively updated.

따라서 로봇과 사용자가 자유롭게 이동하다가 사용자가 제스쳐를 취하는 순간에만 로봇이 정지된 상태를 유지하면, 도 1에 도시된 제스쳐 인식 장치(100)는 로봇 환경에도 적용 가능하다.Therefore, if the robot and the user are free to move and the robot remains stationary only when the user takes the gesture, the gesture recognition apparatus 100 shown in FIG. 1 can be applied to the robot environment.

다시 도 3을 참조하면, 제스쳐 판정부(160) 도 5c와 같이 ROI 영역들 중 어느 한 영역에 존재하는 팔 영역이 안정적으로 확보되면, 제스쳐의 발생 여부를 판정한다. 제스쳐의 발생 여부를 판정하기 위해 제스쳐 판정부(160)는 영역 분석부(162), 모션 분석부(164), 형태 분석부(166)를 포함한다.Referring again to FIG. 3, the gesture judging unit 160 judges whether a gesture is generated if an arm region existing in any one of the ROI regions is stably secured, as shown in FIG. 5C. The gesture judging section 160 includes a region analyzing section 162, a motion analyzing section 164, and a morphological analyzing section 166 to determine whether or not a gesture has occurred.

*상기 영역 분석부(162)는 검출된 팔 영역(또는 팔 블럽(blob))이 도 4에서 설정된 5개의 ROI들(31~35) 중 어느 영역에 위치하는 지를 분석하고, 도 7에 도시된 바와 같은 룩업 테이블을 통해 목표 제스쳐들이 아닌 사용자 일상적인 행동인 노이즈 제스쳐를 구별하는 역할을 수행한다. The region analyzing unit 162 analyzes which of the five ROIs 31 to 35 the detected arm region (or the arm blob) is located in FIG. 4, A lookup table like the one shown in Fig. 3 serves to distinguish noise gestures from user gestures rather than target gestures.

도 7은 목표 제스쳐와 노이즈 제스쳐를 구별하기 위하여 사용되는 룩업 테이블이다. 여기서, '○' 기호는 해당 관심 영역에 팔 블럽이 있음을 나타내는 표시하는 기호이고, '×' 기호는 해당 관심 영역에 팔 블럽이 없음을 나타내는 표시하는 기호이다. 그리고, '－'는 해당 관심 영역에 팔 블럽이 있어도 되고, 없어도 된다는 의미를 나타내는 기호이다. 즉, 돈 케어(don't cate) 상태를 의미하는 기호이다.7 is a lookup table used for distinguishing the target gesture from the noise gesture. Here, the symbol " o " is a symbol indicating that the arm area is present in the region of interest, and the symbol " x " is a symbol indicating that there is no arm portion in the region of interest. '-' is a symbol indicating that there may or may not be an arm block in the area of interest. It is a symbol that means do not cate.

도 7 및 도 4를 참조하면, 오른손으로 Waving 제스쳐(도 1의 12)를 취하는 경우, 팔 블럽은 LL 영역(도 4의 34)에는 반드시 나타나고, RU 영역(도 4의 32)과 RL 영역(도 4의 35)에는 나타나지 않는다. Referring to FIGS. 7 and 4, when the Waving gesture (12 in FIG. 1) is taken with the right hand, the arm block always appears in the LL area (34 in FIG. 4) and the RL area (32 in FIG. 35 in FIG. 4).

LU 영역(도 4의 31) 또는 CU 영역(도 4의 33)에는 사용자의 습관에 따라 팔 팔 블럽이 나타날 수도 있고 안 나타날 수도 있다. In the LU area (31 in Fig. 4) or the CU area (33 in Fig. 4), the arm palm may or may not appear depending on the user's habits.

이와 같이 팔 블럽이 검출된 경우, ROI 영역을 분석하여 룩업 테이블에 명시된 4가지 제스쳐에 대한 조건을 모두 다 만족하지 못하는 경우의 팔 블럽은 노이즈 제스쳐로 판단된다. When the arm block is detected, the arm block is analyzed as a noise gesture when the ROI area is analyzed and the conditions for all four gestures specified in the lookup table are not satisfied.

다시, 도 3을 참조하면, 상술한 바와 같은 영역 분석부(162)는 본 실시예에서 제시하는 4가지 목표 제스쳐들 중 어느 하나의 제스쳐를 인식하기 위한 기능을 충분히 수행하지 못하고, 단지 목표 제스쳐인지 노이즈 제스쳐를 구별하는 기능만을 수행한다. 그럼에도 불구하고, 이러한 영역 분석부(162)를 설계하는 것은 전체 시스템이 불필요한 분석을 수행하여, 발생하는 제스쳐의 오인식을 미연에 방지하고, 제스쳐 인식에 따른 불필요한 연산을 최소화할 수 있다.Referring again to FIG. 3, the region analyzer 162 described above does not sufficiently perform a function of recognizing any one of the four target gestures shown in the present embodiment, Only the function of distinguishing the noise gesture is performed. Nevertheless, designing such a region analyzing unit 162 can prevent the erroneous recognition of the generated gesture by performing analysis unnecessary for the entire system, and can minimize unnecessary operations due to gesture recognition.

모션 분석부(164)는 검출된 팔 블럽의 이동 방향성을 분석하여, 4가지 목표 제스쳐들(도 1에 도시된 12, 14, 16, 18) 중 모션 제스쳐에 해당하는 Waving 제스쳐(12)와 Calling 제스쳐(14)를 발생 여부를 분석한다. Waving 제스쳐(12)와 Calling 제스쳐(14)는 앞서 언급한 바와 같이 사용자가 오른 팔을 이용한 제스쳐로 정의한다. 따라서 모션 분석부(164)는 사용자의 오른 팔의 좌우 또는 상하의 반복적인 동작 여부를 확인하고, 확인된 결과를 통해 모션 제스터의 발생 여부를 판단한다. The motion analyzing unit 164 analyzes the moving direction of the detected arm block and analyzes the moving direction of the arm bobbling to detect a moving gesture 12 corresponding to the motion gesture among the four target gestures (12, 14, 16, and 18 shown in FIG. 1) And analyzes whether the gesture 14 is generated. Waving Gesture (12) and Calling Gesture (14) are defined as a gesture with the user's right arm, as mentioned above. Therefore, the motion analyzer 164 determines whether the user operates the left / right or up / down repeatedly, and determines whether the motion gesture is generated through the confirmed result.

도 8a는 도 3에 도시된 모션 분석부에서 수행되는 모션 제스쳐를 분석하는 과정을 보여주는 흐름도이고, 도 8b는 도 8a의 손끝 좌표를 검출하는 과정에서 손끝의 y좌표를 보여주는 도면이고, 도 8c 내지 도 8e는 손끝의 이동방향으로 보여주는 입력 영상들을 보여주는 도면들이다.8A is a flowchart illustrating a process of analyzing a motion gesture performed by the motion analyzer shown in FIG. 3, FIG. 8B is a view showing y coordinates of a fingertip in the process of detecting fingertip coordinates in FIG. 8A, FIG. 8E is a view showing input images shown in a moving direction of a fingertip.

도 8a을 참조하면, 배경 분리 기법에 의해 분리된 오른 팔 영역이 포함된 입력 영상이 되면, 상기 입력 영상에서 오른 손끝 부분에 해당되는 좌표가 검출된다(S810). 상기 좌표는 손끝의 y 좌표와 손끝의 x 좌표를 포함한다. Referring to FIG. 8A, when an input image including the right-arm region separated by the background separation technique is input, coordinates corresponding to the right-hand portion of the input image are detected (S810). The coordinates include the y coordinate of the fingertip and the x coordinate of the fingertip.

상기 손끝의 y 좌표는 검출된 오른 팔 영역의 y좌표로 할당된다. The y coordinate of the fingertip is assigned to the y coordinate of the detected right arm region.

상기 손끝의 x 좌표는 도 8b에 도시된 바와 같이, 오른 팔 영역의 전체 높이(h) 중 상위 영역(1/5h)에서 팔 블럽의 오른 손 영역의 무게중심 값(40)으로 할당된다. The x-coordinate of the fingertip is allocated to the center of gravity value 40 of the right-hand region of the arm block in the upper region 1 / 5h of the total height h of the right-arm region, as shown in Fig. 8B.

손끝 좌표가 검출되면, 도 8c 내지 도 8에 도시된 바와 같이, 연속된 영상에서 손끝의 이동방향을 쉽게 검출할 수 있다. 즉, 도 8c에는, 이전 영상에서 검출된 손끝 좌표(C1)가 동그라미 형태로 표시되고, 도 8d에서는 현재 영상에서 검출된 손끝 좌표(C2)가 네모 형태로 표시된다. 도 8e는 화살표를 통해 손끝 이동방향을 표시된 결과를 보여주고 있다.When the coordinates of the fingertip are detected, the moving direction of the fingertip can be easily detected in the continuous image, as shown in Figs. 8C to 8B. That is, in FIG. 8C, the fingertip coordinate C1 detected in the previous image is displayed in a circle shape, and in FIG. 8D, the fingertip coordinate C2 detected in the current image is displayed in a square shape. FIG. 8E shows the result of displaying the moving direction of the fingertip through the arrows.

계속해서, 도 8a를 참조하면, 손끝의 이동방향이 검출되면(S820), 손끝의 이동 방향이 분석된다(S830). 손끝의 이동 방향이 분석은 이동방향 코드 테이블과 이동방향 누적 히스토그램을 이용하여 분석된다. 이에 대한 구체적인 설명은 도 9a 및 도 9b를 이용하여 상세히 설명하기로 한다.8A, when the moving direction of the fingertip is detected (S820), the moving direction of the fingertip is analyzed (S830). The movement direction of the fingertip is analyzed by using the movement direction code table and the movement direction cumulative histogram. A detailed description thereof will be made with reference to FIGS. 9A and 9B.

도 9a는 손끝의 이동 방향을 분석하기 위해 본 실시예에서 제시하는 이동 방향 코드 테이블의 일례를 보여주는 도면이고, 도 9b는 손끝의 이동 방향을 분석하기 위해 본 실시예에서 제시하는 이동방향 누적 히스토그램을 나타내는 그래프이고, 도 9c는 영상의 획득 속도에 따른 코드 값 발생 빈도 및 손끝 이동변위를 보여주는 도면이다.FIG. 9A is a view showing an example of a moving direction code table shown in this embodiment for analyzing the moving direction of a fingertip. FIG. 9B is a diagram showing a moving direction cumulative histogram shown in the present embodiment for analyzing the moving direction of a fingertip And FIG. 9C is a view showing a code value occurrence frequency and fingertip displacement according to an image acquisition speed.

손끝의 이동 방향은 도 9a에 도시된 이동 방향 코드 테이블에 의해 4개의 코드 값으로 설정된다. 예컨대, 도 8e와 같이 손끝 좌표가 왼쪽에서 오른쪽 방향으로 이동하는 경우, 그 이동 방향의 각도가 315도 이상이고 45도 이하이므로, 이동 방향의 코드값은 '코드 1'로 할당된다. The moving direction of the fingertip is set to four code values by the moving direction code table shown in Fig. 9A. For example, when the fingertip moves from the left to the right as shown in FIG. 8E, since the angle of the moving direction is 315 degrees or more and 45 degrees or less, the code value of the moving direction is assigned to 'code 1'.

입력 영상이 연속적으로 입력되는 과정에서, 이동 방향의 코드값은 연속적으로 산출되고, 산출된 이동 방향의 코드값은 도 9b에 도시된 바와 같이, 이동방향의 코드값을 코드값 별로 누적시킨 히스토그램으로 구성할 수 있다.In the process of continuously inputting the input image, the code value of the movement direction is continuously calculated, and the code value of the calculated movement direction is a histogram in which the code value of the movement direction is accumulated for each code value as shown in FIG. 9B Can be configured.

이동방향의 코드 값을 누적시킨 히스토그램을 분석하면, Waving 제스쳐(도 1의 12)와 Calling 제스쳐(도 1의 14)의 발생 여부가 판별될 수 있다. If the histogram in which the code values in the moving direction are accumulated is analyzed, whether or not the Waving gesture (12 in Fig. 1) and the calling gesture (14 in Fig. 1) is generated can be determined.

Waving 제스쳐의 경우, 오른 팔을 좌우로 반복적으로 흔드는 동작이므로, 이동방향 누적 히스토그램에서는 이동 방향의 코드값인 코드 1과 코드 3이 주로 발생한다. In the case of the Waving gesture, since the right arm is shaking repeatedly to the left and right, the code values 1 and 3 of the moving direction are mainly generated in the moving direction cumulative histogram.

따라서 코드 1(또는 코드 3)이 기 설정된 임계 값(T1)보다 크고, 그 반대 향에 해당되는 코드 3(또는 코드 1)이 임계 값(T2)보다 큰 경우에는 waving 제스쳐(도 1의 12)가 발생한 것으로 판정될수 있다. 다시 말해, 시스템은 특정 코드 값이 임계 값 T1을 초과할 때 그 반대 방향에 해당되는 코드 값이 임계 값 T2를 초과하는지 검사함으로써, Waving 제스쳐(도 1의 12) 또는 Calling 제스쳐(도 2의 14)의 발생여부가 판정될 수 있다.Therefore, if the code 3 (or the code 3) is larger than the predetermined threshold value T1 and the code 3 (or the code 1) corresponding to the opposite direction is larger than the threshold value T2, the waving gesture (12 of FIG. Can be determined to have occurred. In other words, by checking whether the code value corresponding to the opposite direction exceeds a threshold value T2 when the specific code value exceeds the threshold value T1, the Waging gesture (12 in Fig. 1) or the Calling gesture ) Can be determined.

한편, 누적 히스토그램에서는 이동 방향의 코드값의 발생 빈도수를 누적시키는 것이 일반적이나, 본 실시예에서는 이동 방향의 코드값의 발생 빈도수가 아닌 아래의 수학식 1과 같이 손끝의 이동 속력을 기반으로 한 가중치(W)를 산출하고 이를 누적하여 히스토그램을 구성한다. On the other hand, in the cumulative histogram, it is general to accumulate the occurrence frequency of the code value in the movement direction. However, in this embodiment, the weight value based on the movement speed of the fingertip (W), and accumulates them to construct a histogram.

도 9c를 참조하면, 위의 수학식 1에 의하면, 인접한 영상들 내에서 손끝의 이동변위가 빠를수록 즉, 손끝의 이동 속도가 빠를수록 가중치(W)는 증가하고, 이동변위가 느릴수록 즉, 손끝의 이동 속도가 느릴수록 가중치(W)는 감소한다. Referring to FIG. 9C, according to Equation (1), as the moving displacement of the fingertip increases in the adjacent images, that is, as the moving speed of the fingertips increases, the weight W increases, As the moving speed of the fingertips is slow, the weight W decreases.

여기서, x _f , y _f 는 이전 영상 프레임에서 검출된 손끝 좌표이고, x _f ₊₁ , y _f ₊₁ 는 현재 영상 프레임에서 검출된 손끝 좌표이다. 그리고, w _LL 과 h _LL 은 각각 LL 영역(도 4의 34)의 넓이와 높이를 나타낸다.here, x _f , y _f are fingertip coordinates detected in the previous image frame, and x _f ₊₁ and y _f ₊₁ are fingertip coordinates detected in the current image frame. And, w and h _LL _LL LL denotes the width and height of the region (34 in Fig. 4), respectively.

본 실시예에서, 이동 코드 값의 발생 빈도수가 아닌 이동속력을 기반으로 한 가중치(W)를 이용하여 누적 히스토그램을 구성하는 이유는 영상을 획득하는 시스템(예컨대, 카메라)마다 영상의 획득속도가 다르기 때문이다. 즉, 상이한 영상의 획득속도에 상관없이 사용자의 제스쳐를 균일한 반응 속도로 안정적으로 인식하기 위함이다.In this embodiment, the cumulative histogram is constructed by using the weight W based on the moving speed, not the frequency of occurrence of the moving code value, because the acquisition speed of the image is different for each system (e.g., camera) Because. That is, it is intended to stably recognize the user's gesture at a uniform reaction rate regardless of the acquisition speed of different images.

예를 들어 카메라와 같은 영상을 획득하는 시스템이 영상을 획득하는 속도가 느린 경우에는 초당 획득되는 입력 영상의 프레임 수가 적다. 따라서 이동방향 코드 값의 산출되는 빈도도 감소된다. 이러한 경우 이동 방향의 코드값의 발생 빈도수로 누적 히스토그램을 구성면, 특정 코드 값의 누적되는 정도가 임계 값(T1)을 초과하는데 소요되는 시간이 길어진다. 따라서 시스템이 사용자의 제스쳐를 인식하기까지 많은 시간이 요구되고, 사용자는 오랫동안 동일한 제스쳐를 반복해야 하는 불편한 점이 발생한다. For example, when a system for acquiring an image such as a camera acquires an image at a low speed, the number of frames of the input image acquired per second is small. Therefore, the frequency with which the movement direction code value is calculated is also reduced. In this case, if the cumulative histogram is formed by the occurrence frequency of the code value in the moving direction, the time required for the cumulative degree of the specific code value to exceed the threshold value T1 becomes longer. Therefore, it takes a long time for the system to recognize the user's gesture, and the user has to repeatedly repeat the same gesture for a long time.

반면, 카메라가 영상을 획득하는 속도가 빠르면, 사용자가 손을 좌우로 1-2회 정도만 흔들어도 이동 방향의 코드 값의 산출량이 충분히 확보되므로, 사용자의 제스쳐에 대하여 시스템이 목표 제스쳐로서 인식하는 반응속도가 증가한다. 이 경우, 실제 waving 제스쳐(도 1의 12)가 아닌 일상적인 행동을 waving 제스쳐(도 1의 12)라고 오인식 할 수 있는 가능성도 높아질 수 있다. On the other hand, if the camera acquires the image at a high speed, even if the user shakes the hand 1 or 2 times to the left or right, the amount of code value of the movement direction is sufficiently secured. . In this case, the possibility that the ordinary behavior other than the actual waving gesture (12 in FIG. 1) can be mistaken for the waving gesture (12 in FIG. 1) may be increased.

그러나. 본 실시예에서 제시한 이동 속력을 가중치로 하는 누적 히스토그램은 이러한 문제점을 해결할 수 있다 이에 대한 설명은 도 9c를 참조하여 설명하기로 한다. But. The cumulative histogram using the moving speed as a weight in the present embodiment can solve this problem. The description will be made with reference to FIG. 9C.

도 9c는 영상을 획득하는 시스템의 영상 캡쳐 속도에 따른 이동 코드 값의 발생 빈도 및 손끝 이동변위를 보여주는 도면이다. FIG. 9C is a diagram showing the occurrence frequency and fingertip displacement of a moving code value according to an image capturing speed of a system for acquiring an image.

도 9c를 참조하면, 위의 수학식 1에 의하면, 인접한 영상들 내에서 손끝의 이도변위가 빠를 수록(즉, 손끝의 이동 속도가 빠를수록) 가중치(W)는 증가하고, 이동변위가 느릴수록( 즉, 손끝의 이동 속도가 느릴수록) 가중치(W)는 감소한다. Referring to FIG. 9C, according to Equation (1), as the fingertip displacement of the fingertips in the adjacent images is faster (i.e., the fingertip travels faster), the weight W increases and the slower the displacement (I.e., the slower the moving speed of the fingertip), the weight W decreases.

만약 영상을 획득하는 시스템의 획득 속도가 느리면, 입력되는 영상 수가 감소하여 이동 방향의 코드 값의 발생 빈도는 감소하지만, 연속된 영상에서의 손끝의 이동변위는 커져 가중치(W)는 커진다. 따라서 해당 코드의 누적 값은 비록 빈도수는 작지만 가중치(W)가 크므로 가중치 만큼의 값을 가질 수 있다. If the acquisition speed of the image acquiring system is low, the number of input images decreases and the frequency of occurrence of the code value in the moving direction decreases. However, the movement displacement of the fingertips in the continuous image increases, and the weight W increases. Therefore, the cumulative value of the code can have a value as much as the weight because the weight ( W) is large although the frequency is small.

만약 시스템의 영상 획득 속도가 빠르다면 코드 값의 발생 빈도는 증가하지만, 연속된 인접 영상에서의 손끝 이동변위는 작아진다. 따라서 각각의 가중치(W)는 작아지며, 해당 이동 방향의 코드의 누적 값은 각각의 가중치의 합만큼의 값만을 가진다.If the acquisition speed of the system is fast, the frequency of occurrence of the code value increases, but the fingertip displacement in successive adjacent images decreases. Therefore, each of the weights W is small, and the cumulative value of the code in the moving direction has only a value corresponding to the sum of the respective weights.

따라서 시스템은 영상 획득 속도에 상관없이 사용자의 제스쳐에 대하여 균일한 속도로 안정적으로 제스쳐를 인식할 수 있다.Therefore, the system can stably recognize the gesture at a uniform speed with respect to the user's gesture irrespective of the image acquisition speed.

한편, 본 실시예에서의 누적 히스토그램에 추가되는 가중치(W)는 각각 히스토그램에 추가된 시간 정보(time stamp)를 가지고 있으며, 일정 시간(일례로, 5초) 동안만 히스토그램 내에서 존재하다가 삭제된다. Meanwhile, the weight W added to the cumulative histogram in the present embodiment has a time stamp added to the histogram, and is present in the histogram for a predetermined period of time (for example, 5 seconds) .

다시 도 3을 참조하면, 형태 분석부(166)는 검출된 팔 영역의 크기 정보를 포함하는 형태 정보를 분석하여 4가지의 목표 제스쳐(도 1의 12, 14, 16, 18) 중에서 비 모션 제스쳐에 해당하는 Raising 제스쳐(16)와 Stopping 제스쳐(18)를 판별한다.Referring again to FIG. 3, the morphological analysis unit 166 analyzes the morphology information including the detected size information of the arm region, and calculates morphological information of four motion target gestures (12, 14, 16, 18 in FIG. 1) (16) and the stopping gesture (18) corresponding to the stopping gesture.

형태 분석부(166)는 양팔 영역이 포함된 영상이 입력되면, 양팔 영역의 상대적 길이 비를 분석하여, Raising 제스쳐(16)인지 Stopping 제스쳐(18)인지를 판별하고, Raising 제스쳐(16)와 Stopping 제스쳐(18)로 판별되지 않으며, 노이즈 제스쳐로 판별한다. 이하, 도 10을 참조하여, Raising 제스쳐(16)와 Stopping 제스쳐(18)를 각각 판별하는 방법에 대해 상세히 설명하기로 한다. The shape analyzing unit 166 analyzes the relative length ratio of the two arms to determine whether it is the Raising gesture 16 or the Stopping gesture 18 and outputs the Raising gesture 16 and Stopping It is not discriminated as a gesture (18), and discriminates it as a noise gesture. Hereinafter, a method of discriminating the Raising gesture 16 and the Stopping gesture 18 will be described in detail with reference to FIG.

도 10은 도 3에 도시된 형태 분석부가 Raising 제스쳐(16)와 Stopping 제스쳐(18)를 각각 판별하는 과정을 보여주기 위한 흐름도이고, 도 11은 도 10에 도시된 양팔 영역의 상대적 길이를 분석하는 과정에 대한 이해를 돕기위해 나타낸 도면이고, 도 12는 Raising 제스쳐의 판별 조건을 보여주기 위한 도면이고, 도 13은 Stopping 제스쳐의 판별 조건을 보여주기 위한 도면이다.FIG. 10 is a flowchart for showing the process of discriminating each of the Raising gesture 16 and Stopping gesture 18 shown in FIG. 3 from each other, and FIG. 11 is a flow chart for analyzing the relative lengths of the two- FIG. 12 is a diagram for showing a discrimination condition of a raising gesture, and FIG. 13 is a diagram for showing discrimination conditions of a stopping gesture. FIG.

도 10 및 도 11을 참조하면, 영상이 입력되면, 먼저, 양팔 영역의 상대적 길이비에 대한 분석과정이 수행된다(S911). Referring to FIGS. 10 and 11, when an image is input, the relative length ratio of the two arms is analyzed (S911).

입력 영상에서, 왼팔(L) 영역이 없거나 또는 오른팔(R) 영역의 길이가 왼팔 영역의 길이의 2배 이상이면, 사용자의 제스쳐는 도 11의 (A)에 도시된 바와 같이 Raising 제스쳐(도 1의 16)로 판별된다(S913). 만일 입력된 영상에서, 오른팔(R) 영역이 없거나 왼팔영역의 길이가 오른 팔 영역의 길이의 2배 이상이면, 사용자의 제스쳐는 도 11의 (C)에 도시된 바와 같이, 노이즈 제스쳐로 판별된다(S915).In the input image, if there is no left arm L region or if the length of the right arm region is more than twice the length of the left arm region, the user's gesture will be a Raising gesture (Fig. 1 16) (S913). If there is no right arm R region or the length of the left arm region is more than twice the length of the right arm region in the input image, the user's gesture is determined to be a noise gesture, as shown in Fig. 11C (S915).

입력 영상에서, 오른팔(R) 영역의 길이와 왼팔(L)영역의 길이가 거의 동일하면, 사용자의 제스쳐는 도 11의 (B)에 도시된 바와 같이, Stopping 제스쳐(도 1의 18)로 판별된다(S917). If the length of the right arm (R) area and the length of the left arm (L) area are substantially the same in the input image, the user's gesture is discriminated as a stopping gesture (18 in Fig. 1) (S917).

Raising 제스쳐(도 1의 16) 판별과정은 간단하게 왼 팔 영역의 길이와 각도를 분석하여 Raising 제스쳐 인지 노이즈 제스쳐 인지를 판정될 수 있다(S913).The discriminating process of the raising gesture (16 in FIG. 1) can be easily determined by analyzing the length and angle of the left-hand region to determine whether it is a raising gesture or a noise gesture (S913).

도 12에 도시된 바와 같이, 본 실시예에서는, 오른 팔(R) 영역의 길이가 사용자 머리 영역(HR)의 세로 방향 높이의 대략 1.3배 보다 크고, 오른 팔(R)의 각도가 제1 각도(일례로, 60도)에서 제2 각도(일례로, 135도) 사이인 경우, Rsing 제스쳐로 판단하고(S913), 이러한 조건을 만족하지 않은 사용자의 제스쳐는 모두 노이즈 제스쳐로 판단한다(S915).12, in this embodiment, the length of the right arm R is larger than the height of the user's head HR in the longitudinal direction by about 1.3 times, and the angle of the right arm R is larger than the first angle & (For example, 60 degrees) to a second angle (for example, 135 degrees), it is determined as an Rsing gesture (S913), and all gestures that do not satisfy these conditions are determined as noise gestures (S915) .

Stopping 제스쳐는 양팔(R, L) 영역의 길이, 각도, 위치를 분석하여 Stopping 제스쳐 인지 또는 노이즈 제스쳐 인지를 판별한다(S915, S917). Stopping gesture analyzes the length, angle, and position of both arms (R, L) regions to determine whether it is a stopping gesture or a noise gesture (S915, S917).

먼저 양팔 영역의 상단부가 모두가 사용자의 머리 영역(HR)의 세로 방향의 상단부에 근접해 있는 경우(두 팔로 머리를 만지는 일상적인 행동에 해당)와 팔 영역의 길이가 사용자의 머리 영역(HR)의 세로 방향의 높이의 1.6배 보다 긴 팔이 존재하는 경우에는, 사용자의 제스쳐는 노이즈 제스쳐로 판단한다(S915).First, if the upper end of the two-arm region is close to the upper end of the vertical direction of the user's head region HR (corresponding to the usual behavior of touching the head with two arms) and the length of the arm region If there is an arm longer than 1.6 times the height in the vertical direction, the gesture of the user is determined as a noise gesture (S915).

Stopping 제스쳐는 인간의 신체구조 상 양 손의 각도가 90도 보다 작거나 같을 경우, 사용자의 제스쳐는 Stopping 제스쳐일 가능성이 높다. Stopping gesture is likely to be a stopping gesture if the angle of both hands is less than or equal to 90 degrees in human body structure.

반면 90도보다 커질수록 Stopping 제스쳐가 아닌 일상적인 동작(예를 들어 기지개)일 가능성이 크다. On the other hand, the greater the 90 degree, the more likely it is a normal behavior (eg stretching) rather than a stopping gesture.

또한 도 13에 도시된 바와 같이, 사용자의 손 영역이 머리 영역(HR)의 가로 길이만큼 양쪽으로 떨어진 위치에 존재할 때, 사용자의 제스쳐는 Stopping 제스쳐일 가능성이 가장 높으며, 이 위치에서의 변위가 클수록 그 가능성은 떨어진다. 즉, 왼팔(L)의 각도가 90도를 초과하고, 오른팔(R)의 각도가 90도 미만일때, 사용자의 제스쳐는 Stopping 제스쳐가 아닐 가능성이 가장 높다.Also, as shown in FIG. 13, when the user's hand region is located at a position apart from the head region HR by the width of the head region HR, the user's gesture is most likely a stopping gesture, and the larger the displacement at this position The possibility is reduced. That is, when the angle of the left arm L exceeds 90 degrees and the angle of the right arm R is less than 90 degrees, the user's gesture is most likely not a stopping gesture.

따라서 아래의 수학 식들과 같이 손 영역의 각도와 위치에 따라 Stopping 제스쳐 일 확률 값을 구하고(P _d 와 P _p ), 이 확률 값들에 근거하여 최종 확률 값(P)이 지정한 임계 값보다 높은 경우에 Stopping 제스쳐로 판정하고, 그렇지 않다면 노이즈 제스쳐로 판단한다.Therefore, if the final probability value P is higher than the threshold value specified based on the probability values ( P _d and P _p ) by obtaining the stopping gesture probability value according to the angle and position of the hand region as shown in the following equations Stopping gesture, and if not, a noise gesture.

여기서, 상기 수학식 2에서, Pd는 각도 정보에 근거한 현재 팔 영역이 Stopping 제스쳐일 확률 값이고, d는 팔 영역의 각도 값, dl 와 dh 는 제스쳐 허용 각도로써 각각 90도와 120도이다. In Equation (2), Pd is the probability that the current arm region is the stopping gesture based on the angle information, d is the angle value of the arm region, and dl and dh are the gesture allowable angles of 90 degrees and 120 degrees, respectively.

상기 수학식 3에서, Pp는 위치 정보에 근거한 현재 팔 영역이 Stopping 제스쳐 일 확률 값이고, x는 팔 영역의 X좌표, X_hl 은 머리 영역의 왼쪽 경계의 X 좌표, W_h 는 머리 영역의 넓이를 나타낸다. 상기 수학식 4에서, P는 현재의 팔 영역이 Stopping 제스쳐 일 최종 확률 값이며, α는 가중치이다.Where Xp is the X coordinate of the arm region, _Xhl is the X coordinate of the left boundary of the head region, and _Wh is the width of the head region. [0050] In Equation (3), Pp is the probability that the current arm region is the stopping gesture based on the position information, . In Equation (4), P is a final probability value of the current arm region as a stopping gesture day, and? Is a weight value.

결론적으로, 상기 수학식 2에 의하면, 상기 오른팔 영역의 각도가 90도 이하이고, 상기 왼팔 영역의 각도가 90도 이상인 경우, 상기 Stopping 제스쳐일 확률 값(Pd)이 증가하고, 상기 오른팔 영역의 각도가 90도를 초과하고, 상기 왼팔 영역의 각도가 90도 미만인 경우, 상기 Stopping 제스쳐일 확률 값(Pd)은 감소한다.As a result, according to Equation (2), when the angle of the right arm region is 90 degrees or less and the angle of the left arm region is 90 degrees or more, the stopping gesture probability value Pd increases, Is greater than 90 degrees and the angle of the left arm region is less than 90 degrees, the Stopping Gesture Occurrence probability value (Pd) decreases.

수학식 3에 의하면, 상기 오른팔 영역과 상기 왼팔 영역의 위치가 머리 영역(HR)을 기준으로 머리 영역(HR)의 넓이만큼 각각 떨어진 경우, Stopping 제스쳐일 확률 값(Pd)이 가장 크고, 해당 위치에서의 변위가 커질수록 Stopping 제스쳐일 확률 값(Pd)이 감소한다. According to Equation (3), when the positions of the right arm region and the left arm region are separated from each other by the width of the head region HR with respect to the head region HR, the stopping gesture probability value Pd is the largest, The stopping gesture probability value Pd decreases as the displacement of the stopping gesture becomes larger.

수학식 4에 의하면, 상기 오른팔 영역과 상기 왼팔 영역 각각의 각도와 위치에 따라 stopping 제스처일 확률 값(Pd)이 산출될 수 있다. According to Equation (4), the probability value Pd of the stopping gesture can be calculated according to the angles and positions of the right arm region and the left arm region, respectively.

계속해서, Raising 제스쳐와 Stopping 제스쳐의 발생 횟수가 지정된 임계 값(K)을 초과하면(S919), 시스템은 최종적으로 비 모션 제스쳐가 발생한 것으로 판별한다(S921).Subsequently, if the number of occurrences of the Raising gesture and Stopping gesture exceeds the specified threshold value K (S919), the system finally determines that a non-motion gesture has occurred (S921).

이상 설명한 바와 같이, 본 발명은 도면에 도시된 일 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. I will understand the point. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

Claims

입력 영상으로부터 검출된 사용자의 얼굴 영역을 기준으로 상기 사용자의 팔의 제스쳐가 발생하는 제스쳐 영역을 설정하는 제스쳐 영역 설정부;
상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 팔 검출부; 및
상기 제스쳐 영역 내에 존재하는 팔 영역의 위치, 이동 방향성 및 형태 정보를 분석하여, Waving 제스쳐와, Calling 제스쳐, Raising 제스쳐 및 Stopping 제스쳐를 포함하는 상기 사용자의 목표 제스쳐를 판별하는 제스쳐 판정부를 포함하고,
상기 제스쳐 판정부는,
상기 사용자의 팔 영역 위치가 상기 제스쳐 영역 내에 존재하는지를 판별하고, 판별 결과에 따라 상기 목표 제스쳐와 사용자의 일상적인 행동에 해당하는 노이즈 제스쳐를 구별하는 영역 분석부;
상기 사용자의 팔 영역의 상기 이동 방향성을 분석하여, 상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 모션 분석부; 및
상기 사용자의 양팔의 상대적 길이 비와 각도를 포함하는 상기 형태 정보를 분석하여, 상기 Raising 제스쳐와 상기 Stopping 제스쳐를 판별하는 형태 분석부를 포함함을 특징으로 하는 제스쳐 인식 장치.A gesture region setting unit for setting a gesture region where a gesture of the user's arm occurs based on a face region of the user detected from the input image;
An arm detecting unit for detecting an arm region of the user existing in the gesture region; And
And a gesture judging unit for analyzing the position, movement direction, and type information of the arm region existing in the gesture region and discriminating the target gesture of the user including the Waving gesture, the calling gesture, the Raising gesture, and the stopping gesture,
The gesture judging unit,
A region analyzer for determining whether the position of the user's arm region is within the gesture region and for distinguishing the target gesture from a noise gesture corresponding to a user's daily behavior according to a result of the determination;
A motion analyzer for analyzing the moving direction of the user's arm region to discriminate the Waving gesture and the Calling gesture; And
And a morphological analyzer for analyzing the morphological information including the relative length ratio and angle of the user's arms and determining the Raising gesture and the Stopping gesture.

제1항에서, 상기 팔 검출부는 상기 입력 영상으로부터 분리된 배경 분리 영상을 통해 상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 것을 특징으로 하는 제스쳐 인식 장치.The gesture recognition apparatus of claim 1, wherein the arm detecting unit detects an arm region of the user existing in the gesture region through a background separation image separated from the input image.

제1항에서, 상기 제스쳐 영역 설정부는 상기 검출된 얼굴 영역의 크기와 위치에 따라 상기 제스쳐 영역의 크기를 설정하는 것을 특징으로 하는 제스쳐 인식 장치. The gesture recognition apparatus of claim 1, wherein the gesture region setting unit sets the size of the gesture region according to the size and position of the detected face region.

제1항에서, 상기 영역 분석부는 상기 목표 제스쳐와 상기 제스쳐 영역이 항목별로 구성된 룩업 테이블을 이용하여 상기 목표 제스쳐와 상기 노이즈 제스쳐를 판별하는 것을 특징으로 하는 제스쳐 인식 장치.The gesture recognition apparatus of claim 1, wherein the region analyzing unit identifies the target gesture and the noise gesture by using a lookup table in which the target gesture and the gesture region are classified into items.

제1항에서, 상기 모션 분석부는, 사용자의 오른팔을 기준으로,
상기 사용자의 팔 영역의 손끝 좌표를 검출하고, 상기 검출된 손끝 좌표의 이동 방향과 이동 속도를 분석하여 이동 방향 코드 값을 설정하고, 상기 설정된 이동 방향 코드 값의 누적 치에 근거하여 상기 Waving 제스쳐와 상기 Calling 제스쳐를 판별하는 것을 특징으로 하는 제스쳐 인식 장치.2. The apparatus of claim 1, wherein the motion analyzer, based on a user's right arm,
Detecting a fingertip coordinate of the user's arm region, analyzing a moving direction and a moving speed of the detected fingertip coordinate to set a moving direction code value, and based on the cumulative value of the set moving direction code value, And the gesture recognition unit identifies the calling gesture.

제1항에서, 상기 제스쳐 영역 설정부는,
상기 얼굴 영역을 포함하는 머리 영역의 위치를 기준으로,
상기 머리 영역의 좌측 상부에 위치한 좌측 상부 영역;
상기 머리 영역의 우측 상부에 위치한 우측 상부 영역;
상기 머리 영역에 인접하고, 상기 좌측 상부 영역과 상기 우측 상부 영역 사이에 위치하는 중앙 상부 영역,
상기 머리 영역에 인접하고, 상기 좌측 상부 영역의 하부에 위치한 좌측 하부 영역 및
상기 머리 영역에 인접하고, 상기 우측 상부 영역의 하부에 위치한 우측 하부 영역을 포함하는 상기 제스쳐 영역을 설정하는 것을 특징으로 하는 제스쳐 인식 장치.The apparatus of claim 1, wherein the gesture region setting unit comprises:
Wherein the face region includes a face region,
A left upper region located at the upper left of the head region;
A right upper region located on the upper right side of the head region;
A central upper region adjacent the head region and located between the left upper region and the right upper region,
A left lower region adjacent to the head region and located below the left upper region,
Wherein the gesture area setting unit sets the gesture area including a lower right area adjacent to the head area and below the right upper area.

제6항에서, 상기 형태 분석부는, 사용자가 오른팔을 이용하여 상기 Raising 제스쳐를 표현하는 경우,
상기 우측 하부 영역에 왼팔 영역이 존재하지 않고, 상기 좌측 하부 영역 및 좌측 상부 영역을 걸쳐서 존재하는 오른팔 영역의 길이가 왼팔 영역의 길이 2배 이상이면, 상기 사용자의 제스쳐를 상기 Raising 제스쳐로 판별하고,
상기 좌측 하부 영역에 오른팔(R) 영역이 없고, 상기 우측 상부 영역에 존재하는 왼팔영역의 길이가 오른팔 영역의 길이 이상이면, 상기 사용자의 제스쳐를 상기 노이즈 제스쳐로 판별하는 것을 특징으로 하는 제스쳐 인식 장치. The apparatus of claim 6, wherein when the user expresses the Raising gesture using the right arm,
If the length of the right arm region existing over the left lower region and the upper left region is not less than twice the length of the left arm region in the right lower region and does not exist in the left lower region and the upper left region, the user's gesture is determined as the Raising gesture,
Wherein if the length of the left arm region existing in the right upper region is not greater than the length of the right arm region in the left lower region and the gesture of the user is determined as the noise gesture, .

제6항에서, 상기 형태 분석부는,
상기 좌측 하부 영역 내에 존재하는 오른팔 영역의 길이가 상기 머리 영역의 세로 방향 높이의 1.3배보다 크고, 오른팔의 각도가 60도에서 135도 사이인 경우, 상기 사용자의 제스쳐를 Raising 제스쳐로 판단하는 것을 특징으로 하는 제스쳐 인식 장치.7. The apparatus according to claim 6,
When the length of the right arm region existing in the left lower region is greater than 1.3 times the height in the vertical direction of the head region and the angle of the right arm is between 60 degrees and 135 degrees, the gesture of the user is determined as a Raising gesture Of the gesture recognition device.

제6항에서, 상기 형태 분석부는,
상기 좌측 하부 영역 내에 존재하는 상기 사용자의 오른팔 영역의 길이와 상기 우측 하부 영역에 존재하는 왼팔 영역의 길이가 실질적으로 동일하면, 상기 사용자의 제스쳐를 상기 Stopping 제스쳐로 판별하고,
상기 오른팔 영역의 길이와 상기 왼팔 영역의 길이가 상기 머리 영역의 높이의 기 설정된 배수보다 길면, 상기 사용자의 제스쳐를 상기 노이즈 제스쳐로 판별하는 것을 특징으로 하는 제스쳐 인식 장치.7. The apparatus according to claim 6,
If the length of the right arm region of the user existing in the left lower region and the length of the left arm region existing in the right lower region are substantially equal to each other, the gesture of the user is determined as the stopping gesture,
Wherein the gesture recognition unit discriminates the gesture of the user as the noise gesture if the length of the right arm region and the length of the left arm region are longer than a predetermined multiple of the height of the head region.

제1항에서, 상기 형태 분석부는,
상기 사용자의 오른팔 영역의 각도가 90도 이하이고, 상기 사용자의 왼팔 영역의 각도가 90도 이상인 경우, 상기 Stopping 제스쳐로 판별될 확률 값이 가장 크고, 상기 오른팔 영역의 각도가 90도를 초과하고, 상기 왼팔 영역의 각도가 90도 미만인 경우, 상기 Stopping 제스쳐로 판별될 확률 값이 감소하는 것을 특징으로 하는 제스쳐 인식 장치.The apparatus according to claim 1,
Wherein when the angle of the right arm region of the user is 90 degrees or less and the angle of the left arm region of the user is 90 degrees or more, the probability value to be discriminated by the stopping gesture is the largest, the angle of the right arm region exceeds 90 degrees, Wherein when the angle of the left arm region is less than 90 degrees, the probability value to be discriminated by the stopping gesture is reduced.

제1항에서, 상기 형태 분석부는,
상기 사용자의 오른팔 영역과 상기 사용자의 왼팔 영역의 위치가 머리 영역을 기준으로 상기 머리 영역의 넓이만큼 각각 떨어진 경우, 상기 Stopping 제스쳐로 판별될 확률 값이 가장 크고, 해당 위치에서의 이동 변위가 커질수록 상기 Stopping 제스쳐로 판별될 확률 값이 감소하는 것을 특징으로 하는 제스쳐 인식 장치.The apparatus according to claim 1,
When the position of the right arm region of the user and the left arm region of the user are respectively separated by the width of the head region with respect to the head region, the probability value to be discriminated by the stopping gesture is the largest, and as the displacement at the position becomes larger Wherein the probability value to be determined by the stopping gesture is decreased.

제1항에서, 상기 형태 분석부는,
상기 사용자의 오른팔 영역과 상기 사용자의 왼팔 영역 각각의 각도와 위치에 따라 stopping 제스처일 확률 값을 산출하는 것을 특징으로 하는 제스쳐 인식 장치.The apparatus according to claim 1,
And calculates a probability value of a stopping gesture based on an angle and a position of each of the right arm region of the user and the left arm region of the user.

이동수단을 구비한 인간-로봇 상호작용(Human Robot Interaction: HRI)기술이 구현된 로봇 시스템을 이용한 사용자의 제스쳐 인식 방법에 있어서,
입력 영상으로부터 사용자의 얼굴 영역을 검출하는 단계;
상기 검출된 얼굴 영역의 위치와 크기에 따라 상기 사용자의 팔의 제스쳐가 발생하는 제스쳐 영역의 크기를 소정의 비율로 계산하는 단계;
상기 계산된 제스처 영역 내에 존재하는 사용자 팔 영역이 포함된 배경 분리 영상을 획득하는 단계;
상기 획득된 배경 분리 영상을 이용하여 상기 제스쳐 영역 내에 존재하는 상기 사용자의 팔 영역을 검출하는 단계; 및
상기 제스쳐 영역 내에 존재하는 팔 영역의 위치, 이동 방향성 및 형태 정보를 분석하여, Waving 제스쳐와, Calling 제스쳐, Raising 제스쳐 및 Stopping 제스쳐를 포함하는 상기 사용자의 목표 제스쳐를 판별하는 단계를 포함하고,
상기 사용자의 목표 제스쳐를 판별하는 단계는,
상기 사용자의 팔 영역 위치가 상기 제스쳐 영역 내에 존재하는지 여부를 판별하고, 판별 결과에 따라 상기 목표 제스쳐와 사용자의 일상적인 행동에 해당하는 노이즈 제스쳐를 구별하는 단계;
상기 사용자의 팔 영역의 상기 이동 방향성을 분석하여, 상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 단계; 및
상기 사용자의 양팔의 상대적 길이 비와 각도를 포함하는 상기 형태 정보를 분석하여, 상기 Raising 제스쳐와 상기 Stopping 제스쳐를 판별하는 단계
를 포함함을 특징으로 하는 로봇 시스템을 이용한 제스쳐 인식 방법. A method of recognizing a user's gesture using a robot system embodying a Human-Robot Interaction (HRI) technique having a moving means,
Detecting a face region of a user from an input image;
Calculating a size of a gesture region where a gesture of the user's arm occurs at a predetermined ratio according to a position and a size of the detected face region;
Obtaining a background separated image including a user arm region existing in the calculated gesture region;
Detecting an arm region of the user existing in the gesture region using the obtained background separated image; And
Determining a target gesture of the user including the Waving gesture, the calling gesture, the Raising gesture, and the stopping gesture by analyzing the position, movement direction, and type information of the arm area existing in the gesture area,
Wherein the step of determining the target gesture of the user comprises:
Determining whether the position of the user's arm region is within the gesture region, and distinguishing the target gesture from a noise gesture corresponding to a user's daily behavior according to a result of the determination;
Analyzing the movement direction of the user's arm region to identify the Waving gesture and the Calling gesture; And
Determining the Raising gesture and the Stopping gesture by analyzing the shape information including the relative length ratio and angle of the user's arms;
And a gesture recognition method using the robot system.

제13항에서, 상기 사용자의 목표 제스쳐를 판별하는 단계는,
상기 제스쳐 영역 내에서 상기 사용자의 팔 영역의 손끝 좌표를 검출하는 단계를 더 포함하고,
상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 단계는,
상기 검출된 손끝 좌표의 이동 방향과 이동 속도를 분석하여 이동 방향 코드 값을 설정하고, 상기 설정된 이동 방향 코드 값의 누적 치에 근거하여 상기 Waving 제스쳐와, 상기 Calling 제스쳐를 판별하는 것임을 특징으로 로봇 시스템을 이용한 사용자의 제스쳐 인식 방법.14. The method of claim 13, wherein the step of determining a target gesture of the user comprises:
Further comprising detecting fingertip coordinates of the user's arm region within the gesture region,
Wherein the step of discriminating the Waving gesture and the Calling gesture comprises:
Characterized in that the moving direction code value is set by analyzing the moving direction and the moving speed of the detected fingertip coordinate, and the Waving gesture and the calling gesture are discriminated based on the accumulated moving direction code value. A method of recognizing a user 's gesture using.