KR20230021566A

KR20230021566A - Method and apparatus for matching object based on re-identification

Info

Publication number: KR20230021566A
Application number: KR1020220042242A
Authority: KR
Inventors: 김기진; 김현수; 박성혁; 김준호
Original assignee: 국민대학교산학협력단
Priority date: 2021-08-05
Filing date: 2022-04-05
Publication date: 2023-02-14

Abstract

A method and apparatus for matching an object based on re-identification are disclosed. The method for matching an object based on re-identification according to an embodiment of the present disclosure may include the steps of: obtaining a first image of an object for object recognition when an event occurs in a specific space for an object matching device; extracting unique identification information corresponding to a specific space of the object; and determining whether the first image and the second image are identical to the object using a pre-learned re-recognition distance learning model based on the unique identification information corresponding to the specific space of the object, when the pre-stored second image corresponding to the unique identification information is detected.

Description

재인식 기반 객체 매칭 방법 및 장치{METHOD AND APPARATUS FOR MATCHING OBJECT BASED ON RE-IDENTIFICATION}Recognition-based object matching method and apparatus {METHOD AND APPARATUS FOR MATCHING OBJECT BASED ON RE-IDENTIFICATION}

본 개시는 거리 학습 기반의 사람 재인식 알고리즘을 사용하여 비대면 또는 무인 감시 공간에서의 보안 관리가 가능하도록 하는 재인식 기반 객체 매칭 방법 및 장치에 관한 것이다.The present disclosure relates to a recognition-based object matching method and apparatus for enabling security management in a non-face-to-face or unmanned surveillance space using a distance learning-based human recognition algorithm.

일반적으로, ATM(Automated Teller Machine, 현금 자동 입출금기) 서비스는 개방된 환경에서 비대면 운영을 제공하는 금융서비스로서 영상 보안의 중요성이 강조되는 분야이다. In general, an automated teller machine (ATM) service is a financial service that provides non-face-to-face operation in an open environment, and the importance of video security is emphasized.

ATM 분야는 효율성 증대 및 비대면 환경에서 고객 접근성 강화를 위해 무인화 환경 및 무인 기기를 확대하고 있는 추세이며, 이에 따라 보안 사고의 발생도 증가하고 있다. In the ATM field, unmanned environments and unmanned devices are being expanded to increase efficiency and enhance customer accessibility in a non-face-to-face environment, and security incidents are also increasing accordingly.

현재 ATM으로 현금 인출을 하기 위해 필요한 것은 계좌번호나 통장 혹은 현금 인출이 가능한 카드와 카드의 비밀번호인데, 카드가 도난되거나 비밀번호가 노출될 시에는 범죄에 쉽게 도용될 수 있다는 보안상의 허점이 존재한다. 이를 보완하기 위해 생체 정보 등의 추가적인 개인식별정보를 추가 인증방법으로 사용하거나 ATM 기기 및 서비스 공간에 CCTV 를 다수 설치하여 감시하는 방법이 보안을 위해 도입되고 실제 사용되고 있다.Currently, what is required to withdraw cash through an ATM is an account number, a bankbook, or a card capable of withdrawing cash, and a password of the card. However, there is a security loophole that can be easily stolen by a crime when the card is stolen or the password is exposed. To compensate for this, additional personal identification information such as biometric information is used as an additional authentication method, or a method of monitoring by installing multiple CCTVs in ATM devices and service spaces has been introduced and actually used for security.

하지만 생체 정보나 추가적인 개인정보를 추가 인증 방법으로 도입하려는 시도는 데이터 수집 및 저장의 어려움, 인식 오류, 혹은 노출이나 해킹의 우려가 있고 편의성을 저해하여 전면적인 도입에 어려움이 있는 실상이다.However, an attempt to introduce biometric information or additional personal information as an additional authentication method is difficult to introduce in full due to difficulties in data collection and storage, recognition errors, exposure or hacking, and hindering convenience.

또한, ATM 기기 및 서비스 공간에 CCTV를 다수 설치하여 감시를 하는 방법은 대부분 사고 발행 후, 사후 조치를 위한 목적으로 사용되고 있어 사전 감지 및 대처에는 역할이 미흡하다. In addition, the method of monitoring by installing multiple CCTVs in ATM devices and service spaces is mostly used for the purpose of follow-up measures after the occurrence of an accident, so the role of prior detection and response is insufficient.

예를 들어, 사고 발생 시 고객에 의해 사고가 콜센터에 접수되면 경비사 혹은 용역 직원을 현장에 파견하여 CCTV 영상을 확보 후 사람이 직접 영상 분석을 하여 사고 유무 및 사고 정황을 파악하게 된다.For example, when an accident is reported to a call center by a customer, a security guard or service worker is dispatched to the site to secure CCTV images, and then a person directly analyzes the video to determine whether or not there was an accident and the circumstances of the accident.

이러한 사후 대처에는, 통계적으로 현장 상황에 따라 출동시간만 수도권 기준 6~8시간이 소요되고 지방이나 군부대와 같은 특수한 지점의 경우 평일 기준 24시간 이상 주말이나 공휴일이 낀 경우는 3~4일 정도가 소요될 수 있다. 또한 출동 후 데이터를 수집한 뒤에는 데이터 분석과 조치 등의 후속 업무가 이루어지고, 총 사고 해결까지 걸리는 시간은 최소 15일에서 30일가량 소요될 수 있다.Statistically, this follow-up response takes 6 to 8 hours in the metropolitan area, depending on the situation at the site, and 3 to 4 days in case of weekends or holidays, in the case of special branches such as provinces or military bases, more than 24 hours on weekdays. it may take In addition, after data is collected after dispatch, follow-up tasks such as data analysis and action are carried out, and it may take at least 15 to 30 days to resolve the total accident.

즉, 상술한 바와 같이, 기존에 ATM 기기 및 서비스 공간에 설치 운영 중인 CCTV는 사고 발생 후 사후 조치하기 위한 목적으로 사용되고 있으며, 사고 발생 시 사람이 사고 발생 전후의 CCTV 영상을 확인하는 방식으로 조치가 이루어진다. 이 경우, 사람이 사고 발생 전후의 정황을 확인하고 전후 시간의 CCTV 영상을 확보한 뒤 사고 진위 여부를 가려내야 하는데, 이러한 과정에 상당한 시간이 소요될 수 있다.In other words, as described above, CCTVs installed and operated in existing ATM devices and service spaces are used for the purpose of post-accident measures, and in the event of an accident, a person checks the CCTV images before and after the accident so that measures can be taken. It is done. In this case, a person needs to check the circumstances before and after the accident, secure CCTV images of the time before and after the accident, and determine whether the accident is true, but this process can take a considerable amount of time.

사후 조치에 상당한 시간이 소요되는 것을 보완하고자 CCTV의 영상을 분석하여 이상행동을 사전 감지 혹은 즉각 감지하고 추적하는 대규모 솔루션도 많은 연구개발이 진행되고 있으나 단순 CCTV 영상 분석은 ATM 기기 이용 시 본인의 카드를 사용하는지 타인의 카드를 사용하는지 판단하기가 어렵다는 한계가 있다. 더불어 CCTV 영상을 카드정보나 개인정보와 유기적으로 연동한다고 하더라도, 이는 정보의 노출이나 해킹의 우려가 있기에 도입에 어려움이 있는 것이 현실이다.In order to compensate for the fact that considerable time is required for follow-up measures, a large-scale solution that analyzes CCTV images to detect and track abnormal behavior in advance or immediately is being researched and developed. There is a limitation that it is difficult to determine whether someone else's card is being used. In addition, even if CCTV images are organically linked with card information or personal information, it is difficult to introduce them because there is a risk of information exposure or hacking.

금융 서비스에서는 빠른 사후 대처가 금전 손실의 규모와 직결되므로 사고 발생 징후를 미리 파악하거나 사고 여부를 빠르게 파악하고 빠른 사후 대처를 하여 금전적 손실을 최소화하는 것이 서비스 유지를 위해 필수적으로 중요한 요소 중에 하나라고 할 수 있다.In financial services, quick follow-up measures are directly related to the size of financial loss, so it is one of the essential elements for maintaining services to minimize financial losses by identifying signs of an accident in advance or quickly figuring out whether an accident has occurred and taking quick follow-up measures. can

기존에도 ATM기기 및 서비스 공간에 CCTV를 다수 설치하는 등 보안 수단이 존재하나, 대부분 사고 발생 후 사후 조치를 위한 목적으로 사용되고 있는 것이 현재의 보편적인 환경의 구성이다. 하지만 무인 ATM 및 비대면 환경이 확대되는 최근의 환경 변화에는 이러한 편리한 환경 변화가 오히려 보안상의 허점으로 작용할 수 있어 이를 보완할 수 있는 추가적인 시스템이 요구되고 있다.Although there are existing security measures such as installing a number of CCTVs in ATM machines and service spaces, most of them are used for the purpose of follow-up measures after an accident is a common configuration of the current environment. However, in recent environmental changes where unmanned ATMs and non-face-to-face environments are expanding, these convenient environmental changes can rather act as security loopholes, so an additional system that can compensate for them is required.

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.The foregoing background art is technical information that the inventor possessed for derivation of the present invention or acquired during the derivation process of the present invention, and cannot necessarily be said to be known art disclosed to the general public prior to filing the present invention.

선행기술 1: 한국 등록특허공보 제10-1185525호(2012.09.18)Prior art 1: Korean Patent Registration No. 10-1185525 (2012.09.18) 선행기술 2: 한국 공개특허공보 제10-2019-0068000호(2019.06.18)Prior Art 2: Korean Patent Publication No. 10-2019-0068000 (2019.06.18)

본 개시의 실시 예의 일 과제는, 거리 학습 기반의 사람 재인식 알고리즘을 사용하여 비대면 또는 무인 감시 공간에서의 보안 관리가 가능하도록 하는데 있다.An object of an embodiment of the present disclosure is to enable security management in a non-face-to-face or unmanned surveillance space by using a distance learning-based person recognition algorithm.

본 개시의 실시 예의 일 과제는, 무인 점포 증대와 환경적인 변화의 흐름에서 ATM 기기 및 서비스에 추가적인 보안 설비의 설치나 추가적인 개인정보의 활용없이 사용자가 사용한 카드에서 만들어진 해시 값과 사용자의 이미지만을 활용하여 도난이 의심되는 개인카드가 본인이 아닌 타인에 의해 사용하는 경우를 감지해 낼 수 있도록 하는 보안 서비스를 제공하고자 하는데 있다.One problem of an embodiment of the present disclosure is to utilize only the hash value created from the card used by the user and the user's image without installing additional security facilities in ATM devices and services or utilizing additional personal information in the flow of unmanned store growth and environmental change. Therefore, it is intended to provide a security service that can detect a case in which a personal card suspected of being stolen is used by someone other than the person himself/herself.

본 개시의 실시 예의 일 과제는, 개방된 환경에서 보안 서비스를 제공함에 있어, 기존의 보안시스템의 허점을 보완하기 위해 기존 시스템에 추가적으로 탑재되어 적용되는 보조적인 시스템으로 보안 서비스를 제공하여, 침해가 없도록 하고, 개인의 신상정보 조회를 통한 사람 인식 서비스와 차별화하여 개인의 신상정보에 대한 접근 없이 보안 서비스를 제공하여, 유출될 수 있는 개인 식별 정보 노출을 최소화하고자 하는 것이다. An object of an embodiment of the present disclosure is to provide a security service as an auxiliary system that is additionally mounted and applied to an existing system in order to supplement the loopholes of the existing security system in providing a security service in an open environment, thereby preventing infringement. It is intended to minimize exposure of personal identification information that may be leaked by providing security services without access to personal information by differentiating from person recognition services through personal information inquiry.

본 개시의 실시 예의 일 과제는, 이미지를 통한 인물 재인식을 수행하여, 기존의 카드 사용자와 다르게 인식되는 사용자가 현금 인출을 시도할 경우, 단순 이미지만 가지고도 경고문구의 제공, 절차진행의 재확인 등의 사전조치를 가능하게 하는데 있다.One task of an embodiment of the present disclosure is to perform person recognition through an image, and when a user recognized differently from an existing card user attempts to withdraw cash, providing a warning message even with a simple image, reconfirming the procedure, etc. It is to enable advance measures of

본 개시의 실시예의 목적은 이상에서 언급한 과제에 한정되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시 예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 알 수 있을 것이다.The purpose of the embodiments of the present disclosure is not limited to the above-mentioned tasks, and other objects and advantages of the present invention not mentioned above can be understood by the following description and will be more clearly understood by the embodiments of the present invention. will be. It will also be seen that the objects and advantages of the present invention may be realized by means of the instrumentalities and combinations indicated in the claims.

본 개시의 일 실시 예에 따른 재인식 기반 객체 매칭 방법은, 객체 매칭 장치에 대한 특정 공간에서 이벤트 발생 시, 객체 재인식을 위한 객체의 제 1 이미지를 획득하는 단계와, 객체의 특정 공간에 대응하는 고유 식별 정보를 추출하는 단계와, 객체의 특정 공간에 대응하는 고유 식별 정보에 기반하여, 고유 식별 정보에 대응하는 기 저장된 제 2 이미지가 검출되면, 기 학습된 재인식 거리 학습 모델을 이용하여 제 1 이미지와 제 2 이미지의 객체 동일 여부를 판단하는 단계를 포함할 수 있다.A recognition-based object matching method according to an embodiment of the present disclosure includes obtaining a first image of an object for object recognition when an event occurs in a specific space for an object matching device, and a unique object corresponding to a specific space of the object. Extracting identification information; and when a pre-stored second image corresponding to the unique identification information is detected based on the unique identification information corresponding to a specific space of the object, the first image is detected using a pre-learned re-recognition distance learning model. and determining whether the objects of the second image are the same.

이 외에도, 본 발명의 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램이 저장된 컴퓨터로 판독 가능한 기록매체가 더 제공될 수 있다.In addition to this, another method for implementing the present invention, another system, and a computer-readable recording medium storing a computer program for executing the method may be further provided.

전술한 것 외의 다른 측면, 특징, 이점이 이하의 도면, 특허청구범위 및 발명의 상세한 설명으로부터 명확해질 것이다.Other aspects, features and advantages other than those described above will become apparent from the following drawings, claims and detailed description of the invention.

본 개시의 실시 예에 의하면, 개방된 환경에서 보안 서비스를 제공함에 있어, 기존의 보안시스템의 허점을 보완하기 위해 기존 시스템에 추가적으로 탑재되어 적용되는 보조적인 시스템으로 보안 서비스를 제공함으로써, 개인의 신상정보 조회를 통한 사람 인식 서비스와 차별화하여 개인의 신상정보에 대한 접근 없이 보안 서비스를 제공해, 유출될 수 있는 개인 식별 정보 노출을 최소화할 수 있다.According to an embodiment of the present disclosure, in providing a security service in an open environment, by providing a security service as an auxiliary system that is additionally mounted and applied to an existing system to compensate for weaknesses in the existing security system, Differentiated from human recognition services through information inquiry, it provides security services without access to individual personal information, minimizing the exposure of personal identification information that may be leaked.

또한, 본 개시의 실시 예에 의하면, 이미지를 통한 인물 재인식을 수행함으로써, 기존의 카드 사용자와 다르게 인식되는 사용자가 현금 인출을 시도할 경우, 단순 이미지만 가지고도 경고문구의 제공, 절차진행의 재확인 등의 사전조치가 가능하도록 할 수 있다.In addition, according to an embodiment of the present disclosure, by performing person recognition through an image, when a user recognized differently from an existing card user attempts to withdraw cash, a warning message is provided even with a simple image, and the procedure is reconfirmed. It may be possible to take precautionary measures such as

또한, 본 개시의 실시 예에 의하면, 기기 사용자의 얼굴 이미지를 기반으로, 사람 재인식을 수행하는 학습 기반 딥러닝 모델을 통해 해당 인물의 다른 인물 이미지와의 매칭 여부를 실시간으로 정확하게 판단함으로써, 부정행위 검출 및 즉각적 대응이 가능하도록 할 수 있다.In addition, according to an embodiment of the present disclosure, based on a device user's face image, through a learning-based deep learning model that performs person re-recognition, it is accurately determined in real time whether the corresponding person is matched with other person images, thereby preventing cheating. detection and immediate response.

또한, 본 개시의 실시 예에 의하면, 온라인 상에 등록된 개인정보의 직접적인 연동 및 활용 없이 제한된 조건에서 사용자 이미지를 기반으로 학습된 정보를 가지고 카드의 소유자를 구분 함으로써, 사전 대응과 빠른 사후 대처에 도움을 줄 수 있는 솔루션을 제공할 수 있다.In addition, according to an embodiment of the present disclosure, the owner of the card is classified with the information learned based on the user image under limited conditions without direct linkage and utilization of personal information registered online, thereby enabling proactive response and quick follow-up response. We can provide you with a solution that can help.

또한, 본 개시의 실시 예에 의하면, 사람 재인식(Person Re-identification) 기술은 거점(지역)기반 ATM에서 온라인으로 등록된 개인정보의 활용 없이 사람의 인식과 재인식으로 사용자를 구별할 수 있다.In addition, according to an embodiment of the present disclosure, the person re-identification technology can distinguish a user by recognizing and re-identifying a person without using personal information registered online at a base (region) based ATM.

또한, 본 개시의 실시 예에 의하면, 거리학습 기반의 사람 재인식 알고리즘을 사용함으로써, 기존에 사용하고 있는 보안 솔루션에 추가적인 장비의 설치나 추가적인 개인식별 정보의 활용 없이 사고의 징후를 미리 포착하거나 빠른 사후조치를 가능하게 할 수 있다.In addition, according to an embodiment of the present disclosure, by using a distance learning-based human recognition algorithm, signs of an accident can be captured in advance or quick follow-up without installing additional equipment or using additional personal identification information in the existing security solution. action can be made possible.

또한, 본 개시의 실시 예에 의하면, 딥러닝 기술을 활용하여 영상으로부터 자동으로 사람을 탐지할 수 있고, 찾고자 하는 사람에 대하여 빠른 검색을 가능하게 함으로써, 기존에 사람이 수행하는 검토 시간을 현저히 줄여줄 수 있어 실효성을 향상시킬 수 있다.In addition, according to an embodiment of the present disclosure, a person can be automatically detected from an image using deep learning technology, and a quick search for a person to be found is possible, thereby significantly reducing the review time previously performed by a person. can be given to improve effectiveness.

본 개시의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present disclosure are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 일 실시 예에 따른 재인식 기반 객체 매칭 시스템을 개략적으로 도시한 도면이다.
도 2는 일 실시 예에 따른 거리학습 및 샴 네트워크를 설명하기 위해 개략적으로 나타낸 도면이다.
도 3은 일 실시 예에 따른 거리학습의 트리플렛 네트워크를 설명하기 위해 개략적으로 나타낸 도면이다.
도 4는 일 실시 예에 따른 객체 매칭 장치를 개략적으로 나타낸 블록도이다.
도 5 및 도 6은 일 실시 예에 따른 재인식 기반 객체 매칭 시스템의 전체 파이프라인을 설명하기 위해 나타낸 도면이다.
도 7은 일 실시 예에 따른 객체 매칭 장치의 재인식 거리 학습 모델의 네트워크 구조를 나타낸 예시도이다.
도 8은 일 실시 예에 따른 객체 매칭 장치의 협업 학습을 설명하기 위해 개략적으로 나타낸 예시도이다.
도 9는 일 실시 예에 따른 객체 매칭 장치의 결과 화면의 예시도이다.
도 10은 일 실시 예에 따른 가려진 얼굴 이미지에 대한 객체 매칭 방법을 설명하기 위한 도면이다.
도 11은 일 실시 예에 따른 객체 매칭 방법을 설명하기 위한 흐름도이다.1 is a diagram schematically illustrating a recognition-based object matching system according to an embodiment.
2 is a diagram schematically shown to explain a distance learning and sham network according to an embodiment.
3 is a diagram schematically shown to explain a triplet network for distance learning according to an embodiment.
4 is a schematic block diagram of an object matching apparatus according to an exemplary embodiment.
5 and 6 are diagrams for explaining an entire pipeline of a recognition-based object matching system according to an embodiment.
7 is an exemplary diagram illustrating a network structure of a re-recognition distance learning model of an object matching device according to an embodiment.
8 is an exemplary diagram schematically shown to describe collaborative learning of an object matching device according to an embodiment.
9 is an exemplary view of a result screen of an object matching device according to an embodiment.
10 is a diagram for explaining an object matching method for an obscured face image according to an exemplary embodiment.
11 is a flowchart illustrating an object matching method according to an exemplary embodiment.

본 개시의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 설명되는 실시 예들을 참조하면 명확해질 것이다.Advantages and features of the present disclosure, and methods of achieving them, will become clear with reference to the detailed description of embodiments in conjunction with the accompanying drawings.

그러나 본 개시는 아래에서 제시되는 실시 예들로 한정되는 것이 아니라, 서로 다른 다양한 형태로 구현될 수 있고, 본 개시의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 아래에 제시되는 실시 예들은 본 개시가 완전하도록 하며, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 개시의 범주를 완전하게 알려주기 위해 제공되는 것이다. 본 개시를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 개시의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.However, it should be understood that the present disclosure is not limited to the embodiments presented below, but may be implemented in a variety of different forms, and includes all conversions, equivalents, and substitutes included in the spirit and technical scope of the present disclosure. . The embodiments presented below are provided to complete the present disclosure and to fully inform those skilled in the art of the scope of the disclosure. In describing the present disclosure, if it is determined that a detailed description of related known technologies may obscure the gist of the present disclosure, the detailed description will be omitted.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 개시를 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms used in this application are only used to describe specific embodiments, and are not intended to limit the present disclosure. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "include" or "have" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded. Terms such as first and second may be used to describe various components, but components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

이하, 본 개시에 따른 실시 예들을 첨부된 도면을 참조하여 상세히 설명하기로 하며, 첨부 도면을 참조하여 설명함에 있어, 동일하거나 대응하는 구성요소는 동일한 도면번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. In the description with reference to the accompanying drawings, the same or corresponding components are assigned the same reference numerals, and overlapping descriptions thereof are omitted. I'm going to do it.

도 1은 일 실시 예에 따른 재인식 기반 객체 매칭 시스템을 개략적으로 도시한 도면이다.1 is a diagram schematically illustrating a recognition-based object matching system according to an embodiment.

도 1을 참조하면, 재인식 기반 객체 매칭 시스템(1)은 객체 매칭 장치(100), 사용자 단말(200), 서버(300) 및 네트워크(400)를 포함할 수 있다.Referring to FIG. 1 , the object matching system 1 based on recognition may include an object matching device 100 , a user terminal 200 , a server 300 and a network 400 .

재인식 기반 객체 매칭 시스템(1)은 무인화 된 기기(예를 들어, 금융 자동화 기기(ATM)) 및 서비스 환경에서, 거리학습 기반의 사람 재인식 기술을 활용하여 사고 발생 징후를 미리 파악하거나 사고의 진위 여부를 빠르게 파악할 수 있도록 하고자 한다. 일 실시 예에서, 무인화 된 기기는 ATM 기기, 영상 감시 장치, 보안 장치 등일 수 있으나, 이하에서는 ATM 기기를 실시 예로 하여 설명한다.Recognition-based object matching system (1) utilizes distance learning-based human recognition technology in an unmanned device (eg, financial automation machine (ATM)) and service environment to identify signs of an accident in advance or determine the authenticity of an accident I want to be able to quickly understand the . In one embodiment, the unmanned device may be an ATM device, a video monitoring device, a security device, and the like, but hereinafter, an ATM device will be described as an example.

사람 재인식 기술은 카메라로 수집된 이미지에 등장한 사람을 기존에 보관된 이미지에 등장하는 사람과 비교하여 동일 인물인지를 구별하는 기술이다. 뿐만 아니라, 사람 재인식 기술은 보안 카메라 영상 정보로부터 인식 및 분석 기술을 접목하여 객체 추적, 변화감응에 따른 조치 등 다양한 응용 서비스로도 연계가 가능하다.The person recognition technology compares a person appearing in an image collected by a camera with a person appearing in an existing stored image, and distinguishes whether the person is the same person. In addition, human recognition technology can be linked to various application services such as object tracking and action according to change response by combining recognition and analysis technology from security camera image information.

일 실시 예에서, 거리학습 기반의 사람 재인식 기술은 기존에 사용되고 있는 보안 솔루션에 추가하여 보완적인 수단으로서의 보안 서비스 기술을 제공할 수 있으며, 추가적인 개인정보 수집 없이 오프라인 환경에서도 특정 거점을 중심으로 동작 가능한 서비스를 제공할 수 있다.In one embodiment, the distance learning-based human recognition technology can provide a security service technology as a complementary means in addition to an existing security solution, and can operate around a specific base even in an offline environment without collecting additional personal information. service can be provided.

즉, 재인식 기반 객체 매칭 시스템(1)은 온라인으로 기 등록했거나 추가 등록해야 하는 민감한 개인정보의 활용 없이, 사용자의 상반신 사진 데이터와 사용자가 사용한 카드에서 만들어진 해시 값을 활용하는 것이 핵심이며, ATM 설치 환경 및 특성을 고려하여 거점 기반 오프라인 환경에서도 동작 가능하도록 구성될 수 있다. That is, the key to the recognition-based object matching system (1) is to use the hash value created from the user's upper body photo data and the user's card without using sensitive personal information that has already been registered or needs to be additionally registered online, and ATM installation Considering the environment and characteristics, it can be configured to be operable even in a base-based offline environment.

이를 통해, 재인식 기반 객체 매칭 시스템(1)은 기존 카드 사용자가 아닌 다른 사람이 카드를 사용하거나 현금을 수취하는 등의 이상 상황을 검출하여 사고 발생 전에 사전 조치가 가능하게 하고, 징후를 즉각적으로 파악하여 사고상황이 발생하였을 때 보다 빠른 사후대처가 가능하도록 할 수 있다.Through this, the recognition-based object matching system 1 detects an abnormal situation such as a person other than the existing card user using the card or receiving cash, enabling preliminary action before an accident occurs and immediately identifying the symptoms This enables a quicker follow-up response in the event of an accident.

이러한 서비스는 개방된 환경에서 제공되어야 하는 서비스로, 개인의 신상정보 조회를 통한 사람 인식 서비스와 차별화하여 프라이버시 침해가 없도록 하는 것이 중요하다. These services should be provided in an open environment, and it is important to differentiate them from person recognition services through individual personal information inquiries so that there is no invasion of privacy.

이에, 재인식 기반 객체 매칭 시스템(1)은 개인의 신상정보에 대한 접근 없이 서비스가 제공되도록 하여, 유출될 수 있는 개인 식별 정보 노출을 최소화할 수 있다. 즉, 재인식 기반 객체 매칭 시스템(1)은 이미지를 통해 인물 재인식을 수행하여, 기존의 카드 사용자와 다르게 인식되는 사용자가 현금 인출을 시도할 경우, 단순 이미지만 가지고도 경고 문구의 제공, 절차 진행의 재확인 등의 사전조치가 가능하도록 할 수 있다.Accordingly, the recognition-based object matching system 1 can minimize exposure of personal identification information that may be leaked by providing services without access to personal information of individuals. That is, the recognition-based object matching system 1 performs person re-recognition through an image, and when a user recognized differently from an existing card user tries to withdraw cash, a warning message is provided even with a simple image, and the process is stopped. Preliminary measures such as reconfirmation can be made possible.

이러한 재인식 기반 객체 매칭 시스템(1)은 보조적인 수단으로서의 보안장치가 필요한 공간의 입출입 통제 관리, 같은 공간에서의 사람 재인식으로 거점 기반의 특정한 공간에서 프라이버시를 침해하지 않는 조건으로 입장객의 재입장 관리 및 입장이 허락된 공간 안에서 출입구 통과 확인 등 다양한 응용 솔루션을 제공할 수 있다. 또한, 재인식 기반 객체 매칭 시스템(1)은 무인 감시 장치로 상황 변화를 감시하고 모니터링이 필요한 시설 또는 자원의 감시 등에도 적용되어, 기존 이력 정보 없이 객체 인식 및 재인식을 통한 보조적인 보안 조치에 활용될 수 있다.This recognition-based object matching system (1) manages entry and exit control of spaces requiring security devices as an auxiliary means, manages re-entry of visitors on the condition that privacy is not violated in a specific space based on the base by recognizing people in the same space, and It can provide a variety of application solutions, such as entry and exit checks in a space where entry is allowed. In addition, the recognition-based object matching system (1) can be used for auxiliary security measures through object recognition and re-recognition without existing history information by monitoring situation changes with an unmanned monitoring device and applying monitoring of facilities or resources that require monitoring. can

한편, 재인식(Re-Identification)은 카메라로 수집된 영상을 통해 한번 인식한 객체의 신원(ID)을 시간 또는 공간적인 간격이 있게 인식되는 동일한 객체의 신원을 찾아내는 것을 뜻한다. 재인식 기술은 최근 각종 CCTV, 자율주행차량 등에 적용하여 서비스 환경에 투입하려는 시도가 늘고 있다.On the other hand, re-identification (Re-Identification) means to find the identity (ID) of the same object recognized at a time or spatial interval from the identity (ID) of the object once recognized through the image collected by the camera. Recognition technology has recently been applied to various CCTVs, self-driving vehicles, etc., and attempts to put it into service environments are increasing.

또한, 보안의 목적으로 도로, 공항, 철도 등과 같은 공공장소에서 연중 무휴로 쉬지 않고 지속적으로 특정 사람, 자동차 등을 인식하고 추적함으로 값비싼 사람의 노동력을 대체할 수 있다. 이렇듯 재인식 기술은 활용할 수 있는 범위가 넓으며 사용자에게 편리성을 제공해 준다.In addition, for the purpose of security, expensive human labor can be replaced by continuously recognizing and tracking a specific person, car, etc. 24/7 in public places such as roads, airports, and railways. As such, the re-recognition technology has a wide range of applications and provides convenience to users.

더불어, 딥러닝 기술을 기반으로 한 사람 재인식 기술은 최근 보안이 필요한 다양한 경우에 적극적으로 적용이 검토되고 있어 다양하게 응용이 가능하다. 또한 딥러닝 기술을 기반으로 한 사람 재인식 기술은 산업시설 및 국가 주요시설, 학교 및 통학로, 아파트 방범 등 거점(지역) 기반 환경에서의 범죄 용의자 또는 미아 등 특정인에 관한 확인 및 추적 서비스 인프라에 적용할 수 있고, 도시 단위로 영상 기반 보안 인프라, 전국에 걸친 광역 용의자 검색 서비스 인프라에도 확장하여 활용할 수 있다. 실제로도 지하철 역사, 도로, 공항 등 테러나 범죄가 일어날 수 있는 장소에 영상 보안 인프라, 영상 보안 장비의 운용 확대 및 범죄 사전 예방 등에 적용된 사례가 있다.In addition, human recognition technology based on deep learning technology is actively being reviewed for application in various cases where security is required, so it can be applied in various ways. In addition, the deep learning technology-based person recognition technology can be applied to the identification and tracking service infrastructure for a specific person, such as a criminal suspect or missing child, in base (region)-based environments such as industrial facilities and major national facilities, schools and commuting routes, and apartment crime prevention. It can also be expanded and utilized to video-based security infrastructure in city units and wide-area suspect search service infrastructure nationwide. In fact, there are cases where video security infrastructure, video security equipment operation expansion, and crime prevention are applied to places where terrorism or crime can occur, such as subway stations, roads, and airports.

그러나, 재인식 기술을 자유롭게 사용하기 위해서는 해결해야 할 문제점이 있는데, 자유로운 공간에서 동일한 객체라도 날씨, 조명, 촬영각도, 중첩, 객체의 변형 등의 객체 인식 상황에 따라 다른 객체로 재인식될 수 있고 서로 다른 객체라도 여러 이유의 해상도 저하에 의해 같은 객체로 재인식될 수 있어서 재인식 시 동일한 객체간의 차이는 줄이고, 다른 객체간의 차이는 늘리는 것이 매우 중요하다.However, there is a problem to be solved in order to freely use the recognition technology. Even the same object in a free space can be re-recognized as a different object according to object recognition situations such as weather, lighting, shooting angle, overlapping, and deformation of the object. Even objects can be re-recognized as the same object due to resolution degradation for various reasons, so it is very important to reduce differences between identical objects and increase differences between different objects during re-recognition.

이에, 일 실시 예에서는, 딥러닝을 적용한 재인식을 수행하여 상기 문제점들을 해결할 수 있다. Accordingly, in one embodiment, the above problems may be solved by performing re-recognition using deep learning.

도 2는 일 실시 예에 따른 거리학습 및 샴 네트워크를 설명하기 위해 개략적으로 나타낸 도면이고, 도 3은 일 실시 예에 따른 거리학습의 트리플렛 네트워크를 설명하기 위해 개략적으로 나타낸 도면이다.2 is a diagram schematically shown to explain a distance learning and Siamese network according to an embodiment, and FIG. 3 is a schematic diagram to explain a triplet network of distance learning according to an embodiment.

도 2(a)에 도시된 바와 같이, 딥러닝을 적용한 재인식 방법에서는 서로 다른 심층 신경망에서 출력된 특징 정보들 간의 거리를 통해 학습을 하고, 테스트 시 심층 신경망에서 출력된 특징 정보를 통해 객체간의 유사도(Similarity)를 구하여 차이를 나타내고 이를 거리로 표현하는 거리학습(Metric Learning)이 사용될 수 있다.As shown in FIG. 2 (a), in the deep learning-applied re-recognition method, learning is performed through the distance between feature information output from different deep neural networks, and similarity between objects is measured through feature information output from the deep neural network during testing. Distance learning (Metric Learning), which obtains (similarity) to show the difference and express it as a distance, can be used.

특정한 사물이나 사람을 구분하기 위한 통상적이고 강력한 일반적인 모델을 만들기 위해서 제안된 방법 중 하나는, 두 대상이 갖는 특성이 '다르다'라는 것에 집중하는 것이다. One of the methods proposed to create a common and powerful general model for classifying specific objects or people is to focus on the fact that the characteristics of two objects are 'different'.

이에 거리학습은 객체인 사람의 얼굴 인식을 위해 다수의 인물의 특징에 따라 이미지가 특정 공간 상에 임베딩이 되면, 특징에 따라 사람을 구분할 수 있도록 할 수 있다. 이러한 거리학습을 이용한다면 특정인물에 대해서 반응하는 서비스를 개발할 수 있고, 수 많은 객체의 얼굴을 학습한 인공지능 모델을 이용해서 특정 인물들을 비교하거나 인식하는 등의 얼굴인식 문제에 다양하게 활용이 가능하다.Accordingly, in distance learning, when an image is embedded in a specific space according to the characteristics of a plurality of people for face recognition of a person, which is an object, it is possible to distinguish a person according to the characteristics. If this distance learning is used, it is possible to develop a service that responds to a specific person, and it can be used in various ways for face recognition problems such as comparing or recognizing specific people using an artificial intelligence model that has learned the faces of numerous objects. do.

동일한 객체라도 다른 상황에서 촬영된 이미지가 학습될 때 단순히 '분류'를 하게 되면 같은 분류로 구분되기 어려워서 같은 객체가 다른 객체로 구분될 수 있다. 그렇기 때문에, 일 실시 예에서는, 단순히 특징을 추출하여 '분류'를 하는 것이 아니라, 객체 간의 유사도를 나타내는 거리(distance)를 계산하는 metric을 사용하여, 객체를 그룹핑하여 구분할 수 있다.Even the same object can be classified into different objects because it is difficult to classify it into the same classification if simply 'classified' when images taken in different situations are learned. Therefore, in one embodiment, it is possible to group and classify objects using a metric that calculates a distance indicating similarity between objects, rather than simply extracting features and performing 'classification'.

다시 말해, 거리학습이란 합성곱 신경망을 사용하는 학습방식으로, 예를 들어, 트리플렛 네트워크(triplet network)와 샴 네트워크(siamese network)로 구현될 수 있다.In other words, distance learning is a learning method using a convolutional neural network, and may be implemented with, for example, a triplet network and a siamese network.

도 2(b)에 도시된 바와 같이, 샴 네트워크는 파라미터를 공유하는 2 개의 신경망으로 구성된 학습 모델이다. As shown in Fig. 2(b), the Siamese network is a learning model composed of two neural networks that share parameters.

샴 네트워크는 두 개의 입력 이미지에 대해 같은 객체의 이미지인지 다른 객체의 이미지인지에 따라 '1' 또는 '0'의 라벨 값을 부여하고 신경망을 통과시켜 추출된 특징벡터에 따라 이미지 간의 거리를 조정하여 공간 상에 이미지를 맵핑하게 된다.The Siamese network assigns a label value of '1' or '0' to two input images, depending on whether they are images of the same object or images of different objects, passes them through a neural network, and adjusts the distance between the images according to the extracted feature vector. It maps an image onto space.

이러한 샴 네트워크는 학습 데이터의 선택을 신중하게 해야 하고, 각 부류에 속하는 데이터를 구분하기 위한 최소 임계 거리를 설정해 주어야 하지만, 소량의 데이터만으로도 학습이 가능하고, 별도의 분류(categorizing) 작업 없이 학습 데이터 셋을 만들고 활용이 쉽다는 장점이 있다.Such a Siamese network requires careful selection of training data and setting a minimum threshold distance to distinguish data belonging to each class, but it is possible to learn with only a small amount of data, and training data without separate categorizing work. It has the advantage of being easy to make and use.

이와 비교하여, 도 3을 참조하면, 트리플렛 네트워크는 트리플렛 손실(triplet loss)을 사용한 거리학습으로, triplet loss는 앵커(anchor) 이미지를 기준으로 포지티브(positive) 이미지(동일 객체의 이미지)와 네거티브(negative) 이미지(다른 객체의 이미지)를 이용해 3 개의 값으로 손실(loss)을 계산하기 때문에 triplet loss라고 한다. In comparison, referring to FIG. 3, the triplet network is distance learning using triplet loss, and the triplet loss is a positive image (image of the same object) and a negative (image of the same object) based on the anchor image. It is called triplet loss because the loss is calculated with three values using negative) images (images of other objects).

트리플렛 신경망은, 입력된 값이 참(true)이면 포지티브 이미지와의 거리를 최소화하고, 입력된 값이 거짓(false)이면 네거티브 이미지와의 거리를 최대화하여, 학습이 진행되면 임베딩 공간에 각각의 이미지 간의 거리를 추정하게 된다.The triplet neural network minimizes the distance from the positive image if the input value is true, and maximizes the distance from the negative image if the input value is false. to estimate the distance between them.

트리플렛 네트워크는, 전체 데이터에 해당되는 앵커와 포지티브 쌍, 앵커와 네거티브 쌍을 고려하여 임베딩 시키기 위해서, 앵커, 포지티브, 네거티브를 쌍으로 연결하는 리스트 파일을 이용하여 학습에 사용하게 된다. The triplet network is used for learning by using a list file that connects anchors, positives, and negatives in pairs in order to embed them considering anchor and positive pairs and anchor and negative pairs corresponding to the entire data.

리스트 파일에서 앵커는 기준이 되는 이미지 파일목록으로 구성하고, 포지티브는 기준이 되는 이미지 파일목록을 랜덤으로 구성하며, 네거티브는 앵커의 이미지와 다른 이미지를 구성하여 학습에 활용한다. In the list file, the anchor is composed of a list of standard image files, the positive is composed of a list of standard image files at random, and the negative is composed of images different from the anchor's image and used for learning.

이 방법은 전체 데이터의 쌍을 고려하기 때문에 같은 부류의 데이터들간의 특징을 세세하게 찾고 상대적인 거리 값을 이용하여 적합한 임베딩 공간에 특징 별로 이미지를 배치하는 것에 좋은 결과를 얻을 수 있으나, 연산 횟수가 많고 학습 시간이 오래 걸린다는 점이 단점이다.Since this method considers all pairs of data, it is possible to obtain good results by finding features between data of the same kind in detail and arranging images for each feature in an appropriate embedding space using a relative distance value, but it requires a large number of operations and learning The downside is that it takes a long time.

재인식 기반 객체 매칭 시스템(1)은 ATM 기기 및 환경의 제한된 특징 하에서 거리학습 기반의 사람 재인식 기술을 활용하여 기존의 보안 시스템을 보완하는 실질적인 서비스 활용을 제안하고자 한다. 실질적인 서비스 활용이란, 사용자가 사용한 카드에서 만들어진 해시 값과 사용자의 이미지만을 이용하여 도난이 의심되는 개인 카드가 본인이 아닌 타인에 의해 사용되는 경우를 검출해내고 사전 조치가 가능하도록 서비스를 구성하는 것 등을 말한다.Recognition-based object matching system (1) proposes a practical service utilization that complements the existing security system by utilizing distance learning-based human recognition technology under the limited characteristics of ATM devices and environments. Practical use of the service is to configure the service to detect the case in which a personal card suspected of being stolen is used by someone other than the user and take precautionary measures using only the hash value created from the card used by the user and the user's image. say etc.

ATM 기기 및 환경의 제한된 특징으로는 기존의 기기 운영에 사용되고 있는 컴퓨팅 자원의 한계를 예로 들 수 있다. 일 실시 예에서는, 학습이나 전처리 과정에 너무 많은 자원이 필요하거나 너무 많은 시간이 소요된다면 거점 기반으로 동작하기에 어려울 수 있는 점을 고려하였다. As an example of the limited characteristics of ATM devices and environments, the limitation of computing resources used for operating existing devices can be cited as an example. In one embodiment, it is considered that it may be difficult to operate based on bases if too many resources or too much time are required for learning or preprocessing.

또한, 일 실시 예에서는, 거점 기반으로 동작하는 동시에 중앙 서버와의 네트워크 연결이 원활하지 않을 때에도 자체의 컴퓨팅 자원으로 동작이 가능해야 하는 것을 고려하여 네트워크를 선택 및 구성할 수 있다.In addition, in one embodiment, a network can be selected and configured in consideration of the fact that it must operate on a base-based basis and operate with its own computing resources even when the network connection with the central server is not smooth.

이러한 관점에서, 재인식 기반 객체 매칭 시스템(1)은 거리학습 방법 중 샴 네트워크를 사용하는 것을 실시 예로 하며, 이러한 샴 네트워크는 기존 사용자의 학습 데이터 이미지가 많지 않아도 동작이 가능하다는 점에서 적절한 방법이라고 볼 수 있다.From this point of view, the recognition-based object matching system (1) uses a Siamese network as an example of a distance learning method. can

한편 일 실시 예에서는, 사용자들이 사용자 단말(200)에서 구현되는 어플리케이션 또는 웹사이트에 접속하여, 객체 매칭 장치(100)의 네트워크를 생성 및 학습하거나, 객체 매칭 결과를 확인하는 등의 과정을 수행할 수 있다. Meanwhile, in one embodiment, users access an application or website implemented in the user terminal 200 to perform a process such as generating and learning a network of object matching devices 100 or checking object matching results. can

이러한 사용자 단말(200)은 사용자가 조작하는 데스크 탑 컴퓨터, 스마트폰, 노트북, 태블릿 PC, 스마트 TV, 휴대폰, PDA(personal digital assistant), 랩톱, 미디어 플레이어, 마이크로 서버, GPS(global positioning system) 장치, 전자책 단말기, 디지털방송용 단말기, 네비게이션, 키오스크, MP3 플레이어, 디지털 카메라, 가전기기 및 기타 모바일 또는 비모바일 컴퓨팅 장치일 수 있으나, 이에 제한되지 않는다. The user terminal 200 includes a desktop computer, a smart phone, a laptop computer, a tablet PC, a smart TV, a mobile phone, a personal digital assistant (PDA), a laptop computer, a media player, a micro server, and a global positioning system (GPS) device operated by a user. , e-book readers, digital broadcast terminals, navigation devices, kiosks, MP3 players, digital cameras, home appliances, and other mobile or non-mobile computing devices, but are not limited thereto.

또한, 사용자 단말(200)은 통신 기능 및 데이터 프로세싱 기능을 구비한 시계, 안경, 헤어 밴드 및 반지 등의 웨어러블 단말기 일 수 있다. 사용자 단말(200)은 상술한 내용에 제한되지 아니하며, 웹 브라우징이 가능한 단말기는 제한 없이 차용될 수 있다.In addition, the user terminal 200 may be a wearable terminal such as a watch, glasses, hair band, and ring having a communication function and a data processing function. The user terminal 200 is not limited to the above, and a terminal capable of web browsing may be borrowed without limitation.

일 실시 예에서, 객체 매칭 장치는 ATM 기기, 영상 감시 장치, 보안 장치 등의 내부 구성 요소(100)로서 구현되거나, 다른 실시 예에서는 그 일부 또는 전체가 서버(300)로 구현될 수 있다.In one embodiment, the object matching device may be implemented as an internal component 100 such as an ATM device, a video monitoring device, or a security device, or may be partially or entirely implemented as the server 300 in another embodiment.

서버(300)로 구현되는 경우, ATM 기기, 영상 감시 장치, 보안 장치 등에서 촬영한 복수의 이미지 또는 객체 매칭 결과를 서버(300)로 전송하고, 서버(300)는 복수의 이미지에 기반하여 객체 매칭을 수행하거나 검출한 결과를 수정할 수 있다. 아래의 실시 예들은 재인식 기반 객체 매칭 장치가 ATM 기기, 영상 감시 장치, 보안 장치 등의 내부 구성 요소(100)로서 구현되는 것을 전제로 하여 설명한다.When implemented as the server 300, a plurality of images or object matching results taken by an ATM device, a video surveillance device, a security device, etc. are transmitted to the server 300, and the server 300 matches objects based on the plurality of images. can be performed or the detected result can be modified. The following embodiments will be described on the premise that the recognition-based object matching device is implemented as an internal component 100 of an ATM device, a video monitoring device, or a security device.

즉 일 실시 예에서, 재인식 기반 객체 매칭 시스템(1)은 객체 매칭 장치(100) 및/또는 서버(300)에 의해 구현될 수 있다.That is, in an embodiment, the object matching system 1 based on recognition may be implemented by the object matching device 100 and/or the server 300 .

다시 말하면, 일 실시 예에서, 객체 매칭 장치(100)는 서버(300)에서 구현될 수 있는데, 이때 서버(300)는 객체 매칭 장치(100)가 포함되는 재인식 기반 객체 매칭 시스템(1)을 운용하기 위한 서버이거나 객체 매칭 장치(100)의 일부분 또는 전 부분을 구현하는 서버일 수 있다. In other words, in an embodiment, the object matching device 100 may be implemented in the server 300, where the server 300 operates the recognition-based object matching system 1 including the object matching device 100. It may be a server for or a server that implements part or all of the object matching device 100.

일 실시 예에서, 서버(300)는 ATM 기기를 사용하는 사용자의 이미지를 획득하여, 재인식 거리 학습 모델 기반 객체 매칭을 수행함으로써, 현재 ATM 기기를 사용하는 사용자가 본인의 카드를 사용하고 있는 지 추론할 수 있도록 하는 전반의 프로세스에 대한 객체 매칭 장치(100)의 동작을 제어하는 서버일 수 있다.In one embodiment, the server 300 obtains an image of a user using an ATM device and performs object matching based on a re-recognition distance learning model to infer whether the user currently using the ATM device is using his/her card. It may be a server that controls the operation of the object matching device 100 for the overall process to enable.

또한, 서버(300)는 객체 매칭 장치(100)를 동작시키는 데이터를 제공하는 데이터베이스 서버일 수 있다. 그 밖에 서버(300)는 웹 서버 또는 어플리케이션 서버 또는 딥러닝 네트워크 제공 서버를 포함할 수 있다.Also, the server 300 may be a database server that provides data for operating the object matching device 100 . In addition, the server 300 may include a web server, an application server, or a deep learning network providing server.

그리고 서버(300)는 각종 인공 지능 알고리즘을 적용하는데 필요한 빅데이터 서버 및 AI 서버, 각종 알고리즘의 연산을 수행하는 연산 서버 등을 포함할 수 있다.In addition, the server 300 may include a big data server and an AI server required to apply various artificial intelligence algorithms, a calculation server that performs calculations of various algorithms, and the like.

또한 본 실시 예에서, 서버(300)는 상술하는 서버들을 포함하거나 이러한 서버들과 네트워킹 할 수 있다. 즉, 본 실시 예에서, 서버(300)는 상기의 웹 서버 및 AI 서버를 포함하거나 이러한 서버들과 네트워킹 할 수 있다.Also, in this embodiment, the server 300 may include the aforementioned servers or network with these servers. That is, in this embodiment, the server 300 may include or network with the above web server and AI server.

재인식 기반 객체 매칭 시스템(1)에서 객체 매칭 장치(100) 및 서버(300)는 네트워크(400)에 의해 연결될 수 있다. 이러한 네트워크(400)는 예컨대 LANs(local area networks), WANs(Wide area networks), MANs(metropolitan area networks), ISDNs(integrated service digital networks) 등의 유선 네트워크나, 무선 LANs, CDMA, 블루투스, 위성 통신 등의 무선 네트워크를 망라할 수 있으나, 본 개시의 범위가 이에 한정되는 것은 아니다. 또한 네트워크(400)는 근거리 통신 및/또는 원거리 통신을 이용하여 정보를 송수신할 수 있다.In the object matching system 1 based on recognition, the object matching device 100 and the server 300 may be connected by a network 400 . Such a network 400 may be wired networks such as LANs (local area networks), WANs (wide area networks), MANs (metropolitan area networks), ISDNs (integrated service digital networks), wireless LANs, CDMA, Bluetooth, satellite communication However, the scope of the present disclosure is not limited thereto. In addition, the network 400 may transmit and receive information using short-range communication and/or long-distance communication.

또한, 네트워크(400)는 허브, 브리지, 라우터, 스위치 및 게이트웨이와 같은 네트워크 요소들의 연결을 포함할 수 있다. 네트워크(400)는 인터넷과 같은 공용 네트워크 및 안전한 기업 사설 네트워크와 같은 사설 네트워크를 비롯한 하나 이상의 연결된 네트워크들, 예컨대 다중 네트워크 환경을 포함할 수 있다. 네트워크(400)에의 액세스는 하나 이상의 유선 또는 무선 액세스 네트워크들을 통해 제공될 수 있다. 더 나아가 네트워크(400)는 사물 등 분산된 구성 요소들 간에 정보를 주고받아 처리하는 IoT(Internet of Things, 사물인터넷) 망 및/또는 5G 통신을 지원할 수 있다.Also, the network 400 may include connections of network elements such as hubs, bridges, routers, switches, and gateways. Network 400 may include one or more connected networks, such as a multiple network environment, including a public network such as the Internet and a private network such as a secure corporate private network. Access to network 400 may be provided through one or more wired or wireless access networks. Furthermore, the network 400 may support an Internet of Things (IoT) network and/or 5G communication in which information is exchanged and processed between distributed components such as things.

도 4는 일 실시 예에 따른 객체 매칭 장치를 개략적으로 나타낸 블록도이다.4 is a schematic block diagram of an object matching apparatus according to an exemplary embodiment.

도 4를 참조하면, 객체 매칭 장치(100)는 통신부(110), 카메라(120), 사용자 인터페이스(130), 메모리(140) 및 프로세서(150)를 포함할 수 있다.Referring to FIG. 4 , the object matching device 100 may include a communication unit 110 , a camera 120 , a user interface 130 , a memory 140 and a processor 150 .

통신부(110)는 네트워크(400)와 연동하여 외부 장치간의 송수신 신호를 패킷 데이터 형태로 제공하는 데 필요한 통신 인터페이스를 제공할 수 있다. 또한 통신부(110)는 다른 네트워크 장치와 유무선 연결을 통해 제어 신호 또는 데이터 신호와 같은 신호를 송수신하기 위해 필요한 하드웨어 및 소프트웨어를 포함하는 장치일 수 있다.The communication unit 110 may provide a communication interface required to provide a transmission/reception signal between external devices in the form of packet data in conjunction with the network 400 . In addition, the communication unit 110 may be a device including hardware and software necessary for transmitting and receiving signals such as control signals or data signals to and from other network devices through wired or wireless connections.

즉, 프로세서(150)는 통신부(110)를 통해 연결된 외부 장치로부터 각종 데이터 또는 정보를 수신할 수 있으며, 외부 장치로 각종 데이터 또는 정보를 전송할 수도 있다.That is, the processor 150 may receive various data or information from an external device connected through the communication unit 110 and may transmit various data or information to the external device.

카메라(120)는 객체 매칭 장치(100)에 장착되거나 객체 매칭 장치(100)가 구비된 장소에 별도 설치된 것일 수 있다. 카메라(120)는 한정되지 않고 이미지 촬영이 가능한 다양한 기기로 구현될 수 있다.The camera 120 may be mounted on the object matching device 100 or may be separately installed in a place where the object matching device 100 is provided. The camera 120 is not limited and may be implemented with various devices capable of capturing images.

카메라(120)는 예를 들어, ATM 기기에 장착된 핀홀 카메라일 수 있으며, SD resolution, 320x240 사이즈, 최대 160˚ 화각으로 이미지를 촬영하도록 구현될 수 있다. 이때 카메라(120)에서 촬영된 사진은 ATM 기기의 local disk에 암호화되어 저장되며 일정한 시간 단위로 중앙 서버로 전송되도록 설정될 수 있다.The camera 120 may be, for example, a pinhole camera installed in an ATM device, and may be implemented to capture an image with SD resolution, a size of 320x240, and a maximum angle of view of 160 degrees. At this time, the picture taken by the camera 120 is encrypted and stored in the local disk of the ATM device, and may be set to be transmitted to the central server at regular intervals.

일 실시 예에서, 사용자 인터페이스(130)는 객체 매칭 장치(100)의 동작(예컨대, 재인식 거리 학습 모델의 파라미터 변경, 재인식 거리 학습 모델의 학습 조건 변경 등)을 제어하기 위한 사용자 요청 및 명령들이 입력되는 입력 인터페이스를 포함할 수 있다. 또한 사용자 인터페이스(130)는 예를 들어, ATM 기기를 사용하기 위한 사용자 요청 및 명령들이 입력되는 입력 인터페이스를 포함할 수 있다.In one embodiment, the user interface 130 inputs user requests and commands for controlling the operation of the object matching device 100 (eg, changing parameters of a re-recognition distance learning model, changing learning conditions of a re-recognition distance learning model, etc.) It may include an input interface that is. Also, the user interface 130 may include, for example, an input interface through which user requests and commands for using the ATM are input.

그리고 일 실시 예에서, 사용자 인터페이스(130)는 객체 매칭 결과를 출력하거나 부정 사용자를 판단되는 경우 경고 화면을 출력하는 출력 인터페이스를 포함할 수 있다. 또한 사용자 인터페이스(130)는 예를 들어, ATM 기기의 거래 화면을 출력하는 출력 인터페이스를 포함할 수 있다. 즉, 사용자 인터페이스(130)는 사용자 요청 및 명령에 따른 결과를 출력할 수 있다. 이러한 사용자 인터페이스(130)의 입력 인터페이스와 출력 인터페이스는 동일한 인터페이스에서 구현될 수 있다.In one embodiment, the user interface 130 may include an output interface that outputs an object matching result or outputs a warning screen when an illegal user is determined. Also, the user interface 130 may include, for example, an output interface for outputting a transaction screen of an ATM device. That is, the user interface 130 may output results according to user requests and commands. The input interface and the output interface of the user interface 130 may be implemented in the same interface.

메모리(140)는 객체 매칭 장치(100)의 동작의 제어(연산)에 필요한 각종 정보들을 저장하고, 제어 소프트웨어를 저장할 수 있는 것으로, 휘발성 또는 비휘발성 기록 매체를 포함할 수 있다. The memory 140 may store various information necessary for controlling (operation) of the object matching device 100 and store control software, and may include a volatile or non-volatile recording medium.

메모리(140)는 하나 이상의 프로세서(150)와 전기적 또는 내부 통신 인터페이스로 연결되고, 프로세서(150)에 의해 실행될 때, 프로세서(150)로 하여금 객체 매칭 장치(100)를 제어하도록 야기하는(cause) 코드들을 저장할 수 있다.The memory 140 is connected to one or more processors 150 by an electrical or internal communication interface and, when executed by the processor 150, causes the processor 150 to control the object matching device 100 (cause) Codes can be saved.

여기서, 메모리(140)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media) 등의 비 일시적 저장매체이거나 램(RAM) 등의 일시적 저장매체를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다. 이러한 메모리(140)는 내장 메모리 및/또는 외장 메모리를 포함할 수 있으며, DRAM, SRAM, 또는 SDRAM 등과 같은 휘발성 메모리, OTPROM(one time programmable ROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, NAND 플래시 메모리, 또는 NOR 플래시 메모리 등과 같은 비휘발성 메모리, SSD. CF(compact flash) 카드, SD 카드, Micro-SD 카드, Mini-SD 카드, Xd 카드, 또는 메모리 스틱(memory stick) 등과 같은 플래시 드라이브, 또는 HDD와 같은 저장 장치를 포함할 수 있다. Here, the memory 140 may include non-temporary storage media such as magnetic storage media or flash storage media, or temporary storage media such as RAM, but the scope of the present invention is not limited thereto. The memory 140 may include built-in memory and/or external memory, and may include volatile memory such as DRAM, SRAM, or SDRAM, one time programmable ROM (OTPROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, Non-volatile memory such as NAND flash memory, or NOR flash memory, SSD. It may include a compact flash (CF) card, a flash drive such as an SD card, a Micro-SD card, a Mini-SD card, an Xd card, or a memory stick, or a storage device such as an HDD.

그리고, 메모리(140)에는 본 개시에 따른 학습을 수행하기 위한 알고리즘에 관련된 정보가 저장될 수 있다. 그 밖에도 본 개시의 목적을 달성하기 위한 범위 내에서 필요한 다양한 정보가 메모리(140)에 저장될 수 있으며, 메모리(140)에 저장된 정보는 서버 또는 외부 장치로부터 수신되거나 사용자에 의해 입력됨에 따라 갱신될 수도 있다.Also, information related to an algorithm for performing learning according to the present disclosure may be stored in the memory 140 . In addition, various information required within the scope of achieving the object of the present disclosure may be stored in the memory 140, and the information stored in the memory 140 may be updated as it is received from a server or an external device or input by a user. may be

프로세서(150)는 객체 매칭 장치(100)의 전반적인 동작을 제어할 수 있다. 구체적으로, 프로세서(150)는 메모리(140)를 포함하는 객체 매칭 장치(100)의 구성과 연결되며, 메모리(140)에 저장된 적어도 하나의 명령을 실행하여 객체 매칭 장치(100)의 동작을 전반적으로 제어할 수 있다. The processor 150 may control overall operations of the object matching device 100 . Specifically, the processor 150 is connected to the configuration of the object matching device 100 including the memory 140, and executes at least one command stored in the memory 140 to control the overall operation of the object matching device 100. can be controlled with

프로세서(150)는 다양한 방식으로 구현될 수 있다. 예를 들어, 프로세서(150)는 주문형 집적 회로(Application Specific Integrated Circuit, ASIC), 임베디드 프로세서, 마이크로 프로세서, 하드웨어 컨트롤 로직, 하드웨어 유한 상태 기계(Hardware Finite State Machine, FSM), 디지털 신호 프로세서(Digital Signal Processor, DSP) 중 적어도 하나로 구현될 수 있다. Processor 150 may be implemented in a variety of ways. For example, the processor 150 may include an application specific integrated circuit (ASIC), an embedded processor, a microprocessor, hardware control logic, a hardware finite state machine (FSM), a digital signal processor Processor, DSP) may be implemented as at least one.

프로세서(150)는 일종의 중앙처리장치로서 메모리(140)에 탑재된 제어 소프트웨어를 구동하여 객체 매칭 장치(100)의 동작을 제어할 수 있다. 프로세서(150)는 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는, 예를 들어 프로그램 내에 포함된 코드 또는 명령어로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다.The processor 150, as a kind of central processing unit, may control the operation of the object matching device 100 by driving control software loaded in the memory 140. The processor 150 may include any type of device capable of processing data. Here, a 'processor' may refer to a data processing device embedded in hardware having a physically structured circuit to perform functions expressed by codes or instructions included in a program, for example.

도 5 및 도 6은 일 실시 예에 따른 재인식 기반 객체 매칭 시스템의 전체 파이프라인을 설명하기 위해 나타낸 도면이다.5 and 6 are diagrams for explaining an entire pipeline of a recognition-based object matching system according to an embodiment.

도 5 및 도 6을 참조하면, 프로세서(150)는 카메라(120)에서 촬영된 사용자 이미지 데이터를 획득하고, 상기 획득한 사용자 이미지 데이터를 딥러닝 신경망에 통과시켜, 상기 딥러닝 신경망의 데이터 셋과의 동일성 여부를 추론할 수 있다. 이때, 사용자 이미지는 얼굴이 포함된 상반신 이미지일 수 있다. 5 and 6, the processor 150 acquires user image data captured by the camera 120, passes the acquired user image data through a deep learning neural network, and obtains data sets of the deep learning neural network and The identity of can be inferred. In this case, the user image may be an upper body image including a face.

이러한 딥러닝 신경망은 다차원 공간상에 동일한 사람의 이미지를 유사 공간에 맵핑하도록 학습된 것일 수 있으며, 이하 재인식 거리 학습 모델로 통칭할 수 있다.Such a deep learning neural network may be trained to map an image of the same person in a multidimensional space to a similar space, and may be collectively referred to as a re-recognition distance learning model.

즉, 프로세서(150)는 객체 매칭 장치(100)의 이용 이력이 있는 사용자들의 상반신 데이터들을 기반으로, 한 사람에 대해 촬영된 여러 장의 이미지 중 랜덤으로 두 장의 사진을 추출하여 재인식 거리 학습 모델에 입력할 수 있다. That is, the processor 150 randomly extracts two photos from among several images taken of a person based on upper body data of users who have a history of using the object matching device 100 and inputs them to the re-recognition distance learning model. can do.

그리고 프로세서(150)는 재인식 거리 학습 모델에 기반하여, 서로 다른 두 사진의 인물이 같은 사람이라는 것을 추론할 수 있다. 이때 프로세서(150)는 서로 같은 사람에 대해 촬영된 이미지는 두 사진 간의 거리(distance)가 가까운 유사 공간 상에 배치할 수 있다. Further, the processor 150 may infer that the person in the two different photos is the same person based on the re-recognition distance learning model. In this case, the processor 150 may arrange images of the same person in a similar space where the distance between the two photos is close.

반면에, 프로세서(150)는 랜덤으로 서로 다른 두 사용자에 대해 촬영된 이미지를 두 이미지의 인물이 서로 다른 사람이라고 추론할 수 있다. 이에, 프로세서(150)는 두 사진 간의 거리(distance)가 커지도록 먼 공간에 배치하여 깊은 신경망이 인물의 특징에 대해 인식할 수 있도록 학습할 수 있다.On the other hand, the processor 150 may infer that the persons in the two images are different from the images taken for two different users at random. Accordingly, the processor 150 may learn to recognize the characteristics of a person by arranging the two pictures in a distant space so that the distance between the two pictures increases.

일 실시 예에서, 객체 매칭 장치(100)는 ATM 기기에 구비될 수 있는데, 이러한 ATM 기기에는 기본적으로 CPU, 메모리, CCTV 카메라를 갖추고 있는 점을 감안하여, 네트워크가 끊긴 오프라인 상황에서도 이미 학습되어 있는 재인식 거리 학습 모델과 메모리에 남아있는 사용 이력을 바탕으로 메인 서버와의 연결 없이 거점 환경에서 동작 가능하도록 설계될 수 있다. In one embodiment, the object matching device 100 may be provided in an ATM device. Considering that such an ATM device basically has a CPU, memory, and CCTV camera, the object matching device 100 is already learned even in an offline situation where the network is disconnected. Based on the re-recognition distance learning model and the usage history remaining in the memory, it can be designed to operate in a base environment without connection to the main server.

즉, 거점 환경에서 독립적으로 동작할 시에도 보안은 문제없이 동작해야 하므로, 일 실시 예에서는 데이터의 전처리 과정을 최소화할 수 있다.That is, since security must operate without problems even when operating independently in a base environment, in one embodiment, the pre-processing of data can be minimized.

프로세서(150)는 사용자가 객체 매칭 장치(100) 또는 객체 매칭 장치(100)가 구비된 ATM 기기 사용 시, 사용자 이미지를 획득할 수 있다.The processor 150 may obtain a user image when the user uses the object matching device 100 or an ATM device equipped with the object matching device 100 .

즉, 프로세서(150)는 객체 매칭 장치(100)가 구비된 ATM 기기 사용 시, 사용자가 거래를 시작하고 나서 현금투입 및 출금까지의 전 과정을 촬영한다. That is, the processor 150 photographs the entire process from when a user starts a transaction to cash input and withdrawal when using an ATM equipped with the object matching device 100 .

예를 들어, 사용자가 객체 매칭 장치(100)가 구비된 ATM 기기를 사용한다는 것은, 개인이 소유한 고유의 카드를 사용하여 ATM 기기에 카드 삽입 후 카드 투입, 금액 선택, 비밀번호 입력, 은행 송수신, 현금 수취 및 거래 완료까지의 전체 과정이 수행되는 것을 의미할 수 있다. For example, when a user uses an ATM device equipped with the object matching device 100, using a unique card owned by the individual, inserting the card into the ATM device, inserting the card, selecting an amount, entering a password, sending/receiving a bank, It may mean that the entire process from receiving cash to completing the transaction is performed.

일 실시 예에서는, 상기와 같은 과정에서 카메라(120)를 이용하여 ATM 기기를 사용하는 사용자의 사진이 자동으로 촬영될 수 있다. 카메라(120)는 상술한 바와 같이, ATM 기기 상부에 내장된 핀홀 카메라일 수 있다.In one embodiment, a picture of a user using an ATM machine may be automatically taken using the camera 120 in the above process. As described above, the camera 120 may be a pinhole camera built into an ATM machine.

예를 들어, 프로세서(150)는 ATM 기기 특성상 기기 상부에 내장된 핀홀 카메라로 사용자의 상반신을 촬영하게 되며, 미리 프로그래밍하여 기 저장한 조건에 따라 특정 이벤트가 일어날 때 사진이 촬영되도록 할 수 있다. 이때, 한 번의 거래 사이클에서는 10장 미만의 사진이 촬영되도록 할 수 있다.For example, the processor 150 takes a picture of the user's upper body with a pinhole camera built into the upper part of the machine due to the nature of the ATM machine, and may be preprogrammed to take a picture when a specific event occurs according to pre-stored conditions. At this time, less than 10 pictures may be taken in one transaction cycle.

한편, 프로세서(150)는 사용자 촬영 이미지가 획득되면, 이미지 크롭 전처리 과정을 통해 인물 구별 시 불필요한 배경을 삭제하여 얼굴의 구체적이고 상세한 특징만을 추출할 수 있다.Meanwhile, when a user-captured image is acquired, the processor 150 may extract only specific and detailed features of a face by deleting an unnecessary background when distinguishing a person through an image cropping pre-processing process.

즉, 재인식 거리 학습 모델에 대한 알고리즘에 있어, 불필요한 부분은 제외하고 사용자의 더 구체적인 특징을 추출하여 인물을 구별해낼 수 있다면 더욱 적은 데이터로도 더욱 정교한 인물 구분이 가능해질 것이다. That is, in the algorithm for the re-recognition distance learning model, if a person can be distinguished by extracting more specific features of the user excluding unnecessary parts, more sophisticated person identification will be possible with less data.

또한, 프로세서(150)는 크롭 전처리 과정을 통해 데이터 사이즈를 현저히 줄임으로써 처리 용량 및 속도도 증가시킬 수 있고, 거점에서의 제한된 환경에 적합하도록 모델 최적화 작업을 추가적으로 수행할 수 있다.In addition, the processor 150 can increase processing capacity and speed by significantly reducing data size through a crop pre-processing process, and can additionally perform model optimization to be suitable for a limited environment in the base.

그리고 프로세서(150)는 사용자 이미지가 획득되면, 상기 사용자가 삽입한 카드에 대한 카드 번호 등의 카드 정보, 거래 금액, 거래 날짜, 거래 시간, 거래 고유번호, 기기 정보, 사진 인덱스를 파일명에 포함하여 사용자의 사진 파일을 저장할 수 있다. 상술한 바와 같이, 사진 파일은 서버(300) 및/또는 객체 매칭 장치(100)에 저장될 수 있다. And when the user image is acquired, the processor 150 includes card information such as a card number for the card inserted by the user, transaction amount, transaction date, transaction time, transaction identification number, device information, and photo index in the file name. User's photo files can be saved. As described above, the photo file may be stored in the server 300 and/or the object matching device 100 .

일 실시 예에서는, 이렇게 저장된 데이터를 이미지 라벨링 등에 자동으로 사용되도록 할 수 있다. 즉, 일 실시 예에서는, 한 사람당 하나의 카드만을 사용하여 카드번호를 자동으로 라벨링 하도록 소프트웨어를 제작할 수 있다. In one embodiment, the stored data may be automatically used for image labeling or the like. That is, in one embodiment, software may be produced to automatically label a card number using only one card per person.

일 실시 예에서는, 사진 촬영 후에 사진을 서버(300)로 전송하여 데이터 전처리를 하는 등의 부가적인 작업 없이, ATM 기기 내 프로세서(150)와 메모리(140)에서 자동으로 폴더 트리 구성을 통한 라벨링 및 분류가 가능하도록 할 수 있다.In one embodiment, labeling and classification can be made possible.

일 실시 예에서는, 상기와 같이 수집한 데이터 셋은 한 사람을 하나의 클래스로 정의할 수 있다. 프로세서(150)는 같거나 다른 사람의 이미지 한 쌍에 대해, 클래스가 같다면 1, 클래스가 다르다면 0의 값을 라벨링 할 수 있다. 그리고 프로세서(150)는 입력된 두 개의 이미지에 대해 (image1, image2, class)와 같이 튜플(tuple) 값을 갖도록 설정할 수 있다. 이는 샴 네트워크의 핵심 부분으로서, 같은 클래스의 이미지는 가까운 거리에 위치시키고 다른 클래스의 이미지는 먼 거리에 위치시키는 역할을 하게 된다.In one embodiment, the data set collected as described above may define one person as one class. The processor 150 may label a pair of images of the same person or a different person with a value of 1 if the class is the same and a value of 0 if the class is different. Also, the processor 150 may set two input images to have tuple values such as (image1, image2, class). This is a key part of the Siamese network, and it serves to place images of the same class at a close distance and images of a different class at a long distance.

한편, ATM 기기에서 생성된 고객의 민감한 개인정보를 보안 걱정 없이 안전하게 서버(300)에 전송하고 인공지능(AI)으로 학습시키기 위해서, 일 실시 예에서는, 거점 기반의 Local ATM 기기에서 사람은 이미지를 비 식별하고 인공지능 프로세스에서는 식별 가능하도록 변환할 수 있다.On the other hand, in order to safely transmit the customer's sensitive personal information generated by the ATM device to the server 300 without worrying about security and to learn it with artificial intelligence (AI), in one embodiment, a person in a local ATM device based on a base It can be de-identified and converted to be identifiable in an AI process.

즉 프로세서(150)는 해시(Hash) 암호화를 수행할 수 있다. 해시란 다양한 길이를 가진 데이터를 고정된 길이를 가진 데이터로 매핑(mapping)한 값이다. 이를 이용해 특정한 배열의 인덱스나 위치를 입력하고자 하는 데이터의 값을 이용해 저장하거나 찾을 수 있다. 기존에 사용했던 자료 구조들은 탐색이나 삽입에 선형시간이 걸리기도 했던 것에 비해, 해시를 이용하면 즉시 저장하거나 찾고자 하는 위치를 참조할 수 있으므로 더욱 빠른 속도로 처리할 수 있다. 해시 값이라고도 한다.That is, the processor 150 may perform hash encryption. A hash is a value obtained by mapping data of various lengths to data of a fixed length. Using this, you can save or find the index or position of a specific array by using the value of the data you want to input. Compared to previously used data structures, which took linear time for search or insertion, hashes can be stored immediately or referenced to the location to be found, so processing can be performed at a much faster speed. Also called hash value.

해시는 특정한 데이터를 이를 상징하는 더 짧은 길이의 데이터로 변환하는 행위를 의미한다. 여기서 상징 데이터는 원래의 데이터가 조금만 달라져도 확연하게 달라지는 특성을 가지고 있어 무결성을 지키는 데에 많은 도움을 준다. 예를 들어 'A'라는 문자열의 해시와 'B'라는 문자열의 해시는 고작 한 알파벳이 다를 뿐이지만 해시 결과값은 완전히 다른 문자열이 나오게 된다.Hash refers to the act of converting specific data into a shorter length of data that symbolizes it. Here, symbol data has the characteristic of being significantly different even if the original data is slightly different, so it is very helpful in maintaining integrity. For example, the hash of the string 'A' and the hash of the string 'B' differ by only one alphabet, but the hash result value is a completely different string.

또한, 해시는 기본적으로 복호화가 불가능하다는 특징이 있다. 이는 당연히 입력 데이터 집합이 출력 데이터 집합을 포함하고 있으므로, 특정한 출력 데이터를 토대로 입력 데이터를 찾을 수 없기 때문이다. 즉, 동일한 출력 값을 만들어낼 수 있는 입력 값의 가짓수는 수학적으로 무한개라고 볼 수 있다. 만약에 해시 결과 값에 대해서 복호화를 수행할 수 있다면, 이는 압축률이 무한인 압축 알고리즘과도 같다. 해시는 애초에 복호화를 수행할 수 없도록 설계되었으며, 실제로도 해커가 쉽게 복호화를 할 수 없다는 점에서 강한 보안성을 가진다.In addition, the hash has a characteristic that decryption is impossible by default. This is naturally because the input data set includes the output data set, so input data cannot be found based on specific output data. That is, the number of input values that can produce the same output value is mathematically infinite. If decryption can be performed on the hash result value, it is the same as a compression algorithm with an infinite compression rate. The hash is designed so that it cannot be decrypted in the first place, and in fact, it has strong security in that hackers cannot easily decrypt it.

해시 암호화는 평문을 암호문으로 바꾸는 암호화는 가능 하지만, 암호문을 평문으로 바꾸는 복호화는 불가능하여, 암호화만 가능하기 때문에 단방향 암호화(One-Way Encryption)라고 할 수 있다. 주로 암호화 해시 함수를 이용한 해시 암호화 방식을 사용할 수 있다.Hash encryption can be called One-Way Encryption because encryption is possible to convert plaintext into ciphertext, but decryption to convert ciphertext to plaintext is impossible, and only encryption is possible. A hash encryption method using a cryptographic hash function can be mainly used.

해시 함수(Hash Function)는 임의의 길이의 데이터를 고정된 길이의 데이터로 반환시켜주는 함수이다. 입력 값의 길이가 달라도 출력 값은 언제나 고정된 길이로 반환한다. 동일한 값이 입력되면 언제나 동일한 출력 값을 보장한다.A hash function is a function that returns data of an arbitrary length as data of a fixed length. Even if the length of the input value is different, the output value is always returned with a fixed length. When the same value is input, the same output value is always guaranteed.

해시 함수는 암호학적 해시 함수와 비 암호학적 해시함수로 구분된다. 대표적인 암호학 해시 함수 알고리즘은 MD5와 SHA가 있다.Hash functions are divided into cryptographic hash functions and non-cryptographic hash functions. Representative cryptographic hash function algorithms include MD5 and SHA.

일 실시 예에서는, 보안이 중요한 사용자 개인 데이터를 취급하는 ATM 기기에서 생성된 대규모의 학습데이터(사진, 카드번호, 거래금액 등)를 기반으로 인공지능 서버(300)로부터 인공지능 모델을 내려 받아 머신 러닝을 통해 학습시키고 학습된 결과를 서버로 주기적으로 전달하여 성능을 향상시킬 수 있다.In one embodiment, an artificial intelligence model is downloaded from the artificial intelligence server 300 based on large-scale learning data (photo, card number, transaction amount, etc.) generated by an ATM device that handles user personal data for which security is important, and the machine is Performance can be improved by learning through learning and periodically delivering the learned results to the server.

즉, 프로세서(150)는 일일개국(초기화)시 인공지능 서버(300)로부터 인공지능 모델을 내려 받아, ATM 기기의 고유기기 번호(점번/기번) 데이터를 바탕으로 머신 러닝을 동작시키고 기기별 최적의 인공지능 모델/특징맵(처리 방법)을 업데이트 할 수 있다. 고유기기 번호는 고유기기가 구비된 지점에 대한 고유 번호 및 해당 기기의 고유 번호를 의미할 수 있다.That is, the processor 150 downloads the artificial intelligence model from the artificial intelligence server 300 at the time of daily opening (initialization), operates machine learning based on the unique device number (point number / key number) data of the ATM device, and optimizes each device AI model/feature map (processing method) of can be updated. The unique device number may refer to a unique number of a point equipped with a unique device and a unique number of the corresponding device.

그리고 프로세서(150)는 금융 학습 데이터로 인해 변경된 모델 부분을 인공지능 서버(300)에 공유해 취합하고, 공동 모델을 업데이트 할 수 있다. 그 후, 프로세서(150)는 다시 ATM 기기에 인공지능 모델을 내려 받아 기능을 반복함으로써 전체적인 인공지능 성능을 향상시킬 수 있다. In addition, the processor 150 may share and collect parts of the model changed due to the financial learning data with the artificial intelligence server 300, and update the joint model. After that, the processor 150 can improve the overall AI performance by downloading the artificial intelligence model to the ATM device again and repeating the function.

따라서, 일 실시 예에서는, 개별 ATM 기기에서 학습해 개선된 인공지능 모델을 만들고 이 부분을 나눔으로써, 개인 정보가 많거나 외부로 전송할 수 없는 데이터도 활용한 개인화된 인공지능 서비스 제공이 가능하도록 할 수 있다.Therefore, in one embodiment, by learning from individual ATM devices to create an improved artificial intelligence model and dividing this part, it is possible to provide personalized artificial intelligence services using data that has a lot of personal information or data that cannot be transmitted to the outside. can

도 7은 일 실시 예에 따른 객체 매칭 장치의 재인식 거리 학습 모델의 네트워크 구조를 나타낸 예시도이고, 도 8은 일 실시 예에 따른 객체 매칭 장치의 협업 학습을 설명하기 위해 개략적으로 나타낸 예시도이다.7 is an exemplary diagram illustrating a network structure of a re-recognition distance learning model of an object matching device according to an embodiment, and FIG. 8 is an exemplary diagram schematically illustrating collaborative learning of an object matching device according to an exemplary embodiment.

일 실시 예에서는, 거리학습 신경망 중에서도 샴 네트워크를 사용하여 인물의 유사도와 라벨 값에 따라 다차원 공간에 배치되도록 학습하고, 새로운 사진이 들어왔을 때 동일한 사람은 유사 공간에 임베딩 되도록 추정하여 동일 인물 여부를 확인할 수 있다.In one embodiment, a Siamese network is used among distance learning neural networks to learn to be placed in a multidimensional space according to the similarity and label value of a person, and when a new photo is received, the same person is estimated to be embedded in a similar space to determine whether it is the same person You can check.

샴 네트워크 기반의 재인식 알고리즘은 도 7과 같은 구조로 구성될 수 있다. 일 실시 예의 재인식 알고리즘은 가중치를 공유하는 두 개의 네트워크로 구성될 수 있다.The Siamese network-based re-recognition algorithm may have a structure as shown in FIG. 7 . The re-recognition algorithm of an embodiment may be composed of two networks that share weights.

각각의 네트워크의 합성곱 신경망은 2 개의 컨볼루션 레이어(convolutional layer)와 3 개의 FC 레이어(fully connected layer)로 구성될 수 있다. 그리고 각각의 컨볼루션 레이어는 3 * 3 크기의 커널 4 개, 3 * 3 크기의 커널 8 개로 이루어질 수 있다.The convolutional neural network of each network may be composed of two convolutional layers and three fully connected FC layers. In addition, each convolution layer may consist of 4 kernels of 3 * 3 size and 8 kernels of 3 * 3 size.

그리고 모든 레이어에는 활성 함수로 ReLu(Rectified Linear Unit)가 사용될 수 있으며, 2 개의 컨볼루션 레이어를 거치고 난 뒤에는 각 이미지에 대해 1,800,000 개의 특징벡터가 나오고 3 번의 FC 레이어를 거쳐 500 개, 5 개의 특징 벡터로 추출될 수 있다. In addition, ReLu (Rectified Linear Unit) can be used as an activation function for all layers, and after passing through 2 convolution layers, 1,800,000 feature vectors are obtained for each image, and 500 and 5 feature vectors are obtained through 3 FC layers. can be extracted as

즉, 샴 네트워크 기반의 재인식 알고리즘은, 두 개의 이미지가 하나의 쌍(pair)으로서 입력 데이터가 된다. 그리고 샴 네트워크 기반의 재인식 알고리즘은, 각각의 이미지가 가중치를 공유하는 두 개의 서브 네트워크에 입력되면, 각 네트워크에서부터 출력된 두 임베딩 벡터 간의 거리(distance)를 산출하여 다차원 공간상에 맵핑하는 구조로 구성될 수 있다.That is, in the Siamese network-based re-recognition algorithm, two images become input data as a pair. And the Siamese network-based re-recognition algorithm, when each image is input to two sub-networks that share weights, calculates the distance between the two embedding vectors output from each network and maps it on a multi-dimensional space. It can be.

프로세서(150)는 도 8에 도시된 바와 같이, 상기와 같이 입력된 가중치를 공유하는 두 개의 네트워크의 학습 데이터를 중앙의 서버(300)에 주기적 또는 새로운 데이터가 업데이트 되었을 경우 전송하여, 중앙의 데이터를 갱신할 수 있다. As shown in FIG. 8, the processor 150 transmits the training data of the two networks that share the input weight to the central server 300 periodically or when new data is updated, and the central data can be updated.

즉, 일 실시 예에서는, 중앙 서버(300)의 학습 데이터와 연동하여, 각각의 ATM 기기들이 중앙 서버(300)의 가중치를 공유하는 각각의 네트워크의 학습 데이터를 통해서 동일한 추론 결과를 낼 수 있다.That is, in one embodiment, in conjunction with the learning data of the central server 300, each ATM device can produce the same inference result through the learning data of each network sharing the weight of the central server 300.

또한, 일 실시 예에서는, 동일한 추론 결과를 토대로 인물의 유사도와 라벨 값에 따라 중앙 서버(300)의 다차원 공간에 배치되도록 학습하고, 새로운 사진이 들어왔을 때 동일한 사람은 유사 공간에 임베딩 되도록 추정하여 동일 인물 여부를 확인할 수 있다.In addition, in one embodiment, it is learned to be placed in the multi-dimensional space of the central server 300 according to the similarity and label value of the person based on the same inference result, and when a new photo is received, the same person is estimated to be embedded in the similar space You can check whether they are the same person.

일 실시 예의 재인식 알고리즘은 손실 함수로 대조 손실(contrastive loss)이 사용될 수 있다. 대조 손실은 수식적으로 다음 수학식 1과 같이 정의될 수 있다.In the re-recognition algorithm of an embodiment, a contrastive loss may be used as a loss function. Contrast loss can be mathematically defined as Equation 1 below.

대조 손실에서는 입력된 두 개의 이미지 x_i, x_j에 대하여 (x_i, x_j, y_ij)의 튜플 값이 사용되는데, 이때 임베딩 네트워크 f에 대해

로 정의될 수 있다. In the contrast loss, a tuple value of (x _i , x _j , y _ij ) is used for the two input images x _i , x _j . At this time, for the embedding network f

can be defined as

y_ij는 x가 같은 클래스의 값이면 1, 다른 클래스의 값이면 0으로 정의될 수 있다. 즉, 두 데이터의 클래스가 같다면 앞의 수식만 남고 클래스가 다르다면 뒤의 수식만 남게 된다. 이때 마진(margin)이라고도 불리는 하이퍼 파라미터인 a 값에 따라, 두 데이터가 서로 다른 클래스에 속하는 경우 a 값 이상의 거리를 갖도록 할 수 있다. 예를 들어, a 값은 2 일 수 있다. y _ij can be defined as 1 if x is a value of the same class and 0 if it is a value of a different class. That is, if the classes of the two data are the same, only the preceding formula remains, and if the classes are different, only the latter formula remains. In this case, according to a value of a, which is a hyper-parameter, also called a margin, when the two data belong to different classes, a distance greater than the value of a may be obtained. For example, the value of a may be 2.

즉, 대조 손실은 같은 클래스의 이미지와는 거리가 가깝도록 하고, 다른 클래스의 이미지와는 거리가 멀어지도록 할 수 있다.That is, the contrast loss can cause images of the same class to be closer and farther from images of different classes.

일 실시 예에서는, 훈련(training) 데이터와 시험(test) 데이터, 중복되지 않은 데이터를 임의로 구분하여 학습과 검증에 사용할 수 있다.In one embodiment, training data, test data, and non-redundant data may be arbitrarily classified and used for learning and verification.

한편, 일 실시 예에서는, 통계 기법 중 하나인 귀무가설법을 아래 표 1과 같이 적용하여, 유클리드 거리(euclidean distance)별 정확도를 판단할 수 있다. Meanwhile, in an embodiment, the null hypothesis method, which is one of the statistical techniques, may be applied as shown in Table 1 below to determine accuracy for each Euclidean distance.

같은 사용자의 이미지는 같은 사용자로, 다른 사용자의 이미지는 서로 다른 사용자로 추론한 경우를 '추론 성공(correct)'으로 판단하고, 같은 사용자의 이미지를 다른 사용자로, 다른 사용자의 이미지를 같은 사용자로 추론하면 '추론 실패(incorrect)'로 판단할 수 있다.Inferring that the same user's image is the same user and the different user's images are different users is considered 'correct', and the same user's image is another user, and the other user's image is the same user. If it is inferred, it can be judged as 'incorrect'.

일 실시 예에서는, 시험 데이터에서 무작위로 선정된 두 개의 이미지를 대상으로 추론을 성공한 경우의 수를 전체 경우의 수로 나누고 퍼센트로 변환한 값을 거리별 정확도(accuracy by distance)라고 정의할 수 있다.In an embodiment, a value obtained by dividing the number of cases where inference is successful for two images randomly selected from test data by the total number of cases and converted into a percentage may be defined as accuracy by distance.

일 실시 예에서는, 재인식 거리 학습 모델 구성 시 두 입력 이미지의 유클리드 거리에 따라 거리가 얼마 이하이면 'matched(같은 사람으로 추론)', 얼마 이상이면 'not matched(서로 다른 사람으로 추론)'라고 할 지 임계 값(threshold)을 정해주게 되는데, 임계 값을 변화시켜주며 정확도를 계산할 수 있다. In one embodiment, when constructing a re-recognition distance learning model, according to the Euclidean distance of two input images, if the distance is less than or equal to a certain distance, 'matched (inferred as the same person)', and if more than a certain distance, 'not matched (inferred as different people)'. A threshold is set, and the accuracy can be calculated by changing the threshold.

예를 들어, 유클리드 거리가 0.25 내지 0.50 구간에서, 유클리드 거리별 추론이 성공한 correct로 판정된 정확도가 약 90%에 도달할 수 있다.For example, when the Euclidean distance is in the range of 0.25 to 0.50, the accuracy determined as correct, in which inference for each Euclidean distance is successful, may reach about 90%.

도 9는 일 실시 예에 따른 객체 매칭 장치의 결과 화면의 예시도이다.9 is an exemplary view of a result screen of an object matching device according to an embodiment.

도 9를 참조하여, 재인식 거리 학습 모델을 기반으로 한 객체 매칭 장치(100)의 결과 화면을 확인할 수 있다. Referring to FIG. 9 , a result screen of the object matching device 100 based on the re-recognition distance learning model may be checked.

예를 들어, 임계 값이 1.0으로 설정된 경우, 객체 매칭 결과(기 저장된 사용자 이미지와 현재 ATM 기기를 이용하는 사용자의 이미지 비교 결과), 거리 값이 3.316으로 측정되면, 임계 값인 1.0보다 크므로 'not matched(서로 다른 사람으로 추론)'라고 판단할 수 있다.For example, when the threshold value is set to 1.0, if the object matching result (comparison result of a previously stored user image and the image of a user currently using an ATM machine) and a distance value of 3.316 are greater than the threshold value of 1.0, 'not matched' (inferred as different people)'.

그리고 객체 매칭 결과, 거리 값이 0.174으로 측정되면, 임계 값인 1.0보다 작으므로 'matched(같은 사람으로 추론)'라고 판단할 수 있다.And, as a result of object matching, if the distance value is measured as 0.174, it is smaller than the threshold value of 1.0, so it can be determined as 'matched (inferred to be the same person)'.

즉 거리별 정확도는 데이터 종류와 크기에 따라 달라질 수 있으며, 응용 서비스에서 요구되는 객체를 구별의 정확도에 따라서 적정 임계 거리(threshold over euclidean distance)를 설정하여 서비스에 맞는 최적화가 가능하다.That is, the accuracy for each distance may vary depending on the type and size of data, and it is possible to optimize for the service by setting a threshold over euclidean distance according to the accuracy of distinguishing objects required by the application service.

예를 들어, 재인식 거리 학습 모델 기반 인물 재인식 알고리즘은 보안이 필요한 다양한 상황에서 응용되어 사용될 수 있다. For example, a person recognition algorithm based on a recognition distance learning model can be applied and used in various situations requiring security.

가령 일 실시 예에서는, 보안 등급이 낮은 곳에는 너무 높은 정확도를 적용하면 사용 편리성을 저해하게 되므로 정확도와 편리성을 두루 고려하여 적정 임계 값을 적용하여 추론할 수 있다. 반면, 높은 보안이 필요한 곳에는 작은 오차만 있어도 같은 인물로 판단하면 안 되기 때문에, 임계 값을 낮게 두어 높은 정확도를 가질 때에만 같은 인물로 추론하도록 적용할 수 있다.For example, in one embodiment, if too high accuracy is applied to a place where the security level is low, convenience of use is impaired, so it can be inferred by applying an appropriate threshold value in consideration of both accuracy and convenience. On the other hand, since the same person should not be judged as the same person even if there is only a small error in a place where high security is required, it can be applied to infer the same person only when it has high accuracy by setting a low threshold value.

또한, 일 실시 예에서는, 재인식 거리 학습 모델을 검증하기 위해, k-NN(nearest neighbor) 알고리즘을 활용하여 동일 인물의 이미지가 유사 공간 상에 맵핑 되었는지 확인할 수 있다. In addition, in an embodiment, in order to verify the re-recognition distance learning model, it may be checked whether the image of the same person is mapped on a similar space by using a k-nearest neighbor (NN) algorithm.

k-NN 알고리즘은 임의의 한 이미지에 대하여 유클리드 거리가 가까운 k개의 이미지를 출력하여 동일 인물의 이미지가 유사 공간상에 맵핑 되었는지 확인해 볼 수 있는 방법이다.The k-NN algorithm is a method that can check whether images of the same person are mapped on a similar space by outputting k images having a close Euclidean distance for an arbitrary image.

한편, 도 10은 일 실시 예에 따른 가려진 얼굴 이미지에 대한 객체 매칭 방법을 설명하기 위한 도면으로, 일 실시 예에서는, 마스크 등으로 가려진 사용자 이미지가 획득되었을 때, 해당 카드 소지자의 얼굴을 알고 있다면 강제 식별 가능하게 할 수 있다.Meanwhile, FIG. 10 is a diagram for explaining an object matching method for a face image obscured according to an embodiment. In an embodiment, when a user image obscured by a mask is acquired, if the face of the corresponding cardholder is known, it is compulsory. can be identified.

일 실시 예에서는, ATM 기기에는 안면부 카메라와 수취부 카메라 최소 두 개의 핀 홀 카메라가 내장 되어있을 수 있다. 안면부 카메라의 위치는 사용자가 ATM 기기를 이용하기 위해서 반드시 해야만 하는 동작 시 필연적으로 바라보게 되는 카드 투입구와 같은 위치의 근처에 설치하여 안면부 촬영에 최적화할 수 있다. In one embodiment, the ATM machine may have at least two pinhole cameras built-in: a face camera and a receiver camera. The position of the face camera can be optimized for face photographing by installing it near the same position as the card slot that the user inevitably looks at during an operation that must be performed to use the ATM machine.

또한, 일 실시 예에서는, 각 카메라가 촬영하도록 하는 시점을 설정 값으로 넣어줄 수 있는데, 기본적으로 카드 투입이 되어 거래가 시작되는 시점부터 인물을 촬영하도록 설정할 수 있다. In addition, in one embodiment, the point of time at which each camera is to be photographed may be set as a set value, and it may be set to photograph a person from the point in time when a transaction is started by basically inserting a card.

좀 더 구체적으로는, 프로세스가 시작되고 카메라가 활성화된 이후 가장 먼저 Face & Body Landmark Detection(얼굴 및 신체의 랜드마크를 검출하는 알고리즘 모델)을 통해 사람의 형상을 파악할 수 있도록 하는 기준점들(머리, 눈, 귀, 코, 입, 어깨 등)에 따라 사람이 온전히 검출된 경우에 사진이 촬영되도록 설정할 수 있다.More specifically, after the process starts and the camera is activated, the reference points (head, Eyes, ears, nose, mouth, shoulders, etc.) can be set to take a picture when a person is completely detected.

이때, 일 실시 예에서는, 모든 얼굴 및 신체 랜드마크가 검출이 되지 않고 일부 랜드마크의 검출에 실패하더라도 다른 랜드마크들로 인해 유추가 가능하다면 사람으로 검출되어 사진 촬영이 진행되도록 설정할 수 있다. At this time, in an embodiment, even if all face and body landmarks are not detected and some landmarks fail to be detected, if analogy is possible due to other landmarks, it may be set so that a person is detected and photo-taking proceeds.

일 실시 예에서는, 상기와 같이 촬영된 사진에 대해, 얼굴의 눈, 귀, 코, 입이 모두 검출되었는지 식별하는 과정을 거쳐서 마스크나 기타 액세서리로 인해 얼굴의 일부분이 가려진 사진과 얼굴이 모두 노출된 사진으로 구분할 수 있다.In one embodiment, with respect to the photograph taken as described above, through a process of identifying whether all of the eyes, ears, nose, and mouth of the face are detected, a photograph in which a part of the face is covered by a mask or other accessory and a face in which both the face are exposed It can be identified by photos.

일 실시 예에서는, 얼굴이 모두 노출되어 인물 인식에 무리가 없는 사진은 바로 샴 네트워크로 구성된 재인식 거리 학습 모델에 전송하여 기존 카드 사용자의 사진 데이터와 동일 인물인지 판별되도록 할 수 있다. In an embodiment, a photo with all faces exposed and suitable for person recognition may be directly transmitted to a re-recognition distance learning model composed of a siamese network so that it may be determined whether the photo data of the existing card user is the same person.

그러나, 일 실시 예에서는, 얼굴의 일부가 가려졌다고 판단된 사진은, 재인식 거리 학습 모델과 다른 딥러닝 신경망인, Face Inpainting GAN(Generative Adversarial Network)에 전송한다. However, in one embodiment, the photo in which part of the face is determined to be covered is transmitted to a face inpainting generative adversarial network (GAN), which is a deep learning neural network different from the re-recognition distance learning model.

Face Inpainting GAN에서는 새로 입력된 부분적으로 가려진 이미지에 기존에 가지고 있던 카드 사용자의 이미지를 기반으로 생성된 부분 이미지를 합성하여 전체 얼굴을 추론한 이미지를 생성하게 된다. In the Face Inpainting GAN, an image inferring the entire face is created by synthesizing a newly input partially obscured image with a partial image generated based on an existing image of the card user.

일 실시 예에서는, 상기와 같이 추론으로 완성된 이미지를 샴 네트워크 기반 재인식 거리 학습 모델에 입력하여 동일 인물인지 판별하도록 할 수 있다. In one embodiment, the images completed through inference as described above may be input to a re-recognition distance learning model based on a sham network to determine whether they are the same person.

이 경우에는, 가려진 부분을 기존 이미지를 통해 부분 생성하였으므로 해당 부분으로 인해 기존 사용자이미지와의 거리가 가깝게 측정될 수 있다. In this case, since the obscured part is partially generated through the existing image, the distance to the existing user image can be measured closely due to the corresponding part.

따라서, 일 실시 예에서는, 추론하여 채운 이미지의 면적에 반비례하게 재인식 거리 학습 모델의 임계 값을 낮춰서 유사하지 않은 부분에 대해 민감하게 판별할 수 있도록 할 수 있다.Therefore, in an embodiment, the threshold value of the re-recognition distance learning model may be lowered in inverse proportion to the area of the inferred filled image so that dissimilar parts may be sensitively discriminated.

즉, 일 실시 예에서는, 마스크나 기타 액세서리 착용으로 인해 얼굴이 가려진 사용자 이미지의 경우에도, Face & Body Landmark Detection layer와 Face Inpainting GAN layer를 통과시켜, 인물 매칭 및 식별이 가능하도록 할 수 있다.That is, in an embodiment, even in the case of a user image whose face is covered by wearing a mask or other accessory, it is possible to match and identify a person by passing the Face & Body Landmark Detection layer and the Face Inpainting GAN layer.

도 11은 일 실시 예에 따른 객체 매칭 방법을 설명하기 위한 흐름도이다.11 is a flowchart illustrating an object matching method according to an exemplary embodiment.

도 11을 참조하면, S100단계에서, 프로세서(150)는 객체 매칭 장치(100)에 대한 특정 공간에서 이벤트 발생 시, 객체 재인식을 위한 객체의 제 1 이미지를 획득한다.Referring to FIG. 11 , in step S100, the processor 150 obtains a first image of an object for object recognition when an event occurs in a specific space for the object matching device 100.

일 실시 예에서, 객체 매칭 장치(100)는 ATM 기기 내부에 구비되거나 ATM 기기 자체를 의미할 수 있다. 또한 객체 매칭 장치(100)에 대한 특정 공간은 ATM 기기가 설치된 영역을 의미할 수 있다. 그리고 객체 매칭 장치(100)에 대한 특정 공간에서의 이벤트 발생은 예를 들어, ATM 기기에 사용자가 카드를 투입하고 거래를 완료하는 전체 과정을 의미할 수 있다.In one embodiment, the object matching device 100 may be provided inside the ATM device or may mean the ATM device itself. Also, a specific space for the object matching device 100 may mean an area where an ATM device is installed. Also, occurrence of an event in a specific space for the object matching device 100 may mean, for example, the entire process of a user inserting a card into an ATM and completing a transaction.

다만 일 실시 예에서는, 이에 한정되지 않고 특정 공간을 지나가는 차량의 재인식을 수행하기 위해 차량의 이미지를 획득하거나, 다중시설에서 입장이나 보안 관리를 위해 재인식을 위한 사람들의 이미지를 획득할 수 있다.However, in one embodiment, it is not limited thereto, and an image of a vehicle may be acquired to perform re-recognition of a vehicle passing through a specific space, or images of people for re-recognition may be acquired for entrance or security management in multiple facilities.

S200단계에서, 프로세서(150)는 제 1 이미지의 객체의 특정 공간에 대응하는 고유 식별 정보를 추출한다.In step S200, the processor 150 extracts unique identification information corresponding to a specific space of the object of the first image.

여기서, 고유 식별 정보는 예를 들어, ATM 기기를 사용하는 사용자가 삽입한 카드에 대한 카드 번호 등의 카드 정보, 거래 금액, 거래 날짜, 거래 시간, 거래 고유번호, 기기 정보 등을 포함할 수 있으며, 사용자를 촬영한 사용자 이미지도 고유 식별 정보에 포함될 수 있다.Here, the unique identification information may include, for example, card information such as a card number for a card inserted by a user using an ATM machine, transaction amount, transaction date, transaction time, transaction identification number, device information, etc. , a user image photographed of the user may also be included in the unique identification information.

이때, 프로세서(150)는 고유 식별 정보를 추출한 후, 고유 식별 정보를 암호화하여 고유 식별 정보에 대한 해시 값을 생성할 수 있다. 그리고 프로세서(150)는 고유 식별 정보에 대한 해시 값을 제 1 이미지의 레이블로 하여 제 1 이미지를 저장할 수 있다.At this time, the processor 150 may generate a hash value for the unique identification information by encrypting the unique identification information after extracting the unique identification information. The processor 150 may store the first image using the hash value of the unique identification information as a label of the first image.

S300단계에서, 프로세서(150)는 제 1 이미지의 객체의 특정 공간에 대응하는 고유 식별 정보에 기반하여, 고유 식별 정보에 대응하는 기 저장된 제 2 이미지가 검출되면, 기 학습된 재인식 거리 학습 모델을 이용하여 제 1 이미지와 제 2 이미지의 객체 동일 여부를 판단한다.In step S300, the processor 150, based on the unique identification information corresponding to the specific space of the object of the first image, detects a pre-stored second image corresponding to the unique identification information, the pre-learned re-recognition distance learning model. It is determined whether the object of the first image and the second image are the same by using.

즉, 일 실시 예에서는, 사용자의 고유 식별 정보(카드 정보 등)를 확인하여 해당 ATM 기기를 해당 카드로 사용한 이력이 있는 경우, 이전에 촬영된 이미지(제 2 이미지)와 현재 촬영된 이미지(제 1 이미지)에 대한 거리 학습 기반 재인식을 수행하게 된다.That is, in one embodiment, if there is a history of using the corresponding ATM device with the corresponding card by checking the user's unique identification information (card information, etc.), the previously captured image (second image) and the currently captured image (second image) 1 image) will perform distance learning-based re-recognition.

이때, 재인식 거리 학습 모델은, 파라미터를 공유하는 적어도 2개 이상의 서로 다른 심층 신경망에서 출력된 특징 정보들 간의 거리를 조정하여 공간 상에 이미지를 맵핑하도록 학습된 것일 수 있다.In this case, the re-recognition distance learning model may be learned to map an image on a space by adjusting a distance between feature information output from at least two or more different deep neural networks sharing parameters.

프로세서(150)는 객체 동일 여부를 판단하기 위한 기 설정된 임계값을 기반으로, 제 1 이미지와 제 2 이미지의 임베딩 벡터 간의 거리가 임계값 이상이면 다른 객체로 판단할 수 있다. 여기서, 임계값은 객체 매칭에 대한 높은 정확도가 필요할수록 낮게 설정될 수 있다.The processor 150 may determine that the object is different if the distance between the embedding vectors of the first image and the second image is greater than or equal to the threshold value based on a predetermined threshold value for determining whether objects are the same. Here, the threshold value may be set lower as high accuracy for object matching is required.

S400단계에서, 프로세서(150)는 제 1 이미지와 제 2 이미지의 객체가 서로 다른 객체라고 판단되는 경우(S300단계의 아니오), 경고 이벤트를 발생한다.In step S400, the processor 150 generates a warning event when it is determined that the objects in the first image and the second image are different objects (No in step S300).

한편, 일 실시 예에서, 기 학습된 재인식 거리 학습 모델은, 훈련 페이즈(training phase)를 거쳐 훈련될 수 있다. 훈련 페이즈는, 수집된 객체 이미지들 중에서, 재인식 거리 학습 모델의 서브 네트워크 수와 동일한 수의 이미지를 랜덤으로 추출하는 단계와, 추출한 각각의 이미지의 객체가 동일한 객체인지 여부에 따라 라벨 값을 부여하는 단계와, 추출한 각각의 이미지를 하나의 그룹으로 하여, 라벨 값을 포함하는 하나의 튜플 값을 생성하는 단계와, 튜플 값을 입력하여 서브 네트워크 각각에서 추론된 임베딩 벡터 간의 거리를 산출하여 다차원 공간 상에 맵핑하는 단계를 포함할 수 있다.Meanwhile, in an embodiment, the pre-learned re-recognition distance learning model may be trained through a training phase. The training phase includes randomly extracting the same number of images as the number of subnetworks of the re-recognition distance learning model from among the collected object images, and assigning a label value according to whether the object of each extracted image is the same object. Steps of generating one tuple value including a label value by taking each of the extracted images as one group, and calculating the distance between the embedding vectors inferred from each subnetwork by inputting the tuple value to obtain a It may include a step of mapping to.

또한, 일 실시 예에서, 훈련 페이즈의 맵핑하는 단계는, 동일한 객체로 라벨 값 부여된 이미지 그룹은 이미지들 간의 거리가 가까운 유사 공간 상에 배치하고, 다른 객체로 라벨 값이 부여된 이미지 그룹은 이미지들 간의 거리가 커지도록 먼 공간에 배치하는 단계를 포함할 수 있다.In addition, in the mapping step of the training phase, in the mapping step of the training phase, image groups assigned label values to the same object are arranged in a similar space with a short distance between the images, and image groups assigned label values by different objects are image groups. It may include arranging them in a distant space so that the distance between them increases.

일 실시 예에서는, 재인식 거리 학습 모델을 거점 기반으로 업데이트 하여, 특정 영역 및/또는 특정 영역에 구비된 기기에 최적화된 업데이트를 수행할 수 있다.In an embodiment, the re-recognition distance learning model may be updated based on the base, and an update optimized for a specific area and/or a device provided in the specific area may be performed.

프로세서(150)는 객체 매칭 장치의 초기화 시 서버로부터 기 학습된 재인식 거리 학습 모델을 수신하고, 객체 매칭 장치에 대한 특정 공간에 부여된 고유 정보 데이터를 기반으로, 기 학습된 재인식 거리 학습 모델의 추론을 수행할 수 있다.The processor 150 receives a pre-learned re-recognized distance learning model from the server upon initialization of the object matching device, and infers the pre-learned re-recognized distance learning model based on unique information data assigned to a specific space for the object matching device. can be performed.

그리고 프로세서(150)는 객체 매칭 장치에 대한 특정 공간에서 새로 획득된 데이터로 인해 변경된 기 학습된 재인식 거리 학습 모델의 파라미터를 서버에 공유하여, 기 학습된 재인식 거리 학습 모델을 업데이트할 수 있다.In addition, the processor 150 may update the previously learned re-recognized distance learning model by sharing parameters of the previously learned re-recognized distance learning model changed due to newly acquired data in a specific space for the object matching device to the server.

한편, 일 실시 예에서는, 마스크 등을 써서 얼굴이 일부 가려진 사람에 대한 재인식도 수행되도록 할 수 있다. Meanwhile, in an embodiment, re-recognition of a person whose face is partially covered by using a mask or the like may be performed.

이를 위해, 프로세서(150)는 제 1 이미지를 획득한 후, 얼굴 및 신체 랜드마크 검출기를 통해 사람의 형상을 파악할 수 있는 기준점을 검출하고, 제 1 이미지의 얼굴 전체의 기준점 중 일부가 식별 가능하지 않은 경우, GAN(Generative Adversarial Network) 학습 모델을 통해 가려진 부분을 추론 및 복원할 수 있다.To this end, the processor 150 acquires the first image, detects reference points capable of recognizing the shape of a person through a facial and body landmark detector, and determines that some of the reference points of the entire face of the first image are not identifiable. If not, the occluded part can be inferred and restored through a GAN (Generative Adversarial Network) learning model.

이때, 프로세서(150)는 가려진 부분이 있는 제 1 이미지와, 제 1 이미지의 특정 공간에 대한 고유 식별 정보에 대응하여 기 저장된 제 2 이미지를 기반으로, 전체 얼굴을 추론한 이미지를 생성할 수 있다. 그리고 일 실시 예에서는, 추론 및 복원 수행으로 유사하지 않은 부분에 대해 민감하게 판별할 수 있도록, 추론 및 복원을 수행한 이미지의 채워진 면적을 산출하여, 재인식 거리 학습 모델의 객체 동일 여부를 판단하기 위한 기 설정된 임계값을 면적에 반비례하게 낮춰서 설정할 수 있다.In this case, the processor 150 may generate an image in which the entire face is inferred based on the first image having the obscured portion and the pre-stored second image corresponding to the unique identification information for the specific space of the first image. . And in one embodiment, to determine whether the object of the re-recognition distance learning model is the same by calculating the filled area of the image on which inference and restoration is performed so that dissimilar parts can be sensitively discriminated by performing inference and restoration. The preset threshold may be set by lowering it in inverse proportion to the area.

이상 설명된 본 개시에 따른 실시 예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다.Embodiments according to the present disclosure described above may be implemented in the form of a computer program that can be executed on a computer through various components, and such a computer program may be recorded on a computer-readable medium. At this time, the medium is a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and a DVD, a magneto-optical medium such as a floptical disk, and a ROM hardware devices specially configured to store and execute program instructions, such as RAM, flash memory, and the like.

한편, 상기 컴퓨터 프로그램은 본 개시를 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 통상의 기술자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the computer program may be specially designed and configured for the purpose of the present disclosure, or may be known and available to those skilled in the art in the field of computer software. An example of a computer program may include not only machine language code generated by a compiler but also high-level language code that can be executed by a computer using an interpreter or the like.

본 개시의 명세서(특히 특허청구범위에서)에서 "상기"의 용어 및 이와 유사한 지시 용어의 사용은 단수 및 복수 모두에 해당하는 것일 수 있다. 또한, 본 개시에서 범위(range)를 기재한 경우 상기 범위에 속하는 개별적인 값을 적용한 발명을 포함하는 것으로서(이에 반하는 기재가 없다면), 발명의 상세한 설명에 상기 범위를 구성하는 각 개별적인 값을 기재한 것과 같다.In the specification of the present disclosure (particularly in the claims), the use of the term "above" and similar indicating terms may correspond to both singular and plural. In addition, when a range is described in the present disclosure, as including the invention to which individual values belonging to the range are applied (unless otherwise stated), each individual value constituting the range is described in the detailed description of the invention Same as

본 개시에 따른 방법을 구성하는 단계들에 대하여 명백하게 순서를 기재하거나 반하는 기재가 없다면, 상기 단계들은 적당한 순서로 행해질 수 있다. 반드시 상기 단계들의 기재 순서에 따라 본 개시가 한정되는 것은 아니다. 본 개시에서 모든 예들 또는 예시적인 용어(예들 들어, 등등)의 사용은 단순히 본 개시를 상세히 설명하기 위한 것으로서 특허청구범위에 의해 한정되지 않는 이상 상기 예들 또는 예시적인 용어로 인해 본 개시의 범위가 한정되는 것은 아니다. 또한, 통상의 기술자는 다양한 수정, 조합 및 변경이 부가된 특허청구범위 또는 그 균등물의 범주 내에서 설계 조건 및 팩터에 따라 구성될 수 있음을 알 수 있다.Unless an order is explicitly stated or stated to the contrary for the steps comprising the method according to the present disclosure, the steps may be performed in any suitable order. The present disclosure is not necessarily limited to the order of description of the steps. The use of all examples or exemplary terms (eg, etc.) in this disclosure is simply to explain the present disclosure in detail, and the scope of the present disclosure is limited due to the examples or exemplary terms unless limited by the claims. it is not going to be In addition, those skilled in the art can appreciate that various modifications, combinations and changes can be made according to design conditions and factors within the scope of the appended claims or equivalents thereof.

따라서, 본 개시의 사상은 상기 설명된 실시 예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 개시의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present disclosure should not be limited to the above-described embodiments, and not only the claims to be described later, but also all ranges equivalent to or equivalent to these claims are within the scope of the spirit of the present disclosure. will be said to belong to

1 : 재인식 기반 객체 매칭 시스템
100 : 객체 매칭 장치
110 : 통신부
120 : 카메라
130 : 사용자 인터페이스
140 : 메모리
150 : 프로세서
200 : 사용자 단말
300 : 서버
400 : 네트워크1: Recognition-based object matching system
100: object matching device
110: Communication Department
120: camera
130: user interface
140: memory
150: processor
200: user terminal
300: server
400: Network

Claims

각 단계의 적어도 일부가 프로세서에 의해 수행되는, 거리 학습을 기반으로 특정 영역에 대한 재인식을 수행하는 재인식 기반 객체 매칭 방법으로서,
객체 매칭 장치에 대한 특정 공간에서 이벤트 발생 시, 객체 재인식을 위한 객체의 제 1 이미지를 획득하는 단계;
상기 객체의 상기 특정 공간에 대응하는 고유 식별 정보를 추출하는 단계; 및
상기 객체의 상기 특정 공간에 대응하는 고유 식별 정보에 기반하여, 상기 고유 식별 정보에 대응하는 기 저장된 제 2 이미지가 검출되면, 기 학습된 재인식 거리 학습 모델을 이용하여 상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부를 판단하는 단계를 포함하는,
객체 매칭 방법. A recognition-based object matching method for performing re-recognition on a specific area based on distance learning, in which at least a part of each step is performed by a processor,
Acquiring a first image of an object for object recognition when an event occurs in a specific space for an object matching device;
extracting unique identification information corresponding to the specific space of the object; and
Based on the unique identification information corresponding to the specific space of the object, when a pre-stored second image corresponding to the unique identification information is detected, the first image and the second image are detected using a pre-learned re-recognition distance learning model. Including the step of determining whether the image is identical to the object,
Object matching method.

제 1 항에 있어서,
상기 고유 식별 정보를 추출한 후, 상기 고유 식별 정보를 암호화하여 상기 고유 식별 정보에 대한 해시 값을 생성하는 단계; 및
상기 고유 식별 정보에 대한 해시 값을 상기 제 1 이미지의 레이블로 하여 상기 제 1 이미지를 저장하는 단계를 더 포함하는,
객체 매칭 방법.According to claim 1,
After extracting the unique identification information, generating a hash value for the unique identification information by encrypting the unique identification information; and
Storing the first image by using a hash value for the unique identification information as a label of the first image,
Object matching method.

제 1 항에 있어서,
상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부 판단 결과, 다른 객체라고 판단되는 경우, 경고 이벤트를 발생하는 단계를 더 포함하는,
객체 매칭 방법.According to claim 1,
As a result of determining whether the objects of the first image and the second image are the same, generating a warning event when it is determined that the objects are different,
Object matching method.

제 1 항에 있어서,
상기 재인식 거리 학습 모델은,
파라미터를 공유하는 적어도 2개 이상의 서로 다른 심층 신경망에서 출력된 특징 정보들 간의 거리를 조정하여 공간 상에 이미지를 맵핑하도록 학습된 것인,
객체 매칭 방법.According to claim 1,
The re-recognition distance learning model,
Learning to map an image on space by adjusting the distance between feature information output from at least two or more different deep neural networks that share parameters,
Object matching method.

제 4 항에 있어서,
상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부를 판단하는 단계는,
객체 동일 여부를 판단하기 위한 기 설정된 임계값을 기반으로, 상기 제 1 이미지와 상기 제 2 이미지의 임베딩 벡터간의 거리가 상기 임계값 이상이면 다른 객체로 판단하는 단계를 포함하며,
상기 임계값은 객체 매칭에 대한 높은 정확도가 필요할수록 낮게 설정되는,
객체 매칭 방법.According to claim 4,
The step of determining whether the object of the first image and the second image are the same,
Based on a predetermined threshold for determining whether objects are the same, determining that the object is different if the distance between the embedding vectors of the first image and the second image is greater than or equal to the threshold,
The threshold value is set lower as high accuracy for object matching is required.
Object matching method.

제 4 항에 있어서,
상기 기 학습된 재인식 거리 학습 모델은, 훈련 페이즈(training phase)를 거쳐 훈련되고,
상기 훈련 페이즈는,
수집된 객체 이미지들 중에서, 재인식 거리 학습 모델의 서브 네트워크 수와 동일한 수의 이미지를 랜덤으로 추출하는 단계;
상기 추출한 각각의 이미지의 객체가 동일한 객체인지 여부에 따라 라벨 값을 부여하는 단계;
상기 추출한 각각의 이미지를 하나의 그룹으로 하여, 상기 라벨 값을 포함하는 하나의 튜플 값을 생성하는 단계; 및
상기 튜플 값을 입력하여 상기 서브 네트워크 각각에서 추론된 임베딩 벡터 간의 거리를 산출하여 다차원 공간 상에 맵핑하는 단계를 포함하는,
객체 매칭 방법.According to claim 4,
The pre-learned re-recognition distance learning model is trained through a training phase,
The training phase is
randomly extracting the same number of images as the number of subnetworks of the re-recognition distance learning model from among the collected object images;
assigning a label value according to whether objects of each of the extracted images are the same object;
generating one tuple value including the label value by grouping each of the extracted images; and
Including the step of inputting the tuple value, calculating a distance between embedding vectors inferred in each of the subnetworks, and mapping the distance on a multidimensional space,
Object matching method.

제 6 항에 있어서,
상기 맵핑하는 단계는,
동일한 객체로 라벨 값 부여된 이미지 그룹은 이미지들 간의 거리가 가까운 유사 공간 상에 배치하고, 다른 객체로 라벨 값이 부여된 이미지 그룹은 이미지들 간의 거리가 커지도록 먼 공간에 배치하는 단계를 포함하는,
객체 매칭 방법.According to claim 6,
The mapping step is
Arranging an image group labeled with the same object in a similar space with a short distance between images, and arranging an image group labeled with a different object in a distant space so that the distance between the images increases. ,
Object matching method.

제 1 항에 있어서,
상기 객체 매칭 장치의 초기화 시 서버로부터 기 학습된 재인식 거리 학습 모델을 수신하는 단계;
상기 객체 매칭 장치에 대한 특정 공간에 부여된 고유 정보 데이터를 기반으로, 상기 기 학습된 재인식 거리 학습 모델의 추론을 수행하는 단계; 및
상기 객체 매칭 장치에 대한 특정 공간에서 새로 획득된 데이터로 인해 변경된 상기 기 학습된 재인식 거리 학습 모델의 파라미터를 상기 서버에 공유하여, 상기 기 학습된 재인식 거리 학습 모델을 업데이트하는 단계를 더 포함하는,
객체 매칭 방법.According to claim 1,
Receiving a pre-learned re-recognition distance learning model from a server upon initialization of the object matching device;
performing inference of the pre-learned re-recognition distance learning model based on unique information data assigned to a specific space for the object matching device; and
Updating the pre-learned re-recognized distance learning model by sharing parameters of the pre-learned re-recognized distance learning model changed due to newly acquired data in a specific space for the object matching device to the server. Further comprising,
Object matching method.

제 1 항에 있어서,
상기 제 1 이미지를 획득한 후, 얼굴 및 신체 랜드마크 검출기를 통해 사람의 형상을 파악할 수 있는 기준점을 검출하는 단계; 및
상기 제 1 이미지의 얼굴 전체의 기준점 중 일부가 식별 가능하지 않은 경우, GAN(Generative Adversarial Network) 학습 모델을 통해 가려진 부분을 추론 및 복원하는 단계를 더 포함하는,
객체 매칭 방법.According to claim 1,
After acquiring the first image, detecting a reference point capable of determining a shape of a person through a face and body landmark detector; and
If some of the reference points of the entire face of the first image are not identifiable, inferring and restoring the obscured portion through a generative adversarial network (GAN) learning model.
Object matching method.

제 9 항에 있어서,
상기 가려진 부분을 추론 및 복원하는 단계는,
가려진 부분이 있는 상기 제 1 이미지와, 상기 제 1 이미지의 상기 특정 공간에 대한 고유 식별 정보에 대응하여 기 저장된 상기 제 2 이미지를 기반으로, 전체 얼굴을 추론한 이미지를 생성하는 단계를 포함하는,
객체 매칭 방법.According to claim 9,
The step of inferring and restoring the occluded part,
Generating an image by which the entire face is inferred based on the first image having an obscured portion and the pre-stored second image corresponding to unique identification information for the specific space of the first image,
Object matching method.

제 10 항에 있어서,
상기 추론 및 복원 수행으로 유사하지 않은 부분에 대해 민감하게 판별할 수 있도록, 상기 추론 및 복원을 수행한 이미지의 채워진 면적을 산출하여, 상기 재인식 거리 학습 모델의 객체 동일 여부를 판단하기 위한 기 설정된 임계값을 상기 면적에 반비례하게 낮춰서 설정하는 단계를 더 포함하는,
객체 매칭 방법.According to claim 10,
A predetermined threshold for determining whether the object of the re-recognition distance learning model is the same by calculating the filled area of the image on which the inference and restoration is performed so that dissimilar parts can be sensitively discriminated by performing the inference and restoration. Further comprising the step of setting the value by lowering it in inverse proportion to the area,
Object matching method.

거리 학습을 기반으로 특정 영역에 대한 재인식을 수행하는 재인식 기반 객체 매칭 장치로서,
메모리; 및
상기 메모리와 연결되고, 상기 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 하나의 프로세서를 포함하고,
상기 적어도 하나의 프로세서는,
객체 매칭 장치에 대한 특정 공간에서 이벤트 발생 시, 객체 재인식을 위한 객체의 제 1 이미지를 획득하는 동작,
상기 객체의 상기 특정 공간에 대응하는 고유 식별 정보를 추출하는 동작, 및
상기 객체의 상기 특정 공간에 대응하는 고유 식별 정보에 기반하여, 상기 고유 식별 정보에 대응하는 기 저장된 제 2 이미지가 검출되면, 기 학습된 재인식 거리 학습 모델을 이용하여 상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부를 판단하는 동작을 수행하도록 설정되는,
객체 매칭 장치.A recognition-based object matching device that performs re-recognition for a specific area based on distance learning,
Memory; and
a processor coupled with the memory and configured to execute computer readable instructions contained in the memory;
The at least one processor,
Obtaining a first image of an object for object recognition when an event occurs in a specific space for the object matching device;
An operation of extracting unique identification information corresponding to the specific space of the object; and
Based on the unique identification information corresponding to the specific space of the object, when a pre-stored second image corresponding to the unique identification information is detected, the first image and the second image are detected using a pre-learned re-recognition distance learning model. It is set to perform an operation of determining whether the objects of the image are the same,
object matching device.

제 12 항에 있어서,
상기 적어도 하나의 프로세서는,
상기 고유 식별 정보를 추출한 후, 상기 고유 식별 정보를 암호화하여 상기 고유 식별 정보에 대한 해시 값을 생성하는 동작, 및
상기 고유 식별 정보에 대한 해시 값을 상기 제 1 이미지의 레이블로 하여 상기 제 1 이미지를 저장하는 동작을 더 수행하도록 설정되는,
객체 매칭 장치.According to claim 12,
The at least one processor,
After extracting the unique identification information, generating a hash value for the unique identification information by encrypting the unique identification information; and
Set to further perform an operation of storing the first image by using a hash value for the unique identification information as a label of the first image.
object matching device.

제 12 항에 있어서,
상기 적어도 하나의 프로세서는,
상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부 판단 결과, 다른 객체라고 판단되는 경우, 경고 이벤트를 발생하는 동작을 더 수행하도록 설정되는,
객체 매칭 장치.According to claim 12,
The at least one processor,
As a result of determining whether the objects of the first image and the second image are the same, if it is determined that the objects are different, an operation for generating a warning event is further performed.
object matching device.

제 12 항에 있어서,
상기 재인식 거리 학습 모델은,
파라미터를 공유하는 적어도 2개 이상의 서로 다른 심층 신경망에서 출력된 특징 정보들 간의 거리를 조정하여 공간 상에 이미지를 맵핑하도록 학습된 것인,
객체 매칭 장치.According to claim 12,
The re-recognition distance learning model,
Learning to map an image on space by adjusting the distance between feature information output from at least two or more different deep neural networks that share parameters,
object matching device.

제 15 항에 있어서,
상기 제 1 이미지와 상기 제 2 이미지의 객체 동일 여부를 판단하는 동작은,
객체 동일 여부를 판단하기 위한 기 설정된 임계값을 기반으로, 상기 제 1 이미지와 상기 제 2 이미지의 임베딩 벡터간의 거리가 상기 임계값 이상이면 다른 객체로 판단하는 동작을 포함하며,
상기 임계값은 객체 매칭에 대한 높은 정확도가 필요할수록 낮게 설정되는,
객체 매칭 장치.According to claim 15,
The operation of determining whether the objects of the first image and the second image are identical,
Based on a preset threshold for determining whether objects are the same, if the distance between the embedding vectors of the first image and the second image is greater than or equal to the threshold, determining that the object is different,
The threshold value is set lower as high accuracy for object matching is required.
object matching device.

제 15 항에 있어서,
상기 기 학습된 재인식 거리 학습 모델은, 훈련 페이즈(training phase)를 거쳐 훈련되고,
상기 훈련 페이즈는,
수집된 객체 이미지들 중에서, 재인식 거리 학습 모델의 서브 네트워크 수와 동일한 수의 이미지를 랜덤으로 추출하는 단계;
상기 추출한 각각의 이미지의 객체가 동일한 객체인지 여부에 따라 라벨 값을 부여하는 단계;
상기 추출한 각각의 이미지를 하나의 그룹으로 하여, 상기 라벨 값을 포함하는 하나의 튜플 값을 생성하는 단계; 및
상기 튜플 값을 입력하여 상기 서브 네트워크 각각에서 추론된 임베딩 벡터 간의 거리를 산출하여 다차원 공간 상에 맵핑하는 단계를 포함하는,
객체 매칭 장치.According to claim 15,
The pre-learned re-recognition distance learning model is trained through a training phase,
The training phase is
randomly extracting the same number of images as the number of subnetworks of the re-recognition distance learning model from among the collected object images;
assigning a label value according to whether objects of each of the extracted images are the same object;
generating one tuple value including the label value by grouping each of the extracted images; and
Including the step of inputting the tuple value, calculating a distance between embedding vectors inferred in each of the subnetworks, and mapping the distance on a multidimensional space,
object matching device.

제 17 항에 있어서,
상기 맵핑하는 동작은,
동일한 객체로 라벨 값 부여된 이미지 그룹은 이미지들 간의 거리가 가까운 유사 공간 상에 배치하고, 다른 객체로 라벨 값이 부여된 이미지 그룹은 이미지들 간의 거리가 커지도록 먼 공간에 배치하는 동작을 포함하는,
객체 매칭 장치.18. The method of claim 17,
The mapping operation,
An image group labeled with the same object is placed in a similar space with a short distance between the images, and an image group labeled with a different object is placed in a far space so that the distance between the images increases. ,
object matching device.

제 12 항에 있어서,
상기 적어도 하나의 프로세서는,
상기 객체 매칭 장치의 초기화 시 서버로부터 기 학습된 재인식 거리 학습 모델을 수신하는 동작,
상기 객체 매칭 장치에 대한 특정 공간에 부여된 고유 정보 데이터를 기반으로, 상기 기 학습된 재인식 거리 학습 모델의 추론을 수행하는 동작, 및
상기 객체 매칭 장치에 대한 특정 공간에서 새로 획득된 데이터로 인해 변경된 상기 기 학습된 재인식 거리 학습 모델의 파라미터를 상기 서버에 공유하여, 상기 기 학습된 재인식 거리 학습 모델을 업데이트하는 동작을 더 수행하도록 설정되는,
객체 매칭 장치.According to claim 12,
The at least one processor,
Receiving a pre-learned re-recognition distance learning model from a server upon initialization of the object matching device;
An operation of performing inference of the pre-learned re-recognition distance learning model based on unique information data assigned to a specific space for the object matching device, and
Set to further perform an operation of updating the pre-learned re-recognized distance learning model by sharing the parameters of the pre-learned re-recognition distance learning model changed due to newly acquired data in a specific space for the object matching device to the server. felled,
object matching device.

제 12 항에 있어서,
상기 적어도 하나의 프로세서는,
상기 제 1 이미지를 획득한 후, 얼굴 및 신체 랜드마크 검출기를 통해 사람의 형상을 파악할 수 있는 기준점을 검출하는 동작, 및
상기 제 1 이미지의 얼굴 전체의 기준점 중 일부가 식별 가능하지 않은 경우, GAN(Generative Adversarial Network) 학습 모델을 통해 가려진 부분을 추론 및 복원하는 동작을 더 포함하는,
객체 매칭 장치.According to claim 12,
The at least one processor,
After obtaining the first image, an operation of detecting a reference point capable of identifying a person's shape through a face and body landmark detector; and
If some of the reference points of the entire face of the first image are not identifiable, inferring and restoring the obscured portion through a generative adversarial network (GAN) learning model.
object matching device.