KR20240069908A

KR20240069908A - Apparatus and method for virtualization multi-infrastructure camera

Info

Publication number: KR20240069908A
Application number: KR1020220150865A
Authority: KR
Inventors: 송석일; 김성종; 이하은
Original assignee: 주식회사 위드라이브
Priority date: 2022-11-11
Filing date: 2022-11-11
Publication date: 2024-05-21

Abstract

본 발명은 다중 인프라 카메라 가상화 장치 및 방법에 관한 것으로, 더욱 상세하게는 다수 인프라 카메라들을 가상화하여 마치 하나의 인프라 카메라로 객체를 검지하듯이 동일 객체를 연속적으로 추적할 수 있는 다중 인프라 카메라 가상화 장치 및 방법에 관한 것이다.The present invention relates to a multi-infrastructure camera virtualization device and method, and more specifically, to a multi-infrastructure camera virtualization device that virtualizes multiple infrastructure cameras and can continuously track the same object as if detecting the object with a single infrastructure camera. It's about method.

Description

다중 인프라 카메라 가상화 장치 및 방법{APPARATUS AND METHOD FOR VIRTUALIZATION MULTI-INFRASTRUCTURE CAMERA}Apparatus and method for virtualizing multiple infrastructure cameras {APPARATUS AND METHOD FOR VIRTUALIZATION MULTI-INFRASTRUCTURE CAMERA}

C-ITS(Cooperative-Intelligent Transport system)은 V2X 통신기술을 이용하여 자율주행차나 일반 운전자에게 도로의 상황정보를 실시간으로 제공하여 차량이 돌발상황에 신속하게 대응하여 안전운전을 지속할 수 있도록 도와주는 것을 목적으로 한다. C-ITS (Cooperative-Intelligent Transport system) uses V2X communication technology to provide real-time road situation information to self-driving cars and general drivers, helping vehicles quickly respond to unexpected situations and continue safe driving. The purpose is to

국내에서는 자율주행 차량의 인지 범위를 확장하기 위해 도로 인프라 센서(카메라, 라이다 등)를 이용하여 도로상의 객체들을 인지하고 이를 자율주행차량에 실시간으로 공유하여 4단계 자율주행을 가능하게 하는 연구를 진행하고 있다.In Korea, in order to expand the cognitive range of autonomous vehicles, research is being conducted to recognize objects on the road using road infrastructure sensors (cameras, lidar, etc.) and share them with autonomous vehicles in real time to enable level 4 autonomous driving. It's in progress.

일부 연구에서는 엣지(Edge)에 연결된 인프라 카메라의 영상으로부터 실시간으로 도로상의 위험을 검지하고 V2X(Chen, 2017)를 이용하여 차량에 정보를 제공하는 기술이 개발된 바 있다. 도로 인프라 카메라는 검지영역에 제한이 있으며, 넓은 범위를 검지하기 위해서는 다수의 인프라 센서를 지리적으로 인접한 장소에 설치할 수 있다.In some studies, technology has been developed to detect risks on the road in real time from images of infrastructure cameras connected to the edge and provide information to vehicles using V2X (Chen, 2017). Road infrastructure cameras have limited detection areas, and in order to detect a wide range, multiple infrastructure sensors can be installed in geographically adjacent locations.

다수의 인프라 카메라를 도로에 설치하는 경우에는 동일 객체를 각 센서는 서로 다른 객체로 인식할 수 있는 문제가 있다. 도 3은 인프라 카메라 1과 인프라 카메라 2가 동일 객체를 인식하고 있는 실 예를 보여준다. 해당 객체는 인프라 카메라 1의 검지 영역에서 처음으로 인식되고, 이동을 지속하여 특정 시간 이후에 인프라 카메라 2에서 인식된다. 해당 객체를 동일한 객체로 인식할 수 있다면 인프라 카메라 1에서 인프라 카메라 2로 넘어가는 상황에서도 끊김 없이 객체의 궤적을 예측하여 의도를 파악할 수 있을 것이다.When multiple infrastructure cameras are installed on the road, there is a problem that each sensor may recognize the same object as a different object. Figure 3 shows an example in which infrastructure camera 1 and infrastructure camera 2 recognize the same object. The object is first recognized in the detection area of infrastructure camera 1, continues to move, and is recognized by infrastructure camera 2 after a certain time. If you can recognize the object as the same object, you will be able to predict the trajectory of the object and determine its intention without interruption even when moving from Infrastructure Camera 1 to Infrastructure Camera 2.

대한민국 등록특허 제10-2302132호(2012.02.02)Republic of Korea Patent No. 10-2302132 (2012.02.02)

따라서, 본 발명은 상술한 바와 같은 문제를 개선하기 위하여 제안된 것으로, 다수 인프라 카메라들을 가상화하여 마치 하나의 인프라 카메라로 객체를 검지하듯이 동일 객체를 연속적으로 추적할 수 있는 다중 인프라 카메라 가상화 장치 및 방법을 제공하는데 목적이 있다.Therefore, the present invention was proposed to improve the problems described above, and includes a multi-infrastructure camera virtualization device that virtualizes multiple infrastructure cameras and can continuously track the same object as if detecting the object with one infrastructure camera; The purpose is to provide a method.

본 발명의 목적은 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 본 발명이 속하는 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The object of the present invention is not limited to what was mentioned above, and other objects not mentioned will be clearly understood by those skilled in the art from the description below.

상기와 같은 목적을 달성하기 위한 본 발명의 실시예에 따른 다중 인프라 카메라 가상화 장치가 동일 객체를 연속적으로 추적하는 방법은, 현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델에 기초하여 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출하는 단계; SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출하는 단계; 상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 단계; 및 각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 단계를 포함할 수 있다.The method of continuously tracking the same object by a multiple infrastructure camera virtualization device according to an embodiment of the present invention to achieve the above object includes MBB (Minimum Bounding Box) of the object detected at the first time by the current infrastructure camera, , calculating an intersection over union (IoU) between the MBB of an object at a position predicted for the first viewpoint based on a previous infrastructure camera and a trajectory prediction model; Based on SNN (Siamese Neural Network), the similarity between the MBB of the object last detected through the previous infrastructure camera before the first time point and the MBB when the object was first detected by the current infrastructure camera at the first time point is calculated. calculating step; determining each weight given to the IoU and the similarity according to the separation distance between the previous infrastructure camera and the current infrastructure camera; And it may include determining whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same object based on the IoU and the similarity for which each of the proportions is determined.

다른 일 실시예에 따른 다중 인프라 카메라 가상화 장치는, 기 학습된 인공지능 모델에 대한 정보를 저장하는 메모리부; 학습된 궤적 예측 모델 및 학습된 SNN(Siamese Neural Network)을 저장하는 데이터베이스; 현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델에 기초하여 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출하는 동작, SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출하는 동작, 상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 동작, 및 각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 동작을 실행하는 프로세서를 포함할 수 있다.A multi-infrastructure camera virtualization device according to another embodiment includes a memory unit that stores information about a previously learned artificial intelligence model; A database storing the learned trajectory prediction model and the learned Siamese Neural Network (SNN); Intersection over Union (IoU) between the Minimum Bounding Box (MBB) of the object detected by the current infrastructure camera at the first viewpoint and the MBB of the object at the position predicted for the first viewpoint based on the previous infrastructure camera and trajectory prediction model. ) An operation to calculate the MBB of the object last detected through the previous infrastructure camera before the first time point based on a Siamese Neural Network (SNN) and the object first detected at the first time point by the current infrastructure camera. An operation of calculating the similarity between MBBs, an operation of determining each weight given to the IoU and the similarity according to the separation distance between the previous infrastructure camera and the current infrastructure camera, and the IoU and the similarity for which each of the weights is determined. It may include a processor that executes an operation to determine whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same based on .

본 발명의 실시예에 따른 다중 인프라 카메라 가상화 장치 및 방법에 의하면, 다수 인프라 카메라들을 가상화하여 마치 하나의 인프라 카메라로 객체를 검지하듯이 동일 객체를 연속적으로 추적할 수 있다.According to the multi-infrastructure camera virtualization apparatus and method according to an embodiment of the present invention, it is possible to virtualize multiple infrastructure cameras and continuously track the same object as if detecting the object with a single infrastructure camera.

본 발명의 효과는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 장치의 구성을 도시하는 블록도이다.
도 2는 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 장치가 동일 객체를 추적하는 과정을 도시하는 구조도이다.
도 3은 2대의 인프라 카메라에서 동일 객체를 인식하는 예를 나타내는 사진이다.
도 4 내지 도 6은 IoU를 설명하기 위한 개념도이다.
도 7은 SNN을 이용한 객체 동일 여부 판단 과정 및 개념을 도시하는 개념도이다.
도 8은 이전 인프라 카메라 및 현재 인프라 카메라의 검지 영역의 일 예를 도시한다.
도 9는 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 방법을 설명하기 위한 순서도이다. Figure 1 is a block diagram showing the configuration of a multi-infrastructure camera virtualization device according to an embodiment of the present invention.
Figure 2 is a structural diagram illustrating a process in which a multi-infrastructure camera virtualization device tracks the same object according to an embodiment of the present invention.
Figure 3 is a photo showing an example of recognizing the same object from two infrastructure cameras.
4 to 6 are conceptual diagrams for explaining IoU.
Figure 7 is a conceptual diagram showing the process and concept of determining whether objects are identical using SNN.
Figure 8 shows an example of the detection area of the previous infrastructure camera and the current infrastructure camera.
Figure 9 is a flowchart for explaining a multi-infrastructure camera virtualization method according to an embodiment of the present invention.

본 발명의 목적 및 효과, 그리고 그것들을 달성하기 위한 기술적 구성들은 첨부되는 도면과 함께 상세하게 뒤에 설명이 되는 실시 예들을 참조하면 명확해질 것이다. 본 발명을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 뒤에 설명되는 용어들은 본 발명에서의 구조, 역할 및 기능 등을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다.The purpose and effects of the present invention, and technical configurations for achieving them, will become clear by referring to the embodiments described in detail below along with the accompanying drawings. In describing the present invention, if it is determined that a detailed description of a known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description will be omitted. The terms described later are defined in consideration of the structure, role, and function in the present invention, and may vary depending on the intention or custom of the user or operator.

그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있다. 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 오로지 특허청구범위에 기재된 청구항의 범주에 의하여 정의될 뿐이다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. These examples are merely provided to ensure that the disclosure of the present invention is complete and to fully inform those skilled in the art of the scope of the invention, and that the present invention is limited only to the claims set forth in the patent claims. It is only defined by the scope of the claim. Therefore, the definition should be made based on the contents throughout this specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part is said to “include” a certain element, this means that it may further include other elements rather than excluding other elements, unless specifically stated to the contrary.

이하에서는 첨부한 도면을 참조하며, 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다. Hereinafter, with reference to the accompanying drawings, preferred embodiments of the present invention will be described in more detail.

이하 첨부된 도면을 참고하여 본 발명을 상세히 설명하기로 한다.Hereinafter, the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 장치(100)의 구성을 도시하는 블록도이다.Figure 1 is a block diagram showing the configuration of a multi-infrastructure camera virtualization device 100 according to an embodiment of the present invention.

다중 인프라 카메라 가상화 장치(100)는 학습 데이터를 이용하여 기계 학습을 수행할 수 있는 장치로서, 인공 신경망으로 구성된 모델을 이용하여 학습하는 장치를 포함할 수 있다.The multi-infrastructure camera virtualization device 100 is a device capable of performing machine learning using learning data, and may include a device that learns using a model composed of an artificial neural network.

즉, 다중 인프라 카메라 가상화 장치(100)는 데이터 마이닝, 데이터 분석, 지능형 의사 결정 및 기계 학습 알고리즘을 위해 이용될 정보를 수신, 분류, 저장 및 출력하도록 구성될 수 있다. 여기서, 기계 학습 알고리즘은 딥 러닝 알고리즘을 포함할 수 있다.That is, the multi-infrastructure camera virtualization device 100 may be configured to receive, classify, store, and output information to be used for data mining, data analysis, intelligent decision-making, and machine learning algorithms. Here, the machine learning algorithm may include a deep learning algorithm.

즉, 다중 인프라 카메라 가상화 장치(100)는 적어도 하나의 외부 장치(미도시) 또는 단말기(미도시)와 통신할 수 있고, 외부 장치를 대신하여 혹은 도와 데이터를 분석하거나 학습하여 결과를 도출할 수 있다. 여기서, 다른 장치를 도운다는 의미는 분산 처리를 통한 연산력의 분배를 의미할 수 있다.That is, the multi-infrastructure camera virtualization device 100 can communicate with at least one external device (not shown) or terminal (not shown), and can analyze or learn data on behalf of or with the help of the external device to derive results. there is. Here, helping other devices may mean distributing computing power through distributed processing.

다중 인프라 카메라 가상화 장치(100)는 인공 신경망을 학습하기 위한 다양한 장치로서, 통상적으로 서버를 의미할 수 있고, 신경망 학습 장치 또는 신경망 학습 서버 등으로 지칭할 수 있다.The multi-infrastructure camera virtualization device 100 is a variety of devices for learning artificial neural networks, and may generally refer to a server, and may be referred to as a neural network learning device or neural network learning server.

특히, 다중 인프라 카메라 가상화 장치(100)는 단일한 서버뿐만 아니라 복수의 서버 세트, 클라우드 서버 또는 이들의 조합으로 구현될 수 있다.In particular, the multi-infrastructure camera virtualization device 100 may be implemented not only as a single server but also as a plurality of server sets, cloud servers, or a combination thereof.

즉, 다중 인프라 카메라 가상화 장치(100)는 복수로 구성되어 신경망 학습 장치 세트(혹은 클라우드 서버)를 구성할 수 있고, 신경망 학습 장치 세트에 포함된 적어도 하나 이상의 다중 인프라 카메라 가상화 장치(100)는 분산 처리를 통하여 데이터 분석 또는 학습하여 결과를 도출할 수 있다.That is, a plurality of multi-infrastructure camera virtualization devices 100 may be configured to form a neural network learning device set (or cloud server), and at least one multi-infrastructure camera virtualization device 100 included in the neural network learning device set may be distributed. Results can be derived by analyzing or learning data through processing.

다중 인프라 카메라 가상화 장치(100)는 주기적으로 혹은 요청에 의하여 외부 장치(미도시)에 기계 학습 또는 딥 러닝에 의하여 학습한 모델을 전송할 수 있다.The multi-infrastructure camera virtualization device 100 may transmit a model learned through machine learning or deep learning to an external device (not shown) periodically or upon request.

도 1을 참조하면, 다중 인프라 카메라 가상화 장치(100)는 통신부(110), 입력부(120), 메모리(130), 러닝 프로세서(140), 전원 공급부(150) 및 프로세서(160) 등을 포함할 수 있다.Referring to FIG. 1, the multi-infrastructure camera virtualization device 100 may include a communication unit 110, an input unit 120, a memory 130, a learning processor 140, a power supply unit 150, and a processor 160. You can.

통신부(110)는 무선 통신부(미도시) 및 인터페이스부(미도시)를 포함하는 구성을 의미할 수 있다. 즉, 통신부(110)는 유무선 통신이나 인터페이스를 통하여 다른 장치와 데이터를 송수신할 수 있다.The communication unit 110 may refer to a configuration including a wireless communication unit (not shown) and an interface unit (not shown). That is, the communication unit 110 can transmit and receive data with another device through wired or wireless communication or an interface.

입력부(120)는 모델 학습을 위한 훈련 데이터 또는 학습된 모델을 이용하여 출력을 획득하기 위한 입력 데이터 등을 획득할 수 있다.The input unit 120 may acquire training data for model learning or input data for obtaining an output using a learned model.

입력부(120)는 가공되지 않은 입력 데이터를 획득할 수도 있으며, 이 경우 러닝 프로세서(140) 또는 프로세서(160)는 획득한 데이터를 전처리하여 모델 학습에 입력이 가능한 훈련 데이터 또는 전처리된 입력 데이터를 생성할 수 있다. The input unit 120 may acquire raw input data, in which case the learning processor 140 or processor 160 preprocesses the acquired data to generate training data or preprocessed input data that can be input to model learning. can do.

이때, 입력부(120)는 가공되지 않은 입력 데이터를 획득할 수도 있으며, 이 경우 러닝 프로세서(140) 또는 프로세서(160)는 획득한 데이터를 전처리하여 모델 학습에 입력이 가능한 훈련 데이터 또는 전처리된 입력 데이터를 생성할 수 있다.At this time, the input unit 120 may acquire unprocessed input data, and in this case, the learning processor 140 or processor 160 preprocesses the acquired data to provide training data or preprocessed input data that can be input to model learning. can be created.

이때, 입력부(120)에서 수행하는 입력 데이터에 대한 전처리는, 입력 데이터로부터 입력 특징점(input feature)을 추출하는 것을 의미할 수 있다.At this time, preprocessing of input data performed by the input unit 120 may mean extracting input features from input data.

또한, 입력부(120)는 통신부(110)를 통하여 데이터를 수신함으로써 데이터를 획득할 수도 있다. Additionally, the input unit 120 may acquire data by receiving data through the communication unit 110.

메모리(130)는 러닝 프로세서(140) 또는 다중 인프라 카메라 가상화 장치(100)에서 학습된 모델을 저장할 수 있다.The memory 130 may store a model learned by the learning processor 140 or the multi-infrastructure camera virtualization device 100.

이때, 메모리(130)는 필요에 따라 학습된 모델을 학습 시점 또는 학습 진척도 등에 따라 복수의 버전으로 구분하여 저장할 수 있다.At this time, the memory 130 may store the learned model by dividing it into a plurality of versions depending on the learning time or learning progress, if necessary.

이때, 메모리(130)는 입력부(120)에서 획득한 입력 데이터, 모델 학습을 위하여 이용되는 학습 데이터(또는 훈련 데이터), 모델의 학습 히스토리 등을 저장할 수 있다.At this time, the memory 130 may store input data obtained from the input unit 120, learning data (or training data) used for model learning, model learning history, etc.

이때, 메모리(130)에 저장된 입력 데이터는 모델 학습에 적합하게 가공된 데이터뿐만 아니라, 가공되지 않은 입력 데이터 그 자체일 수 있다.At this time, the input data stored in the memory 130 may be not only data processed suitable for model learning, but also unprocessed input data itself.

메모리(130)는 모델 저장부(131) 및 데이터베이스(132) 등을 포함할 수 있다.The memory 130 may include a model storage unit 131 and a database 132.

모델 저장부(131)는 러닝 프로세스(140)을 통하여 학습 중인 또는 학습된 신경망 모델(또는 인공 신경망, 131a)을 저장하며, 학습을 통하여 모델이 업데이트되면 업데이트된 모델을 저장한다. 이때, 모델 저장부(131)는 필요에 따라 학습된 모델을 학습 시점 또는 학습 진척도 등에 따라 복수의 버전으로 구분하여 저장할 수 있다. The model storage unit 131 stores a neural network model (or artificial neural network, 131a) that is being learned or has been learned through the learning process 140, and stores the updated model when the model is updated through learning. At this time, the model storage unit 131 may store the learned model by dividing it into a plurality of versions depending on the learning time or learning progress, if necessary.

도 1에 도시된 인공 신경망(131a)은 복수의 은닉층을 포함하는 인공 신경망의 하나의 예시일 뿐이며, 본 발명의 인공 신경망이 이에 한정되는 것은 아니다.The artificial neural network 131a shown in FIG. 1 is only an example of an artificial neural network including a plurality of hidden layers, and the artificial neural network of the present invention is not limited to this.

인공 신경망(131a)은 하드웨어, 소프트웨어 또는 하드웨어와 소프트웨어의 조합으로 구현될 수 있다. 인공 신경망(131a)의 일부 또는 전부가 소프트웨어로 구현되는 경우, 인공 신경망(131a)을 구성하는 하나 이상의 명령어는 메모리(130)에 저장될 수 있다.The artificial neural network 131a may be implemented with hardware, software, or a combination of hardware and software. When part or all of the artificial neural network 131a is implemented as software, one or more instructions constituting the artificial neural network 131a may be stored in the memory 130.

데이터베이스(132)는 입력부(120)에서 획득한 입력 데이터, 모델 학습을 위하여 이용되는 학습 데이터(또는 훈련 데이터), 모델의 학습 히스토리 등을 저장할 수 있다.The database 132 may store input data obtained from the input unit 120, learning data (or training data) used for model learning, model learning history, etc.

데이터베이스(132)에 저장된 입력 데이터는 모델 학습에 적합하게 가공된 데이터뿐만 아니라, 가공되지 않은 입력 데이터 그 자체일 수 있다.The input data stored in the database 132 may be not only data processed to be suitable for model learning, but also unprocessed input data itself.

일 실시예로, 데이터베이스(132)는 학습된 궤적 추적 모델 및 SNN 모델을 저장할 수 있다.In one embodiment, the database 132 may store learned trajectory tracking models and SNN models.

러닝 프로세서(140)는 훈련 데이터 또는 트레이닝 셋을 이용하여 인공 신경망(131a)을 훈련시킬 수 있다.The learning processor 140 may train the artificial neural network 131a using training data or a training set.

러닝 프로세서(140)는 프로세서(160)가 입력부(120)를 통해 획득한 입력 데이터를 전처리한 데이터를 바로 획득하여 인공 신경망(131a)을 학습하거나, 데이터베이스(132)에 저장된 전처리된 입력 데이터를 획득하여 인공 신경망(131a)을 학습할 수 있다.The learning processor 140 directly acquires data preprocessed by the processor 160 through the input unit 120 to learn the artificial neural network 131a or acquires preprocessed input data stored in the database 132. Thus, the artificial neural network 131a can be learned.

구체적으로, 러닝 프로세서(140)는 앞서 설명한 다양한 학습 기법을 이용하여 인공 신경망(131a)을 반복적으로 학습시킴으로써, 인공신경망(131a)의 최적화된 모델 파라미터들을 결정할 수 있다.Specifically, the learning processor 140 can determine optimized model parameters of the artificial neural network 131a by repeatedly training the artificial neural network 131a using various learning techniques described above.

본 명세서에서는 학습 데이터를 이용하여 학습됨으로써 파라미터가 결정된 인공 신경망을 학습 모델 또는 학습된 모델이라 칭할 수 있다.In this specification, an artificial neural network whose parameters are determined by learning using training data may be referred to as a learning model or a learned model.

이때, 학습 모델은 인공 신경망의 학습 장치(100)에 탑재된 상태에서 결과 값을 추론할 수도 있으며, 통신부(110)를 통해 단말기 또는 외부 장치와 같은 다른 장치에 전송되어 탑재될 수도 있다.At this time, the learning model may infer the result value while being mounted on the artificial neural network learning device 100, and may also be transmitted and mounted on another device such as a terminal or external device through the communication unit 110.

또한, 학습 모델이 업데이트되는 경우, 업데이트된 학습 모델은 통신부(110)를 통해 단말기 또는 외부 장치와 같은 다른 장치에 전송되어 탑재될 수 있다.Additionally, when the learning model is updated, the updated learning model may be transmitted and mounted on another device, such as a terminal or an external device, through the communication unit 110.

또한, 학습 모델은 학습 데이터가 아닌 새로운 입력 데이터에 대하여 결과 값을 추론해 내는데 사용될 수 있다.Additionally, a learning model can be used to infer a result value for new input data other than training data.

러닝 프로세서(140)는 데이터 마이닝, 데이터 분석, 지능형 의사 결정, 및 기계 학습 알고리즘 및 기술을 위해 이용될 정보를 수신, 분류, 저장 및 출력하도록 구성될 수 있다.Learning processor 140 may be configured to receive, classify, store, and output information to be used for data mining, data analysis, intelligent decision-making, and machine learning algorithms and techniques.

러닝 프로세서(140)는 다중 인프라 카메라 가상화 장치(100)에 통합되거나 구현된 메모리를 포함할 수 있다. 일부 실시 예에서, 러닝 프로세서(140)는 메모리(130)를 사용하여 구현될 수 있다.The learning processor 140 may include memory integrated or implemented in the multi-infrastructure camera virtualization device 100. In some embodiments, learning processor 140 may be implemented using memory 130 .

선택적으로 또는 부가적으로, 러닝 프로세서(140)는 클라우드 컴퓨팅 환경에서 유지되는 메모리, 또는 네트워크와 같은 통신 방식을 통해 단말기에 의해 액세스 가능한 다른 원격 메모리 위치를 이용하여 구현될 수 있다.Alternatively or additionally, learning processor 140 may be implemented using memory maintained in a cloud computing environment, or other remote memory location accessible by the terminal through a communication method such as a network.

러닝 프로세서(140)는 일반적으로 감독 또는 감독되지 않은 학습, 데이터 마이닝, 예측 분석 또는 다른 머신에서 사용하기 위해 데이터를 식별, 색인화, 카테고리화, 조작, 저장, 검색 및 출력하기 위해 데이터를 하나 이상의 데이터베이스에 저장하도록 구성될 수 있다. 여기서, 데이터베이스는 메모리(130), 클라우드 컴퓨팅 환경에서 유지되는 메모리, 또는 네트워크와 같은 통신 방식을 통해 단말기에 의해 액세스 가능한 다른 원격 메모리 위치를 이용하여 구현될 수 있다.Learning processor 140 typically stores data in one or more databases to identify, index, categorize, manipulate, store, retrieve, and output data for use in supervised or unsupervised learning, data mining, predictive analytics, or other machines. It can be configured to save in . Here, the database may be implemented using memory 130, memory maintained in a cloud computing environment, or other remote memory location accessible by the terminal through a communication method such as a network.

러닝 프로세서(140)에 저장된 정보는 다양한 상이한 유형의 데이터 분석 알고리즘 및 기계 학습 알고리즘 중 임의의 것을 사용하여 프로세서(160)에 의해 이용될 수 있다.Information stored in learning processor 140 may be utilized by processor 160 using any of a variety of different types of data analysis algorithms and machine learning algorithms.

이러한, 알고리즘의 예로는, k-최근 인접 시스템, 퍼지 논리, 신경회로망, 볼츠만 기계, 벡터 양자화, 펄스 신경망, 지원 벡터 기계, 최대 마진 분류기, 힐 클라이밍, 유도 논리 시스템 베이지안 네트워크, 페리트넷(예: 유한 상태 머신, 밀리 머신, 무어 유한 상태 머신), 분류기 트리(예: 퍼셉트론 트리, 지원 벡터 트리, 마코프 트리, 의사 결정 트리 포리스트, 임의의 포리스트), 판독 모델 및 시스템, 인공 융합, 센서 융합, 이미지 융합, 보강 학습, 증강 현실, 패턴 인식, 자동화된 계획 등을 포함한다.Examples of such algorithms include k-nearest neighbor systems, fuzzy logic, neural networks, Boltzmann machines, vector quantization, pulsed neural networks, support vector machines, maximum margin classifiers, hill climbing, guided logic systems, Bayesian networks, Ferritnets (e.g. finite state machines, Millie machines, Moore finite state machines), classifier trees (e.g. perceptron trees, support vector trees, Markov trees, decision tree forests, random forests), reading models and systems, artificial fusion, sensor fusion, images Includes fusion, reinforcement learning, augmented reality, pattern recognition, automated planning, etc.

프로세서(160)는 데이터 분석 및 기계 학습 알고리즘을 사용하여 결정되거나, 생성된 정보에 기초하여 다중 인프라 카메라 가상화 장치(100)의 적어도 하나의 실행 가능한 동작을 결정 또는 예측할 수 있다. 이를 위해, 프로세서(160)는 러닝 프로세서(140)의 데이터를 요청, 검색, 수신 또는 활용할 수 있고, 상기 적어도 하나의 실행 가능한 동작 중 예측되는 동작이나, 바람직한 것으로 판단되는 동작을 실행하도록 다중 인프라 카메라 가상화 장치(100)를 제어할 수 있다.The processor 160 may determine or predict at least one executable operation of the multi-infrastructure camera virtualization device 100 based on information determined or generated using data analysis and machine learning algorithms. To this end, the processor 160 may request, retrieve, receive, or utilize data from the learning processor 140, and use multiple infrastructure cameras to execute a predicted operation or an operation determined to be desirable among the at least one executable operation. The virtualization device 100 can be controlled.

프로세서(160)는 지능적 에뮬레이션(즉, 지식 기반 시스템, 추론 시스템 및 지식 획득 시스템)을 구현하는 다양한 기능을 수행할 수 있다. 이는 적응 시스템, 기계 학습 시스템, 인공신경망 등을 포함하는, 다양한 유형의 시스템(예컨대, 퍼지 논리 시스템)에 적용될 수 있다.Processor 160 may perform various functions implementing intelligent emulation (i.e., knowledge-based systems, inference systems, and knowledge acquisition systems). This can be applied to various types of systems (e.g., fuzzy logic systems), including adaptive systems, machine learning systems, artificial neural networks, etc.

프로세서(160)는 또한 I/O 처리 모듈, 환경 조건 모듈, 음성-텍스트(STT)처리 모듈, 자연어 처리 모듈, 작업 흐름 처리 모듈 및 서비스 처리 모듈과 같이, 음성 및 자연 언어 음성 처리를 수반하는 연산을 가능하게 하는 서브 모듈을 포함할 수 있다.Processor 160 may also include operations involving speech and natural language speech processing, such as an I/O processing module, an environmental conditions module, a speech-to-text (STT) processing module, a natural language processing module, a workflow processing module, and a service processing module. It may include submodules that enable.

이들 서브 모듈들 각각은, 단말기에서의 하나 이상의 시스템 또는 데이터 및 모델, 또는 이들의 서브셋 또는 수퍼 셋에 대한 액세스를 가질 수 있다. 또한, 이들 서브 모듈들 각각은, 어휘 색인, 사용자 데이터, 작업 흐름 모델, 서비스 모델 및 자동 음성 인식(ASR) 시스템을 비롯한 다양한 기능을 제공할 수 있다.Each of these submodules may have access to one or more systems or data and models in the terminal, or a subset or superset thereof. Additionally, each of these submodules can provide various functions, including lexical indexing, user data, workflow models, service models, and automatic speech recognition (ASR) systems.

다른 실시예에서, 프로세서(160) 또는 다중 인프라 카메라 가상화 장치(100)의 다른 양태는 서브 모듈, 시스템, 또는 데이터 및 모델로 구현될 수 있다.In other embodiments, processor 160 or other aspects of multi-infrastructure camera virtualization device 100 may be implemented as submodules, systems, or data and models.

일부 예에서, 러닝 프로세서(140)의 데이터에 기초하여, 프로세서(160)는 사용자 입력 또는 자연 언어 입력으로 표현된 문맥 조건 또는 사용자의 의도에 기초하여 요구 사항을 검출하고 감지하도록 구성될 수 있다.In some examples, based on data from learning processor 140, processor 160 may be configured to detect and sense requirements based on context conditions or user intent expressed in user input or natural language input.

프로세서(160)는 문맥 조건 또는 사용자의 의도에 기초하여 요구 사항을 완전히 결정하는데 필요한 정보를 능동적으로 이끌어 내고, 획득할 수 있다. 예를 들어, 프로세서(160)는 역사적 입력 및 출력, 패턴 매칭, 모호하지 않은 단어, 입력 의도 등을 포함하는 과거 데이터를 분석함으로써 요구 사항을 결정하는데 필요한 정보를 능동적으로 이끌어 낼 수 있다.Processor 160 may actively elicit and obtain information necessary to fully determine requirements based on contextual conditions or user intent. For example, processor 160 may actively derive information needed to determine requirements by analyzing historical data, including historical input and output, pattern matching, unambiguous words, input intent, etc.

프로세서(160)는 문맥 조건 또는 사용자의 의도에 기초하여 요구 사항에 응답하는 기능을 실행하기 위한 태스크 흐름을 결정할 수 있다.Processor 160 may determine a task flow for executing functions that respond to requirements based on context conditions or user intent.

프로세서(160)는 러닝 프로세서(140)에서 프로세싱 및 저장을 위한 정보를 수집하기 위해, 단말기에서 하나 이상의 감지 컴포넌트를 통해 분석 및 기계 학습 작업에 사용되는 신호 또는 데이터를 수집, 감지, 추출, 검출 및/또는 수신하도록 구성될 수 있다.Processor 160 collects, detects, extracts, detects, and detects signals or data used for analysis and machine learning tasks through one or more sensing components in the terminal to collect information for processing and storage in learning processor 140. /or may be configured to receive.

정보 수집은 센서를 통해 정보를 감지하는 것, 메모리(130)에 저장된 정보를 추출하는 것 또는 통신 수단을 통해 외부 단말기, 엔티티 또는 외부 저장 장치로부터 정보를 수신하는 것을 포함할 수 있다.Information collection may include detecting information through a sensor, extracting information stored in memory 130, or receiving information from an external terminal, entity, or external storage device through communication means.

프로세서(160)는 다중 인프라 카메라 가상화 장치(100)에서 사용 히스토리 정보를 수집하여 메모리(130)에 저장할 수 있다.The processor 160 may collect usage history information from the multi-infrastructure camera virtualization device 100 and store it in the memory 130.

프로세서(160)는 저장된 사용 히스토리 정보 및 예측 모델링을 사용하여 특정 기능을 실행하기 위한 최상의 매치를 결정할 수 있다.Processor 160 may use stored usage history information and predictive modeling to determine the best match to execute a particular function.

프로세서(160)는 입력부로부터 이미지 정보(또는 해당 신호), 오디오 정보(또는 해당 신호), 데이터 또는 사용자 입력 정보를 수신할 수 있다.The processor 160 may receive image information (or a corresponding signal), audio information (or a corresponding signal), data, or user input information from the input unit.

프로세서(160)는 정보를 실시간으로 수집하고, 정보(예를 들어, 지식 그래프, 명령 정책, 개인화 데이터베이스, 대화 엔진 등)를 처리 또는 분류하고, 처리된 정보를 메모리(130) 또는 러닝 프로세서(140)에 저장할 수 있다.The processor 160 collects information in real time, processes or classifies the information (e.g., knowledge graph, command policy, personalization database, conversation engine, etc.), and stores the processed information in memory 130 or learning processor 140. ) can be saved in .

다중 인프라 카메라 가상화 장치(100)의 동작이 데이터 분석 및 기계 학습 알고리즘 및 기술에 기초하여 결정될 때, 프로세서(160)는 결정된 동작을 실행하기 위해 다중 인프라 카메라 가상화 장치(100)의 구성 요소를 제어할 수 있다. 그리고, 프로세서(160)는 제어 명령에 따라 다중 인프라 카메라 가상화 장치(100)를 제어하여 결정된 동작을 수행할 수 있다.When the operation of the multi-infrastructure camera virtualization device 100 is determined based on data analysis and machine learning algorithms and techniques, the processor 160 controls the components of the multi-infrastructure camera virtualization device 100 to execute the determined operation. You can. Additionally, the processor 160 may control the multi-infrastructure camera virtualization device 100 according to the control command to perform the determined operation.

프로세서(160)는 특정 동작이 수행되는 경우, 데이터 분석 및 기계 학습 알고리즘 및 기법을 통해 특정 동작의 실행을 나타내는 이력 정보를 분석하고, 분석된 정보에 기초하여 이전에 학습한 정보의 업데이트를 수행할 수 있다.When a specific operation is performed, the processor 160 analyzes history information indicating the execution of the specific operation through data analysis and machine learning algorithms and techniques, and updates previously learned information based on the analyzed information. You can.

따라서, 프로세서(160)는 러닝 프로세서(140)와 함께, 업데이트 된 정보에 기초하여 데이터 분석 및 기계 학습 알고리즘 및 기법의 미래 성능의 정확성을 향상시킬 수 있다.Accordingly, processor 160, in conjunction with learning processor 140, may improve the accuracy of data analysis and future performance of machine learning algorithms and techniques based on updated information.

전원 공급부(150)는 프로세서(160)의 제어 하에서, 외부의 전원, 내부의 전원을 인가받아 다중 인프라 카메라 가상화 장치(100)에 포함된 각 구성요소들에 전원을 공급하기 위한 장치를 포함한다.The power supply unit 150 includes a device for receiving external power and internal power under the control of the processor 160 and supplying power to each component included in the multi-infrastructure camera virtualization device 100.

또한, 이러한 전원공급부(150)는 배터리를 포함하며, 상기 배터리는 내장형 배터리 또는 교체 가능한 형태의 배터리가 될 수 있다.Additionally, the power supply unit 150 includes a battery, and the battery may be a built-in battery or a replaceable battery.

구체적으로, 러닝 프로세서(140)에 의해 학습되는 객체 관계 특징을 고려하여 유사 이미지를 검색하는 방법에 대하여 살펴보기로 한다.Specifically, we will look at a method of searching for similar images by considering the object relationship characteristics learned by the learning processor 140.

도 2는 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 장치가 동일 객체를 추적하는 과정을 도시하는 구조도이고, 도 3은 2대의 인프라 카메라에서 동일 객체를 인식하는 예를 나타내는 사진이다.FIG. 2 is a structural diagram illustrating a process in which a multi-infrastructure camera virtualization device tracks the same object according to an embodiment of the present invention, and FIG. 3 is a photograph showing an example of recognizing the same object from two infrastructure cameras.

도 2를 참조하면, 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 장치를 통해 동일 객체를 추적하는 방법은 크게 IoU를 산출하는 과정과 유사도를 산출하는 과정을 포함하고, 산출된 IoU 및 유사도에 기초하여 동일 객체를 판단하는 과정을 포함할 수 있다.Referring to FIG. 2, the method of tracking the same object through a multi-infrastructure camera virtualization device according to an embodiment of the present invention largely includes a process of calculating IoU and a process of calculating similarity, and the calculated IoU and similarity are It may include a process of determining the same object based on the same object.

일 실시예로, 다중 인프라 카메라 가상화 장치(100)를 통해 동일 객체를 추적하는 방법은 현재 인프라 카메라가 검지 영역에 처음으로 진입한 객체를 검출하는 시점에 수행될 수 있다.In one embodiment, the method of tracking the same object through the multi-infrastructure camera virtualization device 100 may be performed at the time when the current infrastructure camera detects the object that first entered the detection area.

먼저, 도 2의 상단에 도시된 IoU(Intersection over Union)를 산출하는 방법으로서, 프로세서(160)는 현재 인프라 카메라가 검지 영역에 처음으로 진입한 객체를 검출한다(Cropped object Image 2 by current infra camera).First, as a method of calculating Intersection over Union (IoU) shown at the top of FIG. 2, the processor 160 detects the object that first entered the detection area of the current infra camera (Cropped object Image 2 by current infra camera ).

다음으로, 프로세서(160)는 현재 인프라 카메라에서 검출한 객체의 위치에 따라 해당 방향으로 인접한 영역을 검지하는 이전 인프라 카메라를 검출할 수 있다. Next, the processor 160 may detect a previous infrastructure camera that detects an adjacent area in the corresponding direction according to the location of the object detected by the current infrastructure camera.

다시 말하면, 현재 인프라 카메라에서 검출된 객체가 이동하는 방향과 동일 선상에 있는 방향 또는 객체의 이동이 이어지는 방향에서 가장 인접한 영역을 이전 시점에 검지하는 이전 인프라 카메라를 선택할 수 있다.In other words, a previous infrastructure camera that detects at a previous time the closest area in the direction in which the object detected by the current infrastructure camera is moving or the direction in which the object is moving can be selected.

다음으로, 프로세서(160)는 선택된 이전 인프라 카메라에서 현재 인프라 카메라 검지 영역 방향으로 이동한 객체들 중 아직 현재 인프라 카메라를 통해 동일 객체로 인식되지 않은 객체들을 선택할 수 있다.Next, the processor 160 may select objects that have not yet been recognized as the same object by the current infrastructure camera among objects that have moved from the selected previous infrastructure camera toward the current infrastructure camera detection area.

다음으로, 프로세서(160)는 선택된 각 객체에 대해서 객체마다 과거 N개의 위치정보(목록)를 궤적 예측 관리자(trajectory prediction manager)에 입력하여, 현재 인프라 카메라에서 객체를 검지한 시간에 해당하는 위치를 예측할 수 있다. Next, the processor 160 inputs N past location information (list) for each selected object into the trajectory prediction manager, and determines the location corresponding to the time the object was detected by the current infrastructure camera. It is predictable.

여기서, 궤적 예측 관리자는 LSTM(Long Short Term Memory) 기반의 궤적 예측 모델일 수 있다. 그리고, 객체의 과거 N개의 위치정보는 이전 인프라 카메라에 기초하여 획득될 수 있다.Here, the trajectory prediction manager may be a trajectory prediction model based on LSTM (Long Short Term Memory). And, N past location information of an object can be obtained based on previous infrastructure cameras.

여기서, LSTM은 RNN(Recurrent Neural Network)의 굉장히 특별한 종류로, 긴 의존 기간을 필요로 하는 학습을 수행할 능력을 갖는다. LSTM은 긴 의존 기간의 문제를 피하기 위해 명시적으로(explicitly) 설계되었다. RNN는 스스로를 반복하면서 이전 단계에서 얻은 정보가 지속되도록 한다. RNN은 체인처럼 이어지는 성질을 갖기에 음성인식, 언어 모델링, 번역, 이미지 주석 생성 등등의 다양한 분야에서 성공적으로 활용되고 있다. RNN의 성공의 열쇠는 "Long Short-Term Memory Network"(이하 LSTM)의 사용이다. 모든 RNN은 neural network 모듈을 반복시키는 체인과 같은 형태를 하고 있다. 기본적인 RNN에서 이렇게 반복되는 모듈은 굉장히 단순한 구조를 가지고 있다. 예를 들어, tanh layer 한 층을 들 수 있다.Here, LSTM is a very special type of RNN (Recurrent Neural Network) and has the ability to perform learning that requires a long dependency period. LSTM was explicitly designed to avoid the problem of long dependence periods. RNN repeats itself to ensure that information obtained from previous steps persists. Because RNNs have chain-like properties, they are successfully used in various fields such as speech recognition, language modeling, translation, and image annotation generation. The key to the success of RNN is the use of “Long Short-Term Memory Network” (LSTM). All RNNs are in the form of a chain that repeats neural network modules. In a basic RNN, this repeated module has a very simple structure. For example, one tanh layer.

LSTM도 똑같이 체인과 같은 구조를 가지고 있지만, 각 반복 모듈은 다른 구조를 갖고 있다. 단순한 neural network layer 한 층 대신에 4개의 layer가 특별한 방식으로 서로 정보를 주고 받도록 되어 있다. LSTM 기반의 궤적 예측 모델은 공지의 기술이므로 상세한 설명은 여기서 줄이도록 한다.LSTM also has a chain-like structure, but each iteration module has a different structure. Instead of a simple neural network layer, four layers exchange information with each other in a special way. Since the LSTM-based trajectory prediction model is a known technology, detailed description will be omitted here.

다시 말하자면, 프로세서(160)는 과거 N개의 위치 정보를 기반으로 현재 인프라 카메라에서 객체를 처음 검지한 제1 시점에 해당하는 객체의 위치를 궤적 예측 모델을 통해 예측할 수 있다. In other words, the processor 160 can predict the location of the object corresponding to the first time point when the object is first detected by the current infrastructure camera based on N pieces of past location information through a trajectory prediction model.

프로세서(160)는 예측이 완료된 후 예측한 위치에서 각 객체의 MBB(Minimum Bounding Box)와 현재 인프라 카메라가 검출한 객체의 MBB 간의 IoU(Intersection over Union)를 계산한다. After the prediction is completed, the processor 160 calculates the Intersection over Union (IoU) between the Minimum Bounding Box (MBB) of each object at the predicted location and the MBB of the object currently detected by the infrastructure camera.

여기서, IoU는 예측한 객체 위치에서의 MBB와 현재 인프라 카메라로 검출한 MBB의 교집합 영역을 합집합 영역으로 나눈 값이다.Here, IoU is the intersection area of the MBB at the predicted object location and the MBB detected by the current infrastructure camera divided by the union area.

구체적으로, IoU의 개념에 대해 설명하자면 다음과 같다.Specifically, the concept of IoU is explained as follows.

도 4에서 파란색 사각형은 정답(Ground truth)을 가리키는 바운딩 박스이고, 이는 현재 인프라 카메라가 검출한 자동차의 MBB이다. 노란색 사각형은 예측 모델에 의해 예측된 객체의 MBB이다. 정답과 예측의 겹친 부분을 교집합 즉, intersection이라 한다. 이를 단순하게 도식화하면 도 5와 같다.In Figure 4, the blue square is a bounding box indicating the ground truth, which is the MBB of the car currently detected by the infrastructure camera. The yellow square is the MBB of the object predicted by the prediction model. The overlapping part of the correct answer and the prediction is called the intersection. This is simply schematized as shown in Figure 5.

IoU를 수식으로 표현하면, 도 6과 같은데, 정답 바운딩 박스와 예측 바운딩 박스의 합집합(union)을 분모에 위치시키고, 교집합을 분자로 계산한다.If IoU is expressed as a formula, it is as shown in Figure 6, where the union of the correct answer bounding box and the predicted bounding box is placed in the denominator, and the intersection is calculated as the numerator.

여기서, 겹치는 부분 즉 교집합이 커지면 IoU값이 커지게 된다. 반대로 겹치는 부분이 아예 없다면 분자의 값이 0이기 때문에 IoU 값은 0으로 수렴한다.Here, as the overlap, or intersection, increases, the IoU value increases. Conversely, if there is no overlapping part, the IoU value converges to 0 because the numerator value is 0.

이를 통해 IoU의 범위는 0부터 1사이라는 것을 알 수 있다.From this, we can see that the range of IoU is between 0 and 1.

한편, 프로세서(160)는 전술한 IoU를 산출함과 동시에 유사도(Similarity)를 산출해야 하는데 구체적인 과정은 다음과 같다.Meanwhile, the processor 160 must calculate the similarity while calculating the above-described IoU, and the specific process is as follows.

먼저, 프로세서(160)는 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB(Minimum Bounding Box) 이미지와 동일한 객체를 현재 인프라 카메라에서 처음으로 검출했을 때의 MBB 이미지를 동일객체로 하여 SNN(Siamese Neural Network)을 학습시킨다. First, the processor 160 uses the MBB (Minimum Bounding Box) image of the object last detected by the previous infrastructure camera and the MBB image of the object when first detected by the current infrastructure camera as the same object, and uses the SNN (Siamese Neural Network) ) is learned.

이후, 프로세서(160)는 학습된 SNN 모델을 이용하여 동일 객체여부를 판단해야 할 이미지들의 특징을 SNN 모델을 통해 추출하고, 이들을 임베딩(Embedding)한 값을 비교하여 유사도(similarity)를 계산한다. 이 과정은 도 2의 하단에 도시되어 있다.Afterwards, the processor 160 uses the learned SNN model to extract features of images that need to determine whether they are the same object through the SNN model, and compares their embedding values to calculate similarity. This process is shown at the bottom of Figure 2.

여기서, 동일 객체여부를 판단해야 할 이미지들은 이전 인프라 카메라에서 제1 시점 직전에 마지막으로 검출한 객체의 MBB와 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB를 의미한다.Here, the images to determine whether they are the same object refer to the MBB of the object last detected just before the first viewpoint by the previous infrastructure camera and the MBB of the object first detected by the current infrastructure camera.

최종적으로 두 객체의 동일 여부는 각 비중의 합이 일정하게 설정된 IoU와 유사도를 이용하여 판단할 수 있다.Ultimately, whether two objects are identical can be determined using IoU and similarity, where the sum of each proportion is set to a constant level.

이때, 두 인프라 카메라(이전 인프라 카메라 및 현재 인프라 카메라)의 검지 영역 간의 간격이 얼마 되지 않아서, 즉 이전 인프라 카메라 및 현재 인프라 카메라 간의 이격 간격이 기 설정된 거리 이내여서 궤적 예측 정확도가 충분히 높다면, IoU의 비중을 상대적으로 높여서, IoU 및 유사도를 기반으로 두 객체의 동일 여부를 판단할 수 있다.At this time, if the distance between the detection areas of the two infrastructure cameras (the previous infrastructure camera and the current infrastructure camera) is short, that is, the separation distance between the previous infrastructure camera and the current infrastructure camera is within a preset distance, and the trajectory prediction accuracy is sufficiently high, IoU By relatively increasing the proportion of , it is possible to determine whether the two objects are identical based on IoU and similarity.

반대로, 두 인프라 카메라(이전 인프라 카메라 및 현재 인프라 카메라)의 검지영역 간의 거리가 멀어서, 즉 이격 간격이 기 설정된 거리를 초과해서 예측 정확도 하락이 예상될 경우, 유사도의 비중을 상대적으로 높여서, IoU 및 유사도를 기반으로 두 객체의 동일 여부를 판단할 수 있다.Conversely, if the distance between the detection areas of two infrastructure cameras (the previous infrastructure camera and the current infrastructure camera) is long, that is, the separation interval exceeds the preset distance and a decrease in prediction accuracy is expected, the proportion of similarity is relatively increased, and the IoU and Based on similarity, it is possible to determine whether two objects are identical.

도 2에서는 w는 비중으로, 0~1 사이의 값을 갖는다. 객체 동일 여부는 IoU와 유사도에 대해 비중을 반영하여 계산한 결과가 일정 값 이상일 경우 판정하게 된다. 일 실시예로, 객체 동일 여부는 IoU와 유사도에 대해 각 비중을 반영하여 합산한 결과가 기 설정된 일정 값 이상일 경우, 이전 인프라 카메라에서 검지된 객체와 현재 인프라에서 검지된 객체가 동일 객체인 것으로 판정할 수 있다.In Figure 2, w is the specific gravity and has a value between 0 and 1. Whether or not an object is identical is determined when the result calculated by reflecting the weight of IoU and similarity is above a certain value. In one embodiment, whether the object is identical is determined by reflecting the respective weights for IoU and similarity, and if the sum result is more than a preset certain value, the object detected by the previous infrastructure camera and the object detected by the current infrastructure are determined to be the same object. can do.

여기서, 본 발명은 LSTM 기반의 기본적인 궤적 예측 방법을 이용한다. 하지만, 배경기술에서 언급한 바와 같이 궤적 예측 방법은 예측해야 하는 시간 범위가 커질수록 정확도가 낮아지는 경향이 있으므로 이를 보완할 수 있는 방법이 필요하기 때문에, 본 발명에서는 CNN(Convolutional Neural Network)의 일종인 SNN(Siamese Neural Network)을 이용해 동일 객체 여부를 판단하는 방법을 채택하였다.Here, the present invention uses a basic trajectory prediction method based on LSTM. However, as mentioned in the background art, the accuracy of the trajectory prediction method tends to decrease as the time range to be predicted increases, so a method to complement this is needed. Therefore, in the present invention, a type of CNN (Convolutional Neural Network) is used. A method of determining whether the object is the same was adopted using Siamese Neural Network (SNN).

도 7은 SNN을 이용한 객체 동일 여부 판단 과정 및 개념을 도시하는 개념도이다.Figure 7 is a conceptual diagram showing the process and concept of determining whether objects are identical using SNN.

도 7을 참조하면, 인프라 카메라는 검지 방향에 따라 도 7에서처럼 동일 객체의 후면을 촬영하거나 전면을 촬영할 수 있다. 이들 객체의 동일 여부를 판단하기 위해서는 동일한 객체에 대해서 촬영 방향이 다른 두 이미지 즉, 전면 이미지 및 후면 이미지를 동일 객체인 것으로 레이블링(labeling)을 하고, SNN을 이용하여 학습을 진행한다. SNN은 분류해야 하는 클래스가 많고, 학습 이미지가 충분하지 않은 경우에 적합한 공지의 모델이다. Referring to FIG. 7, the infrastructure camera can photograph the rear or the front of the same object as shown in FIG. 7 depending on the direction of the detection. In order to determine whether these objects are the same, two images of the same object from different shooting directions, that is, the front image and the back image, are labeled as the same object, and learning is performed using SNN. SNN is a known model suitable for cases where there are many classes to classify and not enough training images.

본 발명에서 해결해야 하는 문제는 차종, 색상에 따라 동일 객체 여부를 판단해야 하는 것으로 클래스가 많으며 동시에 학습 이미지는 차종, 색상 별로 충분하지 않은 상황이다. 따라서 이러한 문제를 해결하는데 있어 가장 적합한 모델이 SNN이다.The problem to be solved in the present invention is that it is necessary to determine whether the object is the same depending on the vehicle type and color. There are many classes, and at the same time, the training images are not sufficient for each vehicle type and color. Therefore, the most suitable model to solve this problem is SNN.

도 9는 본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 방법을 설명하기 위한 순서도이다. Figure 9 is a flowchart for explaining a multi-infrastructure camera virtualization method according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 다중 인프라 카메라 가상화 방법은, 도 1의 프로세서(160)와 실질적으로 동일한 구성에서 진행될 수 있다. 따라서, 도 1의 프로세서(160)와 동일한 구성요소는 동일한 도면부호를 부여하고, 반복되는 설명은 생략한다.The multi-infrastructure camera virtualization method according to an embodiment of the present invention may be performed in substantially the same configuration as the processor 160 of FIG. 1. Accordingly, the same components as those of the processor 160 in FIG. 1 are given the same reference numerals, and repeated descriptions are omitted.

또한, 본 실시예에 따른 다중 인프라 카메라 가상화 방법은 소프트웨어(어플리케이션)에 의해 실행될 수 있다.Additionally, the multi-infra camera virtualization method according to this embodiment can be executed by software (application).

먼저 도 9를 참조하면, 현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델을 통해 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출할 수 있다(S110).First, referring to FIG. 9, the MBB (Minimum Bounding Box) of the object detected by the current infrastructure camera at the first viewpoint and the MBB of the object at the position predicted for the first viewpoint through the previous infrastructure camera and trajectory prediction model Intersection over Union (IoU) can be calculated (S110).

S110 단계는, 구체적으로, 상기 현재 인프라 카메라에서 검출된 객체가 이동하는 방향과 동일 선상에 있는 방향 또는 상기 객체의 이동이 이어지는 방향에서 가장 인접한 영역을 이전 시점에 검지하는 상기 이전 인프라 카메라를 선택하는 단계, 선택된 상기 이전 인프라 카메라로부터 상기 현재 인프라 카메라의 검지 영역 방향으로 이동한 객체들 중 아직 상기 현재 인프라 카메라를 통해 동일 객체로 인식되지 않은 객체들을 선택하는 단계, 선택된 상기 각 객체마다 이전시점의 N개의 위치 정보를 궤적 예측 모델에 입력하여, 상기 현재 인프라 카메라에서 각 상기 객체를 검지한 상기 제1 시점에 해당하는 상기 객체의 위치를 예측하는 단계, 상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB를 획득하는 단계, 및 상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB간의 각 IoU를 산출하는 단계를 포함할 수 있다.Specifically, step S110 selects the previous infrastructure camera that detects the closest area at the previous time in the direction in which the object detected by the current infrastructure camera is moving and the same direction or in the direction in which the object continues to move. Step, selecting objects that have not yet been recognized as the same object by the current infrastructure camera among objects that have moved from the selected previous infrastructure camera toward the detection area of the current infrastructure camera, N of the previous time point for each selected object Inputting the location information into a trajectory prediction model to predict the location of the object corresponding to the first time point when each of the objects is detected by the current infrastructure camera, the MBB of each object at the predicted location and the It may include obtaining the MBB of the object detected by the current infrastructure camera, and calculating each IoU between the MBB of each object and the MBB of the object detected by the current infrastructure camera at the predicted location.

다음으로, SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 직전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출할 수 있다(S120).Next, based on SNN (Siamese Neural Network), the MBB of the object last detected through the previous infrastructure camera just before the first time point and the MBB of the object when the object was first detected at the first time point by the current infrastructure camera Similarity between MBBs can be calculated (S120).

S120 단계는 구체적으로, 상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB와 상기 객체와 동일한 객체를 상기 현재 인프라 카메라에서 처음으로 검출했을 때의 MBB를 학습데이터로 이용하여 SNN(Siamese Neural Network)을 학습시키는 단계, 학습된 상기 SNN을 이용하여 상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 추출하는 단계, 및 상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 각각 임베딩한 값을 비교하여 상기 유사도를 산출하는 단계를 포함할 수 있다.Specifically, step S120 is a SNN (Siamese Neural Network) using the MBB of the object last detected by the previous infrastructure camera and the MBB of the object identical to the object when first detected by the current infrastructure camera as learning data. A step of learning, using the learned SNN to extract features of the MBB of the object last detected by the previous infrastructure camera and features of the MBB of the object first detected by the current infrastructure camera, and the previous infrastructure camera It may include calculating the similarity by comparing the embedding values of the MBB features of the object last detected in and the MBB features of the object first detected by the current infrastructure camera.

다음으로, 상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정할 수 있다(S130).Next, the weight given to the IoU and the similarity can be determined according to the separation distance between the previous infrastructure camera and the current infrastructure camera (S130).

S130 단계는 구체적으로, 상기 IoU 및 상기 유사도에 부여되는 각 비중의 합이 일정할 때, 상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리가 기 설정된 거리 이내인 경우, 상기 IoU에 부여되는 비중을 상대적으로 높일 수 있다. 이는 두 카메라 간 검지 영역 간의 이격 거리가 얼마 되지 않아서, 궤적 예측 모델의 정확도가 충분히 높을 수 있는 것으로 보기 때문에 IoU의 비중을 높일 수 있는 것이다.Specifically, in step S130, when the sum of the weights given to the IoU and the similarity is constant, and the separation distance between the previous infrastructure camera and the current infrastructure camera is within a preset distance, the weight given to the IoU is determined. It can be relatively high. This is because the separation distance between the detection areas between the two cameras is short, and the accuracy of the trajectory prediction model is considered to be sufficiently high, so the proportion of IoU can be increased.

반대로, 상기 이격 거리가 상기 기 설정된 거리를 초과하는 경우, 상기 유사도에 부여되는 비중을 상대적으로 높일 수 있다. 이는, 두 카메라 간 검지 영역 간의 이격 거리가 멀어서 궤적 예측 모델의 정확도가 상대적으로 하락할 것으로 보기 때문에, 유사도의 비중을 높일 수 있는 것이다.Conversely, when the separation distance exceeds the preset distance, the weight given to the similarity may be relatively increased. This is because the accuracy of the trajectory prediction model is expected to decrease relatively due to the long separation distance between the detection areas between the two cameras, so the proportion of similarity can be increased.

다음으로, 각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단할 수 있다(S140).Next, it can be determined whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same based on the IoU and the similarity for which each of the proportions has been determined (S140).

S140 단계는, 구체적으로, 각 상기 비중이 결정된 상기 IoU 및 상기 유사도의 합이 기 설정된 일정 값 이상인 경우 동일 객체인 것으로 판단할 수 있다.Specifically, step S140 may determine that the object is the same when the sum of the IoU and the similarity for which each weight is determined is greater than or equal to a preset certain value.

그리고, 동일 객체로 판단된 경우, 상기 동일 객체의 이동을 계속 추적할 수 있으며, 이는 객체의 이동에 따라 이전 인프라 카메라 및 현재 인프라 카메라만 대체되면서 S110 단계 내지 S140 단계를 반복할 수 있음을 의미한다.And, if it is determined to be the same object, the movement of the same object can be continuously tracked, which means that steps S110 to S140 can be repeated while only the previous infrastructure camera and the current infrastructure camera are replaced according to the movement of the object. .

이하에서는 본 발명의 일 실시예에 따른 다중 인프라 카메라의 가상화 장치의 실효성을 평가하기 위한 실험 진행 내용을 개진한다.Below, we present the details of an experiment to evaluate the effectiveness of the virtualization device for multiple infrastructure cameras according to an embodiment of the present invention.

실험을 위해, LSTM 기반의 궤적 예측과 이를 통한 MBB의 IoU 측정 방법 및 SNN 기반의 이미지 유사도 측정 방법을 결합한다.For the experiment, we combine LSTM-based trajectory prediction with MBB's IoU measurement method and SNN-based image similarity measurement method.

설치된 인프라 카메라들의 검지 영역 간의 간격에 따라 비중의 총합이 일정한 IoU 및 유사도의 각 비중을 조절한다.Depending on the spacing between the detection areas of installed infrastructure cameras, the proportion of each IoU and similarity is adjusted, with the total sum of the proportions being constant.

본 발명에서는 인프라 카메라 검지영역 간의 거리가 충분히 가까운 상황, 즉 기 설정된 거리 이내인 경우에서는 궤적 예측을 통한 IoU 측정 방법만을 적용하여 동일 객체를 판별할 수 있다.In the present invention, in a situation where the distance between the infrastructure camera detection areas is sufficiently close, that is, within a preset distance, the same object can be identified by only applying the IoU measurement method through trajectory prediction.

일 예로, 고속도로 합류부에서 차량 궤적을 추출한 데이터 집합을 이용하여 실험을 수행하였다. 해당 데이터 집합은 하나의 카메라를 통해 고속도로 합류부 차량들의 궤적을 추출하였다. As an example, an experiment was performed using a data set that extracted vehicle trajectories from a highway junction. This data set extracted the trajectories of vehicles at highway junctions through a single camera.

이 실험에서는 해당 데이터 집합의 대상 영역을 두 개의 영역으로 구분하고, 제안하는 방법에 따라 동일 객체를 판별하는 실험을 수행하였다.In this experiment, the target area of the data set was divided into two areas and an experiment was performed to identify identical objects according to the proposed method.

실험에 사용한 궤적은 모두 511개이며, 이중 411개의 궤적으로 LSTM기반의 궤적 예측 모델을 학습하고, 나머지 100개의 궤적으로 테스트를 수행한다. There are a total of 511 trajectories used in the experiment, of which 411 trajectories are used to learn an LSTM-based trajectory prediction model, and tests are performed with the remaining 100 trajectories.

또한, 실험에서는 도 8에 도시된 바와 같이 사용하는 데이터 집합이 수집된 카메라 검지영역을 두 개의 카메라 검지 영역이 있는 것으로 가정하여 영역을 나누고 각각을 이전 인프라 카메라 검지 영역과 현재 인프라 검지 영역으로 가정한다. In addition, in the experiment, as shown in Figure 8, the camera detection area where the data set used is collected is assumed to have two camera detection areas, and the areas are divided into the previous infrastructure camera detection area and the current infrastructure detection area. .

도 8에서 왼쪽의 Infra Camera 1 영역이 이전 인프라 카메라 검지 영역이고, 오른쪽의 Infra Camera 2 영역이 현재 인프라 카메라 검지 영역이다.In Figure 8, the Infra Camera 1 area on the left is the previous infrastructure camera detection area, and the Infra Camera 2 area on the right is the current infrastructure camera detection area.

이 두 영역을 지나가는 궤적들에 대해 LSTM 궤적 예측 모델을 학습하고, 테스트 궤적에 대해서 이전 인프라 카메라 검지 영역에서의 마지막 위치에서 현재 인프라 카메라 검지 영역에서 최초로 나타날 때의 시간에 해당하는 궤적을 예측하고 IoU를 계산한다. IoU가 0.5 이상이면 동일 객체로 판정하게 하였다. An LSTM trajectory prediction model is learned for the trajectories passing through these two areas, and for the test trajectory, the trajectory corresponding to the time from the last position in the previous infrastructure camera detection area to the first appearance in the current infrastructure camera detection area is predicted, and IoU Calculate . If the IoU was 0.5 or more, it was judged to be the same object.

이에, 94건에 대해 동일 객체에 대해 정확하게 동일객체로 판정함을 확인할 수 있었다.Accordingly, it was confirmed that in 94 cases, the same object was judged to be exactly the same.

본 발명의 일 실시예에서는 협력 자율주행에서 다수 인프라 카메라에 의해 검지되는 객체들을 연속적으로 추적하기 위한 방법을 제안하였다.In one embodiment of the present invention, a method for continuously tracking objects detected by multiple infrastructure cameras in cooperative autonomous driving is proposed.

제안하는 방법은 LSTM 기반의 궤적 예측 방법과 객체 이미지간의 유사도를 측정하는 방법을 결합하여 정확도를 높인다. The proposed method improves accuracy by combining an LSTM-based trajectory prediction method and a method of measuring similarity between object images.

전술한 객체 관계 특징을 고려하는 다중 인프라 카메라 가상화 장치는, 프로세서, 메모리, 사용자 입력장치, 프레젠테이션 장치 중 적어도 일부를 포함하는 컴퓨팅 장치에 의해 구현될 수 있다. 메모리는, 프로세서에 의해 실행되면 특정 태스크를 수행할 수 있도록 코딩되어 있는 컴퓨터-판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션(instructions), 및/또는 데이터 등을 저장하는 매체이다. 프로세서는 메모리에 저장되어 있는 컴퓨터-판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션, 및/또는 데이터 등을 판독하여 실행할 수 있다.A multi-infrastructure camera virtualization device that considers the above-described object relationship characteristics may be implemented by a computing device that includes at least some of a processor, memory, user input device, and presentation device. Memory is a medium that stores computer-readable software, applications, program modules, routines, instructions, and/or data that are coded to perform specific tasks when executed by a processor. The processor may read and execute computer-readable software, applications, program modules, routines, instructions, and/or data stored in memory.

컴퓨팅 장치는 스마트폰, 태블릿, 랩탑, 데스크탑, 서버, 클라이언트 등의 다양한 장치를 포함할 수 있다. 컴퓨팅 장치는 하나의 단일한 스탠드-얼론 장치일 수도 있고, 통신망을 통해 서로 협력하는 다수의 컴퓨팅 장치들로 이루어진 분산형 환경에서 동작하는 다수의 컴퓨팅 장치를 포함할 수 있다.Computing devices may include a variety of devices such as smartphones, tablets, laptops, desktops, servers, and clients. A computing device may be a single stand-alone device or may include multiple computing devices operating in a distributed environment comprised of multiple computing devices cooperating with each other through a communication network.

또한 전술한 다중 인프라 카메라 가상화 방법은, 프로세서를 구비하고, 또한 프로세서에 의해 실행되면 인공지능 모델을 활용한 다중 인프라 카메라 가상화 방법을 수행할 수 있도록 코딩된 컴퓨터 판독가능 소프트웨어, 애플리케이션, 프로그램 모듈, 루틴, 인스트럭션, 및/또는 데이터 구조 등을 저장한 메모리를 구비하는 컴퓨팅 장치에 의해 실행될 수 있다.In addition, the above-described multi-infrastructure camera virtualization method includes a processor, and computer-readable software, applications, program modules, and routines coded to perform the multi-infrastructure camera virtualization method using an artificial intelligence model when executed by the processor. , instructions, and/or data structures, etc. may be executed by a computing device having a memory.

상술한 본 실시예들은 다양한 수단을 통해 구현될 수 있다. 예를 들어, 본 실시예들은 하드웨어, 펌웨어(firmware), 소프트웨어 또는 그것들의 결합 등에 의해 구현될 수 있다.The above-described embodiments can be implemented through various means. For example, the present embodiments may be implemented by hardware, firmware, software, or a combination thereof.

하드웨어에 의한 구현의 경우, 본 실시예들에 따른 인공지능 모델을 활용한 영상 진단 방법은 하나 또는 그 이상의 ASICs(Application Specific Integrated Circuits), DSPs(Digital Signal Processors), DSPDs(Digital Signal Processing Devices), PLDs(Programmable Logic Devices), FPGAs(Field Programmable Gate Arrays), 프로세서, 컨트롤러, 마이크로 컨트롤러 또는 마이크로 프로세서 등에 의해 구현될 수 있다.In the case of hardware implementation, the image diagnosis method using the artificial intelligence model according to the present embodiments includes one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), It can be implemented by Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, microcontrollers, or microprocessors.

예를 들어, 실시예들에 따른 다중 인프라 카메라 가상화 방법은 심층 신경망의 뉴런(neuron)과 시냅스(synapse)가 반도체 소자들로 구현된 인공지능 반도체 장치를 이용하여 구현될 수 있다. 이때 반도체 소자는 현재 사용하는 반도체 소자들, 예를 들어 SRAM이나 DRAM, NAND 등일 수도 있고, 차세대 반도체 소자들, RRAM이나 STT MRAM, PRAM 등일 수도 있고, 이들의 조합일 수도 있다.For example, the multi-infrastructure camera virtualization method according to embodiments may be implemented using an artificial intelligence semiconductor device in which neurons and synapses of a deep neural network are implemented with semiconductor devices. At this time, the semiconductor device may be currently used semiconductor devices, such as SRAM, DRAM, or NAND, or may be next-generation semiconductor devices such as RRAM, STT MRAM, or PRAM, or a combination thereof.

실시예들에 따른 다중 인프라 카메라 가상화 방법을 인공지능 반도체 장치를 이용하여 구현할 때, 인공지능 모델을 소프트웨어로 학습한 결과(가중치)를 어레이로 배치된 시냅스 모방소자에 전사하거나 인공지능 반도체 장치에서 학습을 진행할 수도 있다.When implementing the multi-infra camera virtualization method according to the embodiments using an artificial intelligence semiconductor device, the results (weights) of learning the artificial intelligence model with software are transferred to synapse mimic devices arranged in an array or learned from the artificial intelligence semiconductor device. You can also proceed.

펌웨어나 소프트웨어에 의한 구현의 경우, 본 실시예들에 따른 다중 인프라 카메라 가상화 방법은 이상에서 설명된 기능 또는 동작들을 수행하는 장치, 절차 또는 함수 등의 형태로 구현될 수 있다. 소프트웨어 코드는 메모리 유닛에 저장되어 프로세서에 의해 구동될 수 있다. 메모리 유닛은 상기 프로세서 내부 또는 외부에 위치하여, 이미 공지된 다양한 수단에 의해 프로세서와 데이터를 주고 받을 수 있다.In the case of implementation by firmware or software, the multi-infra camera virtualization method according to the present embodiments may be implemented in the form of a device, procedure, or function that performs the functions or operations described above. Software code can be stored in a memory unit and run by a processor. The memory unit is located inside or outside the processor and can exchange data with the processor through various known means.

또한, 위에서 설명한 "시스템", "프로세서", "컨트롤러", "컴포넌트", "모듈", "인터페이스", "모델", 또는 "유닛" 등의 용어는 일반적으로 컴퓨터 관련 엔티티 하드웨어, 하드웨어와 소프트웨어의 조합, 소프트웨어 또는 실행 중인 소프트웨어를 의미할 수 있다. 예를 들어, 전술한 구성요소는 프로세서에 의해서 구동되는 프로세스, 프로세서, 컨트롤러, 제어 프로세서, 개체, 실행 스레드, 프로그램 및/또는 컴퓨터일 수 있지만 이에 국한되지 않는다. 예를 들어, 컨트롤러 또는 프로세서에서 실행 중인 애플리케이션과 컨트롤러 또는 프로세서가 모두 구성 요소가 될 수 있다. 하나 이상의 구성 요소가 프로세스 및/또는 실행 스레드 내에 있을 수 있으며, 구성 요소들은 하나의 장치(예: 시스템, 컴퓨팅 디바이스 등)에 위치하거나 둘 이상의 장치에 분산되어 위치할 수 있다.Additionally, terms such as "system", "processor", "controller", "component", "module", "interface", "model", or "unit" described above generally refer to computer-related entities hardware, hardware and software. It may refer to a combination of, software, or running software. By way of example, but not limited to, the foregoing components may be a process, processor, controller, control processor, object, thread of execution, program, and/or computer run by a processor. For example, both an application running on a controller or processor and the controller or processor can be a component. One or more components may reside within a process and/or thread of execution, and the components may be located on a single device (e.g., system, computing device, etc.) or distributed across two or more devices.

한편, 또 다른 실시예는 전술한 다중 인프라 카메라 가상화 방법을 수행하는, 컴퓨터 기록매체에 저장되는 컴퓨터 프로그램을 제공한다. 또한 또 다른 실시예는 전술한 방법을 실현시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다.Meanwhile, another embodiment provides a computer program stored in a computer recording medium that performs the above-described multi-infrastructure camera virtualization method. Another embodiment also provides a computer-readable recording medium on which a program for implementing the above-described method is recorded.

기록매체에 기록된 프로그램은 컴퓨터에서 읽히어 설치되고 실행됨으로써 전술한 단계들을 실행할 수 있다. 이와 같이, 컴퓨터가 기록매체에 기록된 프로그램을 읽어 들여 프로그램으로 구현된 기능들을 실행시키기 위하여, 전술한 프로그램은 컴퓨터의 프로세서(CPU)가 컴퓨터의 장치 인터페이스(Interface)를 통해 읽힐 수 있는 C, C++, JAVA, 기계어 등의 컴퓨터 언어로 코드화된 코드(Code)를 포함할 수 있다.The program recorded on the recording medium can be read, installed, and executed on the computer to execute the above-described steps. In this way, in order for the computer to read the program recorded on the recording medium and execute the functions implemented by the program, the above-mentioned program is a C, C++ program that the computer's processor (CPU) can read through the computer's device interface (Interface). , may include code coded in computer languages such as JAVA and machine language.

이러한 코드는 전술한 기능들을 정의한 함수 등과 관련된 기능적인 코드를 포함할 수 있고, 전술한 기능들을 컴퓨터의 프로세서가 소정의 절차대로 실행시키는데 필요한 실행 절차 관련 제어 코드를 포함할 수도 있다.These codes may include functional codes related to functions that define the above-described functions, and may also include control codes related to execution procedures necessary for the computer's processor to execute the above-described functions according to predetermined procedures.

또한, 이러한 코드는 전술한 기능들을 컴퓨터의 프로세서가 실행시키는데 필요한 추가 정보나 미디어가 컴퓨터의 내부 또는 외부 메모리의 어느 위치(주소 번지)에서 참조 되어야 하는지에 대한 메모리 참조 관련 코드를 더 포함할 수 있다.In addition, these codes may further include memory reference-related codes that determine which location (address address) in the computer's internal or external memory the additional information or media required for the computer's processor to execute the above-mentioned functions should be referenced. .

또한, 컴퓨터의 프로세서가 전술한 기능들을 실행시키기 위하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 통신이 필요한 경우, 코드는 컴퓨터의 프로세서가 컴퓨터의 통신 모듈을 이용하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 어떻게 통신해야만 하는지, 통신 시 어떠한 정보나 미디어를 송수신해야 하는 지 등에 대한 통신 관련 코드를 더 포함할 수도 있다.In addition, if the computer's processor needs to communicate with any other remote computer or server in order to execute the above-mentioned functions, the code is It may further include communication-related code for how to communicate with other computers, servers, etc., and what information or media should be transmitted and received during communication.

이상에서 전술한 바와 같은 프로그램을 기록한 컴퓨터로 읽힐 수 있는 기록매체는, 일 예로, ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 미디어 저장장치 등이 있다.Examples of recording media that can be read by a computer recording the programs described above include ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical media storage devices.

또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Additionally, computer-readable recording media can be distributed across computer systems connected to a network, so that computer-readable code can be stored and executed in a distributed manner.

그리고, 본 발명을 구현하기 위한 기능적인(Functional) 프로그램과 이와 관련된 코드 및 코드 세그먼트 등은, 기록매체를 읽어서 프로그램을 실행시키는 컴퓨터의 시스템 환경 등을 고려하여, 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론되거나 변경될 수도 있다.In addition, the functional program for implementing the present invention and the code and code segments related thereto are designed by programmers in the technical field to which the present invention belongs, taking into account the system environment of the computer that reads the recording medium and executes the program. It can also be easily inferred or changed by .

다중 인프라 카메라 가상화 방법은, 컴퓨터에 의해 실행되는 애플리케이션이나 프로그램 모듈과 같은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다.The multi-infrastructure camera virtualization method may also be implemented in the form of a recording medium containing instructions executable by a computer, such as an application or program module executed by the computer. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and non-volatile media, removable and non-removable media. Additionally, computer-readable media may include all computer storage media. Computer storage media includes both volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data.

전술한 다중 인프라 카메라 가상화 방법은, 단말기에 기본적으로 설치된 애플리케이션(이는 단말기에 기본적으로 탑재된 플랫폼이나 운영체제 등에 포함된 프로그램을 포함할 수 있다)에 의해 실행될 수 있고, 사용자가 애플리케이션 스토어 서버, 애플리케이션 또는 해당 서비스와 관련된 웹 서버 등의 애플리케이션 제공 서버를 통해 마스터 단말기에 직접 설치한 애플리케이션(즉, 프로그램)에 의해 실행될 수도 있다. 이러한 의미에서, 전술한 다중 인프라 카메라 가상화 방법은 단말기에 기본적으로 설치되거나 사용자에 의해 직접 설치된 애플리케이션(즉, 프로그램)으로 구현되고 단말기에 등의 컴퓨터로 읽을 수 있는 기록매체에 기록될 수 있다.The above-described multi-infrastructure camera virtualization method can be executed by an application installed by default on the terminal (this may include programs included in the platform or operating system, etc. installed by default on the terminal), and the user can use an application store server, application or It may also be executed by an application (i.e. program) installed directly on the master terminal through an application providing server such as a web server related to the service. In this sense, the above-described multi-infrastructure camera virtualization method may be implemented as an application (i.e., program) installed by default in the terminal or directly installed by the user and recorded on a computer-readable recording medium such as the terminal.

이상, 본 발명의 특정 실시예에 대하여 상술하였다. 그러나, 본 발명의 사상 및 범위는 이러한 특정 실시예에 한정되는 것이 아니라, 본 발명의 요지를 변경하지 않는 범위 내에서 다양하게 수정 및 변형이 가능하다는 것을 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 이해할 것이다.Above, specific embodiments of the present invention have been described in detail. However, the spirit and scope of the present invention are not limited to these specific embodiments, and it is known by common knowledge in the technical field to which the present invention pertains that various modifications and variations can be made without changing the gist of the present invention. Anyone who has it will understand.

따라서, 이상에서 기술한 실시예들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이므로, 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 하며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Therefore, the embodiments described above are provided to fully inform those skilled in the art of the present invention of the scope of the invention, and should be understood as illustrative in all respects and not restrictive. The invention is defined only by the scope of the claims.

Claims

다중 인프라 카메라 가상화 장치가 동일 객체를 연속적으로 추적하는 방법에 있어서,
현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델에 기초하여 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출하는 단계;
SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출하는 단계;
상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 단계; 및
각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 단계를 포함하는,
다중 인프라 카메라 가상화 방법.In a method for a multi-infrastructure camera virtualization device to continuously track the same object,
Intersection over Union (IoU) between the Minimum Bounding Box (MBB) of the object detected by the current infrastructure camera at the first viewpoint and the MBB of the object at the position predicted for the first viewpoint based on the previous infrastructure camera and trajectory prediction model. ) calculating;
Based on SNN (Siamese Neural Network), the similarity between the MBB of the object last detected through the previous infrastructure camera before the first time point and the MBB when the object was first detected by the current infrastructure camera at the first time point is calculated. calculating step;
determining each weight given to the IoU and the similarity according to the separation distance between the previous infrastructure camera and the current infrastructure camera; and
Comprising the step of determining whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same object based on the IoU and the similarity for which each of the proportions is determined.
Multi-infrastructure camera virtualization method.

제 1 항에 있어서,
상기 IoU를 산출하는 단계는,
상기 현재 인프라 카메라에서 검출된 객체가 이동하는 방향과 동일 선상에 있는 방향 또는 상기 객체의 이동이 이어지는 방향에서 가장 인접한 영역을 이전 시점에 검지하는 상기 이전 인프라 카메라를 선택하는 단계;
선택된 상기 이전 인프라 카메라로부터 상기 현재 인프라 카메라의 검지 영역 방향으로 이동한 객체들 중 아직 상기 현재 인프라 카메라를 통해 동일 객체로 인식되지 않은 객체들을 선택하는 단계;
선택된 상기 각 객체마다 이전시점의 N개의 위치 정보를 궤적 예측 모델에 입력하여, 상기 현재 인프라 카메라에서 각 상기 객체를 검지한 상기 제1 시점에 해당하는 상기 객체의 위치를 예측하는 단계;
상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB를 획득하는 단계; 및
상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB간의 각 IoU를 산출하는 단계를 포함하는,
다중 인프라 카메라 가상화 방법.According to claim 1,
The step of calculating the IoU is,
Selecting the previous infrastructure camera that detects at a previous time the area closest to the direction in which the object detected by the current infrastructure camera is moving in the same direction or in the direction in which the object is moving;
selecting objects that have not yet been recognized as the same object by the current infrastructure camera among objects that have moved from the selected previous infrastructure camera toward the detection area of the current infrastructure camera;
Inputting N pieces of location information from previous viewpoints for each selected object into a trajectory prediction model to predict the position of the object corresponding to the first viewpoint at which each object is detected by the current infrastructure camera;
Obtaining the MBB of each object and the MBB of the object detected by the current infrastructure camera at the predicted location; and
Comprising the step of calculating each IoU between the MBB of each object at the predicted location and the MBB of the object detected by the current infrastructure camera,
Multi-infrastructure camera virtualization method.

제 1 항에 있어서,
상기 유사도를 산출하는 단계는,
상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB와 상기 객체와 동일한 객체를 상기 현재 인프라 카메라에서 처음으로 검출했을 때의 MBB를 학습데이터로 이용하여 SNN(Siamese Neural Network)을 학습시키는 단계;
학습된 상기 SNN을 이용하여 상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 추출하는 단계; 및
상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 각각 임베딩한 값을 비교하여 상기 유사도를 산출하는 단계를 포함하는,
다중 인프라 카메라 가상화 방법.According to claim 1,
The step of calculating the similarity is,
Learning a Siamese Neural Network (SNN) using the MBB of the object last detected by the previous infrastructure camera and the MBB of the object identical to the object when first detected by the current infrastructure camera as learning data;
extracting features of the MBB of the object last detected by the previous infrastructure camera and features of the MBB of the object first detected by the current infrastructure camera using the learned SNN; and
Comprising the step of calculating the similarity by comparing the embedding values of the features of the MBB of the object last detected by the previous infrastructure camera and the features of the MBB of the object first detected by the current infrastructure camera, respectively.
Multi-infrastructure camera virtualization method.

제 1 항에 있어서,
상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 단계는,
상기 IoU 및 상기 유사도에 부여되는 각 비중의 합이 일정할 때,
상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리가 기 설정된 거리 이내인 경우, 상기 IoU에 부여되는 비중을 상대적으로 높이고,
상기 이격 거리가 상기 기 설정된 거리를 초과하는 경우, 상기 유사도에 부여되는 비중을 상대적으로 높이는,
다중 인프라 카메라 가상화 방법.According to claim 1,
The step of determining each weight given to the IoU and the similarity is,
When the sum of the weights assigned to the IoU and the similarity is constant,
If the separation distance between the previous infrastructure camera and the current infrastructure camera is within a preset distance, the weight given to the IoU is relatively increased,
When the separation distance exceeds the preset distance, the weight given to the similarity is relatively increased,
Multi-infrastructure camera virtualization method.

제 1 항에 있어서,
상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 단계는,
각 상기 비중이 결정된 상기 IoU 및 상기 유사도의 합이 기 설정된 일정 값 이상인 경우 동일 객체인 것으로 판단하고,
동일 객체로 판단된 경우, 상기 동일 객체의 이동을 계속 추적하는 단계를 더 포함하는,
다중 인프라 카메라 가상화 방법.According to claim 1,
The step of determining whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same object,
If the sum of the IoU and the similarity for which each weight is determined is greater than or equal to a preset certain value, it is determined that they are the same object,
If it is determined to be the same object, further comprising continuing to track the movement of the same object,
Multi-infrastructure camera virtualization method.

기 학습된 인공지능 모델에 대한 정보를 저장하는 메모리부;
학습된 궤적 예측 모델 및 학습된 SNN(Siamese Neural Network)을 저장하는 데이터베이스;
현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델에 기초하여 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출하는 동작,
SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출하는 동작,
상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 동작, 및 각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 동작을 실행하는 프로세서를 포함하는,
다중 인프라 카메라 가상화 장치.A memory unit that stores information about a previously learned artificial intelligence model;
A database storing the learned trajectory prediction model and the learned Siamese Neural Network (SNN);
Intersection over Union (IoU) between the Minimum Bounding Box (MBB) of the object detected by the current infrastructure camera at the first viewpoint and the MBB of the object at the position predicted for the first viewpoint based on the previous infrastructure camera and trajectory prediction model. ), the operation that calculates
Based on SNN (Siamese Neural Network), the similarity between the MBB of the object last detected through the previous infrastructure camera before the first time point and the MBB when the object was first detected by the current infrastructure camera at the first time point is calculated. action that produces,
An operation of determining each weight given to the IoU and the similarity according to the separation distance between the previous infrastructure camera and the current infrastructure camera, and determining the weight assigned to the IoU and the similarity for each weight detected from the previous infrastructure camera based on the determined IoU and the similarity. Comprising a processor that executes an operation to determine whether an object and an object detected by the current infrastructure camera are the same object,
Multi-infrastructure camera virtualization device.

제 6 항에 있어서,
상기 프로세서는,
상기 IoU를 산출하는 동작으로서,
상기 현재 인프라 카메라에서 검출된 객체가 이동하는 방향과 동일 선상에 있는 방향 또는 상기 객체의 이동이 이어지는 방향에서 가장 인접한 영역을 이전 시점에 검지하는 상기 이전 인프라 카메라를 선택하는 동작,
선택된 상기 이전 인프라 카메라로부터 상기 현재 인프라 카메라의 검지 영역 방향으로 이동한 객체들 중 아직 상기 현재 인프라 카메라를 통해 동일 객체로 인식되지 않은 객체들을 선택하는 동작,
선택된 상기 각 객체마다 이전시점의 N개의 위치 정보를 궤적 예측 모델에 입력하여, 상기 현재 인프라 카메라에서 각 상기 객체를 검지한 상기 제1 시점에 해당하는 상기 객체의 위치를 예측하는 동작,
상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB를 획득하는 동작, 및
상기 예측된 위치에서 상기 각 객체의 MBB와 상기 현재 인프라 카메라가 검출한 객체의 MBB간의 각 IoU를 산출하는 동작을 실행하는,
다중 인프라 카메라 가상화 장치.According to claim 6,
The processor,
As an operation for calculating the IoU,
An operation of selecting the previous infrastructure camera that detects at a previous time the closest area in the direction in which the object detected by the current infrastructure camera is moving in the same direction or in the direction in which the object is moving;
An operation of selecting objects that have not yet been recognized as the same object by the current infrastructure camera among objects that have moved from the selected previous infrastructure camera toward the detection area of the current infrastructure camera,
An operation of inputting N location information of previous viewpoints for each selected object into a trajectory prediction model to predict the position of the object corresponding to the first viewpoint at which each object is detected by the current infrastructure camera;
An operation of obtaining the MBB of each object and the MBB of the object detected by the current infrastructure camera at the predicted location, and
Executing an operation of calculating each IoU between the MBB of each object at the predicted location and the MBB of the object detected by the current infrastructure camera,
Multi-infrastructure camera virtualization device.

제 6 항에 있어서,
상기 프로세서는,
상기 유사도를 산출하는 동작으로서,
상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB와 상기 객체와 동일한 객체를 상기 현재 인프라 카메라에서 처음으로 검출했을 때의 MBB를 학습데이터로 이용하여 SNN(Siamese Neural Network)을 학습시키는 동작,
학습된 상기 SNN을 이용하여 상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 추출하는 동작, 및
상기 이전 인프라 카메라에서 마지막으로 검출한 객체의 MBB의 특징과 상기 현재 인프라 카메라에서 처음으로 검출한 객체의 MBB의 특징을 각각 임베딩한 값을 비교하여 상기 유사도를 산출하는 동작을 실행하는,
다중 인프라 카메라 가상화 장치.According to claim 6,
The processor,
As an operation for calculating the similarity,
An operation of training an SNN (Siamese Neural Network) using the MBB of the object last detected by the previous infrastructure camera and the MBB of the object identical to the object when first detected by the current infrastructure camera as learning data,
An operation of extracting features of the MBB of the object last detected by the previous infrastructure camera and features of the MBB of the object first detected by the current infrastructure camera using the learned SNN, and
Executing an operation of calculating the similarity by comparing the embedding values of the MBB characteristics of the object last detected by the previous infrastructure camera and the MBB characteristics of the object first detected by the current infrastructure camera, respectively.
Multi-infrastructure camera virtualization device.

제 6 항에 있어서,
상기 프로세서는,
상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 동작으로서,
상기 IoU 및 상기 유사도에 부여되는 각 비중의 합이 일정할 때,
상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리가 기 설정된 거리 이내인 경우, 상기 IoU에 부여되는 비중을 상대적으로 높이고,
상기 이격 거리가 상기 기 설정된 거리를 초과하는 경우, 상기 유사도에 부여되는 비중을 상대적으로 높이는,
다중 인프라 카메라 가상화 장치.According to claim 6,
The processor,
An operation of determining the weight assigned to the IoU and the similarity,
When the sum of the weights assigned to the IoU and the similarity is constant,
If the separation distance between the previous infrastructure camera and the current infrastructure camera is within a preset distance, the weight given to the IoU is relatively increased,
When the separation distance exceeds the preset distance, the weight given to the similarity is relatively increased,
Multi-infrastructure camera virtualization device.

제 6 항에 있어서,
상기 프로세서는,
상기 동일 객체 여부를 판단하는 동작으로서,
각 상기 비중이 결정된 상기 IoU 및 상기 유사도의 합이 기 설정된 일정 값 이상인 경우 동일 객체인 것으로 판단하고, 동일 객체로 판단된 경우, 상기 동일 객체의 이동을 계속 추적하는,
다중 인프라 카메라 가상화 장치.According to claim 6,
The processor,
As an operation to determine whether the object is the same,
If the sum of the IoU and the similarity for which each of the weights are determined is greater than or equal to a preset certain value, it is determined to be the same object, and if it is determined to be the same object, the movement of the same object is continuously tracked,
Multi-infrastructure camera virtualization device.

컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체로서,
상기 컴퓨터 프로그램은, 프로세서에 의해 실행되면,
현재 인프라 카메라가 제1 시점에 검지한 객체의 MBB(Minimum Bounding Box)와, 이전 인프라 카메라 및 궤적 예측 모델에 기초하여 상기 제1 시점에 대해 예측한 위치에 있는 객체의 MBB 간의 IoU(Intersection over Union)를 산출하는 단계;
SNN(Siamese Neural Network)에 기초하여 상기 제1 시점 전에 상기 이전 인프라 카메라를 통해 마지막으로 검출된 객체의 MBB와 상기 객체를 상기 현재 인프라 카메라에서 제1 시점에 처음으로 검출했을 때의 MBB 간의 유사도를 산출하는 단계;
상기 이전 인프라 카메라와 상기 현재 인프라 카메라 간의 이격 거리에 따라 상기 IoU 및 상기 유사도에 부여되는 각 비중을 결정하는 단계; 및
각 상기 비중이 결정된 상기 IoU 및 상기 유사도를 기반으로 상기 이전 인프라 카메라에서 검출된 객체와 상기 현재 인프라 카메라에서 검출된 객체의 동일 객체 여부를 판단하는 단계를 포함하는 방법을 상기 프로세서가 수행하도록 하기 위한 명령어를 포함하는, 컴퓨터 판독 가능한 기록매체.A computer-readable recording medium storing a computer program,
When the computer program is executed by a processor,
Intersection over Union (IoU) between the Minimum Bounding Box (MBB) of the object detected by the current infrastructure camera at the first viewpoint and the MBB of the object at the position predicted for the first viewpoint based on the previous infrastructure camera and trajectory prediction model. ) calculating;
Based on SNN (Siamese Neural Network), the similarity between the MBB of the object last detected through the previous infrastructure camera before the first time point and the MBB when the object was first detected by the current infrastructure camera at the first time point is calculated. calculating step;
determining each weight given to the IoU and the similarity according to the separation distance between the previous infrastructure camera and the current infrastructure camera; and
For the processor to perform a method comprising determining whether the object detected by the previous infrastructure camera and the object detected by the current infrastructure camera are the same based on the IoU and the similarity for which each of the proportions is determined. A computer-readable recording medium containing instructions.