KR102352242B1

KR102352242B1 - Apparatus and method for recognizing objects using complex number based convolution neural network

Info

Publication number: KR102352242B1
Application number: KR1020210085762A
Authority: KR
Inventors: 황인수; 신수진; 김준희; 김영중; 김성호
Original assignee: 국방과학연구소
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2022-01-17

Abstract

The present disclosure relates to a device for recognizing an object using a convolutional neural network (CNN). According to the present disclosure, the device for recognizing an object comprises: a transceiver that acquires a synthetic aperture radar (SAR) image based on complex data; a complex number-based CNN that generates a feature map by performing a complex number-based convolution operation; and an object recognition information generation device that generates object recognition information based on the feature map.

Description

복소수 기반의 CNN을 이용하여 객체를 인식하기 위한 장치 및 방법{APPARATUS AND METHOD FOR RECOGNIZING OBJECTS USING COMPLEX NUMBER BASED CONVOLUTION NEURAL NETWORK}Apparatus and method for object recognition using complex number-based CNN

본 개시(disclosure)는 일반적으로 신경망(neural network)을 이용하여 영상 이미지에서 객체를 인식하기 위한 방법에 관한 것으로, 보다 구체적으로 복소수 기반의 CNN을 이용하여 복소수를 기반으로 하는 영상 이미지에서 객체를 인식하기 위한 장치 및 방법에 관한 것이다.The present disclosure relates generally to a method for recognizing an object in a video image using a neural network, and more specifically, recognizing an object in a video image based on a complex number using a complex-number-based CNN. It relates to an apparatus and method for doing so.

인공지능 기술이 발전함에 따라, 인공지능 기술을 이용하여 영상 이미지에 존재하는 객체를 인식하고 객체가 무엇인지 여부를 판정하는 딥러닝(deep learning) 기술에 관한 연구가 활발히 이루어지고 있다. 특히, 영상 이미지에서 객체를 인식하는 방법으로서, 합성곱 신경망(convolutional neural network, CNN)이 주로 사용되고 있다.As artificial intelligence technology develops, research on deep learning technology for recognizing an object existing in a video image and determining what an object is by using artificial intelligence technology is being actively conducted. In particular, as a method of recognizing an object in a video image, a convolutional neural network (CNN) is mainly used.

CNN은 입력된 영상 이미지에 대하여 컨벌루션 레이어를 이용한 컨벌루션 필터를 적용하여, 영상 이미지의 특징 추출하기 위한 특징맵(feature map)의 생성 과정을 반복하여 수행한다. 또한 CNN을 기초로 하는 학습 장치는 반복적으로 생성된 특징맵을 처리하여 FC(fully-connected) 레이어에 입력하고, 영상 이미지에서 객체가 무엇인지 확률적인 연산을 수행한다. 이러한 CNN을 기초로 하는 학습 장치는 연산 결과 값에 따른 손실(loss)이 최소화되도록 컨벌루션 필터의 가중치를 지속적으로 학습한다.CNN applies a convolution filter using a convolutional layer to the input video image, and repeats the process of generating a feature map for extracting features of the video image. In addition, the CNN-based learning apparatus processes the repeatedly generated feature map, inputs it to a fully-connected (FC) layer, and performs a probabilistic calculation of what an object is in the video image. A learning apparatus based on such a CNN continuously learns the weights of the convolutional filter so that a loss according to an operation result value is minimized.

합성 개구 레이더(synthetic aperture radar, SAR)는 공중에서 지상으로 레이더 신호를 순차적으로 송신하고, 레이더 신호가 굴곡면으로부터 반사되어 돌아오는 미세한 시간 차이를 처리하여 지형도를 만들거나 지표를 관측하는 레이더를 지시한다. 이러한 합성 개구 레이더는 군사 작전에서 사용되는 지형도를 실시간으로 업데이트 하거나, 적군의 항공 모함 등을 확실하게 인식할 수 있게 하여, 군용 작전에 널리 사용되고 있다.Synthetic aperture radar (SAR) sequentially transmits radar signals from the air to the ground and instructs radars to map or observe the surface by processing minute differences in time when the radar signal is reflected from a curved surface and returned. do. These synthetic aperture radars are widely used in military operations because they update topographic maps used in military operations in real time or reliably recognize enemy aircraft carriers, etc.

SAR을 이용하여 관측되는 SAR 영상은 지면에 반사된 레이더 신호의 한 종류이기 때문에, 진폭과 위상을 가지는 복소수 형태의 데이터로 표현된다. SAR 영상에서 진폭은 지형이나 물체의 레이더 반사도에 관한 정보를 포함하고, 위상은 레이더와 물체 사이의 거리 정보를 포함할 수 있다. Since the SAR image observed using the SAR is a type of radar signal reflected on the ground, it is expressed as data in the form of a complex number having an amplitude and a phase. In the SAR image, the amplitude may include information on the radar reflectivity of the terrain or object, and the phase may include information on the distance between the radar and the object.

종래에 따른 객체 인식 장치는 SAR 영상에서 객체를 인식하는 경우 위상 데이터의 처리가 어려운 점을 고려하여, 진폭 만을 고려하여 객체를 인식하였다. 즉, 종래에 따르면, 객체 인식 장치에 포함된 CNN은 실수를 기반으로 학습하였기 때문에 복소수 연산을 처리할 수 없었다. 그에 따라, CNN은 진폭에 관한 정보를 실수 데이터로 활용하고, 진폭에 관한 실수 데이터만을 고려하여 학습함에 따라 위상에 따른 오차가 발생하였다. 이에 대응하여, 최근 복소수 데이터를 포함하는 SAR 영상으로부터 객체를 인식하는 경우 오차를 감소시키기 위한 기술의 필요성이 대두되고 있다.The conventional object recognition apparatus recognizes an object in consideration of only amplitude, considering that it is difficult to process phase data when recognizing an object in an SAR image. That is, according to the prior art, since the CNN included in the object recognition device learned based on real numbers, it could not process complex number operations. Accordingly, as CNN uses amplitude-related information as real data and learns only by considering amplitude-related real data, errors according to phase occurred. In response to this, the need for a technique for reducing an error in recognizing an object from an SAR image including complex data has recently emerged.

상술한 바와 같은 논의를 바탕으로, 본 개시(disclosure)는, 복소수 기반의 CNN을 이용하여 복소수를 기반으로 하는 영상 이미지에서 객체를 인식하기 위한 장치 및 방법을 제공한다.Based on the above discussion, the present disclosure provides an apparatus and method for recognizing an object in a video image based on a complex number using a CNN based on the complex number.

또한, 본 개시는 실수 데이터를 기반으로 학습된 CNN을 복소수 기반의 CNN으로 수정하기 위한 장치 및 방법을 제공한다.In addition, the present disclosure provides an apparatus and method for modifying a CNN learned based on real data into a complex-number-based CNN.

또한, 본 개시는 실수 데이터를 기반으로 학습된 CNN을 복소수 기반의 CNN으로 수정하기 위하여, CNN의 초기화 파라미터를 결정하기 위한 장치 및 방법을 제공한다.In addition, the present disclosure provides an apparatus and method for determining an initialization parameter of a CNN in order to modify a CNN learned based on real data into a complex-based CNN.

또한, 본 개시는 복소수 기반의 SAR 영상에서, 복소수 기반의 CNN을 이용하여 객체를 인식함으로써, 객체 인식에 관한 오차를 감소시키기 위한 장치 및 방법을 제공한다.In addition, the present disclosure provides an apparatus and method for reducing an error related to object recognition by recognizing an object in a complex number-based SAR image using a complex number-based CNN.

또한, 본 개시는 복소수 기반의 SAR 영상에서, 복소수 기반의 CNN을 이용하여 객체를 인식함으로써, 객체 인식 속도를 향상시키기 위한 장치 및 방법을 제공한다.In addition, the present disclosure provides an apparatus and method for improving object recognition speed by recognizing an object using a complex-number-based CNN in a complex-number-based SAR image.

본 개시의 다양한 실시 예들에 따르면, CNN(convolution neural network)을 이용하여 객체를 인식하는 장치는 복소수 데이터를 기반으로 하는 SAR(synthetic aperture radar) 영상을 획득하는 송수신기, 복소수 기반의 컨벌루션(convolution) 연산을 수행하여 특징맵을 생성하는 복소수 기반의 CNN, 및 상기 특징맵에 기반하여 객체 인식 정보를 생성하는 객체 인식 정보 생성기를 포함할 수 있다.According to various embodiments of the present disclosure, an apparatus for recognizing an object using a convolution neural network (CNN) includes a transceiver for acquiring a synthetic aperture radar (SAR) image based on complex data, and a complex number-based convolution operation. It may include a complex number-based CNN that generates a feature map by performing , and an object recognition information generator that generates object recognition information based on the feature map.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은 복수의 레이어들로 구성된 네트워크를 포함하고, 상기 복수의 레이어들은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN(batch normalization) 레이어, 복소수 기반의 ReLU(rectified linear unit) 레이어, 및 복소수 기반의 맥스 풀링(max pooling) 레이어 중 적어도 하나를 포함할 수 있다.According to another embodiment, the complex number-based CNN includes a network composed of a plurality of layers, and the plurality of layers include a complex number-based convolution layer, a complex number-based batch normalization (BN) layer, and a complex number-based ReLU ( It may include at least one of a rectified linear unit layer and a complex-based max pooling layer.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은, 상기 복소수 기반의 컨벌루션 연산을 수행하기 이전에, 실수 기반의 컨벌루션 연산을 통해 적어도 하나의 실수 파라미터를 학습하고, 상기 적어도 하나의 실수 파라미터에 기반하여, 상기 복수의 레이어들의 초기화를 수행할 수 있다.According to another embodiment, the complex number-based CNN learns at least one real parameter through a real number-based convolution operation before performing the complex number-based convolution operation, and based on the at least one real parameter Thus, initialization of the plurality of layers may be performed.

다른 일 실시 예에 따르면, 상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 컨벌루션 레이어에서 사용되는 실수 컨벌루션 가중치를 포함할 수 있다.According to another embodiment, the at least one real parameter may include a real convolution weight used in the complex number-based convolutional layer.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은, RGB(red green blue) 색상들 중 적어도 하나의 색상의 색상 값에 기반하여 실수 컨벌루션 가중치를 식별하고, 상기 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정하고, 상기 복소 컨벌루션 가중치에 기반하여, 상기 컨벌루션 레이어의 초기화를 수행할 수 있다.According to another embodiment, the complex-based CNN identifies a real convolutional weight based on a color value of at least one color among red green blue (RGB) colors, and a complex convolutional weight based on the real convolutional weight. may be determined, and the convolutional layer may be initialized based on the complex convolution weight.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은 상기 복소 컨벌루션 가중치의 실수부를 상기 실수 컨벌루션 가중치와 동일하도록 결정하고, 상기 복소 컨벌루션 가중치의 허수부를 상기 실수 컨벌루션 가중치와 동일하도록 결정할 수 있다.According to another embodiment, the complex number-based CNN may determine the real part of the complex convolution weight to be the same as the real convolution weight, and determine the imaginary part of the complex convolution weight to be the same as the real convolution weight.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은 상기 컨벌루션 레이어의 초기화를 수행한 이후에, 적어도 하나의 트레이닝 이미지에 기반하여 상기 복소 컨벌루션 가중치를 학습할 수 있다.According to another embodiment, the complex number-based CNN may learn the complex convolution weight based on at least one training image after the convolution layer is initialized.

다른 일 실시 예에 따르면, 상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 BN 레이어에서 사용되는 실수 평균, 실수 분산, 실수 바이어스(bias), 실수 BN 가중치를 포함할 수 있다.According to another embodiment, the at least one real parameter may include a real average, a real variance, a real bias, and a real BN weight used in the complex number-based BN layer.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은 상기 실수 평균, 상기 실수 분산, 상기 실수 바이어스, 및 상기 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하고, 상기 복소 평균, 상기 복소 분산, 상기 복소 바이어스, 상기 복소 BN 가중치에 기반하여, 상기 BN 레이어의 초기화를 수행할 수 있다.According to another embodiment, the complex-based CNN determines a complex average, a complex variance, a complex bias, and a complex BN weight corresponding to each of the real mean, the real variance, the real bias, and the real BN weight, and , the BN layer may be initialized based on the complex mean, the complex variance, the complex bias, and the complex BN weight.

다른 일 실시 예에 따르면, 상기 복소수 기반의 CNN은 상기 복소 평균의 실수부를 상기 실수 평균과 동일하도록 결정하고, 상기 복소 평균의 허수부를 상기 실수 평균과 동일하도록 결정하고, 상기 복소 분산의 실수부를 상기 실수 분산과 동일하도록 결정하고, 상기 복소 분산의 허수부를 상기 실수 분산과 동일하도록 결정할 수 있다.According to another embodiment, the complex number-based CNN determines that the real part of the complex average is equal to the real average, determines the imaginary part of the complex average equal to the real average, and the real part of the complex variance is the It may be determined to be equal to the real variance, and an imaginary part of the complex variance may be determined to be equal to the real variance.

본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN(convolution neural network)을 기반으로 객체를 인식하는 장치의 동작 방법은 복소수 데이터를 기반으로 하는 SAR(synthetic aperture radar) 영상을 획득하는 단계, 복소수 기반의 컨벌루션(convolution) 연산을 수행하여 특징맵을 생성하는 단계, 및 상기 특징맵에 기반하여 객체 인식 정보를 생성하는 단계를 포함할 수 있다.According to an embodiment of the present disclosure, a method of operating an apparatus for recognizing an object based on a complex number-based convolution neural network (CNN) includes acquiring a synthetic aperture radar (SAR) image based on complex data, complex number-based It may include generating a feature map by performing a convolution operation of , and generating object recognition information based on the feature map.

다른 일 실시 예에 따르면, 객체 인식 장치의 동작 방법은 상기 복소수 기반의 컨벌루션 연산을 수행하기 이전에, 실수 기반의 컨벌루션 연산을 통해 적어도 하나의 실수 파라미터를 학습하는 단계, 및 상기 적어도 하나의 실수 파라미터에 기반하여, 상기 복수의 레이어들의 초기화를 수행하는 단계를 더 포함할 수 있다.According to another embodiment, the method of operating an object recognition apparatus includes, before performing the complex number-based convolution operation, learning at least one real parameter through a real number-based convolution operation, and the at least one real parameter Based on , the method may further include performing initialization of the plurality of layers.

다른 일 실시 예에 따르면, 상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 컨벌루션 레이어에서 사용되는 실수 컨벌루션 가중치를 포함하고, 상기 초기화를 수행하는 단계는, RGB(red green blue) 색상들 중 적어도 하나의 색상의 색상 값에 기반하여 실수 컨벌루션 가중치를 식별하는 단계, 상기 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정하는 단계, 및 상기 복소 컨벌루션 가중치에 기반하여, 상기 컨벌루션 레이어의 초기화를 수행하는 단계를 포함할 수 있다.According to another embodiment, the at least one real parameter includes a real convolution weight used in the complex number-based convolutional layer, and the performing of the initialization includes at least one of red green blue (RGB) colors. identifying a real convolutional weight based on a color value of a color, determining a complex convolutional weight based on the real convolutional weight, and performing initialization of the convolutional layer based on the complex convolutional weight can do.

다른 일 실시 예에 따르면, 상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 BN 레이어에서 사용되는 실수 평균, 실수 분산, 실수 바이어스(bias), 실수 BN 가중치를 포함하고, 상기 초기화를 수행하는 단계는, 상기 실수 평균, 상기 실수 분산, 상기 실수 바이어스, 및 상기 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하는 단계, 및 상기 복소 평균, 상기 복소 분산, 상기 복소 바이어스, 상기 복소 BN 가중치에 기반하여, 상기 BN 레이어의 초기화를 수행하는 단계를 포함할 수 있다.According to another embodiment, the at least one real parameter includes a real average, a real variance, a real bias, and a real BN weight used in the complex number-based BN layer, and performing the initialization includes: determining a complex mean, complex variance, complex bias, and complex BN weight corresponding to each of the real mean, the real variance, the real bias, and the real BN weight; and the complex mean, the complex variance, and the complex BN weight. The method may include performing initialization of the BN layer based on a bias and the complex BN weight.

본 발명의 다양한 각각의 측면들 및 특징들은 첨부된 청구항들에서 정의된다. 종속 청구항들의 특징들의 조합들(combinations)은, 단지 청구항들에서 명시적으로 제시되는 것뿐만 아니라, 적절하게 독립항들의 특징들과 조합될 수 있다.Various respective aspects and features of the invention are defined in the appended claims. Combinations of features of the dependent claims may be combined with features of the independent claims as appropriate, not just expressly set forth in the claims.

또한, 본 개시에 기술된 임의의 하나의 실시 예(any one embodiment) 중 선택된 하나 이상의 특징들은 본 개시에 기술된 임의의 다른 실시 예 중 선택된 하나 이상의 특징들과 조합될 수 있으며, 이러한 특징들의 대안적인 조합이 본 개시에 논의된 하나 이상의 기술적 문제를 적어도 부분적으로 경감시키거나, 본 개시로부터 통상의 기술자에 의해 식별될 수 있는(discernable) 기술적 문제를 적어도 부분적으로 경감시키고, 나아가 실시 예의 특징들(embodiment features)의 이렇게 형성된 특정한 조합(combination) 또는 순열(permutation)이 통상의 기술자에 의해 양립 불가능한(incompatible) 것으로 이해되지만 않는다면, 그 조합은 가능하다.In addition, one or more features selected in any one embodiment described in this disclosure may be combined with one or more features selected in any other embodiment described in this disclosure, and alternatives to these features a combination of at least partially alleviates one or more technical problems discussed in the present disclosure, or at least partially alleviates technical problems that can be discerned by a person skilled in the art from the present disclosure, and furthermore features of embodiments ( The combination is possible, provided that a specific combination or permutation so formed of the embodiment features is not understood by a person skilled in the art as incompatible.

본 개시에 기술된 임의의 예시 구현(any described example implementation)에 있어서 둘 이상의 물리적으로 별개의 구성 요소들은 대안적으로, 그 통합이 가능하다면 단일 구성 요소로 통합될 수도 있으며, 그렇게 형성된 단일한 구성 요소에 의해 동일한 기능이 수행된다면, 그 통합은 가능하다. 반대로, 본 개시에 기술된 임의의 실시 예(any embodiment)의 단일한 구성 요소는 대안적으로, 적절한 경우, 동일한 기능을 달성하는 둘 이상의 별개의 구성 요소들로 구현될 수도 있다.In any described example implementation, two or more physically separate components may alternatively be integrated into a single component if their integration is possible, and the single component so formed If the same function is performed by , the integration is possible. Conversely, a single component of any embodiment described in the present disclosure may alternatively be implemented with two or more separate components that achieve the same function, where appropriate.

본 발명의 특정 실시 예들(certain embodiments)의 목적은 종래 기술과 관련된 문제점 및/또는 단점들 중 적어도 하나를, 적어도 부분적으로, 해결, 완화 또는 제거하는 것에 있다. 특정 실시 예들(certain embodiments)은 후술하는 장점들 중 적어도 하나를 제공하는 것을 목적으로 한다.It is an object of certain embodiments of the present invention to solve, mitigate, or eliminate, at least in part, at least one of the problems and/or disadvantages associated with the prior art. Certain embodiments aim to provide at least one of the advantages described below.

본 개시의 다양한 실시 예들에 따른 장치 및 방법은 복소수 데이터를 기반으로하는 CNN을 이용함으로써, 객체를 인식하기 위한 장치 및 방법을 제공한다.The apparatus and method according to various embodiments of the present disclosure provide an apparatus and method for recognizing an object by using a CNN based on complex data.

또한, 본개시의 다양한 실시 예들에 따른 장치 및 방법은 실수 데이터를 기반으로 학습된 CNN을 복소수 기반의 CNN으로 수정할 수 있게 한다.In addition, the apparatus and method according to various embodiments of the present disclosure allow a CNN learned based on real data to be modified into a complex-number-based CNN.

또한, 본개시의 다양한 실시 예들에 따른 장치 및 방법은 실수 데이터를 기반으로 학습된 CNN을 복소수 기반의 CNN으로 수정하기 위하여, CNN의 초기화 파라미터를 결정할 수 있게 한다.In addition, the apparatus and method according to various embodiments of the present disclosure make it possible to determine the initialization parameters of the CNN in order to modify the CNN learned based on real data into a complex-number-based CNN.

또한, 본 개시의 다양한 실시 예들에 따른 장치 및 방법은 복소수 데이터를 기반으로 하는 SAR 영상에 복소수 기반의 CNN을 적용함으로써, SAR 영상의 객체를 정확하게 인식할 수 있게 한다.In addition, the apparatus and method according to various embodiments of the present disclosure apply a complex-based CNN to a complex-data-based SAR image to accurately recognize an object of the SAR image.

또한, 본 개시의 다양한 실시 예들에 따른 장치 및 방법은 복소수 데이터를 기반으로 하는 SAR 영상에 복소수 기반의 CNN을 적용함으로써, SAR 영상의 객체를 빠르게 인식할 수 있게 한다.In addition, the apparatus and method according to various embodiments of the present disclosure apply a complex number-based CNN to a complex-number data-based SAR image, so that an object of the SAR image can be quickly recognized.

본 개시에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.Effects obtainable in the present disclosure are not limited to the above-mentioned effects, and other effects not mentioned may be clearly understood by those of ordinary skill in the art to which the present disclosure belongs from the description below. will be.

도 1은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치를 도시한다.
도 2는 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN의 구조를 도시한다.
도 3은 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN에 포함된 복소 잔차 블록의 일 예를 도시한다.
도 4는 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN이 특징맵을 생성하는 방법에 관한 모식도를 도시한다.
도 5는 본 개시의 다양한 실시 예들에 따른 객체 인식 장치의 객체 인식 방법에 관한 모식도를 도시한다.
도 6은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치의 동작 방법에 관한 흐름도를 도시한다.
도 7은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치를 이용한 객체 인식 결과의 일 예를 도시한다.
도 8은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치를 이용한 객체 인식 결과의 다른 일 예를 도시한다. 1 illustrates an object recognition apparatus according to various embodiments of the present disclosure.
2 illustrates a structure of a complex-based CNN according to various embodiments of the present disclosure.
3 illustrates an example of a complex residual block included in a complex-number-based CNN according to various embodiments of the present disclosure.
4 is a schematic diagram illustrating a method for generating a feature map by a complex number-based CNN according to various embodiments of the present disclosure.
5 is a schematic diagram illustrating an object recognition method of an object recognition apparatus according to various embodiments of the present disclosure.
6 is a flowchart illustrating a method of operating an object recognition apparatus according to various embodiments of the present disclosure.
7 illustrates an example of an object recognition result using an object recognition apparatus according to various embodiments of the present disclosure.
8 illustrates another example of an object recognition result using an object recognition apparatus according to various embodiments of the present disclosure.

본 개시에서 사용되는 용어들은 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 다른 실시 예의 범위를 한정하려는 의도가 아닐 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 용어들은 본 개시에 기재된 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가질 수 있다. 본 개시에 사용된 용어들 중 일반적인 사전에 정의된 용어들은, 관련 기술의 문맥상 가지는 의미와 동일 또는 유사한 의미로 해석될 수 있으며, 본 개시에서 명백하게 정의되지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 경우에 따라서, 본 개시에서 정의된 용어일지라도 본 개시의 실시 예들을 배제하도록 해석될 수 없다.Terms used in the present disclosure are used only to describe specific embodiments, and may not be intended to limit the scope of other embodiments. The singular expression may include the plural expression unless the context clearly dictates otherwise. Terms used herein, including technical or scientific terms, may have the same meanings as commonly understood by one of ordinary skill in the art described in the present disclosure. Among the terms used in the present disclosure, terms defined in a general dictionary may be interpreted with the same or similar meaning as the meaning in the context of the related art, and unless explicitly defined in the present disclosure, ideal or excessively formal meanings is not interpreted as In some cases, even terms defined in the present disclosure cannot be construed to exclude embodiments of the present disclosure.

이하에서 설명되는 본 개시의 다양한 실시 예들에서는 하드웨어적인 접근 방법을 예시로서 설명한다. 하지만, 본 개시의 다양한 실시 예들에서는 하드웨어와 소프트웨어를 모두 사용하는 기술을 포함하고 있으므로, 본 개시의 다양한 실시 예들이 소프트웨어 기반의 접근 방법을 제외하는 것은 아니다.In various embodiments of the present disclosure described below, a hardware access method will be described as an example. However, since various embodiments of the present disclosure include technology using both hardware and software, various embodiments of the present disclosure do not exclude a software-based approach.

이하 본 개시는 복소수 기반의 CNN을 이용하여 객체를 인식하기 위한 장치 및 방법에 관한 것이다. 구체적으로, 본 개시는 복소수를 기반으로 하는 영상 이미지에서 복소수 기반의 CNN을 이용하여 객체를 인식하기 위한 기술을 설명한다.Hereinafter, the present disclosure relates to an apparatus and method for recognizing an object using a complex number-based CNN. Specifically, the present disclosure describes a technique for recognizing an object using a complex number-based CNN in a complex number-based video image.

아래에서는 첨부한 도면을 참조하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 다양한 실시예들을 상세히 설명한다. 그러나 본 개시의 기술적 사상은 다양한 형태로 변형되어 구현될 수 있으므로 본 명세서에서 설명하는 실시예들로 제한되지 않는다. 본 명세서에 개시된 실시예들을 설명함에 있어서 관련된 공지 기술을 구체적으로 설명하는 것이 본 개시의 기술적 사상의 요지를 흐릴 수 있다고 판단되는 경우 그 공지 기술에 대한 구체적인 설명을 생략한다. 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, various embodiments will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art to which the present disclosure pertains can easily implement it. However, since the technical spirit of the present disclosure may be modified and implemented in various forms, it is not limited to the embodiments described herein. In the description of the embodiments disclosed in the present specification, when it is determined that a detailed description of a related known technology may obscure the gist of the present disclosure, a detailed description of the known technology will be omitted. The same or similar components are given the same reference numerals, and overlapping descriptions thereof will be omitted.

본 명세서에서 어떤 요소가 다른 요소와 "연결"되어 있다고 기술될 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라 그 중간에 다른 요소를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 어떤 요소가 다른 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 요소 외에 또 다른 요소를 배제하는 것이 아니라 또 다른 요소를 더 포함할 수 있는 것을 의미한다.In the present specification, when an element is described as being "connected" with another element, it includes not only the case of being "directly connected" but also the case of being "indirectly connected" with another element interposed therebetween. When an element "includes" another element, it means that another element may be further included without excluding another element in addition to other elements unless otherwise stated.

일부 실시예들은 기능적인 블록 구성들 및 다양한 처리 단계들로 설명될 수 있다. 이러한 기능 블록들의 일부 또는 전부는 특정 기능을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 본 개시의 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 본 개시의 기능 블록이 수행하는 기능은 복수의 기능 블록에 의해 수행되거나, 본 개시에서 복수의 기능 블록이 수행하는 기능들은 하나의 기능 블록에 의해 수행될 수도 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다.Some embodiments may be described in terms of functional block configurations and various processing steps. Some or all of these functional blocks may be implemented in various numbers of hardware and/or software configurations that perform specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors, or may be implemented by circuit configurations for a given function. The functional blocks of the present disclosure may be implemented in various programming or scripting languages. The functional blocks of the present disclosure may be implemented as an algorithm running on one or more processors. A function performed by a functional block of the present disclosure may be performed by a plurality of functional blocks, or functions performed by a plurality of functional blocks in the present disclosure may be performed by one functional block. In addition, the present disclosure may employ prior art for electronic configuration, signal processing, and/or data processing, and the like.

또한, 본 개시에서, 특정 조건의 만족(satisfied), 충족(fulfilled) 여부를 판단하기 위해, 초과 또는 미만의 표현이 사용되었으나, 이는 일 예를 표현하기 위한 기재일 뿐 이상 또는 이하의 기재를 배제하는 것이 아니다. '이상'으로 기재된 조건은 '초과', '이하'로 기재된 조건은 '미만', '이상 및 미만'으로 기재된 조건은 '초과 및 이하'로 대체될 수 있다. In addition, in the present disclosure, in order to determine whether a specific condition is satisfied (satisfied) or satisfied (fulfilled), an expression of more than or less than is used, but this is only a description to express an example, and more or less description is excluded not to do Conditions described as 'more than' may be replaced with 'more than', conditions described as 'less than', and conditions described as 'more than and less than' may be replaced with 'more than and less than'.

이하 사용되는 '…부', '…기' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어, 또는, 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 객체 인식 장치는 통신부, 저장부, 제어부를 포함할 수 있다. 적어도 하나의 프로세서는 객체 인식 장치의 제어부에 기능적으로 결합되어, 객체 인식 장치의 동작을 수행할 수 있다.Hereinafter used '… wealth', '… The term 'group' means a unit that processes at least one function or operation, which may be implemented as hardware or software, or a combination of hardware and software. The object recognition apparatus may include a communication unit, a storage unit, and a control unit. At least one processor may be functionally coupled to the control unit of the object recognition apparatus to perform an operation of the object recognition apparatus.

도 1은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치(100)를 도시한다. 1 illustrates an object recognition apparatus 100 according to various embodiments of the present disclosure.

객체 인식 장치(100)는 영상 이미지를 통해 촬영된 객체들을 인식하는 기능을 수행한다. 객체 인식 장치(100)는 영상 이미지에서 인식되는 객체들에 관한 정보를 생성할 수 있다. 객체 인식 장치(100)는 영상 이미지에서 인식되는 객체들의 크기와 종류를 결정하여 객체를 인식할 수 있다. 객체 인식 장치(100)는 각각의 픽셀 마다 실수 값 또는 복소수 값을 가지는 영상 이미지를 수신하고, 영상 이미지를 분석하여 객체를 인식할 수 있다. 여기서, 복소수 값을 가지는 영상 이미지는 진폭과 위상에 관한 복소수 데이터를 포함하는 SAR(synthetic aperture radar) 영상 이미지를 포함할 수 있다. 본 개시의 일 실시 예에 따르면, 객체 인식 장치(100)는 송수신기(101), 복소수 기반의 CNN(103), 객체 인식 정보 생성기(105)를 포함할 수 있다.The object recognition apparatus 100 performs a function of recognizing captured objects through a video image. The object recognition apparatus 100 may generate information about objects recognized in the video image. The object recognition apparatus 100 may recognize the object by determining the size and type of objects recognized in the video image. The object recognition apparatus 100 may receive an image image having a real value or a complex value for each pixel, and may recognize the object by analyzing the image image. Here, the video image having a complex value may include a synthetic aperture radar (SAR) video image including complex data regarding amplitude and phase. According to an embodiment of the present disclosure, the object recognition apparatus 100 may include a transceiver 101 , a complex number-based CNN 103 , and an object recognition information generator 105 .

송수신기(101)는 영상 이미지를 수신하는 기능을 수행한다. 송수신기(101)는 무선 채널 또는 유선 채널을 통해 외부로부터 실수 기반의 영상 이미지 신호나 복소수 기반의 영상 이미지 신호를 수신할 수 있다. 본 개시의 일 실시 예에 따르면, 송수신기(101)는 SAR이 촬영한 SAR 영상 이미지를 수신할 수 있다. 송수신기(101)는 수신한 SAR 영상을 복소수 기반의 CNN(103)으로 전달할 수 있다.The transceiver 101 performs a function of receiving a video image. The transceiver 101 may receive a real number-based video image signal or a complex number-based video image signal from the outside through a wireless channel or a wired channel. According to an embodiment of the present disclosure, the transceiver 101 may receive the SAR video image captured by the SAR. The transceiver 101 may transmit the received SAR image to the complex number-based CNN 103 .

복소수 기반의 CNN(103)는 수신한 SAR 영상 이미지로부터 특징맵(feature map)을 생성하는 기능을 수행한다. 복소수 기반의 CNN(103)는 복소수 기반의 영상 이미지의 수신에 대응하여, 복소수 기반의 컨벌루션(convolution) 연산을 수행할 수 있다. 이하에서 컨벌루션 연산은 CNN의 컨벌루션 레이어, BN(batch normalization) 레이어, ReLU(rectified linear unit) 레이어, 및 맥스 풀링(max pooling) 레이어에서 수행되는 연산을 지시할 수 있다.The complex-based CNN 103 performs a function of generating a feature map from the received SAR video image. The complex number-based CNN 103 may perform a complex number-based convolution operation in response to the reception of the complex number-based video image. Hereinafter, a convolution operation may indicate an operation performed in a convolutional layer, a batch normalization (BN) layer, a rectified linear unit (ReLU) layer, and a max pooling layer of a CNN.

본 개시의 다른 일 실시 예에 따르면, 복소수 기반의 CNN(103)은 복소수 데이터를 이용하여 학습하기 전에 미리 실수 기반의 데이터로 학습되어 있을 수 있다. 실수 데이터를 기반으로 미리 학습되어 있는 경우, 복소수 기반의 CNN(103)은 복소수 데이터를 이용하기 위한 복소 초기화를 수행한 이후에, 복소수 기반의 데이터로 학습할 수 있다. 복소수 기반의 CNN(103)의 구조는 도 2 내지 도 5에서 상세히 설명된다.According to another embodiment of the present disclosure, the complex number-based CNN 103 may be previously trained with real number-based data before learning using complex number data. In the case of learning in advance based on real data, the complex-based CNN 103 may learn from complex-based data after performing complex initialization for using the complex data. The structure of the complex number-based CNN 103 is described in detail with reference to FIGS. 2 to 5 .

객체 인식 정보 생성기(105)는 특징맵을 이용하여, 영상 이미지에 인식된 객체들에 관한 정보를 생성하는 기능을 수행한다. 객체 인식 정보 생성기(105)는 복소수 기반의 CNN(103)이 생성한 특징맵에 기반하여 객체 인식 정보를 생성할 수 있다. 본 개시의 일 실시 예에 따르면, 객체 인식 정보 생성기(105)는 RPN(region proposal network)를 이용하여 특징맵에서 관심 영역(region of interest, RoI)을 검출하고, 특징맵과 관심 영역에 기반하여 ROI 풀링(pooling)을 적용하고, FC 레이어를 이용하여 풀링된 맵에 분류 연산을 적용하여 객체를 인식하기 위한 정보를 생성할 수 있다. 본 개시의 일 실시 예에 따르면, 객체 인식 정보는 객체의 크기에 관한 정보, 객체의 종류에 관한 정보 중 적어도 하나를 포함할 수 있다. 객체 인식 정보 생성기(105)의 생성 과정은 도 5에서 상세히 설명된다.The object recognition information generator 105 performs a function of generating information on objects recognized in the video image by using the feature map. The object recognition information generator 105 may generate object recognition information based on the feature map generated by the complex number-based CNN 103 . According to an embodiment of the present disclosure, the object recognition information generator 105 detects a region of interest (RoI) from a feature map using a region proposal network (RPN), and based on the feature map and the region of interest, Information for recognizing an object may be generated by applying ROI pooling and applying a classification operation to the pooled map using the FC layer. According to an embodiment of the present disclosure, the object recognition information may include at least one of information about a size of an object and information about a type of an object. A process of generating the object recognition information generator 105 is described in detail with reference to FIG. 5 .

도 2는 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN의 구조(200)를 도시한다. 도 2는 도 1의 복소수 기반의 CNN(103)의 네트워크 아키텍쳐(architecture)를 예시한다.2 illustrates a structure 200 of a complex number-based CNN according to various embodiments of the present disclosure. FIG. 2 illustrates the network architecture of the complex number-based CNN 103 of FIG. 1 .

복소수 기반의 CNN의 구조(200)는 일반적인 ResNet(residual network) 구조를 포함할 수 있다. 즉, 복소수 기반의 CNN(103)은 각각의 블록들에서 입력을 출력에 바로 연결시키는 skip connection 구조를 이용할 수 있다. 즉, 복소수 기반의 CNN(103)은 입력 값이 출력 값에 더해지는 연산을 이용하고, 출력 함수가 0이 되게 하기 위한 잔차(residual)를 학습할 수 있다. 복소수 기반의 CNN(103)에 포함된 복수의 블록들 각각은 복수의 레이어들을 이용하여 컨벌루션 연산을 수행한다. 여기서, 복수의 레이어들은 컨벌루션 레이어, BN 레이어, ReLU 레이어, 및 맥스 풀링 레이어 중 적어도 하나를 포함할 수 있다. 종래의 실수 기반의 CNN과 다르게, 본 개시에 따른 복소수 기반의 CNN(103)은 복수의 레이어들에서 복소수 기반의 컨벌루션 연산을 수행할 수 있다. 즉, 복소수 기반의 CNN(103)은 실수 값이 아닌 아닌 실수부와 허수부가 결합된 복소수 값을 이용하여 학습되므로, 복소수 기반의 컨벌루션 연산은 종래의 실수 기반의 컨벌루션 연산과 상이하게 수행된다. 복소수 기반의 컨벌루션 연산은 후술하는 방법을 이용하여 수행될 수 있다.The complex number-based CNN structure 200 may include a general ResNet (residual network) structure. That is, the complex number-based CNN 103 may use a skip connection structure that directly connects the input to the output in each block. That is, the complex-based CNN 103 may use an operation in which an input value is added to an output value and learn a residual for making the output function 0. Each of the plurality of blocks included in the complex number-based CNN 103 performs a convolution operation using a plurality of layers. Here, the plurality of layers may include at least one of a convolutional layer, a BN layer, a ReLU layer, and a max pooling layer. Unlike the conventional real number-based CNN, the complex number-based CNN 103 according to the present disclosure may perform a complex number-based convolution operation in a plurality of layers. That is, since the complex number-based CNN 103 is learned using a complex value in which a real part and an imaginary part are combined, not a real value, the complex number-based convolution operation is performed differently from the conventional real number-based convolution operation. The complex number-based convolution operation may be performed using a method to be described later.

i) 복소수 기반의 컨벌루션i) Convolution based on complex numbers

컨벌루션은 영상 이미지에서 특징을 추출하기 위해 사용될 수 있다. 실수 기반의 컨벌루션 연산에서, 입력 값 X와 가중치 W에 컨벌루션 연산을 적용한 결과는 X * W로 표현될 수 있다. 본 개시에 따른 복소수 기반의 컨벌루션 연산에서, 컨벌루션 결과는 <수학식 1>과 같이 결정될 수 있다.Convolution can be used to extract features from video images. In the real number-based convolution operation, the result of applying the convolution operation to the input value X and the weight W may be expressed as X * W. In the complex number-based convolution operation according to the present disclosure, the convolution result may be determined as shown in Equation (1).

<수학식 1>을 참고하면, W는 가중치, X는 입력 값, W_r은 가중치의 실수부, W_i는 가중치의 허수부, X_r은 입력 값의 실수부, X_i는 입력 값의 허수부를 지시한다. 즉, 복소수 기반의 컨벌루션을 수행하면, 연산 결과는 실수부 W_r*X_r-W_i*X_i와, 허수부 W_r*X_i+W_i*X_r로 구분될 수 있다.Referring to <Equation 1>, W is the weight, X is the input value, W _r is the real part of the weight, W _i is the imaginary part of the weight, X _r is the real part _{of the input value, X i} is the imaginary part of the input value dictate wealth That is, when complex number-based convolution is performed, the operation result may be divided into a _{real part W r} *X _r -W _i *X _i and an imaginary part W _r *X _i +W _i *X _{r .}

ii) 복소수 기반의 BNii) BN based on complex numbers

BN은 각각의 레이어 별로 정규화하여 데이터 분포의 변형을 막기 위하여 사용될 수 있다. 실수 기반의 BN에서 입력 값 X에 평균 E, 분산 V를 이용하는 경우, 입력 값 X가 정규화 되어

로 표현되고, 실수 기반의 BN 결과는 스케일 파라미터(scale parameter)γ와 쉬프트 파라미터(shift parameter)β가 적용되어

로 표현될 수 있다. 이에 대응하여, 실수 기반의 BN을 수행하면 실수 값이 출력된다. BN may be used to prevent data distribution from being deformed by normalizing for each layer. In case of using the mean E and variance V for the input value X in real number-based BN, the input value X is normalized and

, and the real number-based BN result is a scale parameter γ and a shift parameter β applied.

can be expressed as Correspondingly, if real number-based BN is performed, a real value is output.

이에 반하여, 본 개시에 따른 복소수 기반의 BN이 적용되기 위하여, 평균과 분산, 바이어스, 및 가중치는 <수학식 2>와 같이 표현될 수 있다.On the other hand, in order to apply the complex number-based BN according to the present disclosure, the mean, variance, bias, and weight may be expressed as in Equation (2).

<수학식 2>를 참고하면, E_r은 평균의 실수부, E_i는 평균의 허수부를 지시한다. 또한, 분산 V는 공분산으로 표현되어 V_rr은 실수부와 실수부 사이의 분산, V_ri는 실수부와 허수부 사이의 분산, V_ir은 허수부와 실수부 사이의 분산, V_ii는 허수부 허수부 사이의 분산을 지시한다. B_r은 바이어스의 실수부, B_i는 바이어스의 허수부를 지시한다. 또한, W_rr,W_ri, W_ir, W_ii는 BN 가중치 벡터를 지시한다. Referring to <Equation 2>, E _r denotes the real part of the average, and E _i denotes the imaginary part of the average. Also, the variance V is expressed as covariance, where V _rr is the variance between the real and real parts, V _ri is the variance between the real and imaginary parts, V _ir is the variance between the imaginary and real parts, and V _ii is the imaginary part. It indicates the variance between the imaginary parts. B _r indicates the real part of the bias, and B _i indicates the imaginary part of the bias. Also, W _rr ,W _ri , W _ir , and W _ii indicate a BN weight vector.

본 개시에 따른 복소수 기반의 BN 연산에서 입력 값 X에 평균 E와 분산 V를 이용하는 경우, 입력 값 X가 정규화 되어

로 표현된다. 여기서, V^1/2은 <수학식 3>과 같이 표현될 수 있다.In the case of using the mean E and variance V for the input value X in the complex number-based BN operation according to the present disclosure, the input value X is normalized and

is expressed as Here, V ^1/2 may be expressed as in <Equation 3>.

<수학식 3>을 참고하면, V는 분산, V_rr은 실수부와 실수부 사이의 분산, V_ri는 실수부와 허수부 사이의 분산, V_ir은 허수부와 실수부 사이의 분산, V_ii는 허수부 허수부 사이의 분산을 지시한다. S, T는 분산 값들에 기반하여 결정되는 변수를 지시한다. Referring to <Equation 3>, V is the variance, V _rr is the variance between the real part and the real part, V _ri is the variance between the real part and the imaginary part, V _ir is the variance between the imaginary part and the real part, V _ii indicates the variance between the imaginary part and the imaginary part. S and T indicate variables determined based on variance values.

이 후, 복소수 기반의 BN 결과는 스케일 파라미터γ와 쉬프트 파라미터β가 적용되어

로 표현될 수 있다. 여기서, 복소수 기반의 연산이 수행됨에 따라 스케일 파라미터 또한 <수학식 4>와 같이 복소수 형태로 표현될 수 있다.After that, the complex number-based BN result is obtained by applying the scale parameter γ and the shift parameter β.

can be expressed as Here, as the complex number-based operation is performed, the scale parameter may also be expressed in the form of a complex number as shown in Equation (4).

<수학식 4>를 참고하면, γ_rr은 실수부와 실수부 사이의 스케일 파라미터, γ_ri는 실수부와 허수부 사이의 스케일 파라미터, γ_ir은 허수부와 실수부 사이의 스케일 파라미터, γ_ii는 허수부 허수부 사이의 스케일 파라미터를 지시한다. 또한, β_r은 쉬프트 파라미터의 실수부, β_i는 쉬프트 파라미터의 허수부를 지시한다. _Referring to <Equation 4>, γ rr is the scale parameter between the real part and the real part, γ _ri is the scale parameter between the real part and the imaginary part, γ _ir is the scale parameter between the imaginary part and the real part, γ _ii indicates the scale parameter between the imaginary part and the imaginary part. In addition, β _r indicates the real part of the shift parameter, and β _i indicates the imaginary part of the shift parameter.

<수학식 2> 내지 <수학식 4>를 참고하면, 복소수 기반의 BN을 수행하면 실수 기반의 BN 결과와 달리 실수부와 허수부로 구분된 복소수가 출력될 수 있다.Referring to <Equation 2> to <Equation 4>, when complex number-based BN is performed, a complex number divided into a real part and an imaginary part may be output unlike a real number-based BN result.

iii) 복소수 기반의 ReLUiii) Complex number-based ReLU

입력 값 X에 ReLU 연산을 적용하는 경우, 실수 기반의 ReLu 연산의 결과는 ReLU(X)로 표현될 수 있다. 본 개시에 따른 복소수 기반의 ReLU 연산 결과는 <수학식 5>와 같이 결정될 수 있다.When the ReLU operation is applied to the input value X, the result of the real number-based ReLu operation may be expressed as ReLU(X). The complex number-based ReLU operation result according to the present disclosure may be determined as in Equation 5.

<수학식 5>을 참고하면, X는 입력 값으로서, 실수부 X_r과 허수부 X_i의 합으로 표현될 수 있다. 즉, 복소수 기반의 ReLU 연산을 수행하면, 연산 결과는 실수부 ReLU(X_r)과, 허수부 ReLU(X_i)로 구분될 수 있다.Referring to <Equation 5>, X is an input value and may be expressed as the sum of the _{real part X r} and the imaginary part X _{i .} That is, when the complex number-based ReLU operation is performed, the operation result may be divided into a _{real part ReLU(X r} ) and an imaginary part ReLU(X _{i ).}

iv) 복소수 기반의 맥스 풀링iv) Complex number-based max pooling

실수 기반의 맥스 풀링 연산에 따르면, 실수 데이터는 복수의 그룹들로 구분되고, 맥스 풀링은 각각의 그룹들 내에서 가장 큰 값 가지는 실수를 선택하는 연산을 지시한다. 이에 반하여, 복소수 기반의 맥스 풀링은 복소수를 실수부와 허수부로 구분하고, 각각의 그룹들 내에서 가장 큰 값을 가지는 실수와 가장 큰 값을 가지는 복소수를 선택한다. 즉, 복소수 기반의 맥스 풀링는 실수부와 허수부의 연산을 분리하여 연산을 수행한다.According to the real number-based max pooling operation, real data is divided into a plurality of groups, and the max pooling indicates an operation of selecting a real number having the largest value in each group. In contrast, complex number-based max pooling divides a complex number into a real part and an imaginary part, and selects a real number having the largest value and a complex number having the largest value in each group. That is, the complex number-based max pooling performs the operation by separating the operation of the real part and the imaginary part.

도 2를 참고하면, 복소수 기반의 CNN(103)은 하나의 사전 복소 잔차 블록(complex pre residual block)(201)과 복수의 복소 잔차 블록들(complex residual blocks)(205)로 구성될 수 있다. 여기서 복수의 복소 잔차 블록들(205)은 제1 스테이지 내지 제4 스테이지(210, 230, 250, 270)으로 구분될 수 있다. 도 2는 복소수 기반의 CNN(103)에 포함된 스테이지가 네 개인 경우를 도시하고 있으나, 스테이지의 개수와 복소 잔차 블록들의 개수는 사용자의 설정에 따라 변경될 수 있다.Referring to FIG. 2 , the complex-based CNN 103 may include one complex pre residual block 201 and a plurality of complex residual blocks 205 . Here, the plurality of complex residual blocks 205 may be divided into first to fourth stages 210 , 230 , 250 and 270 . FIG. 2 illustrates a case in which four stages are included in the complex number-based CNN 103, but the number of stages and the number of complex residual blocks may be changed according to a user's settings.

사전 복소 잔차 블록(201)은 데이터가 복소 잔차 블록(203)에 입력되기 전에 데이터를 필터링 하는 기능을 수행한다. 본 개시의 일 실시 예에 따르면, 사전 복소 잔차 블록(201)은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN 레이어, 복소수 기반의 ReLU 레이어, 및 복소수 기반의 맥스 풀링 레이어를 포함할 수 있다. 복소수 기반의 CNN(103)으로 전달되는 복소 데이터는 사전 복소 잔차 블록(201)을 통과하여 복소 잔차 블록(203)로 순차적으로 입력될 수 있다.The pre-complex residual block 201 performs a function of filtering data before it is input to the complex residual block 203 . According to an embodiment of the present disclosure, the prior complex residual block 201 may include a complex number-based convolution layer, a complex number-based BN layer, a complex number-based ReLU layer, and a complex number-based max pooling layer. Complex data transmitted to the complex-based CNN 103 may be sequentially input to the complex residual block 203 by passing through the prior complex residual block 201 .

복소 잔차 블록(203)은 입력된 데이터로부터 특징맵을 생성하기 위하여 데이터를 필터링하는 기능을 수행한다. 복소 잔차 블록(203)은 입력된 데이터에 컨벌루션 연산을 적용하여 특징맵을 생성할 수 있다. 본 개시의 일 실시 예에 따르면, 복소 잔차 블록(203)은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN 레이어, 및 복소수 기반의 ReLU를 포함할 수 있다. 복소 잔차 블록(203)에 입력된 복소 데이터는 복수의 스테이지에 걸친 복소 잔차 블록들을 통과하고, 이에 대응하여 특징맵이 생성될 수 있다.The complex residual block 203 performs a function of filtering data to generate a feature map from the input data. The complex residual block 203 may generate a feature map by applying a convolution operation to input data. According to an embodiment of the present disclosure, the complex residual block 203 may include a complex-based convolutional layer, a complex-based BN layer, and a complex-based ReLU. Complex data input to the complex residual block 203 may pass through complex residual blocks spanning a plurality of stages, and a feature map may be generated in response thereto.

도 3은 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN에 포함된 복소 잔차 블록의 일 예(300)를 도시한다. 도 3는 도 2의 복소 잔차 블록(203) 네트워크 아키텍쳐(architecture)의 일 예를 도시한다.3 illustrates an example 300 of a complex residual block included in a complex-based CNN according to various embodiments of the present disclosure. FIG. 3 shows an example of the complex residual block 203 network architecture of FIG. 2 .

도 3은 종래의 실수 기반의 잔차 블록(310)과 복소수 기반의 잔차 블록(360)을 비교하기 위한 비교 모식도를 예시한다. 또한, 도 3은 하나의 잔차 블록에 세 개의 컨벌루션 레이어와 세 개의 BN 레이어가 번갈아 배치된 구조를 예시하였으나, 컨벌루션 레이어와 BN 레이어의 개수는 사용자의 설정에 따라 변경될 수 있다. 3 illustrates a comparison schematic diagram for comparing the conventional real-based residual block 310 and the complex-based residual block 360 . Also, although FIG. 3 illustrates a structure in which three convolutional layers and three BN layers are alternately arranged in one residual block, the number of convolutional layers and BN layers may be changed according to a user's setting.

실수 기반의 잔차 블록(310)은 컨벌루션 레이어와 BN 레이어, ReLU 레이어로 구성된다. 본 개시의 일 실시 예에 따르면, 실수 기반의 잔차 블록(310)은 컨벌루션 레이어와 BN 레이어가 서로 번갈아가며 배치되고, 마지막으로 ReLU 레이어가 배치되는 구조로 구성될 수 있다. 본 개시의 일 실시 예에 따르면, 입력 값 X_I는 적어도 하나의 컨벌루션 레이어와 적어도 하나의 BN 레이어를 통과하고, 마지막으로 활성화 함수를 적용하는 ReLU 레이어를 통과하여 필터링 될 수 있다. 이에 대응하여 실수 기반의 잔차 블록(310)은 입력 값 X_I와 ReLU에 의한 필터링 데이터에 기반하여 출력 값 X_I+1을 생성할 수 있다. 생성된 출력 값 X_I+1는 순차적으로 이어지는 실수 기반의 잔차 블록으로 전달될 수 있다.The real-based residual block 310 includes a convolutional layer, a BN layer, and a ReLU layer. According to an embodiment of the present disclosure, the real number-based residual block 310 may have a structure in which a convolution layer and a BN layer are alternately disposed, and finally a ReLU layer is disposed. According to an embodiment of the present disclosure, the input value X _I may be filtered by passing through at least one convolutional layer and at least one BN layer, and finally passing through a ReLU layer to which an activation function is applied. Correspondingly, the real number-based residual block 310 may generate _{an output value X I+1} _{based on the input value X I} and the filtering data by ReLU. The generated output value X _I+1 may be transferred to sequentially following real-based residual blocks.

복소수 기반의 잔차 블록(360)은 복소수 기반의 컨벌루션 레이어와 복소수 기반의 BN 레이어가 서로 번갈아가며 배치되고, 마지막으로 복소수 기반의 ReLU 레이어가 배치되는 구조로 구성될 수 있다. 본 개시의 일 실시 예에 따르면, 입력 값 X_I는 적어도 하나의 복소수 기반의 컨벌루션 레이어와 적어도 하나의 복소수 기반의 BN 레이어를 통과하고, 마지막으로 활성화 함수를 적용하는 복소수 기반의 ReLU 레이어를 통과하여 필터링 될 수 있다. 이에 대응하여 복소수 기반의 잔차 블록(360)은 입력 값 X_I와 복소수 기반의 ReLU에 의한 필터링 데이터에 기반하여 출력 값 X_I+1을 생성할 수 있다. 생성된 출력 값 X_I+1는 순차적으로 이어지는 복소수 기반의 잔차 블록으로 전달될 수 있다.The complex-based residual block 360 may have a structure in which a complex-based convolutional layer and a complex-based BN layer are alternately disposed, and finally, a complex-based ReLU layer is disposed. According to an embodiment of the present disclosure, the input value X _I passes through at least one complex number-based convolutional layer and at least one complex number-based BN layer, and finally passes through a complex number-based ReLU layer to which an activation function is applied. can be filtered. Correspondingly, the complex number-based residual block 360 may generate _{an output value X I+1} _{based on the input value X I} and the complex number-based ReLU filtering data. The generated output value X _I+1 may be transferred to a sequentially succeeding complex number-based residual block.

종래에 따르면 영상 이미지로부터 객체를 인식하기 위하여, 실수 기반의 잔차 블록들을 이용하는 실수 기반의 CNN을 이용하였다. 그에 따라, SAR 영상과 같은 복소 데이터 형태의 이미지 영상로부터 객체를 인식하기 위하여, 실수 기반의 CNN은 복소 데이터의 실수부 또는 허수부 중 하나를 선택적으로 이용할 수 밖에 없었다. 이에 대응하여, 본 개시에 따른 CNN은 복소수 기반의 잔차 블록들을 이용하는 복소수 기반의 CNN을 이용하여 객체를 인식할 수 있다. 그러나, 실수 데이터를 기반으로 미리 학습되어 있는 CNN을 복소수 기반의 CNN으로 수정하는 경우, 복소수 기반의 CNN에 포함된 컨벌루션 레이어, BN 레이어는 복소수 데이터를 입력 받기 위한 초기화의 수행이 요구된다. 종래에 따르면, 실수 기반의 신경 망에서 미리 학습된 파라미터와 복소수 기반의 파라미터의 크기가 상이하기 때문에 복소수 초기화의 방법으로서, 자비에 초기화(xavier initialization)가 사용되었다. 자비에 초기화를 이용하는 것은 간단하지만 이미 실수 기반으로 학습된 파라미터를 활용하는 것에 비하여 검출 성능이 낮다는 문제가 있다. According to the prior art, in order to recognize an object from a video image, a real number-based CNN using real-based residual blocks is used. Accordingly, in order to recognize an object from an image image in the form of a complex data such as a SAR image, a real number-based CNN had to selectively use either the real part or the imaginary part of the complex data. Correspondingly, the CNN according to the present disclosure may recognize an object using a complex-number-based CNN using complex-based residual blocks. However, when a CNN previously trained based on real data is modified into a complex-based CNN, the convolutional layer and the BN layer included in the complex-based CNN are initialized to receive complex data. According to the related art, since the size of a parameter learned in advance and a parameter based on a complex number are different in a real number-based neural network, xavier initialization is used as a method of initializing a complex number. Although it is simple to use Xavier initialization, there is a problem in that detection performance is low compared to using parameters that have already been learned based on real numbers.

본 개시에 따른 복소수 기반의 CNN은 자비에 초기화 방법이 아닌 실수 기반으로 미리 학습된 파라미터에 기반하여, 복소수 연산을 수행하기 위하여 복수의 레이어들을 초기화할 수 있다. 미리 학습된 파라미터는 실수 컨벌루션 가중치, 실수 평균, 실수 분산, 실수 바이어스(bias), 실수 BN 가중치 중 적어도 하나를 포함할 수 있다.The complex number-based CNN according to the present disclosure may initialize a plurality of layers in order to perform a complex number operation based on a parameter previously learned based on a real number rather than a Xavier initialization method. The pre-learned parameter may include at least one of a real convolution weight, a real average, a real variance, a real bias, and a real BN weight.

복소수 기반의 컨벌루션 레이어를 초기화하기 위하여, 복소수 기반의 CNN(103)은, RGB(red green blue) 색상들 중 적어도 하나의 색상에 대응되는 색상 값에 기반하여 실수 컨벌루션 가중치를 결정할 수 있다. 이후, 복소수 기반의 CNN(103)은 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정하고, 컨벌루션 레이어의 초기화를 수행할 수 있다. In order to initialize the complex-based convolutional layer, the complex-based CNN 103 may determine a real convolutional weight based on a color value corresponding to at least one color among red green blue (RGB) colors. Thereafter, the complex-based CNN 103 may determine a complex convolution weight based on the real convolution weight and initialize the convolutional layer.

본 개시의 일 실시 예에 따르면, 미리 학습된 제1 컨벌루션 레이어의 가중치는 [64, 3, 7, 7]의 형태로 전달 받을 수 있다. 여기서, 64, 7, 7은 가중치의 크기를 지시하고, 3은 RGB에 따른 채널의 수를 지시한다. 복소수 기반의 CNN의 제1 컨벌루션 레이어는 오직 RGB 중 하나의 색상 값만을 받을 수 있다. 즉, 제1 컨벌루션 레이어는 컨벌루션을 수행할 수 있도록 [64, 1, 7, 7]의 형태로 가중치를 받아서 실수부와 허수부 각각에 동일하게 복사하여 입력 받을 수 있다. 본 개시의 일 실시 예에 따르면, 복소 컨벌루션 가중치의 실수부, 허수부는 실수 컨벌루션 가중치와 동일하도록 결정될 수 있다. 제1 컨벌루션 레이어 이후 복수의 컨벌루션 레이어들에서, 실수 컨벌루션과 커넬(kernel) 사이즈가 동일하므로, 미리 학습된 실수 값을 그대로 복소수 초기화에 사용할 수 있다. 즉, 실수부와 허수부에 미리 학습된 커넬 실수 값이 동일하게 복사될 수 있다. According to an embodiment of the present disclosure, the pre-learned weight of the first convolutional layer may be transmitted in the form of [64, 3, 7, 7]. Here, 64, 7, and 7 indicate the size of the weight, and 3 indicates the number of channels according to RGB. The first convolutional layer of the complex-based CNN can receive only one color value among RGB. That is, the first convolutional layer may receive a weight in the form of [64, 1, 7, 7] to perform convolution, and may be copied equally to the real part and the imaginary part to receive the input. According to an embodiment of the present disclosure, the real part and the imaginary part of the complex convolution weight may be determined to be the same as the real convolution weight. In the plurality of convolutional layers after the first convolutional layer, since real convolution and kernel size are the same, a pre-learned real value can be used for complex number initialization as it is. That is, the kernel real value learned in advance can be copied equally to the real part and the imaginary part.

복소수 기반의 BN 레이어를 초기화하기 위하여, 복소수 기반의 CNN(103)은 실수 평균, 실수 분산, 실수 바이어스, 및 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하여, BN 레이어의 초기화를 수행할 수 있다.In order to initialize the complex-based BN layer, the complex-based CNN 103 determines the complex mean, complex variance, complex bias, and complex BN weight corresponding to the real mean, real variance, real bias, and real BN weight, respectively. Accordingly, the initialization of the BN layer may be performed.

본 개시의 일 실시 예에 따르면, 복소 바이어스 B와 복소 평균 E, 복소 BN 가중치, 및 복소 분산은 각각 <수학식 2>와 같이 실수부와 허수부가 구분되는 데이터 벡터로 표현될 수 있다. BN 레이어는 미리 학습된 실수 바이어스를 복소 바이어스 B의 실수부와 허수부 각각에 동일하게 복사하여 입력 받을 수 있고, 미리 학습된 실수 평균 E를 복소 평균 E의 실수부와 허수부 각각에 동일하게 복사하여 입력 받을 수 있다. 또한, 복소 BN 가중치W와 복소 분산 V는 2 x 2 크기의 데이터 벡터로 표현되므로, BN 레이어는 W_rr, W_ii에 실수 BN 가중치를 복사하여 입력 받고, W_ri, W_ir은 0으로 입력 받을 수 있다. 동일한 방법으로 BN 레이어는 V_rr, V_ii에 실수 분산을 복사하여 입력 받고, V_ri, V_ir은 0을 입력 받을 수 있다.According to an embodiment of the present disclosure, the complex bias B, the complex mean E, the complex BN weight, and the complex variance may be expressed as a data vector in which a real part and an imaginary part are distinguished as shown in Equation 2, respectively. The BN layer can copy and receive the pre-learned real bias equally into the real and imaginary parts of the complex bias B, and copy the pre-learned real mean E equally to the real and imaginary parts of the complex mean E. can be input. In addition, since the complex BN weight W and the complex variance V are expressed as a data vector with a size of 2 x 2, the BN layer receives input by copying the real BN weights to _{W rr} , W _ii _{, and W ri} , W _ir is input as 0. can In the same way, the BN layer can receive input by copying the real variance to _{V rr} , V _ii _{, and V ri} , V _ir can receive 0 as input.

본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN(103)은 실수 기반의 네트워크로부터 미리 학습된 실수 파라미터에 기반하여, 복소수 연산을 수행하기 위한 파라미터들을 초기화를 한다. 여기서, 실수 기반의 네트워크는 BN 가중치, 바이어스를 학습하여 미세 튜닝하고, 컨벌루션 가중치, 평균, 및 분산은 학습하지 않는다. 이에 반하여, 본 개시에 따른 복소수 기반의 CNN은 BN 가중치, 바이어스에 추가로 컨벌루션 가중치를 학습하여 미세 튜닝하고, 평균, 및 분산은 초기화 이후에 학습하지 않는다. 복소수 기반의 CNN이 학습하는 파라미터를 요약하면 <표 1>과 같이 표현될 수 있다.According to an embodiment of the present disclosure, the complex number-based CNN 103 initializes parameters for performing a complex number operation based on a real parameter previously learned from a real number-based network. Here, the real number-based network is fine-tuned by learning BN weights and biases, and convolution weights, mean, and variance are not learned. In contrast, the complex-based CNN according to the present disclosure is fine-tuned by learning convolutional weights in addition to BN weights and biases, and averages and variances are not learned after initialization. If the parameters learned by the complex-based CNN are summarized, it can be expressed as in <Table 1>.

<표 1>을 참고하면, 기존의 실수 기반 네트워크는 객체를 인식할 때 실수 ResNet의 컨벌루션 가중치, BN의 평균, 및 BN의 분산을 학습하지 않고, BN의 가중치, BN의 바이어스만 학습 하였다. 그러나, 복소수 기반의 네트워크는 객체를 인식할 때 BN의 평균, BN의 분산을 학습하지 않지만 컨벌루션 가중치, BN의 가중치, BN의 바이어스를 학습할 수 있다.Referring to <Table 1>, when recognizing an object, the existing real number-based network learned only the BN weight and BN bias without learning the real ResNet convolution weight, BN mean, and BN variance. However, the complex number-based network does not learn the BN mean and BN variance when recognizing an object, but can learn the convolutional weight, BN weight, and BN bias.

본 개시에 따른 복소수 기반의 CNN은 실수 기반의 네트워크에서 복소수 기반의 네트워크로 수정되는 과정에서, 미리 학습된 실수 파라미터를 이용함으로써, 보다 빠르게 네트워크 최적화가 수행될 수 있다. 실수 데이터를 기반으로 학습된 실수 파라미터를 직접 이용하여 초기화 됨에 따라, 객체 인식 성능이 향상될 수 있다.In the complex number-based CNN according to the present disclosure, network optimization can be performed more quickly by using pre-learned real parameters in the process of being modified from a real number-based network to a complex number-based network. As it is initialized by directly using real parameters learned based on real data, object recognition performance can be improved.

도 4는 본 개시의 다양한 실시 예들에 따른 복소수 기반의 CNN이 특징맵을 생성하는 방법에 관한 모식도(400)를 도시한다. 또한, 도 4는 도 1의 복소수 기반의 CNN(103)의 네트워크 구조를 예시한다.4 is a schematic diagram 400 of a method for generating a feature map by a complex number-based CNN according to various embodiments of the present disclosure. In addition, FIG. 4 exemplifies the network structure of the complex number-based CNN 103 of FIG. 1 .

도 4를 참고하면, 영상 이미지(410)가 복소수 기반의 CNN(103)으로 입력된 경우, 복소수 기반의 CNN(103)은 사전 복소 잔차 블록(430), 제1 복소 잔차 블록 내지 제n 복소 잔차 블록(440-1 내지 440-n)에 포함된 복소수 기반의 레이어들을 이용하여 특징맵을 생성(460)할 수 있다. 영상 이미지(410)는 영상 이미지 픽셀마다 진폭과 위상에 관한 복소 데이터를 포함하는 SAR 영상을 지시할 수 있다. 복소수 기반의 CNN(103)은 실수 기반으로 학습된 이후에 실수 파라미터를 기반으로 복소수 초기화의 수행된 합성곱 신경망을 지시할 수 있다.Referring to FIG. 4 , when a video image 410 is input to the complex-based CNN 103 , the complex-based CNN 103 includes a pre-complex residual block 430 , a first complex residual block to an n-th complex residual A feature map may be generated ( 460 ) by using the complex number-based layers included in blocks 440 - 1 to 440 - n . The video image 410 may indicate an SAR image including complex data regarding amplitude and phase for each video image pixel. After the complex number-based CNN 103 is learned on the basis of real numbers, it may indicate the convolutional neural network performed of complex number initialization based on real parameters.

도 4를 참고하면, 복소수 기반의 CNN(103)은 사전 복소 잔차 블록(430)과, 제1 복소 잔차 블록 내지 제n 복소 잔차 블록(440-1 내지 440-n)을 포함할 수 있다. 도 4의 복소 잔차 블록들은 도 2의 복소 잔차 블록들을 구체화 한 것으로서 도 2의 복소 잔차 블록들과 동일한 기능을 수행할 수 있다. 또한, 복소 잔차 블록들의 개수는 사용자의 설정에 따라 변경될 수 있다.Referring to FIG. 4 , the complex-based CNN 103 may include a prior complex residual block 430 and first to nth complex residual blocks 440 - 1 to 440 - n . The complex residual blocks of FIG. 4 embody the complex residual blocks of FIG. 2 , and may perform the same function as the complex residual blocks of FIG. 2 . Also, the number of complex residual blocks may be changed according to a user's setting.

사전 복소 잔차 블록(430)은 복소수 데이터를 입력 받고 복소수 데이터를 학습하기 위한 복소수 기반의 레이어들을 포함할 수 있다. 본 개시의 일 실시 예에 따르면, 사전 복소 잔차 블록(430)은 복소 컨벌루션 레이어, 복소 BN 레이어, 복소 ReLU 레이어, 및 복소 맥스 풀링 레이어를 포함할 수 있다. 사전 복소 잔차 블록(430)에서, 복소 컨벌루션 레이어, 복소 BN 레이어, 복소 ReLU 레이어, 및 복소 맥스 풀링 레이어가 순차적으로 배치될 수 있다.The prior complex residual block 430 may include complex number-based layers for receiving complex data and learning complex data. According to an embodiment of the present disclosure, the pre-complex residual block 430 may include a complex convolutional layer, a complex BN layer, a complex ReLU layer, and a complex max pooling layer. In the pre-complex residual block 430 , a complex convolutional layer, a complex BN layer, a complex ReLU layer, and a complex max pooling layer may be sequentially disposed.

제1 복소 잔차 블록 내지 제n 복소 잔차 블록(440-1 내지 440-n)은 복수의 컨벌루션 연산을 수행하여 복소수 데이터를 학습하기 위한 복소수 기반의 레이어들을 포함할 수 있다. 제1 복소 잔차 블록 내지 제n 복소 잔차 블록(440-1 내지 440-n)은 복소 컨벌루션 레이어, 복소 BN 레이어, 및 복소 ReLU 레이어를 포함할 수 있다. 복소 잔차 블록내에서 및/또는 복소 잔차 블록들마다, 복소 컨벌루션 레이어와 복소 BN 레이어의 개수는 사용자의 설정에 따라 변경될 수 있다. 즉, 제1 복소 잔차 블록(440-1)에 포함된 컨벌루션 레이어와 BN 레이어의 개수 각각은 제n 복소 잔차 블록(440-n)에 포함된 컨벌루션 레이어와 BN 레이어의 개수와 상이할 수 있다.The first to nth complex residual blocks 440 - 1 to 440 - n may include complex number-based layers for learning complex data by performing a plurality of convolution operations. The first to nth complex residual blocks 440 - 1 to 440 - n may include a complex convolutional layer, a complex BN layer, and a complex ReLU layer. Within the complex residual block and/or per complex residual blocks, the number of complex convolutional layers and complex BN layers may be changed according to a user's setting. That is, each of the number of convolutional layers and BN layers included in the first complex residual block 440 - 1 may be different from the number of convolutional layers and BN layers included in the nth complex residual block 440 - n .

도 5는 본 개시의 다양한 실시 예들에 따른 객체 인식 장치의 객체 인식 방법에 관한 모식도(500)를 도시한다. 도 5는 도 1의 객체 인식 장치(100)의 동작 방법을 예시한다. 도 5를 참고하면, 객체 인식 장치(100)는 복소수 기반의 CNN(103)과 ROI 트랜스포머(560)기반의 객체 인식 정보 생성기(105)를 결합하여 구성될 수 있다. 5 is a schematic diagram 500 of an object recognition method of an object recognition apparatus according to various embodiments of the present disclosure. 5 exemplifies an operation method of the object recognition apparatus 100 of FIG. 1 . Referring to FIG. 5 , the object recognition apparatus 100 may be configured by combining the complex number-based CNN 103 and the ROI transformer 560-based object recognition information generator 105 .

도 5를 참고하면, 객체 인식 장치(100)는 송수신기(101)를 이용하여 영상 이미지(501)를 수신한다. 여기서 영상 이미지(501)는 복소 데이터 기반의 SAR 영상을 포함할 수 있다. Referring to FIG. 5 , the object recognition apparatus 100 receives a video image 501 using the transceiver 101 . Here, the video image 501 may include a complex data-based SAR image.

이후, 객체 인식 장치(100)는 수신한 영상 이미지에서 복소수 기반의 CNN(103)을 이용하여 특징맵을 생성한다. 복소수 기반의 CNN(103)은 I/Q(in phase/quadrature) 신호를 2 채널로 입력 받는다. 이후, 복소수 기반의 CNN은 실수부 및 허수부의 두가지 데이터를 RMS(root mean square) 형태로 합산하여 1 채널의 실수 데이터로 변환한다. Thereafter, the object recognition apparatus 100 generates a feature map from the received video image by using the complex number-based CNN 103 . The complex number-based CNN 103 receives an I/Q (in phase/quadrature) signal through 2 channels. Thereafter, the complex number-based CNN sums two data of a real part and an imaginary part in a root mean square (RMS) form and converts it into real data of one channel.

이후, 객체 인식 장치(100)는 출력된 실수 데이터로부터 객체를 인식할 수 있다. 객체 인식 장치(100)는 FPN(feature pyramid network)(505)과 ROI 트랜스포머(560)에 기반하여 영상 이미지에서 객체를 인식할 수 있으며, FPN(505), ROI 트랜스포머(560)는 일반 실수 네트워크와 동일한 구조로 구성될 수 있다. 본 개시의 일 실시 예에 따르면, ROI 트랜스포머(560)는 FC-5(515), 디코더(517), FC-800(521)를 포함할 수 있고, ROI 트랜스포머(560)의 동작 과정에서, 특징맵 일부가 추출된 제1 부분 특징맵(513), 제2 부분 특징맵(519)이 생성되고, classification(523), regression(525) 절차가 수행될 수 있다.Thereafter, the object recognition apparatus 100 may recognize an object from the output real number data. The object recognition apparatus 100 may recognize an object in a video image based on a feature pyramid network (FPN) 505 and an ROI transformer 560, and the FPN 505 and the ROI transformer 560 are a general real network and It may be configured in the same structure. According to an embodiment of the present disclosure, the ROI transformer 560 may include the FC-5 515 , the decoder 517 , and the FC-800 521 , and in the process of the operation of the ROI transformer 560 , the feature A first partial feature map 513 and a second partial feature map 519 from which a part of the map is extracted are generated, and classification 523 and regression 525 procedures may be performed.

본 개시의 일 실시 예에 따르면, FPN(505)를 통과하여 다섯 개의 특징맵들(507)이 생성되고, 특징맵 별로 ROI들(509)이 선별된다. 선별된 ROI들(509)은 ROI 트랜스포머(560)에서 HROI(horizontal ROI)로 입력 된다. HROI를 통해 RGT(rotated ground truth) 사이의 오프셋만큼 학습하여 RROI(rotated ROI)를 생성하고, 생성된 RROI로부터 특징 추출을 다시 수행할 수 있다. 이러한 과정을 반복하여, 객체 인식 장치는 SAR 영상으로부터 객체를 인식하고, 객체 인식 정보를 생성할 수 있다.According to an embodiment of the present disclosure, five feature maps 507 are generated through the FPN 505 , and ROIs 509 are selected for each feature map. The selected ROIs 509 are input as a horizontal ROI (HROI) from the ROI transformer 560 . Through HROI, it is possible to generate a rotated ROI (RROI) by learning as much as the offset between the rotated ground truth (RGT), and then perform feature extraction again from the generated RROI. By repeating this process, the object recognition apparatus may recognize an object from the SAR image and generate object recognition information.

도 6은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치의 동작 방법에 관한 흐름도(600)를 도시한다. 도 6은 도 1의 객체 인식 장치(100)의 동작 방법을 예시한다.6 is a flowchart illustrating a method of operating an object recognition apparatus according to various embodiments of the present disclosure. 6 illustrates an operation method of the object recognition apparatus 100 of FIG. 1 .

도 6을 참고하면 단계(601)에서, 객체 인식 장치(100)는 복소수 데이터를 기반으로 하는 SAR 영상을 획득한다. 본 개시의 일 실시 예에 따르면, 객체 인식 장치(100)는 송수신기를 이용하여 각각의 픽셀마다 진폭과 위상에 관한 복소 데이터를 포함하는 SAR 영상을 획득할 수 있다.Referring to FIG. 6 , in step 601 , the object recognition apparatus 100 acquires an SAR image based on complex data. According to an embodiment of the present disclosure, the object recognition apparatus 100 may obtain an SAR image including complex data regarding amplitude and phase for each pixel using a transceiver.

단계(603)에서, 객체 인식 장치(100)는 복소수 기반의 컨벌루션 연산을 수행하여 특징맵을 생성한다. 객체 인식 장치(100)는 복소수 기반의 CNN을 이용하여 복소수 기반의 컨벌루션 연산을 수행하여 특징 맵을 생성할 수 있다. 여기서, 복소수 기반의 컨벌루션 연산은 컨벌루션 레이어, BN 레이어, ReLU 레이어, 및 맥스 풀링 레이어에서 수행되는 연산을 지시할 수 있다. 복소수 기반의 CNN은 복수의 레이어들로 구성된 네트워크를 포함하고, 복수의 레이어들은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN 레이어, 복소수 기반의 ReLU 레이어, 및 복소수 기반의 맥스 풀링 레이어 중 적어도 하나를 포함할 수 있다.In operation 603, the object recognition apparatus 100 generates a feature map by performing a complex number-based convolution operation. The object recognition apparatus 100 may generate a feature map by performing a complex number-based convolution operation using a complex-number-based CNN. Here, the complex number-based convolution operation may indicate an operation performed in the convolution layer, the BN layer, the ReLU layer, and the max pooling layer. A complex-based CNN includes a network composed of a plurality of layers, and the plurality of layers includes at least one of a complex-based convolutional layer, a complex-based BN layer, a complex-based ReLU layer, and a complex-based max pooling layer. can do.

본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN은, 복소수 기반의 컨벌루션 연산을 수행하기 이전에, 실수 기반의 컨벌루션 연산을 통해 적어도 하나의 실수 파라미터를 학습하고, 적어도 하나의 실수 파라미터에 기반하여 복수의 레이어들의 초기화를 수행할 수 있다. According to an embodiment of the present disclosure, the complex number-based CNN learns at least one real parameter through a real number-based convolution operation before performing the complex number-based convolution operation, and based on the at least one real parameter Initialization of a plurality of layers may be performed.

본 개시의 일 실시 예에 따르면, 적어도 하나의 실수 파라미터는 컨벌루션 레이어에서 사용되는 실수 컨벌루션 가중치를 포함할 수 있다. 복소수 기반의 CNN은, RGB 색상들 중 적어도 하나의 색상의 색상 값에 기반하여 실수 컨벌루션 가중치를 식별하고, 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정할 수 있다. 이후, 복소수 기반의 CNN은 결정된 복소 컨벌루션 가중치에 기반하여, 컨벌루션 레이어의 초기화를 수행할 수 있다. 본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN은 복소 컨벌루션 가중치의 실수부를 실수 컨벌루션 가중치와 동일하도록 결정하고, 복소 컨벌루션 가중치의 허수부를 실수 컨벌루션 가중치와 동일하도록 결정할 수 있다. 본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN은 컨벌루션 레이어의 초기화를 수행한 이후에, 적어도 하나의 트레이닝 이미지에 기반하여 복소 컨벌루션 가중치를 학습할 수 있다.According to an embodiment of the present disclosure, at least one real parameter may include a real convolution weight used in a convolutional layer. The complex-based CNN may identify a real convolutional weight based on a color value of at least one color among RGB colors, and determine a complex convolutional weight based on the real convolutional weight. Thereafter, the complex number-based CNN may initialize the convolutional layer based on the determined complex convolution weight. According to an embodiment of the present disclosure, the complex number-based CNN may determine the real part of the complex convolution weight to be the same as the real convolution weight, and determine the imaginary part of the complex convolution weight to be the same as the real convolution weight. According to an embodiment of the present disclosure, the complex number-based CNN may learn complex convolutional weights based on at least one training image after the convolutional layer is initialized.

본 개시의 일 실시 예에 따르면, 적어도 하나의 실수 파라미터는 복소수 기반의 BN 레이어에서 사용되는 실수 평균, 실수 분산, 실수 바이어스, 실수 BN 가중치를 포함할 수 있다. 복소수 기반의 CNN은 실수 평균, 실수 분산, 실수 바이어스, 및 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하고, 복소 평균, 복소 분산, 복소 바이어스, 복소 BN 가중치에 기반하여, BN 레이어의 초기화를 수행할 수 있다. 본 개시의 일 실시 예에 따르면, 복소수 기반의 CNN은 복소 평균의 실수부를 실수 평균과 동일하도록 결정하고, 복소 평균의 허수부를 실수 평균과 동일하도록 결정하고, 복소 분산의 실수부를 실수 분산과 동일하도록 결정하고, 복소 분산의 허수부를 실수 분산과 동일하도록 결정할 수 있다.According to an embodiment of the present disclosure, the at least one real parameter may include a real average, a real variance, a real bias, and a real BN weight used in a complex number-based BN layer. The complex-based CNN determines the complex mean, complex variance, complex bias, and complex BN weights corresponding to the real mean, real variance, real bias, and real BN weights, respectively, and determines the complex mean, complex variance, complex bias, and complex BN weights. Based on the weight, initialization of the BN layer may be performed. According to an embodiment of the present disclosure, the complex number-based CNN determines that the real part of the complex mean is equal to the real mean, determines the imaginary part of the complex mean equal to the real mean, and sets the real part of the complex variance equal to the real variance. and determine that the imaginary part of the complex variance is equal to the real variance.

단계(605)에서, 객체 인식 장치(100)는 특징맵에 기반하여 객체 인식 정보를 생성할 수 있다. 본 개시의 일 실시 예에 따르면, 객체 인식 장치(100)는 복소수 기반의 CNN이 생성한 특징맵에 ROI 연산에 기반하여 특징을 검출할 수 있다. 객체 인식 장치(100)는 ROI 연산에 기반하여 SAR 영상에 포함된 객체를 식별하고, 객체에 관한 정보를 생성할 수 있다. 이후, 객체 인식 장치(100)는 생성된 객체 인식 정보를 출력할 수 있다.In operation 605, the object recognition apparatus 100 may generate object recognition information based on the feature map. According to an embodiment of the present disclosure, the object recognition apparatus 100 may detect a feature based on an ROI operation in a feature map generated by the complex-based CNN. The object recognition apparatus 100 may identify an object included in the SAR image based on the ROI operation and generate information about the object. Thereafter, the object recognition apparatus 100 may output the generated object recognition information.

도 7은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치를 이용한 객체 인식 결과의 일 예(700)를 도시한다. 도 7은 CNN을 이용하여 객체를 인식한 결과에 관한 결과 이미지를 예시한다.7 illustrates an example 700 of an object recognition result using an object recognition apparatus according to various embodiments of the present disclosure. 7 illustrates a result image related to a result of recognizing an object using CNN.

도 7을 참고하면, 객체 인식 장치가 SAR 영상으로부터 선박을 검출하는 경우, 실수 기반의 CNN을 이용하여 객체를 인식한 제1 결과 이미지(710)와 복소수 기반의 CNN을 이용하여 객체를 인식한 제2 결과 이미지(760)가 도시 된다. 또한, 일반적으로 RROI와 RGT 사이의 IoU(intersection over union)가 0.5이상인 경우, tp(true positive)로 판단되고, IoU가 0.5 미만인 경우 fp(false positive)로 판단된다. Referring to FIG. 7 , when the object recognition apparatus detects a ship from the SAR image, the first result image 710 for recognizing an object using a real number-based CNN and a second image for recognizing an object using a complex number-based CNN 2 The resulting image 760 is shown. Also, in general, when the intersection over union (IoU) between the RROI and the RGT is 0.5 or more, it is determined as tp (true positive), and when the IoU is less than 0.5, it is determined as fp (false positive).

제1 결과 이미지(710)를 참고하면, 실수 기반의 CNN을 이용하여 선박을 검출하는 경우, 객체 인식 장치는 선박 이미지(711)를 인식하고 predicted(tp)가 발생된다. 그러나, 선박 이미지(711)와 인식된 선박 이미지(713) 사이의 오차가 크게 발생한 것이 확인된다. 또한, 실수 기반의 CNN은 선박이 아니거나 관측하고자 하는 주요 선박이 아닌 객체를 검출하고 predicted(fp)(715-1 내지 715-3)가 발생된다. 실수 기반의 CNN을 이용하면 목적과 관련 없는 객체가 인식되어 객체 인식 성능이 낮다.Referring to the first result image 710 , when a vessel is detected using a real number-based CNN, the object recognition device recognizes the vessel image 711 and predicted(tp) is generated. However, it is confirmed that a large error occurs between the ship image 711 and the recognized ship image 713 . In addition, the real number-based CNN detects an object that is not a vessel or is not a major vessel to be observed, and predicted(fp)(715-1 to 715-3) is generated. If a real-time-based CNN is used, objects irrelevant to the purpose are recognized, resulting in low object recognition performance.

이에 반하여, 제2 결과 이미지(760)를 참고하면, 복소수 기반의 CNN을 이용하여 선박을 검출하는 경우, 객체 인식 장치는 선박 이미지(761)를 인식하고 predicted(tp)가 발생된다. 또한, 선박 이미지(761)와 인식된 선박 이미지(763) 사이의 오차가, 실수 기반의 CNN을 이용한 경우 발생된 오차보다 작은 것이 확인된다. 또한, 복소수 기반의 CNN을 이용하면 관측하고자 하는 주요 선박이 아닌 객체를 인식하지 않아 객체 인식 성능이 높다.In contrast, referring to the second result image 760 , when a vessel is detected using a complex-number-based CNN, the object recognition device recognizes the vessel image 761 and predicted(tp) is generated. In addition, it is confirmed that the error between the ship image 761 and the recognized ship image 763 is smaller than the error generated when a real number-based CNN is used. In addition, the complex number-based CNN does not recognize objects other than the main vessel to be observed, so object recognition performance is high.

도 8은 본 개시의 다양한 실시 예들에 따른 객체 인식 장치를 이용한 객체 인식 결과의 다른 일 예(800)를 도시한다. 도 8은 CNN을 이용하여 객체를 인식한 결과에 관한 결과 이미지를 예시한다.8 illustrates another example 800 of an object recognition result using an object recognition apparatus according to various embodiments of the present disclosure. 8 illustrates a result image related to a result of recognizing an object using CNN.

도 8을 참고하면, 객체 인식 장치가 SAR 영상으로부터 선박을 검출하는 경우, 실수 기반의 CNN을 이용하여 객체를 인식한 제3 결과 이미지(810)와 복소수 기반의 CNN을 이용하여 객체를 인식한 제4 결과 이미지(860)가 도시 된다.Referring to FIG. 8 , when the object recognition device detects a vessel from the SAR image, the third result image 810 for recognizing an object using a real number-based CNN and a third resultant image for recognizing an object using a complex number-based CNN 4 The resulting image 860 is shown.

제3 결과 이미지(810)를 참고하면, 실수 기반의 CNN을 이용하여 선박을 검출하는 경우, 객체 인식 장치는 선박 이미지(811)를 인식하고 predicted(tp)가 발생된다. 그러나, 선박 이미지(811)와 인식된 선박 이미지(813) 사이의 오차가 크게 발생한 것이 확인된다. 또한, 실수 기반의 CNN은 SAR 영상에서 진폭 정보만을 이용함에 따라 다른 선박 이미지(815)를 인식하지 못하여 false negative가 발생될 수 있다. 실수 기반의 CNN을 이용하면 한정된 정보로 객체를 인식함에 따라 객체 인식 성능이 낮다Referring to the third result image 810 , when a vessel is detected using a real number-based CNN, the object recognition device recognizes the vessel image 811 and predicted(tp) is generated. However, it is confirmed that a large error between the ship image 811 and the recognized ship image 813 occurs. In addition, the real number-based CNN may not recognize other ship images 815 as only amplitude information is used in the SAR image, so a false negative may be generated. When using a real number-based CNN, object recognition performance is low as objects are recognized with limited information.

이에 반하여, 제4 결과 이미지(860)를 참고하면, 복소수 기반의 CNN을 이용하여 선박을 검출하는 경우, 객체 인식 장치는 선박 이미지(861)를 검출하여 predicted(tp)가 발생된다. 또한 선박 이미지(861)와 인식된 선박 이미지 (863) 사이의 오차가, 실수 기반의 CNN을 이용한 경우 발생된 오차보다 작은 것이 확인된다. 또한, 복소수 기반의 CNN을 이용하면 SAR 영상에서 진폭과 위상을 모두 고려하여 다른 선박(865)을 인식하여 predicted(tp)가 발생될 수 있다. 즉, 객체 인식 장치는 복소수 기반의 CNN을 이용함으로써, 실수 기반의 CNN을 이용하는 경우 대비 보다 정밀하게 객체를 인식할 수 있다.In contrast, referring to the fourth result image 860 , when a ship is detected using a complex-number-based CNN, the object recognition device detects the ship image 861 to generate predicted(tp). In addition, it is confirmed that the error between the ship image 861 and the recognized ship image 863 is smaller than the error generated when a real number-based CNN is used. In addition, if a complex-based CNN is used, predicted(tp) may be generated by recognizing another vessel 865 considering both amplitude and phase in the SAR image. That is, the object recognition apparatus can recognize an object more precisely than when a real number-based CNN is used by using a complex-number-based CNN.

본 개시의 청구항 또는 명세서에 기재된 실시 예들에 따른 방법들은 하드웨어, 소프트웨어, 또는 하드웨어와 소프트웨어의 조합의 형태로 구현될(implemented) 수 있다. Methods according to the embodiments described in the claims or specifications of the present disclosure may be implemented in the form of hardware, software, or a combination of hardware and software.

소프트웨어로 구현하는 경우, 하나 이상의 프로그램(소프트웨어 모듈)을 저장하는 컴퓨터 판독 가능 저장 매체가 제공될 수 있다. 컴퓨터 판독 가능 저장 매체에 저장되는 하나 이상의 프로그램은, 전자 장치(device) 내의 하나 이상의 프로세서에 의해 실행 가능하도록 구성된다(configured for execution). 하나 이상의 프로그램은, 전자 장치로 하여금 본 개시의 청구항 또는 명세서에 기재된 실시 예들에 따른 방법들을 실행하게 하는 명령어(instructions)를 포함한다. When implemented in software, a computer-readable storage medium storing one or more programs (software modules) may be provided. One or more programs stored in the computer-readable storage medium are configured to be executable by one or more processors in an electronic device (device). One or more programs include instructions for causing an electronic device to execute methods according to embodiments described in a claim or specification of the present disclosure.

이러한 프로그램(소프트웨어 모듈, 소프트웨어)은 랜덤 액세스 메모리 (random access memory), 플래시(flash) 메모리를 포함하는 불휘발성(non-volatile) 메모리, 롬(read only memory, ROM), 전기적 삭제가능 프로그램가능 롬(electrically erasable programmable read only memory, EEPROM), 자기 디스크 저장 장치(magnetic disc storage device), 컴팩트 디스크 롬(compact disc-ROM, CD-ROM), 디지털 다목적 디스크(digital versatile discs, DVDs) 또는 다른 형태의 광학 저장 장치, 마그네틱 카세트(magnetic cassette)에 저장될 수 있다. 또는, 이들의 일부 또는 전부의 조합으로 구성된 메모리에 저장될 수 있다. 또한, 각각의 구성 메모리는 다수 개 포함될 수도 있다. Such programs (software modules, software) include random access memory, non-volatile memory including flash memory, read only memory (ROM), electrically erasable programmable ROM (electrically erasable programmable read only memory, EEPROM), magnetic disc storage device, compact disc-ROM (CD-ROM), digital versatile discs (DVDs), or other It may be stored in an optical storage device or a magnetic cassette. Alternatively, it may be stored in a memory composed of a combination of some or all thereof. In addition, each configuration memory may be included in plurality.

또한, 프로그램은 인터넷(Internet), 인트라넷(Intranet), LAN(local area network), WAN(wide area network), 또는 SAN(storage area network)과 같은 통신 네트워크, 또는 이들의 조합으로 구성된 통신 네트워크를 통하여 접근(access)할 수 있는 부착 가능한(attachable) 저장 장치(storage device)에 저장될 수 있다. 이러한 저장 장치는 외부 포트를 통하여 본 개시의 실시 예를 수행하는 장치에 접속할 수 있다. 또한, 통신 네트워크상의 별도의 저장장치가 본 개시의 실시 예를 수행하는 장치에 접속할 수도 있다.In addition, the program is transmitted through a communication network consisting of a communication network such as the Internet, an intranet, a local area network (LAN), a wide area network (WAN), or a storage area network (SAN), or a combination thereof. It may be stored on an attachable storage device that can be accessed. Such a storage device may be connected to a device implementing an embodiment of the present disclosure through an external port. In addition, a separate storage device on the communication network may be connected to the device implementing the embodiment of the present disclosure.

상술한 본 개시의 구체적인 실시 예들에서, 개시에 포함되는 구성 요소는 제시된 구체적인 실시 예에 따라 단수 또는 복수로 표현되었다. 그러나, 단수 또는 복수의 표현은 설명의 편의를 위해 제시한 상황에 적합하게 선택된 것으로서, 본 개시가 단수 또는 복수의 구성 요소에 제한되는 것은 아니며, 복수로 표현된 구성 요소라 하더라도 단수로 구성되거나, 단수로 표현된 구성 요소라 하더라도 복수로 구성될 수 있다.In the specific embodiments of the present disclosure described above, elements included in the disclosure are expressed in the singular or plural according to the specific embodiments presented. However, the singular or plural expression is appropriately selected for the context presented for convenience of description, and the present disclosure is not limited to the singular or plural component, and even if the component is expressed in plural, it is composed of the singular or singular. Even an expressed component may be composed of a plurality of components.

한편 본 개시의 상세한 설명에서는 구체적인 실시 예에 관해 설명하였으나, 본 개시의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 개시의 범위는 설명된 실시 예에 국한되어 정해져서는 아니 되며 후술하는 특허청구의 범위뿐만 아니라 이 특허청구의 범위와 균등한 것들에 의해 정해져야 한다.Meanwhile, although specific embodiments have been described in the detailed description of the present disclosure, various modifications are possible without departing from the scope of the present disclosure. Therefore, the scope of the present disclosure should not be limited to the described embodiments and should be defined by the claims described below as well as the claims and equivalents.

101 송수신기 103 복소수 기반의 CNN
105 객체 인식 정보 생성기 201 사전 복소 잔차 블록
203 복소 잔차 블록 410 영상 이미지
406 특징맵 505 FPN
509 ROI들 515 FC-5
517 디코더 521 FC-800
523 Classification 525 regression101 Transceiver 103 Complex Number Based CNN
105 Object Recognition Information Generator 201 Pre-complex Residual Block
203 Complex Residual Block 410 Image Image
406 Feature Map 505 FPN
509 ROIs 515 FC-5
517 decoder 521 FC-800
523 Classification 525 regression

Claims

CNN(convolution neural network)을 이용하여 객체를 인식하는 장치에 있어서,
복소수 데이터를 기반으로 하는 SAR(synthetic aperture radar) 영상을 획득하는 송수신기;
복소수 기반의 컨벌루션(convolution) 연산을 수행하여 특징맵을 생성하는 복소수 기반의 CNN; 및
상기 특징맵에 기반하여 객체 인식 정보를 생성하는 객체 인식 정보 생성기를 포함하고,
상기 복소수 기반의 CNN은 복수의 레이어들로 구성된 네트워크를 포함하고,
상기 복소수 기반의 CNN은,
실수 기반의 컨벌루션 연산에 기반하여, 상기 특징맵을 생성하기 위한 적어도 하나의 실수 파라미터를 학습하고,
상기 적어도 하나의 실수 파라미터에 기반하여, 상기 복수의 레이어들에 초기화를 적용하고,
상기 적어도 하나의 실수 파라미터는 RGB(red green blue) 색상들 중 하나의 색상에 대응되는 채널에서 미리 학습된 가중치 값을 포함하는 객체 인식 장치.
In an apparatus for recognizing an object using a convolutional neural network (CNN),
a transceiver for acquiring a synthetic aperture radar (SAR) image based on complex data;
a complex number-based CNN that generates a feature map by performing a complex number-based convolution operation; and
An object recognition information generator for generating object recognition information based on the feature map,
The complex-based CNN includes a network consisting of a plurality of layers,
The complex number-based CNN is,
Learning at least one real parameter for generating the feature map based on a real number-based convolution operation,
applying initialization to the plurality of layers based on the at least one real parameter;
The at least one real parameter includes a weight value previously learned in a channel corresponding to one color among red green blue (RGB) colors.

청구항 1에 있어서,
상기 복수의 레이어들은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN(batch normalization) 레이어, 복소수 기반의 ReLU(rectified linear unit) 레이어, 및 복소수 기반의 맥스 풀링(max pooling) 레이어 중 적어도 하나를 포함하는 객체 인식 장치.
The method according to claim 1,
The plurality of layers may include at least one of a complex number-based convolutional layer, a complex number-based batch normalization (BN) layer, a complex number-based rectified linear unit (ReLU) layer, and a complex number-based max pooling layer. recognition device.

삭제delete

청구항 1에 있어서,
상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 컨벌루션 레이어에서 사용되는 실수 컨벌루션 가중치를 포함하는 객체 인식 장치.
The method according to claim 1,
The at least one real parameter includes a real convolution weight used in the complex number-based convolutional layer.

청구항 4에 있어서,
상기 복소수 기반의 CNN은,
상기 RGB 색상들 중 하나의 색상의 색상 값에 기반하여 실수 컨벌루션 가중치를 식별하고,
상기 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정하고,
상기 복소 컨벌루션 가중치에 기반하여, 상기 컨벌루션 레이어의 초기화를 수행하는 객체 인식 장치.
5. The method according to claim 4,
The complex number-based CNN is,
identify a real convolutional weight based on a color value of one of the RGB colors;
determine a complex convolution weight based on the real convolution weight;
An object recognition apparatus configured to initialize the convolutional layer based on the complex convolutional weight.

청구항 5에 있어서,
상기 복소수 기반의 CNN은
상기 복소 컨벌루션 가중치의 실수부를 상기 실수 컨벌루션 가중치와 동일하도록 결정하고,
상기 복소 컨벌루션 가중치의 허수부를 상기 실수 컨벌루션 가중치와 동일하도록 결정하는 객체 인식 장치.
6. The method of claim 5,
The complex number-based CNN is
determine the real part of the complex convolution weight to be equal to the real convolution weight;
An object recognition apparatus for determining an imaginary part of the complex convolutional weight to be equal to the real convolutional weight.

청구항 5에 있어서,
상기 복소수 기반의 CNN은
상기 컨벌루션 레이어의 초기화를 수행한 이후에, 적어도 하나의 트레이닝 이미지에 기반하여 상기 복소 컨벌루션 가중치를 학습하는 객체 인식 장치.
6. The method of claim 5,
The complex number-based CNN is
After performing the initialization of the convolutional layer, the object recognition apparatus for learning the complex convolutional weight based on at least one training image.

청구항 1에 있어서,
상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 BN 레이어에서 사용되는 실수 평균, 실수 분산, 실수 바이어스(bias), 실수 BN 가중치를 포함하는 객체 인식 장치.
The method according to claim 1,
The at least one real parameter includes a real average, a real variance, a real bias, and a real BN weight used in the complex number-based BN layer.

청구항 8에 있어서,
상기 복소수 기반의 CNN은,
상기 실수 평균, 상기 실수 분산, 상기 실수 바이어스, 및 상기 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하고,
상기 복소 평균, 상기 복소 분산, 상기 복소 바이어스, 상기 복소 BN 가중치에 기반하여, 상기 BN 레이어의 초기화를 수행하는 객체 인식 장치.
9. The method of claim 8,
The complex number-based CNN is,
determine a complex mean, a complex variance, a complex bias, and a complex BN weight corresponding to each of the real mean, the real variance, the real bias, and the real BN weight;
The apparatus for recognizing an object to initialize the BN layer based on the complex mean, the complex variance, the complex bias, and the complex BN weight.

청구항 9에 있어서,
상기 복소수 기반의 CNN은,
상기 복소 평균의 실수부를 상기 실수 평균과 동일하도록 결정하고,
상기 복소 평균의 허수부를 상기 실수 평균과 동일하도록 결정하고,
상기 복소 분산의 실수부를 상기 실수 분산과 동일하도록 결정하고,
상기 복소 분산의 허수부를 상기 실수 분산과 동일하도록 결정하는 객체 인식 장치.
10. The method of claim 9,
The complex number-based CNN is,
determine the real part of the complex mean equal to the real mean;
determining the imaginary part of the complex mean equal to the real mean;
determine the real part of the complex variance equal to the real variance;
An object recognition apparatus for determining an imaginary part of the complex variance equal to the real variance.

복소수 기반의 CNN(convolution neural network)을 기반으로 객체를 인식하는 장치의 동작 방법에 있어서,
복소수 데이터를 기반으로 하는 SAR(synthetic aperture radar) 영상을 획득하는 단계;
복소수 기반의 컨벌루션(convolution) 연산을 수행하여 특징맵을 생성하는 단계; 및
상기 특징맵에 기반하여 객체 인식 정보를 생성하는 단계를 포함하고,
상기 복소수 기반의 CNN은 복수의 레이어들로 구성된 네트워크를 포함하고,
상기 복소수 기반의 컨벌루션 연산을 수행하기 이전에, 실수 기반의 컨벌루션 연산에 기반하여, 상기 특징맵을 생성하기 위한 적어도 하나의 실수 파라미터를 학습하는 단계; 및
상기 적어도 하나의 실수 파라미터에 기반하여, 상기 복수의 레이어들에 초기화를 적용하는 단계를 포함하고,
상기 적어도 하나의 실수 파라미터는 RGB(red green blue) 색상들 중 하나의 색상에 대응되는 채널에서 미리 학습된 가중치 값을 포함하는 객체 인식 장치의 동작 방법.
In the operating method of an apparatus for recognizing an object based on a complex number-based convolutional neural network (CNN),
acquiring a synthetic aperture radar (SAR) image based on complex data;
generating a feature map by performing a complex number-based convolution operation; and
generating object recognition information based on the feature map;
The complex-based CNN includes a network consisting of a plurality of layers,
before performing the complex number-based convolution operation, learning at least one real parameter for generating the feature map based on the real number-based convolution operation; and
applying initialization to the plurality of layers based on the at least one real parameter;
The at least one real parameter includes a weight value previously learned in a channel corresponding to one color among red green blue (RGB) colors.

청구항 11에 있어서,
상기 복수의 레이어들은 복소수 기반의 컨벌루션 레이어, 복소수 기반의 BN(batch normalization) 레이어, 복소수 기반의 ReLU(rectified linear unit) 레이어, 및 복소수 기반의 맥스 풀링(max pooling) 레이어 중 적어도 하나를 포함하는 객체 인식 장치의 동작 방법.
12. The method of claim 11,
The plurality of layers may include at least one of a complex number-based convolutional layer, a complex number-based batch normalization (BN) layer, a complex number-based rectified linear unit (ReLU) layer, and a complex number-based max pooling layer. How the recognition device works.

삭제delete

청구항 11에 있어서,
상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 컨벌루션 레이어에서 사용되는 실수 컨벌루션 가중치를 포함하고,
상기 초기화를 수행하는 단계는,
상기 RGB 색상들 중 하나의 색상의 색상 값에 기반하여 실수 컨벌루션 가중치를 식별하는 단계;
상기 실수 컨벌루션 가중치에 기반하여 복소 컨벌루션 가중치를 결정하는 단계; 및
상기 복소 컨벌루션 가중치에 기반하여, 상기 컨벌루션 레이어의 초기화를 수행하는 단계를 포함하는 객체 인식 장치의 동작 방법.
12. The method of claim 11,
The at least one real parameter includes a real convolution weight used in the complex-based convolutional layer,
The step of performing the initialization includes:
identifying a real convolutional weight based on a color value of one of the RGB colors;
determining a complex convolution weight based on the real convolution weight; and
and performing initialization of the convolutional layer based on the complex convolutional weight.

청구항 14에 있어서,
상기 적어도 하나의 실수 파라미터는 상기 복소수 기반의 BN 레이어에서 사용되는 실수 평균, 실수 분산, 실수 바이어스(bias), 실수 BN 가중치를 포함하고,
상기 초기화를 수행하는 단계는,
상기 실수 평균, 상기 실수 분산, 상기 실수 바이어스, 및 상기 실수 BN 가중치 각각에 대응되는 복소 평균, 복소 분산, 복소 바이어스, 및 복소 BN 가중치를 결정하는 단계; 및
상기 복소 평균, 상기 복소 분산, 상기 복소 바이어스, 상기 복소 BN 가중치에 기반하여, 상기 BN 레이어의 초기화를 수행하는 단계를 포함하는 객체 인식 장치의 동작 방법.
15. The method of claim 14,
The at least one real parameter includes a real average, a real variance, a real bias, and a real BN weight used in the complex number-based BN layer,
The step of performing the initialization includes:
determining a complex mean, a complex variance, a complex bias, and a complex BN weight corresponding to each of the real mean, the real variance, the real bias, and the real BN weight; and
and performing initialization of the BN layer based on the complex mean, the complex variance, the complex bias, and the complex BN weight.