KR102298175B1

KR102298175B1 - Image out-painting appratus and method on deep-learning

Info

Publication number: KR102298175B1
Application number: KR1020200064544A
Authority: KR
Inventors: 이승완; 김번영
Original assignee: 건양대학교 산학협력단
Priority date: 2020-05-28
Filing date: 2020-05-28
Publication date: 2021-09-03

Abstract

The present invention relates to a deep-learning based image out-painting device and a method thereof. More specifically, the deep-learning based image out-painting device comprises: an image generating unit that extracts a feature map of an input image in which a defect occurs, and generates an output image in which the defect is corrected by identifying a correlation between pixels of an extracted feature map; an authenticity determining unit that determines whether the output image is authentic; and a loss calculating unit that updates a weighted value for minimizing a loss function of the image generating unit and the authenticity determining unit according to the authenticity of the output image to the image generating unit and the authenticity determining unit.

Description

딥러닝기반의 영상 아웃페인팅 장치 및 그 방법{IMAGE OUT-PAINTING APPRATUS AND METHOD ON DEEP-LEARNING}Deep learning-based image outpainting apparatus and method {IMAGE OUT-PAINTING APPRATUS AND METHOD ON DEEP-LEARNING}

본 발명은 딥러닝기반의 영상 아웃페인팅 장치 및 그 방법에 관한 것이다.The present invention relates to a deep learning-based image outpainting apparatus and method.

보다 구체적으로, 결손이 발생된 입력영상의 특징맵을 추출하고, 추출된 특징맵의 픽셀간 상관관계를 파악하여 결손이 보정된 출력영상을 생성하는 영상 생성부, 출력영상에 대한 진위여부를 판단하는 진위 판단부 및 진위여부에 따라 영상 생성부 및 진위 판단부의 손실함수를 최소화하는 가중치를 영상 생성부 및 진위 판단부에 업데이트하는 손실 산출부를 포함하는 것을 특징으로 하는 딥러닝기반 아웃 페인팅 장치에 관한 것이다. More specifically, an image generator that extracts a feature map of an input image in which a defect is generated, and a correlation between pixels of the extracted feature map to generate an output image in which the defect is corrected, and determines whether the output image is authentic or not A deep learning-based out-painting apparatus, characterized in that it comprises: a truth determination unit that performs will be.

아웃페인팅은 주어진 영상의 외부 영역을 기존 영상의 데이터를 기반으로 예측하여 외삽하는 기술이다. Outpainting is a technology that predicts and extrapolates an external area of a given image based on data of an existing image.

일반적으로, 아웃페인팅은 바이리니어(bilinear), 바이큐빅(bicubic) 등과 같은 보외법(extrapolation)을 이용하여 영상 외부의 결손 데이터를 예측한다. 하지만, 이러한 아웃페인팅은 인페인팅에 비해 불확실성이 크고, 무의미한 결과를 초래할 가능성이 높아 영상의 외부 영역 예측 정확도가 매우 낮다는 문제점이 있다. In general, outpainting predicts missing data outside of an image by using an extrapolation method such as bilinear or bicubic. However, such outpainting has a problem in that the accuracy of prediction of the outer region of the image is very low because uncertainty is greater than inpainting and there is a high possibility of causing meaningless results.

한편, 딥러닝은 영상 분류, 음성 인식 등 인공 지능의 다양한 분야에서 사용되고 있다. 이러한 발전은 심층 신경망(Deep neural Network)이 역전파(back-propagation)을 통해 효과적으로 복잡한 확률 분포를 학습할 수 있기 때문이다.Meanwhile, deep learning is being used in various fields of artificial intelligence, such as image classification and voice recognition. This advance is because deep neural networks can effectively learn complex probability distributions through back-propagation.

특히, 적대적 생성 신경망(Generative Adversarial Networks;GAN)의 등장으로 인해 좀 더 효과적으로 학습데이터의 확률분포를 정교하게 학습할 수 있게 되었다. 즉, 생성모델들로 인해 좀 더 고차원의 데이터 분포들을 모방 및 재생산하는 것이 가능해졌다. 이는, 영상, 인조 음성, 복원 등 여러 분야에 널리 응용되고 있다. In particular, due to the advent of Generative Adversarial Networks (GANs), it is possible to more effectively learn the probability distribution of the training data precisely. In other words, generative models make it possible to imitate and reproduce higher-order data distributions. This is widely applied in various fields such as video, artificial sound, and restoration.

본 출원인은 이러한 적대적 생성 신경망을 활용하여 아웃페인팅 학습모델을 생성하되, 아웃페인팅에 적합하게 네트워크 구조를 변경함으로써 영상 예측 정확도를 향상시키고자 한다.The present applicant creates an outpainting learning model by utilizing such an adversarial generative neural network, but intends to improve image prediction accuracy by changing the network structure suitable for outpainting.

본 발명의 목적은, 적대적 생성 신경망을 기반으로 아웃페인팅 학습모델을 생성하되, 아웃페인팅에 적합하게 네트워크 구조를 변경함으로써 영상 예측 정확도를 향상시킬 수 있는 아웃페인팅 학습모델을 생성하는 딥러닝기반 영상 아웃페인팅 장치를 제공하는 데 있다.An object of the present invention is to create an outpainting learning model based on an adversarial generated neural network, but by changing the network structure suitable for outpainting, a deep learning-based image out to create an outpainting learning model that can improve image prediction accuracy To provide a painting device.

상기한 바와 같은 목적을 달성하기 위한 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 장치는 결손이 발생된 입력영상의 특징맵을 추출하고, 추출된 특징맵의 픽셀간 상관관계를 파악하여 상기 결손이 보정된 출력영상을 생성하는 영상 생성부, 상기 출력영상에 대한 진위여부를 판단하는 진위 판단부 및 상기 진위여부에 따라 상기 영상 생성부 및 진위 판단부의 손실함수를 최소화하는 가중치를 상기 영상 생성부 및 진위 판단부에 업데이트하는 손실 산출부를 포함할 수 있다.A deep learning-based image outpainting apparatus according to an embodiment of the present invention for achieving the above object extracts a feature map of an input image with a defect, and identifies the correlation between pixels of the extracted feature map. An image generator for generating the output image in which the defect is corrected, a authenticity determiner for determining whether the output image is authentic or not, and a weight for minimizing the loss function of the image generator and the authenticity determiner according to the authenticity of the image It may include a loss calculator that updates the generator and the authenticity determiner.

또한, 상기 영상 생성부는 상기 추출된 특징맵에 커널을 적용하여 픽셀간 상관관계를 파악하며, 상기 커널의 픽셀간 간격 비율은 결손영역이 포함되도록 설정될 수 있다.In addition, the image generator may determine a correlation between pixels by applying a kernel to the extracted feature map, and the inter-pixel spacing ratio of the kernel may be set to include a missing region.

또한, 상기 영상 생성부는, 상기 업데이트된 가중치가 적용된 아웃페인팅 학습모델이 된다.In addition, the image generator becomes an outpainting learning model to which the updated weight is applied.

또한, 상기 진위 판단부는, 상기 생성된 출력영상을 보정영역과 비 보정영역으로 구분하여 각각에 대한 학습모델을 생성할 수 있다.Also, the authenticity determining unit may divide the generated output image into a corrected region and a non-corrected region to generate a learning model for each.

또한, 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 방법은 결손이 발생된 입력영상의 특징맵을 추출하고, 추출된 특징맵의 픽셀간 상관관계를 파악하여 상기 결손이 보정된 출력영상을 생성하는 단계, 상기 출력영상에 대한 진위여부를 판단하는 단계, 상기 진위여부에 따라 상기 출력영상 생성 및 진위여부 판단시 손실을 최소화하는 가중치를 산출하는 단계, 상기 산출된 가중치를 상기 출력영상 생성 및 진위여부 판단시 적용하여 아웃페인팅 학습모델을 생성하는 단계를 포함할 수 있다. In addition, the deep learning-based image outpainting method according to an embodiment of the present invention extracts a feature map of an input image with a defect, and finds a correlation between pixels of the extracted feature map, and the defect is corrected output image generating the output image, determining the authenticity of the output image, calculating a weight for minimizing a loss when generating the output image and determining whether the output image is authentic according to the authenticity, generating the output image using the calculated weight and generating an outpainting learning model by applying it when determining the authenticity.

이상에서 설명한 바와 같이, 본 발명의 딥러닝기반 영상 아웃페인팅 장치 및 그 방법은 적대적 생성 신경망을 기반으로 아웃페인팅 학습모델을 생성하되, 아웃페인팅에 적합하게 네트워크 구조를 변경함으로써 영상 예측 정확도를 향상시킬 수 있다. As described above, the deep learning-based image outpainting apparatus and method of the present invention generate an outpainting learning model based on an adversarial generated neural network, but improve the image prediction accuracy by changing the network structure suitable for outpainting. can

특히, 아웃페인팅 학습모델은 확장 콘볼루션을 통해 필터를 구성하는 픽셀간 간격을 아웃페인팅이 수행될 영역 즉, 결손영역의 크기를 커버하는 범위로 설정하여 영상 특성을 분석함으로써 예측정확도를 향상시킬 수 있다.In particular, the outpainting learning model can improve prediction accuracy by analyzing image characteristics by setting the interval between pixels constituting the filter to a range that covers the size of the area to be outpainted, that is, the missing area through extended convolution. have.

또한, 영상 생성부에서 출력한 출력영상을 보정영역과 비 보정영역으로 구분하여 각각에 대한 학습모델을 생성하여 진위 여부 판단 학습능률을 향상시킴으로써, 네트워크 성능 효율을 높일 수 있다. In addition, by dividing the output image output from the image generating unit into a corrected region and a non-corrected region to generate a learning model for each, and improve the learning efficiency of determining whether authenticity or not, it is possible to increase network performance efficiency.

도 1은 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 장치의 개략적인 구성을 나타내는 블록도이다.
도 2는 도 1의 영상 생성부를 설명하기 위한 도면이다.
도 3은 도 2의 확장 콘볼루션부를 설명하기 위한 도면이다.
도 4는 도 1의 진위 판단부를 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습방법을 설명하기 위한 흐름도이다.
도 6은 도 5의 딥러닝기반 영상 아웃페인팅 학습을 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습모델의 성능을 확인하기 위한 도면이다.1 is a block diagram showing a schematic configuration of a deep learning-based image outpainting apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining the image generator of FIG. 1 .
FIG. 3 is a diagram for explaining the extended convolution unit of FIG. 2 .
FIG. 4 is a view for explaining the authenticity determining unit of FIG. 1 .
5 is a flowchart illustrating a deep learning-based image outpainting learning method according to an embodiment of the present invention.
FIG. 6 is a diagram for explaining deep learning-based image outpainting learning of FIG. 5 .
7 is a diagram for confirming the performance of a deep learning-based image outpainting learning model according to an embodiment of the present invention.

이하에서는 본 발명에 따른 딥러닝 기반의 영상 아웃페인팅 학습장치 및 그 방법에 관하여 첨부된 도면과 함께 더불어 상세히 설명하기로 한다.Hereinafter, a deep learning-based image outpainting learning apparatus and method according to the present invention will be described in detail together with the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 장치의 개략적인 구성을 나타내는 블록도이다. 도 1을 참고하면, 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 장치는 영상 생성부(110), 진위 판단부(120) 및 손실 산출부(130)를 포함한다. 1 is a block diagram showing a schematic configuration of a deep learning-based image outpainting apparatus according to an embodiment of the present invention. Referring to FIG. 1 , an apparatus for outpainting an image based on deep learning according to an embodiment of the present invention includes an image generating unit 110 , an authenticity determining unit 120 , and a loss calculating unit 130 .

본 발명의 딥러닝기반 영상 아웃페인팅 장치는 적대적 생성 신경망을 아웃페인팅에 적합한 네트워크 구조를 변경하여 아웃페인팅 학습모델을 생성할 수 있다. The deep learning-based image outpainting apparatus of the present invention can generate an outpainting learning model by changing a network structure suitable for outpainting an adversarial generated neural network.

이때, 아웃페인팅 학습모델은 딥러닝기반 영상 아웃페인팅 장치의 손실 최소화 가중치적용에 따라 생성되며, 최종 가중치가 적용된 영상 생성부(110)는 아웃페인팅 학습모델로 이용될 수 있다. 이는 도 2 내지 6을 참고하여 설명할 수 있다.In this case, the outpainting learning model is generated according to the application of the loss minimization weight of the deep learning-based image outpainting apparatus, and the image generator 110 to which the final weight is applied may be used as the outpainting learning model. This can be explained with reference to FIGS. 2 to 6 .

도 2는 도 1의 영상 생성부를 설명하기 위한 도면이다. 영상 생성부(110)는 딥러닝 기반으로 입력 영상의 특징맵을 추출하고, 특징맵의 픽셀간 상관관계를 파악하여 결손이 보정된 출력영상을 생성할 수 있다.FIG. 2 is a diagram for explaining the image generator of FIG. 1 . The image generator 110 may extract a feature map of an input image based on deep learning and generate an output image in which a defect is corrected by identifying a correlation between pixels of the feature map.

도 2를 참고하면, 본 발명의 일 실시 예에 따른 영상 생성부(110)는 인코더(111), 확장 콘볼루션부(112) 및 디코더(113)를 포함할 수 있다. 본 발명의 딥러닝기반 영상 아웃페인팅장치는 영상 아웃페인팅 학습모델을 생성하기 위한 것으로, 생성된 영상 아웃페인팅 학습모델에 결손이 있는 영상을 입력할 때, 아웃페인팅을 통해 결손을 보정한 출력영상을 출력할 수 있다. 이를 위해, 입력영상을 훈련하는 학습을 수행할 수 있다. Referring to FIG. 2 , the image generating unit 110 according to an embodiment of the present invention may include an encoder 111 , an extended convolution unit 112 , and a decoder 113 . The deep learning-based image outpainting apparatus of the present invention is for generating an image outpainting learning model, and when an image with a defect is input to the generated image outpainting learning model, the output image corrected for the defect through outpainting can be printed out. To this end, learning to train the input image may be performed.

여기서, 학습모델 생성을 위한 입력영상은 잘림아티팩트 등과 같은 외측영역에 결손이 발생된 데이터 영상과, 결손영역을 구분할 수 있는 참고영상이 필요하다. Here, the input image for generating the learning model requires a data image in which a defect is generated in an outer region such as a truncation artifact, and a reference image capable of distinguishing the defective region.

이를 위해, 본 발명의 실시 예에서는 도 6과 같이 기준영상(In)에 마스크 영상(M)을 합성하여 아웃페인팅이 수행된 영역이 제로 패딩된 즉, 결손이 발생된 데이터 영상(Im)을 생성하고, 결손이 발생된 데이터 영상(Im)과 마스크 영상(M)을 입력영상으로 이용할 수 있다.To this end, in an embodiment of the present invention, as shown in FIG. 6 , a mask image M is synthesized with a reference image In to generate a data image Im in which an outpainting area is zero-padded, that is, a defect is generated. In addition, the data image Im and the mask image M in which the defect is generated may be used as input images.

여기서, 기준영상(In)은 결손이 없는 원본영상이 될 수 있고, 마스크 영상(M)은 결손영역 생성을 위해 0(검은색 영역)과 1(흰색 영역)의 값을 가지는 영상이 될 수 있다.Here, the reference image In may be an original image without a defect, and the mask image M may be an image having values of 0 (black region) and 1 (white region) for generating a defect region. .

이때, 입력영상으로 마스크 영상을 함께 입력함으로써 신경망이 아웃페인팅을 수행할 영역에 대한 정보를 효과적으로 학습할 수 있다.In this case, by inputting the mask image as the input image together, the neural network can effectively learn information on the area to be outpainted.

또한, 도 2에서 "Conv"는 콘볼루션 레이어를 나타내며, "ReLU" 및 "Sigmoid"는 활성함수를 나타낸다. 또한, f는 필터의 크기, η는 필터의 확장 비율, s는 필터의 이동범위가 될 수 있으며, 각 수치정보는 일 실시 예일 뿐이며, 이로 한정되지는 않는다. f, η 및 s는 입력영상의 특징 및 입력영상의 결손영역 크기 등에 따라 설계자에 의해 용이하게 변경될 수 있다.Also, in FIG. 2, “Conv” denotes a convolutional layer, and “ReLU” and “Sigmoid” denote activation functions. In addition, f may be the size of the filter, η may be the expansion ratio of the filter, and s may be the movement range of the filter, and each numerical information is only an example, and is not limited thereto. f, η, and s can be easily changed by a designer according to the characteristics of the input image and the size of the missing region of the input image.

인코더(111)는 입력영상(10)으로 데이터 영상(11;Im)과 마스크 영상(12;M)을 입력받으면, 복수 개의 콘볼루션 레이어를 통해 결손이 발생된 입력영상의 특징맵을 추출할 수 있다. 본 발명의 일 실시 예에서는 복수의 콘볼루션 레이어에 다음의 값을 적용하였다.When the encoder 111 receives the data image 11;Im and the mask image 12;M as the input image 10, it is possible to extract a feature map of the input image in which the defect is generated through a plurality of convolutional layers. have. In an embodiment of the present invention, the following values are applied to a plurality of convolutional layers.

확장 콘볼루션부(112)는 추출된 특징맵에 커널을 적용하여 픽셀간 상관관계를 파악할 수 있다. 이때, 커널은 결손영역이 포함되도록 픽셀간 간격 비율을 설정함으로써 결손영역의 일부 및 데이터 영역의 일부를 동시에 필터링하여 픽셀 상관관계를 파악할 수 있다. 이에, 데이터 영역의 영상특성을 고려하여 결손영역을 예측하는 정확도를 향상시킬 수 있다.The extended convolution unit 112 may determine the correlation between pixels by applying a kernel to the extracted feature map. In this case, the kernel may determine the pixel correlation by simultaneously filtering a part of the missing area and a part of the data area by setting an interval ratio between pixels to include the missing area. Accordingly, it is possible to improve the accuracy of predicting the missing region in consideration of the image characteristics of the data region.

본 발명의 일 실시 예에서는 확장 콘볼루션부에 다음의 값을 적용하였다.In an embodiment of the present invention, the following values are applied to the extended convolution unit.

도 3은 도 2의 확장 콘볼루션부를 설명하기 위한 도면이다. 도 3을 참고하면, 커널의 크기는 3*3으로 동일하나 확장비율이 (a)는 1, (b)는 2, (c)는 3일 때를 예시하였다. 이때, 인코더(111)의 최종 레이어에서 출력된 특징맵(10')에 커널(50)을 적용하여 콘볼루션 연산이 수행되는 과정이 도시되었다.FIG. 3 is a diagram for explaining the extended convolution unit of FIG. 2 . Referring to FIG. 3 , it is exemplified when the size of the kernel is the same as 3*3, but the expansion ratio is 1 in (a), 2 in (b), and 3 in (c). In this case, a process in which a convolution operation is performed by applying the kernel 50 to the feature map 10 ′ output from the final layer of the encoder 111 is illustrated.

이때, 윈도우(window;70)는 특징맵(10')의 좌측 상단으로부터 우측 하단으로 한 칸씩 이동하며 콘볼루션 연산을 수행(71)할 수 있다. 이와 같이, 본 발명의 일 실시 예에 따른 확장 컨볼루션부(112)는 커널의 확장비율을 비례적으로 확장하여 영상의 픽셀간 상관관계를 파악할 수 있다. In this case, the window 70 may move from the upper left to the lower right of the feature map 10' one by one and perform a convolution operation (71). As described above, the extension convolution unit 112 according to an embodiment of the present invention can determine the correlation between pixels of an image by proportionally extending the extension ratio of the kernel.

디코더(113)는 확장 컨볼루션부(112)에서 출력되는 특징맵을 역변환하여 보정된 출력영상을 출력할 수 있다.The decoder 113 may output a corrected output image by inversely transforming the feature map output from the extended convolution unit 112 .

여기서, 영상 생성부(110)는 기준영상(In)과 유사하면서, 진위 판단부(120)에 의해 분별되지 못하는 출력영상을 생성해야 하므로, 손실 함수의 값을 최소화하는 방향으로 네트워크 파라미터를 학습할 수 있다.Here, since the image generating unit 110 needs to generate an output image that is similar to the reference image In and cannot be discriminated by the authenticity determining unit 120, the network parameter can be learned in a direction that minimizes the value of the loss function. can

본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습장치의 학습 및 손실함수 산출은 도 6을 참고하여 설명할 수 있다. 여기서, 영상 생성부(110)는 하기의 수학식1(도 6의 Phase1)을 이용하여 기설정횟수(R1번) 학습될 수 있다. The learning and loss function calculation of the deep learning-based image outpainting learning apparatus according to an embodiment of the present invention may be described with reference to FIG. 6 . Here, the image generator 110 may be trained a preset number of times (R1 times) using Equation 1 below (Phase1 in FIG. 6 ).

또한, 진위 판단부(120)는 하기의 수학식2(도 6의 Phase2)를 이용하여 기설정횟수(R2번) 학습될 수 있다. In addition, the authenticity determination unit 120 may be learned a preset number of times (R2 times) using the following Equation 2 (Phase 2 of FIG. 6 ).

또한, 영상 생성부(110) 및 진위 판단부(120)는 수학식3(도 6의 Phase3)을 이용하여 경쟁적 학습을 수행하며, 기설정횟수(R3번) 학습될 수 있다. In addition, the image generating unit 110 and the authenticity determining unit 120 perform competitive learning using Equation 3 (Phase 3 in FIG. 6 ), and may be learned a preset number of times (R3 times).

이때, 각 단계별로 학습을 수행하면 1번(n=1) 전체학습이 수행된다. 본 발명의 딥러닝기반 영상 아웃페인팅 학습장치는 전체학습을 N번(n=N) 수행하여 최종 가중치가 적용된 영상 생성부(110)를 아웃페인팅 학습모델로 이용할 수 있다. At this time, if learning is performed in each step, total learning is performed once (n=1). The deep learning-based image outpainting learning apparatus of the present invention may perform the entire learning N times (n=N) and use the image generator 110 to which the final weight is applied as an outpainting learning model.

구체적으로, 영상 생성부(110)는 손실 산출부(130)에서 다음의 수학식1을 이용하여 산출된 손실함수를 최소화하는 가중치로 영상 생성부(110)의 각 레이어의 가중치값을 업데이트할 수 있으며, 기설정된 횟수(R1)만큼 학습된 후 가중치의 업데이트가 수행될 수 있다. Specifically, the image generator 110 may update the weight value of each layer of the image generator 110 with a weight that minimizes the loss function calculated by the loss calculator 130 using Equation 1 below. In addition, after learning for a preset number of times R1, the weights may be updated.

여기서,

는 기준영상(In)과 입력영상(Ip)의 평균 제곱근 오차 손실함수이며, M은 마스크영상, G는 영상 생성부를 의미한다.here,

is the root mean square error loss function of the reference image In and the input image Ip, M denotes a mask image, and G denotes an image generator.

또한, 영상 생성부(110)에서 출력된 출력영상은 진위 판단부(120)의 입력영상으로 입력될 수 있으며, 진위 판단부(120)에 의해 진위여부가 판별될 수 있다.Also, the output image output from the image generating unit 110 may be input as an input image of the authenticity determining unit 120 , and authenticity may be determined by the authenticity determining unit 120 .

도 4는 도 1의 진위 판단부를 설명하기 위한 도면이다. 도 4를 참고하면, 진위 판단부(120)는 영역 구분부(121), 글로벌 판단부(122), 로컬 판단부(123) 및 진위부(124)를 포함할 수 있다.FIG. 4 is a view for explaining the authenticity determining unit of FIG. 1 . Referring to FIG. 4 , the authenticity determination unit 120 may include a region division unit 121 , a global determination unit 122 , a local determination unit 123 , and an authenticity unit 124 .

한편, 영상 생성부(110)에서 출력된 출력영상(보정영역 및 비보정영역을 미구분한 전체영상)에 대해, 진위 판단부(120)에서 진위여부를 판단할 경우 보정영역에 의해 초기 학습손실이 발생되어 학습정확도가 낮아진다. On the other hand, when the authenticity determination unit 120 determines the authenticity of the output image output from the image generating unit 110 (the entire image in which the corrected region and the non-corrected region are not differentiated), the initial learning loss is caused by the correction region. This results in lower learning accuracy.

이에, 본 발명에서는 영상생성부(110)에서 출력된 출력영상을 비 보정영역(31)과 보정영역(32,33)으로 분류하여 각각 학습을 진행할 수 있다. Accordingly, in the present invention, the output image output from the image generator 110 may be classified into the non-correction region 31 and the correction region 32 and 33 to perform learning, respectively.

이때, 비 보정영역(31)을 학습하는 글로벌 판단부(122)는 입력영상(10)을 참고하여 생성된 영상을 학습함으로써 손실이 저감되며, 이로 인해 진위 판단부(120)의 손실도 저감되어 진위 판단부(120)의 학습정확도 및 전반적인 네트워크 학습성능이 향상될 수 있다.At this time, the global determination unit 122 learning the non-correction region 31 reduces the loss by learning the image generated by referring to the input image 10, and thus the loss of the authenticity determination unit 120 is also reduced. Learning accuracy and overall network learning performance of the authenticity determination unit 120 may be improved.

구체적으로, 영역 구분부(121)는 입력영상(30)을 보정영역(32,33)과 비 보정영역(31)으로 구분하여 비 보정영역(31)의 영상을 글로벌 판단부(122)로 출력하고, 보정영역(32,33)의 영상을 로컬 판단부(123)로 출력할 수 있다. 이때, 영역 구분부(121)는 이미 알고 있는 마스크 영상(M)과 기준영상(In)을 이용하여 영역을 구분할 수 있다.Specifically, the region division unit 121 divides the input image 30 into correction regions 32 and 33 and a non-correction region 31 and outputs the image of the non-correction region 31 to the global determination unit 122 . and the images of the correction regions 32 and 33 may be output to the local determination unit 123 . In this case, the region dividing unit 121 may divide the region by using the already known mask image M and the reference image In.

글로벌 판단부(122) 및 로컬 판단부(123)는 각각 입력되는 영상에 대하여 복수의 콘볼루션 레이어를 적용하여 학습할 수 있으며, 진위부(124)는 각각 학습된 보정영역(32,33)의 영상과 비 보정영역(31)의 영상에 대해 종합적으로 진위여부를 판단할 수 있다. The global determination unit 122 and the local determination unit 123 may learn by applying a plurality of convolutional layers to an input image, respectively, and the authenticity unit 124 may be configured to perform learning of the learned correction regions 32 and 33 , respectively. The authenticity of the image and the image of the non-correction region 31 may be determined comprehensively.

이때, 본 발명의 일 실시 예에서는 진위 판단부(120)의 콘볼루션 레이어에 다음의 값을 적용하였다.In this case, in an embodiment of the present invention, the following values are applied to the convolutional layer of the authenticity determining unit 120 .

또한, 진위 판단부(120)는 영상 생성부(110)에서 생성된 출력영상의 진위판단을 위해 손실 함수의 값을 최소화하는 방향으로 네트워크 파라미터를 학습할 수 있다. In addition, the authenticity determining unit 120 may learn the network parameter in a direction of minimizing the value of the loss function to determine the authenticity of the output image generated by the image generating unit 110 .

이때, 진위 판단부(120)의 손실함수는 손실 산출부(130)에서 다음의 수학식2를 이용하여 산출할 수 있으며, 산출된 손실함수를 최소화하는 가중치로 진위 판단부(120)의 각 레이어의 가중치를 업데이트할 수 있다. 이때, 진위 판단부(120)는 기설정된 횟수(R2)만큼 학습 및 가중치의 업데이트를 수행하며, 영상 생성부(110)의 가중치는 수학식1에 의해 업데이트된 가중치로 교정될 수 있다. At this time, the loss function of the authenticity determining unit 120 can be calculated by the loss calculating unit 130 using Equation 2 below, and each layer of the authenticity determining unit 120 is a weight that minimizes the calculated loss function. You can update the weights of . In this case, the authenticity determining unit 120 performs learning and weight updates a preset number of times R2, and the weight of the image generator 110 may be corrected to the updated weight by Equation 1.

여기서,

는 진위판단부(120)의 손실함수이며, D는 진위판단부(120)를 의미한다.here,

is a loss function of the authenticity determining unit 120 , and D denotes the authenticity determining unit 120 .

또한, 손실 산출부(130)는 진위여부에 따라 영상 생성부(110) 및 진위 판단부(120)의 손실함수를 경쟁적으로 최소화하는 가중치를 산출하여 최종적으로 영상 생성부(110) 및 진위 판단부(120)에 적용시킬 수 있다. 여기서, 진위 판단부(120)의 손실함수는 수학식2가 되고, 영상 생성부(110)의 손실함수는 다음의 수학식 3이 된다. 이때, 경쟁적 학습은 기설정된 횟수(R3)까지 반복학습된다.In addition, the loss calculator 130 calculates a weight for competitively minimizing the loss function of the image generator 110 and the authenticity determiner 120 according to the authenticity, and finally the image generator 110 and the authenticity determiner (120) can be applied. Here, the loss function of the authenticity determining unit 120 becomes Equation 2, and the loss function of the image generating unit 110 becomes the following Equation 3. At this time, the competitive learning is repeated learning up to a preset number of times (R3).

여기서,

는 영상 생성부의 손실함수이며,

는 수학식1의 손실함수와 수학식3의 손실함수 사이의 상호관계를 결정하는 상수가 된다.here,

is the loss function of the image generator,

is a constant that determines the correlation between the loss function of Equation 1 and the loss function of Equation 3.

영상 생성부(110)는 최종 가중치를 적용함으로써 아웃페인팅 학습모델로 이용할 수 있다. 이때, 진위 판단부(120)의 진위 판단 능력 및 영상 생성부(110)의 속임 영상 생성 능력이 경쟁적으로 향상됨으로써, 최종 생성된 아웃페인팅 학습모델은 정확도가 높은 아웃페인팅을 수행할 수 있다.The image generator 110 may be used as an outpainting learning model by applying a final weight. In this case, since the authenticity determining unit 120's ability to determine the authenticity and the image generating unit 110's ability to generate a deceptive image are competitively improved, the finally generated outpainting learning model can perform outpainting with high accuracy.

도 5는 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습방법을 설명하기 위한 흐름도이다. 도 6은 도 5의 딥러닝기반 영상 아웃페인팅 학습을 설명하기 위한 도면이다. 도 1 내지 도 4를 참고하여 설명할 수 있다.5 is a flowchart illustrating a deep learning-based image outpainting learning method according to an embodiment of the present invention. FIG. 6 is a diagram for explaining deep learning-based image outpainting learning of FIG. 5 . It can be described with reference to FIGS. 1 to 4 .

본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습방법은, 도 6과 같이 기준영상(In)에 마스크 영상(M)을 합성하여 아웃페인팅이 수행된 영역이 제로 패딩된 즉, 결손이 발생된 데이터 영상(Im)을 생성하고, 결손이 발생된 데이터 영상(Im)과 마스크 영상(M)을 입력영상으로 이용할 수 있다.In the deep learning-based image outpainting learning method according to an embodiment of the present invention, as shown in FIG. 6 , the area on which the outpainting is performed by synthesizing the mask image M with the reference image In is zero-padded, that is, there is no defect. The generated data image Im may be generated, and the data image Im and the mask image M in which the defect is generated may be used as input images.

영상 생성부(110)가, 복수의 콘볼루션 레이어를 통해 결손이 발생된 입력영상의 특징맵을 추출하고(S510), 추출된 특징맵에 커널을 적용하여 픽셀간 상관관계를 파악하며(S520), 파악된 상관관계를 디코딩하여 결손이 보정된 출력영상(Io)를 출력할 수 있다(S530). 이때, 커널의 픽셀간 간격 비율은 결손영역이 포함되는 비율로 설정할 수 있다.The image generator 110 extracts a feature map of an input image in which a defect is generated through a plurality of convolutional layers (S510), and applies a kernel to the extracted feature map to determine a correlation between pixels (S520) , it is possible to output the output image (Io) in which the deficit is corrected by decoding the identified correlation (S530). In this case, the inter-pixel spacing ratio of the kernel may be set as a ratio including the defective region.

이때, 영상 생성부(110)는 S510 내지 S530 단계를 도 6과 같이 기설정된 횟수(R1)까지 반복학습을 수행하되, 손실함수(수학식1 : Phase1)를 최소화하는 가중치로 업데이트되며 학습을 수행할 수 있다.At this time, the image generator 110 repeats the learning in steps S510 to S530 up to a preset number of times (R1) as shown in FIG. 6, but is updated with a weight that minimizes the loss function (Equation 1: Phase1) and performs learning can do.

다음으로, 진위 판단부(120)가, 영상 생성부(110)에서 출력된 출력영상에 대한 진위여부를 판단할 수 있다(S540). 이때, 출력영상은 진위판단부(120)의 입력영상(30)이 되며, 진위 판단부(120)는 비보정영역(31)과 보정영역(32,33)을 구분하여 각각 학습을 수행하고, 진위 판단시에는 비보정영역(31)과 보정영역(32,33)에 대해 종합적으로 진위를 판단할 수 있다.Next, the authenticity determining unit 120 may determine the authenticity of the output image output from the image generating unit 110 (S540). At this time, the output image becomes the input image 30 of the authenticity determination unit 120, and the authenticity determination unit 120 divides the non-correction area 31 and the correction area 32 and 33 and performs learning, respectively, When determining the authenticity, it is possible to comprehensively determine the authenticity of the non-corrected area 31 and the corrected areas 32 and 33 .

이때, 진위 판단부(120)는 S540단계를 도 6과 같이 기설정된 횟수(R2)까지 반복학습을 수행하되, 진위 판단부(120)의 손실함수(수학식2 : Phase2)를 최소화하는 가중치로 업데이트되며 학습을 수행할 수 있다. At this time, the authenticity determination unit 120 repeats the learning in step S540 up to a preset number of times (R2) as shown in FIG. It is updated and learning can be performed.

다음으로, 손실 산출부(130)가, 진위 판단부(120)의 진위 여부에 따라 출력영상 생성 및 진위 여부 판단시 손실을 경쟁적으로 최소화시키는 가중치를 산출할 수 있다(S550). 이때, 경쟁적 학습은 기설정된 횟수(R3)까지 반복 학습된다. Next, the loss calculating unit 130 may calculate a weight for competitively minimizing the loss when generating the output image and determining the authenticity according to the authenticity of the authenticity determining unit 120 (S550). In this case, the competitive learning is repeatedly learned up to a preset number of times (R3).

S550는 수학식2(Phase2)에 따른 진위 판단부(120)의 손실함수와 수학식3(Phase3)에 따른 영상 생성부(110)의 손실함수를 경쟁적으로 최소화시키는 가중치를 산출할 수 있다.S550 may calculate a weight for competitively minimizing the loss function of the authenticity determining unit 120 according to Equation 2 (Phase2) and the loss function of the image generating unit 110 according to Equation 3 (Phase3).

다음으로, S510 내지 S550 단계가 기설정횟수(N번) 반복수행되면, 최종 산출된 가중치를 영상 생성부(110)에 적용하여 아웃페인팅 학습모델을 생성할 수 있다(S560).Next, when steps S510 to S550 are repeated a preset number of times (N times), an outpainting learning model may be generated by applying the final calculated weight to the image generator 110 (S560).

본 발명의 일 실시 예에 따라 생성된 아웃페인팅 학습모델에 결손이 발생된 영상을 입력하여 아웃페인팅된 출력영상의 성능을 도 7을 통해 확인해 볼 수 있다.The performance of the output image overpainted by inputting the image with the defect to the outpainting learning model generated according to an embodiment of the present invention can be checked through FIG. 7 .

도 7은 본 발명의 일 실시 예에 따른 딥러닝기반 영상 아웃페인팅 학습모델의 성능을 확인하기 위한 도면이다. 도 7에서, (a)는 기준영상(In), (b)는 아웃페인팅을 적용할 입력영상(Im), (c)는 종래의 보외법을 적용하여 생성된 영상, (d)는 본 발명의 딥러닝 기반의 아웃페인팅 방법에 따라 생성된 영상이다.7 is a diagram for confirming the performance of a deep learning-based image outpainting learning model according to an embodiment of the present invention. 7, (a) is a reference image (In), (b) is an input image to which outpainting is applied (Im), (c) is an image generated by applying a conventional extrapolation method, (d) is an image of the present invention It is an image generated according to the deep learning-based outpainting method of

도 7의 (c)와 (d)를 비교시 본 발명의 딥러닝 기반의 아웃페인팅 방법을 적용하였을 때, 아웃페인팅의 정확도가 향상된 것을 확인할 수 있다.When comparing (c) and (d) of FIG. 7 , when the deep learning-based outpainting method of the present invention is applied, it can be seen that the outpainting accuracy is improved.

본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다.The embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiment of the present invention and do not represent all the technical spirit of the present invention, so various equivalents that can be substituted for them at the time of the present application It should be understood that there may be variations and examples.

110 : 영상 생성부 111 : 인코더 112 : 확장 컨볼루션부
113 : 디코더
120 : 진위 판단부 121 : 영역 구분부 122 : 글로벌 판단부
123 : 로컬 판단부 124 : 진위부
130 : 손실 산출부110: image generator 111: encoder 112: extended convolution unit
113: decoder
120: authenticity determination unit 121: area division unit 122: global determination unit
123: local judgment unit 124: authenticity unit
130: loss calculator

Claims

결손이 없는 원본영상인 기준영상(In)에 검은색과 흰색 영역 값을 갖는 마스크 영상(M)을 합성하여 생성된 결손이 발생된 데이터 영상(Im)과 상기 마스크 영상(M)을 입력받고 복수의 콘볼루션 레이어를 통해 결손이 발생된 입력영상의 특징맵을 추출하는 인코더(111)와, 추출된 특징맵에 커널의 확장비율을 비례적으로 확장하여 픽셀간 상관관계를 파악하되 상기 커널의 픽셀간 간격 비율은 결손영역이 포함되도록 설정하고 특징맵의 좌측 상단으로부터 우측 하단으로 한 칸씩 이동하며 콘볼루션 연산을 수행하는 확장 콘볼루션부(112)와, 상기 확장 컨볼루션부(112)에서 출력되는 특징맵을 역변환하여 결손이 보정된 출력영상을 출력하는 디코더(113)를 포함하는 영상 생성부(110);
상기 출력영상을 상기 마스크 영상(M)과 기준영상(In)을 이용하여 보정영역과 비 보정영역으로 구분하는 영역 구분부(121)와, 상기 비보정영역의 영상을 복수의 콘볼루션 레이어를 적용하여 학습하는 글로벌 판단부(122)와, 상기 보정영역의 영상을 복수의 콘볼루션 레이어를 적용하여 학습하는 로컬 판단부(123)와, 학습된 보정영역의 영상과 비 보정영역의 영상에 대해 진위여부를 판단하는 진위부(124)를 포함하는 진위 판단부(120);
상기 진위여부에 따라 상기 영상 생성부 및 진위 판단부(120)의 손실함수를 최소화하는 가중치를 상기 영상 생성부(110) 및 진위 판단부(120)에 업데이트하는 손실 산출부(130); 를 포함하되,
상기 영상생성부(110)는 결손이 발생된 입력영상의 특징맵을 추출하고, 추출된 특징맵의 픽셀간 상관관계를 파악하여 상기 결손이 보정된 출력영상을 생성하는 과정을 기설정된 횟수까지 반복학습하고,
진위 판단부(120)는 상기 영상 생성부(110)에서 출력된 출력영상에 대한 진위 판단을 기설정된 횟수까지 반복학습하고 상기 손실 산출부(130)는 상기 진위 판단부(120)의 진위 여부에 따라 출력영상 생성 및 진위 여부 판단시 손실을 경쟁적으로 최소화시키는 가중치를 산출하는 경쟁적 학습을 기설정된 횟수까지 반복 학습하며,
기설정횟수 반복수행 후 최종 산출된 가중치를 영상 생성부(110)에 적용하여 아웃페인팅 학습모델을 생성하는 것을 특징으로 하는 딥러닝기반 아웃페인팅 장치.
A data image Im with a defect generated by synthesizing a mask image M having black and white region values with a reference image In, which is an original image without a defect, and the mask image M are input and received An encoder 111 that extracts a feature map of an input image having a defect through a convolutional layer of The interval ratio is set to include the missing region, and the extended convolution unit 112 performs a convolution operation by moving one space from the upper left to the lower right of the feature map, and the extended convolution unit 112 is output from the an image generating unit 110 including a decoder 113 that inversely transforms the feature map to output an output image in which a defect is corrected;
A region divider 121 that divides the output image into a corrected region and an uncorrected region using the mask image M and the reference image In, and a plurality of convolutional layers are applied to the image of the uncorrected region The global determination unit 122 that learns by doing this, the local determination unit 123 that learns by applying a plurality of convolutional layers to the image of the correction region, and the authenticity of the image of the learned correction region and the image of the non-correction region Authenticity determination unit 120 including the authenticity unit 124 to determine whether or not;
a loss calculating unit 130 for updating the image generating unit 110 and the authenticity determining unit 120 with a weight for minimizing the loss function of the image generating unit and the authenticity determining unit 120 according to the authenticity; including,
The image generating unit 110 repeats the process of extracting a feature map of an input image in which a defect is generated, identifying a correlation between pixels of the extracted feature map, and generating an output image in which the defect is corrected up to a preset number of times. learn,
The authenticity determining unit 120 repeatedly learns to determine the authenticity of the output image output from the image generating unit 110 up to a preset number of times, and the loss calculating unit 130 determines whether the authenticity of the image generating unit 120 is authentic. Competitive learning is repeatedly learned up to a preset number of times to calculate weights that competitively minimize losses when generating output images and determining authenticity.
A deep learning-based outpainting apparatus, characterized in that the outpainting learning model is generated by applying the final calculated weight to the image generator 110 after repeating the preset number of times.

삭제delete

결손이 없는 원본영상인 기준영상(In)에 검은색과 흰색 영역 값을 갖는 마스크 영상(M)을 합성하여 생성된 결손이 발생된 데이터 영상(Im)과 상기 마스크 영상(M)을 입력받고 복수의 콘볼루션 레이어를 통해 결손이 발생된 입력영상의 특징맵을 추출하는 단계(S510);
추출된 특징맵에 커널의 확장비율을 비례적으로 확장하여 픽셀간 상관관계를 파악하는 단계(S520);
상기 커널의 픽셀간 간격 비율은 결손영역이 포함되도록 설정하고 특징맵의 좌측 상단으로부터 우측 하단으로 한 칸씩 이동하며 콘볼루션 연산을 수행함으로 출력되는 특징맵을 역변환하여 결손이 보정된 출력영상을 생성하는 단계(S530);
상기 출력영상을 상기 마스크 영상(M)과 기준영상(In)을 이용하여 보정영역과 비 보정영역으로 구분하고 각각 복수의 콘볼루션 레이어를 적용하여 학습하여, 학습된 보정영역의 영상과 비 보정영역의 영상에 대해 진위여부를 판단하는 단계(S540);
상기 진위여부에 따라 상기 출력영상 생성 및 진위여부 판단시 손실을 최소화하는 가중치를 산출하는 단계(S550);
상기 산출된 가중치를 상기 출력영상 생성 및 진위여부 판단시 적용하여 아웃페인팅 학습모델을 생성하는 단계(S560); 를 포함하되,
상기 결손이 발생된 입력영상의 특징맵을 추출하는 단계(S510) 내지 결손이 보정된 출력영상을 생성하는 단계(S530)를 기설정된 횟수까지 반복학습을 수행하되 손실함수를 최소화하는 가중치로 업데이트되며 학습을 수행하고,
상기 출력영상의 진위여부를 판단하는 단계(S540)를 기설정된 횟수까지 반복학습하고, 진위 판단 후 진위 여부에 따라 출력영상 생성 및 진위 여부 판단시 손실을 경쟁적으로 최소화시키는 가중치를 산출하는 경쟁적 학습을 기설정된 횟수까지 반복 학습 후 최종 산출된 가중치를 적용하여 아웃페인팅 학습모델을 생성하는 것을 특징으로 하는 딥러닝기반 아웃페인팅 방법.A data image Im with a defect generated by synthesizing a mask image M having black and white region values with a reference image In, which is an original image without a defect, and the mask image M are input and received extracting the feature map of the input image in which the defect is generated through the convolutional layer of (S510);
determining a correlation between pixels by proportionally extending an extension ratio of the kernel to the extracted feature map (S520);
The inter-pixel spacing ratio of the kernel is set to include the missing region, and the feature map is inversely transformed by performing a convolution operation by moving one space from the upper left to the lower right of the feature map to generate an output image corrected for the defect. step (S530);
The output image is divided into a correction region and a non-correction region using the mask image (M) and the reference image (In), and a plurality of convolutional layers are applied to learn, respectively, the image of the learned correction region and the non-correction region determining the authenticity of the image (S540);
calculating a weight for minimizing a loss when generating the output image and determining the authenticity according to the authenticity (S550);
generating an outpainting learning model by applying the calculated weight when generating the output image and determining the authenticity (S560); including,
The step (S510) of extracting the feature map of the input image in which the deficit is generated and the step (S530) of generating the output image in which the deficit is corrected are repeated learning up to a preset number of times, but the weight is updated to minimize the loss function, do learning,
The step (S540) of determining the authenticity of the output image is repeatedly learned up to a preset number of times, and after determining the authenticity, competitive learning is performed to calculate a weight that competitively minimizes the loss when generating the output image and determining the authenticity according to the authenticity. A deep learning-based outpainting method, characterized in that after repeated learning up to a preset number of times and then applying the final calculated weight to generate an outpainting learning model.