KR20230101353A

KR20230101353A - Method and apparatus for extracting situation change sections of sports game videos using deep learning

Info

Publication number: KR20230101353A
Application number: KR1020210191387A
Authority: KR
Inventors: 박현희; 김형빈; 우기문; 김용호; 김지하
Original assignee: 명지대학교 산학협력단
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2023-07-06

Abstract

딥러닝 모델을 이용하여 스포츠 경기 동영상으로부터 노이즈 영상을 제거하고, 동영상 내의 중계판을 이용하여 스포츠 경기 동영상에 대한 상황 변화 구간 동영상을 추출하는 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치에 관한 것으로, 스포츠 경기 동영상으로부터 영상 프레임을 추출하는 단계, 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출하는 단계. 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판의 색상 변화를 이용하여 주요 영상을 선별하는 단계, 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성하는 단계를 포함하고, 스포츠 경기 동영상에서 관심도가 높은 경기 장면만을 선별하여 자동으로 상황 변화 구간 동영상을 생성할 수 있는 효과가 있다.Method and apparatus for extracting situation change section of sports game video using deep learning to remove noise image from sports game video using deep learning model and extract situation change section video for sports game video using relay board in the video It relates to, extracting an image frame from a sports game video, and detecting an object from the image frame using a deep learning model. Based on the object detection result, selecting a main image using a color change of a relay board displayed in the image frame, and generating a video of a situation change section by combining the selected main images, There is an effect of automatically generating a situation change section video by selecting only high game scenes.

Description

딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치{Method and apparatus for extracting situation change sections of sports game videos using deep learning}Method and apparatus for extracting situation change sections of sports game videos using deep learning}

본 발명은 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치에 관한 것으로서, 더욱 상세하게는 딥러닝 모델을 이용하여 스포츠 경기 동영상으로부터 노이즈 영상을 제거하고, 동영상 내의 중계판을 이용하여 스포츠 경기 동영상에 대한 상황 변화 구간을 추출하는 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for extracting a situation change section of a sports game video using deep learning, and more particularly, to remove noise images from a sports game video using a deep learning model, and to use a relay board in the video to perform sports A method and apparatus for extracting a situation change section of a sports game video using deep learning for extracting a situation change section of a game video.

일반적으로 스포츠 경기의 상황 변화 구간 동영상은 사용자 또는 영상 전문가가 직접 동영상을 확인해 가면서 프레임의 분할 시작점과 종료점을 지정하여 수작업을 통해 제작한다.In general, a video of a situation change section of a sports game is produced manually by a user or a video expert by designating a start point and an end point of dividing a frame while checking the video.

따라서, 동영상의 분량이 길거나 다량의 동영상으로부터 상황 변화 구간 동영상을 제작하는 경우 시간과 비용이 크게 소모된다.Therefore, when producing a situation change section video from a long video or a large number of videos, time and cost are greatly consumed.

예를 들어, 하기 특허문헌 1에는 경기 영상 원본을 수신하는 단계, 상기 경기 영상 원본에 포함된 이미지를 샘플링하는 단계, 샘플링된 이미지를 장면 단위로 그룹핑하는 단계 및 그룹핑된 장면에 기초하여 하이라이트 영상을 추출하는 단계를 포함하는 하이라이트 추출 방법 및 하이라이트 추출 장치가 개시되어 있다.For example, Patent Document 1 below includes receiving an original game video, sampling an image included in the original game video, grouping the sampled images in scene units, and generating a highlight image based on the grouped scenes. Disclosed are a highlight extracting method and a highlight extracting device including the step of extracting.

이를 통해 하기 특허문헌 1의 하이라이트 추출 방법 및 하이라이트 추출 장치는 머신 러닝(Machine Learning)을 활용하여 경기 영상 원본으로부터 하이라이트 영상을 추출할 수 있는 효과가 있다.Through this, the highlight extraction method and highlight extraction device of Patent Document 1 have an effect of extracting a highlight image from the original game video using machine learning.

또한, 경기 영상에서 시각적으로 표현되는 내용을 분석하여 시청자의 흥미와 관심에 적합한 하이라이트 영상을 추출할 수 있는 효과가 있다.In addition, there is an effect of extracting a highlight image suitable for the viewer's interest and interest by analyzing the content visually expressed in the match video.

그러나, 특허문헌 1은 서로 인접한 이미지의 유사도에 기초하여 샘플링된 이미지를 장면 단위로 그룹핑하는 방식이기 때문에 경기 영상 원본으로부터 노이즈 영상을 제거하기가 용이하지 않다.However, since Patent Document 1 is a method of grouping sampled images in scene units based on the similarity of images adjacent to each other, it is not easy to remove noise images from the original game image.

또한, 특허문헌 1은 스포츠 경기 동영상에서 관심도가 높은 경기 장면만을 선별하여 하이라이트 동영상을 생성할 수 없는 문제가 있다.In addition, Patent Document 1 has a problem in that a highlight video cannot be generated by selecting only game scenes of high interest in sports game videos.

또한, 하기 특허문헌 2에는 영상 정보를 오디오 신호와 비디오 신호로 분류하고 오디오 신호를 복수의 오디오 구간으로 분류하며, 비디오 신호 중 키워드의 문자 분석과 오디오 구간 중 키워드의 음소열 매칭을 통해 키워드를 생성하는 키워드 생성부, 키워드 생성부로 입력되는 오디오 구간 중 사운드를 분석함으로써 하이라이트 구간을 추출하고 키워드 중 특정 키워드의 출현 빈도가 높은 하이라이트 구간을 결합하여 키워드별 하이라이트 구간을 생성하는 상황 변화 구간 생성부를 포함하는 하이라이트 추출 장치 및 그 방법에 대해 개시되어 있다.In addition, in Patent Document 2 below, video information is classified into an audio signal and a video signal, the audio signal is classified into a plurality of audio sections, and keywords are generated through character analysis of keywords in the video signal and phoneme string matching of keywords in the audio section. A keyword generator that extracts a highlight section by analyzing the sound of the audio section input to the keyword generator and generates a highlight section for each keyword by combining highlight sections with a high frequency of occurrence of a specific keyword among keywords. Disclosed is a highlight extraction device and method.

이를 통해, 특허문헌 2는 특정 동영상의 하이라이트 뿐만아니라 사용자가 원하는 키워드에 따른 하이라이트를 선택하여 볼 수 있는 효과가 있다.Through this, Patent Document 2 has an effect of selecting and viewing not only the highlight of a specific video but also the highlight according to the keyword desired by the user.

그러나, 특허문헌 2는 관중의 시청률이 떨어지는 경기나 관중의 키워드 문자 및 관중의 함성 소리가 동반될 수 없는 스포츠 경기의 경우에는 하이라이트 동영상을 추출할 수 없는 문제가 있다.However, Patent Document 2 has a problem in that a highlight video cannot be extracted in the case of a game with low viewership ratings or a sports game in which keyword characters and shouts of the spectators cannot be accompanied.

대한민국 등록특허 제10-2151668호(2020년 09월 03일 공고)Republic of Korea Patent Registration No. 10-2151668 (Announced on September 03, 2020) 대한민국 등록특허 제10-1265960호(2013년 05월 22일 공고)Republic of Korea Patent Registration No. 10-1265960 (Announced on May 22, 2013)

본 발명의 목적은 수작업을 통해 제작되는 스포츠 경기의 상황 변화 구간 동영상을 자동으로 추출할 수 있는 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치를 제공하는 것이다.An object of the present invention is to provide a method and apparatus for extracting a situation change section of a sports game video that can automatically extract a video of a situation change section of a sports game produced manually.

또한, 본 발명의 다른 목적은 딥러닝 모델을 이용하여 스포츠 경기 동영상으로부터 노이즈 영상을 제거함으로써 상황 변화 구간 동영상의 제작 시간 및 비용을 감소시킬 수 있는 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치를 제공하는 것이다.In addition, another object of the present invention is to provide a method and apparatus for extracting a situation change section of a sports game video that can reduce the production time and cost of the situation change section video by removing noise images from the sports game video using a deep learning model. is to do

또한, 본 발명의 다른 목적은 관중의 참여도와 상관없이 스포츠 경기 동영상에서 관심도가 높은 경기 장면만을 선별하여 상황 변화 구간 동영상을 생성할 수 있는 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치를 제공하는 것이다.In addition, another object of the present invention is to provide a method and apparatus for extracting a situation change section of a sports game video that can generate a situation change section video by selecting only game scenes of high interest from a sports game video regardless of the participation of spectators. .

이러한 기술적 과제를 이루기 위한 본 발명의 일 측면에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 (a) 스포츠 경기 동영상으로부터 영상 프레임을 추출하는 단계, (b) 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출하는 단계를 포함한다.A method for extracting a situation change section of a sports game video using deep learning according to an aspect of the present invention for achieving this technical problem is (a) extracting a video frame from a sports game video, (b) using a deep learning model and detecting an object from the image frame.

또한, 본 발명은 (d) 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판의 색상 변화를 이용하여 주요 영상을 선별하는 단계, 및 (e) 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성하는 단계를 더 포함한다.In addition, the present invention includes (d) selecting a main image using a color change of a relay board shown in the image frame based on the object detection result, and (e) generating a situation change section video by combining the selected main images. It further includes the steps of

또한, 본 발명에서 상기 단계 (b) 이후에 (c) 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하는 단계를 더 포함할 수 있다.Further, in the present invention, after step (b), (c) removing noise images from the sports game video based on the object detection result may be further included.

또한, 본 발명에서 상기 단계 (b)는 (b1) 상기 영상 프레임으로부터 관심도가 높은 경기 장면을 인식하기 위한 클래스를 설정하는 단계, (b2) 설정된 클래스를 토대로 상기 영상 프레임의 데이터 라벨링(Data Labeling)을 수행하는 단계를 포함한다.In the present invention, the step (b) includes (b1) setting a class for recognizing a game scene of high interest from the video frame, (b2) data labeling of the video frame based on the set class It includes the steps of performing

또한, 본 발명에서 상기 단계 (b)는 (b3) 딥러닝을 이용하여 데이터 라벨링된 영상 프레임의 데이터 학습을 수행하는 단계, 및 (b4) 학습 수행 결과를 토대로 상기 영상 프레임의 객체를 예측하는 단계를 더 포함한다.Further, in the present invention, the step (b) includes (b3) performing data learning of a data-labeled image frame using deep learning, and (b4) predicting an object of the image frame based on a learning result. more includes

또한, 본 발명에서 상기 클래스는 각 방송사별 중계판이 있는 영상 프레임을 검출하기 위한 중계판 클래스, 또는 각 방송사별 중계판에서 누상(base) 영역을 검출하기 위한 누상 클래스를 포함한다.Further, in the present invention, the class includes a relay board class for detecting a video frame with a relay board for each broadcasting company or a base image class for detecting a base area in a relay board for each broadcasting company.

또한, 본 발명에서 상기 단계 (d)는 (d1) 상기 영상 프레임으로부터 중계판의 누상 영역을 검출하는 단계, (d2) 상기 누상 영역을 좌우 또는 상하 방향으로 이등분하여 분할하는 단계를 포함한다.Further, in the present invention, the step (d) includes (d1) detecting a false image area of the relay board from the image frame, and (d2) dividing the false image area into two halves in the left and right or up and down directions.

또한, 본 발명에서 상기 단계 (d)는 (d3) 분할된 누상 영역의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역의 색상 변화를 검출하는 단계, 및 (d4) 상기 누상 영역의 색상 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하는 단계를 더 포함한다.Further, in the present invention, the step (d) includes: (d3) comparing the previous frame and the next frame of each divided image based on the image of the divided image area to detect a color change of the false image area; and (d4) The method may further include selecting a corresponding image frame as a main image when the color change of the false image area is determined.

또한, 본 발명에서 상기 단계 (d3)는 (d31) 각 영상 프레임의 픽셀들을 흑색과 백색으로 분류하여 나타내는 단계, (d32) 분류된 흑색과 백색의 픽셀 수를 히스토그램으로 나타내는 단계, 및 (d33) 상기 이전 프레임과 다음 프레임의 히스토그램에서 흑색 대비 백색의 크기 차이를 비교하여 상기 이전 프레임과 다음 프레임 간 누상 영역의 차이를 검출하는 단계를 포함한다.Further, in the present invention, the step (d3) includes (d31) classifying and displaying pixels of each image frame as black and white, (d32) displaying the number of classified black and white pixels in a histogram, and (d33) and detecting a difference in residual image area between the previous frame and the next frame by comparing the size difference between black and white in the histograms of the previous frame and the next frame.

또한, 본 발명에서 상기 단계 (d3)는 (d311) 각 영상 프레임의 픽셀들을 밝기 단계에 따라 분류하여 점수화하는 단계, (d312) 점수화된 픽셀 값을 밝기 단계에 따른 히스토그램으로 나타내는 단계, 및 (d313) 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램을 비교하여 누상 영역의 차이를 검출하는 단계를 포함한다.In the present invention, the step (d3) includes (d311) classifying pixels of each image frame according to the brightness level and scoring them, (d312) displaying the scored pixel values as a histogram according to the brightness level, and (d313 ) comparing the histogram of the previous frame and the histogram of the next frame for each divided image to detect a difference in the ripple area.

또한, 본 발명에서 상기 단계 (d4)는 카이제곱 분포(chi-squqre distribution)를 이용하여 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램 간 누상 영역의 변화를 판단하는 것을 특징으로 한다.Further, in the present invention, the step (d4) is characterized in that a change in an image area between a histogram of a previous frame and a histogram of a next frame for each divided image is determined using a chi-square distribution.

또한, 본 발명은 상기 단계 (d4)에서 선별된 주요 영상을 기반으로 중복된 영상 프레임을 삭제하는 단계를 더 포함한다. 또한, 본 발명에서 상기 딥러닝 모델은 YOLOv5인 것을 특징으로 한다.Further, the present invention further includes deleting duplicate image frames based on the main image selected in step (d4). In addition, in the present invention, the deep learning model is characterized in that YOLOv5.

또한, 본 발명의 다른 측면에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 장치는 입력부, 프레임 추출부, 객체 검출부, 필터링부, 영상 판별부, 상황 변화 구간 동영상 생성부, 저장부, 및 제어부를 포함한다.In addition, an apparatus for extracting a situation change section of a sports game video using deep learning according to another aspect of the present invention includes an input unit, a frame extraction unit, an object detection unit, a filtering unit, an image determination unit, a situation change section video generation unit, a storage unit, and includes a control unit.

상기 입력부는 스포츠 경기 동영상을 입력받을 수 있다. 상기 프레임 추출부는 상기 입력부를 통해 입력받은 동영상으로부터 영상 프레임을 추출한다. 또한, 상기 객체 검출부는 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출한다.The input unit may receive a sports game video. The frame extraction unit extracts an image frame from a video input through the input unit. In addition, the object detection unit detects an object from the image frame using a deep learning model.

상기 영상 판별부는 상기 객체 검출부에서 검출된 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판의 색상 변화를 이용하여 주요 영상을 선별한다. 또한, 상기 상황 변화 구간 동영상 생성부는 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성한다.The image discrimination unit selects a main image using a color change of a relay board displayed in the image frame based on the object detection result detected by the object detection unit. In addition, the situation change section video generator generates a situation change section video by combining the selected main images.

또한, 본 발명은 상기 객체 검출부에서 검출된 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하는 필터링부를 더 포함한다.In addition, the present invention further includes a filtering unit for removing noise images from the sports game video based on the object detection result detected by the object detection unit.

또한, 본 발명에서 상기 영상 판별부는 상기 영상 프레임에 나타난 중계판의 누상 영역을 검출하는 중계판 검출 모듈, 상기 중계판 검출 모듈에서 검출된 상기 누상 영역을 좌우 방향 또는 상하 방향으로 이등분하여 분할하는 분할 모듈을 포함한다.Further, in the present invention, the image discrimination unit is a relay board detection module that detects an area of the relay board displayed in the image frame and divides the area of the relay board detected by the relay board detection module into two halves in a left-right direction or an up-and-down direction. contains the module

또한, 본 발명에서 상기 영상 판별부는 분할된 누상 영역의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역의 색상 변화를 검출하는 변화 검출 모듈, 및 상기 누상 영역의 색상 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하는 판단 모듈을 더 포함한다.In addition, in the present invention, the image discrimination unit compares a previous frame and a next frame of each divided image based on the image of the divided image area, and detects a color change in the red area; and a color change in the red area. When is determined, a determination module for selecting the corresponding image frame as the main image is further included.

이상에서 설명한 바와 같이, 본 발명에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치는 수작업을 통해 제작되는 스포츠 경기의 상황 변화 구간 동영상을 자동으로 추출할 수 있는 효과가 있다.As described above, the method and apparatus for extracting the situation change section of a sports game video using deep learning according to the present invention has an effect of automatically extracting a manually produced video situation change section of a sports game.

또한, 본 발명에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치는 딥러닝 모델을 이용하여 스포츠 경기 동영상으로부터 노이즈 영상을 제거함으로써 상황 변화 구간 동영상의 제작 시간 및 비용을 감소시킬 수 있는 효과가 있다.In addition, the method and apparatus for extracting the situation change section of a sports game video using deep learning according to the present invention can reduce the production time and cost of the situation change section video by removing noise images from the sports game video using a deep learning model. There is an effect.

또한, 본 발명에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치는 관중의 참여도와 상관없이 스포츠 경기 동영상에서 관심도가 높은 경기 장면만을 선별하여 상황 변화 구간 동영상을 생성할 수 있는 효과가 있다.In addition, the method and apparatus for extracting the situation change section of a sports game video using deep learning according to the present invention has the effect of generating a situation change section video by selecting only the game scenes of high interest in the sports game video regardless of the participation of the spectators. there is

도 1은 본 발명의 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 장치를 나타내는 구성도이다.
도 2는 본 발명의 실시 예에 따른 YOLO 알고리즘의 앵커 박스를 나타내는 도면이다.
도 3은 본 발명의 실시 예에 따른 YOLO 알고리즘에서 객체 검출을 위한 격자 그리드를 나타내는 도면이다.
도 4는 본 발명의 실시 예에 따른 YOLOv5 알고리즘의 구조를 나타내는 도면이다.
도 5는 본 발명의 실시 예에 따른 객체 검출부를 세부적으로 나타내는 구성도이다.
도 6은 본 발명의 일 실시 예에 따른 중계판 클래스를 설명하기 위한 도면이다.
도 7은 본 발명의 실시 예에 따른 데이터 라벨링을 통해 생성된 데이터 셋을 나타내는 도면이다.
도 8은 본 발명의 실시 예에 따른 영상 판별부를 세부적으로 나타내는 구성도이다.
도 9는 본 발명의 실시 예에 따른 중계판의 누상 변화를 설명하기 위한 도면이다.
도 10은 본 발명의 실시 예에 따른 영상 프레임의 픽셀에 대한 점수화를 설명하기 위한 도면이다.
도 11은 도 10에서 점수화된 픽셀의 히스토그램을 나타내는 도면이다.
도 12a 및 도 12b는 본 발명의 다른 실시 예에 따른 중계판 클래스를 설명하기 위한 도면이다.
도 13은 본 발명의 일 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 방법을 나타내는 순서도이다.
도 14는 도 13에서 객체를 검출하는 단계를 세부적으로 나타내는 순서도이다.
도 15는 도 13에서 주요 영상을 선별하는 단계를 세부적으로 나타내는 순서도이다.
도 16 및 도 17은 도 15에서 색상 변화를 검출하는 단계를 세부적으로 나타내는 순서도이다.
도 18은 본 발명의 다른 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 방법을 나타내는 순서도이다.1 is a block diagram showing an apparatus for extracting a situation change section of a sports game video according to an embodiment of the present invention.
2 is a diagram showing an anchor box of the YOLO algorithm according to an embodiment of the present invention.
3 is a diagram showing a lattice grid for object detection in the YOLO algorithm according to an embodiment of the present invention.
4 is a diagram showing the structure of the YOLOv5 algorithm according to an embodiment of the present invention.
5 is a detailed configuration diagram of an object detection unit according to an embodiment of the present invention.
6 is a diagram for explaining a relay board class according to an embodiment of the present invention.
7 is a diagram illustrating a data set generated through data labeling according to an embodiment of the present invention.
8 is a detailed configuration diagram of an image discrimination unit according to an embodiment of the present invention.
9 is a diagram for explaining a change in top image of a relay plate according to an embodiment of the present invention.
10 is a diagram for explaining scoring of pixels of an image frame according to an embodiment of the present invention.
FIG. 11 is a diagram illustrating a histogram of pixels scored in FIG. 10 .
12A and 12B are diagrams for explaining a relay board class according to another embodiment of the present invention.
13 is a flowchart illustrating a method of extracting a situation change section of a sports game video according to an embodiment of the present invention.
14 is a flowchart illustrating in detail the step of detecting an object in FIG. 13 .
FIG. 15 is a flowchart illustrating in detail the step of selecting a main image in FIG. 13 .
16 and 17 are flowcharts showing in detail the step of detecting color change in FIG. 15 .
18 is a flowchart illustrating a method of extracting a situation change section of a sports game video according to another embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면부호를 붙였다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily carry out the present invention. However, the present invention may be implemented in many different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a certain component is said to "include", it means that it may further include other components without excluding other components unless otherwise stated.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예를 설명함으로써, 본 발명을 상세히 설명한다.Hereinafter, the present invention will be described in detail by describing preferred embodiments of the present invention with reference to the accompanying drawings.

각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Like reference numerals in each figure indicate like elements.

도 1은 본 발명의 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 장치를 나타내는 구성도이고, 도 2는 본 발명의 실시 예에 따른 YOLO 알고리즘의 앵커 박스(301)를 나타내는 도면이다.1 is a block diagram showing an apparatus for extracting a situation change section of a sports game video according to an embodiment of the present invention, and FIG. 2 is a diagram showing an anchor box 301 of the YOLO algorithm according to an embodiment of the present invention.

또한, 도 3은 본 발명의 실시 예에 따른 YOLO 알고리즘에서 객체 검출을 위한 격자 그리드(302)를 나타내는 도면이고, 도 4는 본 발명의 실시 예에 따른 YOLOv5 알고리즘의 구조를 나타내는 도면이다.3 is a diagram showing a lattice grid 302 for object detection in the YOLO algorithm according to an embodiment of the present invention, and FIG. 4 is a diagram showing the structure of the YOLOv5 algorithm according to an embodiment of the present invention.

여기에서, 상기 스포츠 경기 동영상은 스포츠 경기를 촬영한 동영상이나 스포츠 경기를 중계하는 동영상 등 스포츠 경기를 포함하는 동영상을 나타낸다.Here, the sports game video represents a video including a sports game, such as a video of a sports game or a video of relaying a sports game.

본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 장치는 도 1에서 도시된 바와 같이 입력부(100), 프레임 추출부(200), 객체 검출부(300), 필터링부(400), 영상 판별부(500), 상황 변화 구간 동영상 생성부(600), 저장부(700), 및 제어부(800)를 포함할 수 있다.As shown in FIG. 1, an apparatus for extracting a situation change section of a sports game video using deep learning according to an embodiment of the present invention includes an input unit 100, a frame extraction unit 200, an object detection unit 300, and a filtering unit 400. ), an image determination unit 500, a situation change section video generation unit 600, a storage unit 700, and a control unit 800.

입력부(100)는 스포츠 경기의 동영상을 입력받는다. 또한, 입력부(100)는 상기 스포츠 경기에서 발생한 이벤트를 순차적으로 기록한 로그 정보를 입력받을 수 있다.The input unit 100 receives a video of a sports game. In addition, the input unit 100 may receive log information sequentially recording events occurring in the sports game.

또한, 프레임 추출부(200)는 입력부(100)를 통해 입력된 상기 스포츠 경기의 동영상을 이용하여 영상 프레임(frame)을 추출한다.Also, the frame extraction unit 200 extracts an image frame using the video of the sports game input through the input unit 100 .

이때, 프레임 추출부(200)는 상기 동영상으로부터 미리 설정된 프레임 속도(프레임/초)에 의해 시간 순서에 따라 영상 프레임을 추출하고, 추출된 영상 프레임으로 구성되는 데이터셋을 생성한다. 이와 같이 생성된 데이터셋은 저장부(700)에 저장된다.At this time, the frame extraction unit 200 extracts video frames according to time order at a preset frame rate (frame/sec) from the video, and creates a dataset composed of the extracted video frames. The data set created in this way is stored in the storage unit 700 .

객체 검출부(300)는 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 포즈를 인식하거나 객체를 검출한다. 즉, 객체 검출부(300)는 저장부(700)에 저장된 데이터셋을 기반으로 딥러닝을 이용하여 데이터 라벨링(Data Labeling)과 학습을 수행하고, 학습 수행 결과를 토대로 상기 영상 프레임의 포즈 인식 또는 객체 검출을 수행한다.The object detector 300 recognizes a pose or detects an object from the image frame using a deep learning model. That is, the object detection unit 300 performs data labeling and learning using deep learning based on the dataset stored in the storage unit 700, and recognizes the pose or object of the image frame based on the learning result. perform detection.

본 발명에 따른 실시 예로 상기 영상 프레임으로부터 객체 검출(Object Detection)을 하기 위한 딥러닝 모델로 YOLO가 적용될 수 있다. YOLO 알고리즘은 원본 이미지를 동일한 크기의 그리드로 나누어 객체를 검출하는 대표적인 단일 단계 방식의 객체 탐지 알고리즘이다.As an embodiment according to the present invention, YOLO may be applied as a deep learning model for object detection from the image frame. The YOLO algorithm is a representative single-step object detection algorithm that detects objects by dividing the original image into grids of the same size.

상기 YOLO 알고리즘은 각 그리드(302)에 대해 그리드 중앙을 중심으로 미리 정의된 형태로 지정된 경계 박스의 개수를 예측하고 이를 기반으로 신뢰도를 계산한다.The YOLO algorithm predicts the number of bounding boxes specified in a predefined form centered on the grid center for each grid 302 and calculates reliability based on this.

또한, 상기 YOLO 알고리즘은 이미지에 객체가 포함되어 있는지 또는 배경만 단독으로 있는지에 대한 여부를 검출할 수 있다. 이를 통해 YOLO 알고리즘은 높은 객체 신뢰도를 가진 위치를 선택하고, 객체 카테고리를 파악할 수 있다.Also, the YOLO algorithm can detect whether an object is included in the image or only a background is present. Through this, the YOLO algorithm can select locations with high object reliability and identify object categories.

이때, 미리 정의된 형태를 가진 경계 박스를 앵커 박스(Anchor Boxes)(301)라고 한다.At this time, the bounding boxes having a predefined shape are referred to as anchor boxes 301 .

도 2에서 도시된 바와 같이 상기 앵커 박스(301)는 K-평균 알고리즘(K-means clustering algorithm)에 의한 데이터로부터 생성되며, 학습 데이터 세트의 객체 크기와 형태에 대한 사전 정보를 확보할 수 있다.As shown in FIG. 2 , the anchor box 301 is generated from data by a K-means clustering algorithm, and can secure prior information on the size and shape of objects in the training data set.

이때, 상기 K-평균 알고리즘은 주어진 데이터를 k개의 클러스터로 묶는 알고리즘으로서 각 클러스터와 거리 차이의 분산을 최소화하는 방식으로 동작한다. 즉, 상기 K-평균 알고리즘은 특정 지점을 임의로 선택하고 그 지점에 가까운 데이터들을 선택하는 군집화 기법이다.At this time, the K-means algorithm is an algorithm for grouping given data into k clusters and operates in a manner of minimizing the variance of distance differences from each cluster. That is, the K-means algorithm is a clustering technique that randomly selects a specific point and selects data close to the point.

K-평균 알고리즘에서 처음에 임의로 지정되었던 군집 중심점은 첫 선택된 데이터들의 평균 중심 지점을 계산하고 그 곳으로 이동한다. 또한, 이러한 과정을 계속 반복하면서 더이상 군집 중심점의 이동이 없다면 데이터 학습을 종료한다.In the K-means algorithm, the cluster center point, which was initially randomly designated, calculates the center point of the average of the first selected data and moves there. In addition, while repeating this process, data learning is terminated when there is no movement of the cluster center point any more.

또한, 도 2에서 도시된 바와 같이 각각의 앵커 박스(301)는 각기 다른 크기와 형태의 객체를 탐지하도록 설계되어 있다. 예를 들어, 도 2에는 한 장소에 3개의 앵커 박스(301)가 겹쳐져 있는데, 이 중 붉은색 앵커 박스(301)는 가운데 있는 사람을 탐지한다.In addition, as shown in FIG. 2, each anchor box 301 is designed to detect objects of different sizes and shapes. For example, in FIG. 2, three anchor boxes 301 are overlapped in one place, and among them, a red anchor box 301 detects a person in the middle.

또한, 상기 YOLO는 도 3에서 도시된 바와 같이 영상 프레임을 격자 그리드(302)로 나누어 후술하는 클래스를 한 번에 판단하고 이를 통합하여 최종 객체를 구분할 수 있다. 이로 인해 동영상에 적용하여도 실시간으로 동작할 만큼 빠른 속도를 나타내는 장점이 있다.Also, as shown in FIG. 3 , the YOLO divides an image frame into a lattice grid 302 to determine a class to be described later at once and integrates them to classify a final object. As a result, even when applied to a video, it has the advantage of showing a speed fast enough to operate in real time.

본 발명에 따른 실시 예로 객체 검출을 하기 위한 딥러닝 모델로 YOLO 시리즈 중 최근에 개발된 YOLOv5 프레임워크가 사용될 수 있다.In an embodiment according to the present invention, a recently developed YOLOv5 framework among the YOLO series may be used as a deep learning model for object detection.

상기 YOLOv5는 종래의 YOLO 시리즈에 비해 데이터 라벨링(Data Labeling) 과정이 편리하며, 임베디드 장비에서 파이토치(PyTorch)와 토치비전(Torchvision)으로 손쉬운 환경 설정이 가능한 장점이 있다.Compared to the conventional YOLO series, the YOLOv5 has a convenient data labeling process, and has the advantage of enabling easy environment setting with PyTorch and Torchvision in embedded equipment.

또한, 상기 YOLOv5 모델은 데이터의 크기에 따라 s(small), m(medium), l(large), x(xlarge)의 모델을 선택할 수 있기 때문에 사용자의 환경에 맞게 사용할 수 있는 장점이 있다.In addition, the YOLOv5 model has an advantage in that it can be used according to the user's environment because s (small), m (medium), l (large), and x (xlarge) models can be selected according to the size of the data.

또한, 도 4에서 도시된 바와 같이 YOLOv5 알고리즘의 구조를 거쳐 마지막 네트워크 출력층인 7x7x30 피쳐맵에서 클래스의 최종 예측이 이루어진다. 이때, 상기 7x7x30 피쳐맵에는 그리드별 박스와 신뢰도 지수, 각 클래스별 예측값들이 포함된다.In addition, as shown in FIG. 4, the final prediction of the class is made in the 7x7x30 feature map, which is the last network output layer, through the structure of the YOLOv5 algorithm. In this case, the 7x7x30 feature map includes a box for each grid, a reliability index, and predicted values for each class.

도 5는 본 발명의 실시 예에 따른 객체 검출부(300)를 세부적으로 나타내는 구성도이다. 본 발명에 따른 객체 검출부(300)는 도 5에서 도시된 바와 같이 클래스 설정 모듈(310), 데이터 라벨링 모듈(320), 학습 모듈(330), 및 객체 예측 모듈(340)을 포함할 수 있다.5 is a detailed configuration diagram of an object detection unit 300 according to an embodiment of the present invention. As shown in FIG. 5 , the object detection unit 300 according to the present invention may include a class setting module 310, a data labeling module 320, a learning module 330, and an object prediction module 340.

상기 클래스 설정 모듈(310)은 상기 영상 프레임으로부터 관심도가 높은 경기 장면을 인식하기 위한 클래스를 설정한다. 또한, 데이터 라벨링 모듈(320)은 클래스 설정 모듈(310)을 통해 설정된 클래스를 토대로 상기 영상 프레임에 대한 데이터 라벨링(Data Labeling)을 수행한다.The class setting module 310 sets a class for recognizing a game scene of high interest from the video frame. Also, the data labeling module 320 performs data labeling on the image frame based on the class set through the class setting module 310 .

야구 중계 영상을 예로 들어 설명하면 다음과 같다.An example of a baseball relay video is described below.

도 6은 본 발명의 일 실시 예에 따른 중계판 클래스를 설명하기 위한 도면이고, 도 7은 본 발명의 실시 예에 따른 데이터 라벨링을 통해 생성된 데이터 셋을 나타내는 도면이다.6 is a diagram for explaining a relay board class according to an embodiment of the present invention, and FIG. 7 is a diagram showing a data set generated through data labeling according to an embodiment of the present invention.

본 발명에서 클래스 설정 모듈(310)은 영상 프레임으로부터 경기 장면을 인식하기 위한 목적에 따라 다양한 클래스를 설정할 수 있다.In the present invention, the class setting module 310 may set various classes according to the purpose of recognizing a game scene from an image frame.

예를 들어, 클래스 설정 모듈(310)은 도 6에서 도시된 바와 같이 각 방송사별 중계판(311)이 있는 영상 프레임을 검출하기 위한 중계판 클래스, 또는 각 방송사별 중계판(311)에서 누상(base) 영역(312)을 검출하기 위한 누상 클래스를 생성할 수 있다.For example, the class setting module 310, as shown in FIG. 6, is a relay board class for detecting an image frame having a relay board 311 for each broadcasting company, or a relay board class for each broadcasting company. A base image class for detecting the region 312 may be generated.

앞서 YOLO 알고리즘에 대한 정의에서 상술한 바와 같이 YLOL 알고리즘은 학습 데이터의 객체 크기와 형태에 대한 사전 정보를 확보하고, 프레임(이미지)을 격자 그리드(302)로 나누어 한 번에 클래스를 판단한 이후에 이를 통합하여 최종 객체를 구분할 수 있다.As described above in the definition of the YOLO algorithm, the YLOL algorithm obtains prior information on the object size and shape of the training data, divides the frame (image) into a lattice grid 302, determines the class at once, and then Integrate to distinguish the final object.

또한, 클래스 설정 모듈(310)은 영상 프레임에서 각 방송사별 중계판(311)에 대한 학습 데이터를 구성하기 위해 중계판 클래스를 설정할 수 있다.In addition, the class setting module 310 may set a relay board class to configure learning data for the relay board 311 for each broadcasting company in the image frame.

또한, 데이터 라벨링 모듈(320)은 도 7에서 도시된 바와 같이 클래스 설정 모듈(310)에서 설정된 상기 클래스에 따라 영상 프레임을 분류하고, 분류된 영상 프레임에 각각 데이터 라벨링을 수행한다.Also, as shown in FIG. 7 , the data labeling module 320 classifies image frames according to the class set in the class setting module 310 and performs data labeling on each of the classified image frames.

예를 들어, 데이터 라벨링 모듈(320)은 영상 프레임을 상기 중계판 클래스에 따라 분류하고, 각각의 영상 프레임에 대해 클래스에 해당하는 데이터 라벨을 부여할 수 있다.For example, the data labeling module 320 may classify image frames according to the relay board class and assign a data label corresponding to the class to each image frame.

상기 데이터 라벨링 모듈(320)을 통한 데이터 라벨링(Data Labeling)이 완료되면 데이터 라벨이 부여된 각 이미지(영상 프레임)에 대한 위치정보 파일(.txt)이 생성되어 저장부(700)에 저장될 수 있다.When data labeling through the data labeling module 320 is completed, a location information file (.txt) for each image (video frame) to which data labels are assigned can be generated and stored in the storage unit 700. there is.

이때, 상기 위치정보 파일(.txt)에는 클래스별로 라벨링(Labeling)된 박스의 위치정보인 x(x 좌표값), y(y 좌표값), w(width), h(height) 값이 저장된다.At this time, x (x coordinate value), y (y coordinate value), w (width), h (height) values, which are location information of boxes labeled for each class, are stored in the location information file (.txt). .

또한, 데이터 라벨링 모듈(320)은 train.txt 파일과 val.txt 파일을 생성하여 영상 프레임의 이미지(.jpg)와 상기 위치정보 파일(.txt)의 파일 경로를 저장한다.In addition, the data labeling module 320 creates train.txt and val.txt files to store an image (.jpg) of an image frame and a file path of the location information file (.txt).

또한, 데이터 라벨링 모듈(320)은 후술하는 학습을 위해 필요한 모든 파일의 정보와 경로를 저장하기 위해 .yaml 파일을 생성하여 클래스 이름(names), nc(클래스 수), train(train.txt 파일 경로), val(val.txt 파일 경로)를 저장할 수 있다.In addition, the data labeling module 320 generates a .yaml file to store information and paths of all files necessary for learning to be described later, class names (names), nc (number of classes), train (train.txt file path) ), val (val.txt file path).

또한, 본 발명에 따른 객체 검출부(300)에서 학습 모듈(330)은 딥러닝을 이용하여 데이터 라벨링이 수행된 영상 프레임의 데이터 학습을 수행한다. 또한, 객체 예측 모듈(340)은 상기 학습 모듈(330)을 통한 학습 수행 결과를 토대로 상기 영상 프레임의 객체를 예측한다.In addition, in the object detector 300 according to the present invention, the learning module 330 performs data learning of image frames on which data labeling has been performed using deep learning. In addition, the object prediction module 340 predicts the object of the image frame based on the result of learning performed through the learning module 330 .

이때, 상기 학습 모듈(330)을 통한 영상 프레임의 학습이 완료되면 가장 마지막으로 학습된 가중치 파일(last.pt)과 학습 결과 중 가장 학습이 잘된 최적의 가중치 파일(best.pt)이 저장부(700)에 저장된다.At this time, when the learning of the image frame through the learning module 330 is completed, the last learned weight file (last.pt) and the most well-learned optimal weight file (best.pt) among the learning results are stored in the storage unit ( 700) is stored.

따라서, 상기 객체 예측 모듈(340)은 상기 최적의 가중치 파일(best.pt)을 사용하여 상기 영상 프레임의 객체를 예측할 수 있다.Accordingly, the object prediction module 340 may predict the object of the image frame using the optimal weight file (best.pt).

또한, 본 발명에서 필터링부(400)는 객체 검출부(300)를 통해 검출된 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하여 필터링된 기본 영상을 추출한다.Also, in the present invention, the filtering unit 400 extracts the filtered base image by removing the noise image from the sports game video based on the object detection result detected through the object detection unit 300 .

즉, 필터링부(400)는 상기 클래스에 따라 분류된 영상 프레임을 통해 노이즈 영상을 필터링함으로써 광고 영상 등의 불필요한 영상을 제거할 수 있다.That is, the filtering unit 400 may remove unnecessary images such as advertisement images by filtering noise images through image frames classified according to the class.

도 8은 본 발명의 실시 예에 따른 영상 판별부(500)를 세부적으로 나타내는 구성도이다.8 is a detailed configuration diagram of an image determination unit 500 according to an embodiment of the present invention.

본 발명에 따른 영상 판별부(500)는 상기 기본 영상 또는 객체 검출부(300)를 통해 검출된 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판(311)에서 누상 영역(312)의 색상 변화를 이용하여 주요 영상을 선별한다.The image discrimination unit 500 according to the present invention uses the color change of the false image area 312 in the relay board 311 displayed in the image frame based on the basic image or the object detection result detected through the object detection unit 300. to select the main images.

즉, 영상 판별부(500)는 노이즈 영상이 제거된 상기 기본 영상 또는 객체 검출부(300)를 통해 검출된 영상 프레임에서 중계판 클래스가 탐지될 때 누상(312)의 변화를 판단하여 주요 영상을 선별할 수 있다.That is, the image determination unit 500 selects the main image by determining the change of the false image 312 when the relay board class is detected in the basic image from which the noise image is removed or the image frame detected through the object detection unit 300. can do.

또한, 영상 판별부(500)는 도 8에서 도시된 바와 같이 중계판 검출 모듈(510), 분할 모듈(520), 변화 검출 모듈(530), 및 판단 모듈(540)을 포함할 수 있다.Also, the image discrimination unit 500 may include a relay board detection module 510, a segmentation module 520, a change detection module 530, and a determination module 540, as shown in FIG. 8 .

중계판 검출 모듈(510)은 상기 기본 영상의 각 영상 프레임 또는 객체 검출부(300)를 통해 검출된 영상 프레임으로부터 중계판(311) 또는 누상 영역(312)을 검출한다.The relay plate detection module 510 detects the relay plate 311 or the false image area 312 from each image frame of the basic image or the image frame detected through the object detection unit 300 .

또한, 분할 모듈(520)은 상기 중계판 검출 모듈(510)에서 검출된 상기 누상 영역(312)을 좌우 방향 또는 상하 방향으로 이등분하여 분할한다.Also, the segmentation module 520 divides the missing image area 312 detected by the relay plate detection module 510 into two halves in a left-right direction or an up-down direction.

또한, 상기 변화 검출 모듈(530)은 분할 모듈(520)을 통해 분할된 누상 영역(312)의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역(312)의 색상 변화를 검출한다.In addition, the change detection module 530 compares the previous frame and the next frame of each divided image based on the image of the red image region 312 divided through the segmentation module 520 to change the color of the red image region 312. detect

또한, 판단 모듈(540)은 상기 변화 검출 모듈(530)의 검출 결과를 토대로 상기 누상 영역(312)의 색상 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별할 수 있다.In addition, the determination module 540 may select a corresponding image frame as the main image when the color change of the false image area 312 is determined based on the detection result of the change detection module 530 .

또한, 본 발명의 일 실시 예에 따른 변화 검출 모듈(530)은 흑색 대비 백색의 히스토그램(Histogram) 차이를 검출하여 누상 영역(312)의 변화를 검출하는 것을 특징으로 한다.In addition, the change detection module 530 according to an embodiment of the present invention is characterized in detecting a change in the false image area 312 by detecting a difference in a histogram between black and white.

즉, 상기 변화 검출 모듈(530)은 각 영상 프레임의 픽셀(Pixel)들을 흑색과 백색으로 분류함으로써 이진화하여 나타내고, 이진화된 흑색과 백색의 픽셀(Pixel) 수를 히스토그램으로 나타낸다.That is, the change detection module 530 categorizes the pixels of each image frame into black and white, and binarizes the pixels, and displays the number of binarized black and white pixels as a histogram.

또한, 상기 이전 프레임과 다음 프레임에서 각각 히스토그램으로 표현된 흑색 대비 백색의 크기 차이를 비교하여 상기 이전 프레임과 다음 프레임 간 누상 영역(312)의 차이를 검출한다.In addition, a difference in the dead image area 312 between the previous frame and the next frame is detected by comparing the size difference between black and white represented by a histogram in the previous frame and the next frame.

이때, 변화 검출 모듈(530)은 히스토그램으로 표현된 흑색 대비 백색의 크기를 점수화하여 누상(312)의 주자 위치를 정확히 판별할 수도 있다.At this time, the change detection module 530 may accurately determine the position of the runner of the naked image 312 by scoring the size of white versus black expressed as a histogram.

누상(312)의 변화는 야구의 경우 안타, 볼넷과 같은 중요 이벤트(사건)가 발생하였다는 것을 나타내므로 이를 이용하면 최적의 상황 변화 구간 동영상을 추출할 수 있다.Since the change in the base image 312 indicates that an important event (event) such as a hit or a walk has occurred in the case of baseball, it is possible to extract an optimal situation change section video.

도 9는 본 발명의 실시 예에 따른 중계판(311)의 누상(312) 변화를 설명하기 위한 도면이다. 즉, 도 9에서 도면 (a)는 영상 프레임으로부터 검출된 중계판(311)의 누상 영역(312)을 나타내고, 도면 (b)는 좌우 방향으로 이등분하여 분할된 누상 영역(312)을 나타낸다.FIG. 9 is a view for explaining a change in the top image 312 of the relay plate 311 according to an embodiment of the present invention. That is, in FIG. 9, drawing (a) shows the false image area 312 of the relay board 311 detected from the image frame, and drawing (b) shows the false image area 312 divided by bisecting in the left and right directions.

도 9의 도면 (b)에서 도시된 바와 같이 중계판(311)의 누상 영역(312)을 좌우 방향으로 이등분하여 분할하고, 분할된 이미지 중에서 좌측 이미지를 제1 분할 이미지(521), 우측 이미지를 제2 분할 이미지(522)로 가정하여 설명한다.As shown in (b) of FIG. 9, the left image area 312 of the relay plate 311 is divided into two halves in the left and right directions, and among the divided images, the left image is the first divided image 521, and the right image is the first divided image 521. It is assumed that the second divided image 522 will be described.

이때, 본 발명의 실시 예에 따라 이등분된 각 이미지에서 온전한 모양의 베이스를 10점, 이등분된 반쪽의 베이스를 5점으로 점수화할 수 있다.At this time, according to an embodiment of the present invention, in each bisected image, a whole shape base may be scored as 10 points, and a bisected half base may be scored as 5 points.

또한, 각 분할 이미지의 다양한 경우에 대한 베이스 점수를 각각 산출하면, 아래와 같이 산출된 점수에 따라 누상(312)의 주자 위치를 판별할 수 있다.In addition, when base scores for various cases of each divided image are calculated, the position of the runner on the ground 312 can be determined according to the calculated scores as follows.

[베이스 점수에 따른 누상 주자 위치][Position of base runners according to base score]

- 제1 분할 이미지(521) 점수 : 0점, 제2 분할 이미지(522) 점수 : 10점- First split image 521 score: 0 points, second split image 522 score: 10 points

=> 주자 위치 : 1루 베이스 => Runner position: 1st base

- 제1 분할 이미지(521) 점수 : 5점, 제2 분할 이미지(522) 점수 : 5점- First split image 521 score: 5 points, second split image 522 score: 5 points

=> 주자 위치 : 2루 베이스 => Runner position: 2nd base

- 제1 분할 이미지(521) 점수 : 10점, 제2 분할 이미지(522) 점수 : 0점- First split image 521 score: 10 points, second split image 522 score: 0 points

=> 주자 위치 : 3루 베이스 => Runner position: 3rd base

- 제1 분할 이미지(521) 점수 : 5점, 제2 분할 이미지(522) 점수 : 15점- First split image 521 score: 5 points, second split image 522 score: 15 points

=> 주자 위치 : 1루 및 2루 베이스 => Runner position: 1st and 2nd base

- 제1 분할 이미지(521) 점수 : 15점, 제2 분할 이미지(522) 점수 : 5점- First split image 521 score: 15 points, second split image 522 score: 5 points

=> 주자 위치 : 2루 및 3루 베이스 => Runner position: 2nd and 3rd base

- 제1 분할 이미지(521) 점수 : 15점, 제2 분할 이미지(522) 점수 : 15점- First split image 521 score: 15 points, second split image 522 score: 15 points

=> 주자 위치 : 1루, 2루, 및 3루 베이스 => Runner positions: 1st, 2nd, and 3rd base

이와 같이 본 발명의 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 장치는 영상 프레임에서 누상(312)의 변화뿐만 아니라 누상(312)의 주자 위치를 정확히 판별할 수 있는 장점이 있다.As described above, the apparatus for extracting situation change sections of a sports game video according to an embodiment of the present invention has the advantage of being able to accurately determine not only the change of the ground floor 312 but also the position of the runner on the ground floor 312 in the video frame.

또한, 본 발명의 다른 실시 예에 따른 변화 검출 모듈(530)은 영상 픽셀들의 밝기 단계에 따라 분류된 히스토그램을 이용하여 누상 영역(312)의 변화를 검출할 수 있다.In addition, the change detection module 530 according to another embodiment of the present invention may detect a change in the residual image area 312 using a histogram classified according to brightness levels of image pixels.

도 10은 본 발명의 실시 예에 따른 영상 프레임의 픽셀에 대한 점수화를 설명하기 위한 도면이고, 도 11은 도 10에서 점수화된 픽셀의 히스토그램을 나타내는 도면이다.FIG. 10 is a diagram for explaining scoring of pixels of an image frame according to an embodiment of the present invention, and FIG. 11 is a diagram showing a histogram of the scored pixels in FIG. 10 .

도 10 및 도 11에서 도시된 바와 같이 상기 변화 검출 모듈(530)은 각 영상 프레임의 픽셀들을 밝기 단계에 따라 분류하여 점수화하고, 점수화된 픽셀 값을 밝기 단계에 따라 히스토그램으로 나타낼 수 있다.As shown in FIGS. 10 and 11 , the change detection module 530 may classify and score pixels of each image frame according to the brightness level, and display the scored pixel values as a histogram according to the brightness level.

도 10에서 도시된 바와 같이 영상 프레임의 이미지는 가로와 세로 방향으로 배열된 픽셀(Pixel)들의 집합으로 나타낼 수 있다. 이때, 이러한 픽셀들은 각각의 색깔과 밝기를 갖는데 각 픽셀의 색깔 또는 밝기의 단계를 수치화하여 나타낼 수 있다.As shown in FIG. 10 , an image of a video frame may be represented by a set of pixels arranged horizontally and vertically. In this case, these pixels have respective colors and brightness, and the level of color or brightness of each pixel can be digitized and expressed.

예를 들어, 도 10의 도면 (a)에서 도시된 바와 같이 가로, 세로가 10x10의 픽셀로 이루어진 이미지에 대하여 밝기의 단계를 10단계로 가정하여 나타내면 도 10의 도면 (b)와 같이 표현될 수 있다.For example, as shown in (a) of FIG. 10, if the brightness level is assumed to be 10 levels for an image composed of pixels of 10x10 horizontally and vertically, it can be expressed as shown in (b) of FIG. there is.

즉, 가장 어두운 픽셀의 밝기값은 0으로 나타내고, 가장 밝은 픽셀의 밝기값을 9로 하여 각 픽셀의 밝기 단계를 0~9의 값으로 나타낼 수 있다.That is, if the brightness value of the darkest pixel is 0 and the brightness value of the brightest pixel is 9, the brightness level of each pixel can be represented by a value of 0 to 9.

또한, 도 11에서 도시된 바와 같이 도 10에서 수치화된 픽셀들의 갯수를 밝기 단계에 따라 분류하여 히스토그램(Histogram)으로 나타낼 수 있다. 또한, 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램을 비교하여 누상 영역(312)의 차이를 검출할 수 있다.In addition, as shown in FIG. 11, the number of pixels digitized in FIG. 10 may be classified according to the brightness level and displayed as a histogram. In addition, a difference in the residual image area 312 may be detected by comparing a histogram of a previous frame and a histogram of a next frame for each divided image.

이와 같이 본 발명은 각 분할 이미지에서 흑색 대비 백색의 히스토그램(Histogram) 차이를 검출하여 누상 영역(312)의 변화를 검출하거나, 픽셀을 밝기 단계에 따라 분류하여 히스토그램으로 나타내고, 각 분할 이미지의 이전 프레임 히스토그램과 다음 프레임 히스토그램을 비교하여 누상(312) 변화의 차이를 검출할 수 있다.In this way, the present invention detects the difference in the histogram of black versus white in each divided image to detect the change in the ripple area 312, or classifies pixels according to the brightness level and displays them as a histogram, and displays the previous frame of each divided image A difference in change in the residual image 312 may be detected by comparing the histogram with the histogram of the next frame.

이때, 판단 모듈(540)은 카이제곱 분포(chi-squqre distribution)를 이용하여 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램 간 누상 영역(312)의 변화를 판단하고, 그 판단 결과에 따라 주요 영상으로 선별할 수 있다.At this time, the determination module 540 determines the change in the residual image area 312 between the histogram of the previous frame and the histogram of the next frame for each divided image using a chi-square distribution, and determines the change in the determined result. It can be selected as the main image according to.

상기 카이제곱 분포(chi-squqre distribution)는 k개의 서로 독립적인 표준 정규 확률변수를 각각 제곱한 다음 합해서 얻어지는 분포를 나타낸다. 이때, k를 자유도라고 하고, 카이제곱 분포의 매개변수가 된다.The chi-square distribution (chi-squqre distribution) represents a distribution obtained by squaring each of k independent standard normal random variables and then summing them. At this time, k is called the degree of freedom and becomes a parameter of the chi-square distribution.

또한, 본 발명에 따른 영상 판별부(500)는 상기 기본 영상 또는 객체 검출부(300)를 통해 검출된 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판(311)에서 점수 영역의 변화를 이용하여 주요 영상을 선별할 수도 있다.In addition, the image determination unit 500 according to the present invention uses the change in the score area in the relay board 311 displayed in the image frame based on the basic image or the object detection result detected through the object detection unit 300 to determine the main You can also select images.

이때, 영상 판별부(500)는 파이썬에서 다양한 운영 체제를 위한 광학 문자 인식 엔진으로 사용되는 파이테서랙트(Pytesseract)를 이용하여 중계판(311) 점수 영역의 숫자를 인식할 수 있다.At this time, the image discrimination unit 500 may recognize the number in the score area of the relay board 311 using Pytesseract, which is used as an optical character recognition engine for various operating systems in Python.

또한, 영상 판별부(500)는 인식된 숫자를 기반으로 점수 변화를 검출하고, 점수 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별한다.In addition, the image discrimination unit 500 detects a score change based on the recognized number, and selects a corresponding image frame as a main image when the score change is determined.

또한, 본 발명에서 상황 변화 구간 동영상 생성부(600)는 상기 영상 판별부(500)를 통해 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성한다.Also, in the present invention, the situation change section video generator 600 combines the main images selected through the image determination unit 500 to generate a situation change section video.

즉, 상황 변화 구간 동영상 생성부(600)는 선별된 주요 영상들을 시간 순서에 따라 그룹으로 묶어 스포츠 경기의 전체 동영상을 요약하거나, 관심도가 높은 경기 장면으로 이루어지는 상황 변화 구간 동영상을 생성할 수 있다.That is, the situation change section video generator 600 may summarize the entire sports game video by grouping the selected main videos in chronological order or generate a situation change section video consisting of game scenes of high interest.

또한, 본 발명에서 제어부(800)는 입력부(100), 프레임 추출부(200), 객체 검출부(300), 필터링부(400), 영상 판별부(500), 및 상황 변화 구간 동영상 생성부(600)를 제어할 수 있다.In addition, in the present invention, the control unit 800 includes the input unit 100, the frame extraction unit 200, the object detection unit 300, the filtering unit 400, the image determination unit 500, and the situation change section video generation unit 600. ) can be controlled.

도 12a 및 도 12b는 본 발명의 다른 실시 예에 따른 중계판 클래스를 설명하기 위한 도면이다. 즉, 도 12a는 펜싱 경기 동영상으로부터 추출된 영상 프레임의 중계판(311)을 설명하기 위한 도면이고, 도 12b는 배구 경기 동영상으로부터 추출된 영상 프레임의 중계판(311)을 설명하기 위한 도면이다.12A and 12B are diagrams for explaining a relay board class according to another embodiment of the present invention. That is, FIG. 12A is a diagram for explaining a relay board 311 of an image frame extracted from a video of a fencing game, and FIG. 12B is a diagram for explaining a relay board 311 of an image frame extracted from a video of a volleyball game.

본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 장치는 야구뿐만 아니라 유도, 태권도, 펜싱, 배구, 테니스, 배드민턴, 빙상(쇼트트랙, 스피드스케이팅), 수영 등과 같은 스포츠 경기 동영상의 영상 프레임에서 중계판(311)의 색상 변화를 이용하여 스포츠 경기의 상황 변화 구간 동영상을 생성할 수 있다.An apparatus for extracting situation change sections of a sports game video using deep learning according to an embodiment of the present invention is not only baseball but also sports games such as judo, taekwondo, fencing, volleyball, tennis, badminton, ice skating (short track, speed skating), and swimming. A video of a situation change section of a sports game may be generated by using a color change of the relay board 311 in an image frame of the video.

예를 들어, 유도 및 태권도와 같은 스포츠 경기 동영상의 경우에는 중계판(311)에서 경고 카드의 유무를 나타내는 마크를 이용하여 상황 변화 구간 동영상을 생성할 수 있다.For example, in the case of sports game videos such as judo and taekwondo, a situation change section video may be generated using a mark indicating whether a warning card is present or not on the relay board 311 .

즉, 중계판(311)에서 경고 카드의 유무에 따라 달라지는 색상 변화를 이용하여 주요 영상을 선별하고, 선별된 주요 영상을 조합하여 원하는 상황 변화 구간 동영상을 생성할 수 있다.That is, the relay board 311 may select a main video using a color change depending on the presence or absence of a warning card, and combine the selected main videos to generate a video of a desired situation change section.

또한, 도 12a에서 도시된 바와 같이 펜싱 경기 동영상의 경우에는 중계판(311)에서 각 선수의 득점 상태를 나타내는 초록색 및 빨간색 마크를 이용하여 상황 변화 구간 동영상을 생성할 수 있다.In addition, as shown in FIG. 12A , in the case of a fencing match video, a situation change section video can be created using green and red marks indicating the scoring status of each player on the relay board 311 .

즉, 중계판(311)에서 각 선수의 득점에 따라 달라지는 초록색 및 빨간색 마크의 색상 변화를 이용하여 주요 영상을 선별하고, 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성할 수 있다.That is, it is possible to select main videos using color changes of green and red marks that vary according to each player's score on the relay board 311, and create a situation change section video by combining the selected main videos.

또한, 도 12b에서 도시된 바와 같이 배구, 테니스, 배드민턴과 같은 스포츠 경기 동영상의 경우에는 중계판(311)에서 서브권 소유 위치를 나타내는 마크를 이용하여 상황 변화 구간 동영상을 생성할 수 있다.In addition, as shown in FIG. 12B, in the case of sports game videos such as volleyball, tennis, and badminton, a situation change section video can be created using a mark indicating the position of the sub right on the relay board 311.

즉, 중계판(311)에서 서브권을 가진 팀 또는 선수의 위치에 따라 달라지는 색상 변화를 이용하여 주요 영상을 선별하고, 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성할 수 있다.That is, it is possible to select a main video using a color change that varies depending on the position of a team or a player having the right to serve on the relay board 311, and create a situation change section video by combining the selected main videos.

또한, 본 발명은 빙상(쇼트트랙, 스피드스케이팅), 수영, 육상 등과 같이 복수의 국가 또는 팀이 순위를 겨루는 스포츠 경기 동영상의 경우에는 중계판(311)에서 각 국가를 나타내는 국기 또는 각 팀을 구별하는 마크를 이용하여 상황 변화 구간 동영상을 생성할 수 있다.In addition, in the case of sports game videos in which a plurality of countries or teams compete for rankings, such as ice skating (short track, speed skating), swimming, and athletics, the relay board 311 distinguishes flags representing each country or each team A situation change section video can be created using a mark that

즉, 중계판(311)에서 미리 설정된 순위 내의 국기 또는 마크가 변동되는 색상 변화를 이용하여 주요 영상을 선별하고, 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성할 수 있다.That is, the relay board 311 may select main images by using color changes in which national flags or marks within preset ranks change, and create situation change section videos by combining the selected main images.

도 13은 본 발명의 일 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 방법을 나타내는 순서도이고, 도 14는 도 13에서 객체를 검출하는 단계(S30)를 세부적으로 나타내는 순서도이며, 도 15는 도 13에서 주요 영상을 선별하는 단계(S50)를 세부적으로 나타내는 순서도이다.13 is a flowchart illustrating a method for extracting a situation change section of a sports game video according to an embodiment of the present invention, FIG. 14 is a flowchart showing in detail the step (S30) of detecting an object in FIG. 13, and FIG. 13 is a flowchart showing in detail the step of selecting the main image (S50).

본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 도 13에서 도시된 바와 같이 스포츠 경기 동영상을 입력받는 단계(S10), 상기 동영상으로부터 영상 프레임을 추출하는 단계(S20), 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출하는 단계(S30)를 포함한다.As shown in FIG. 13, a method for extracting a situation change section of a sports game video using deep learning according to an embodiment of the present invention includes steps of receiving a sports game video as an input (S10) and extracting video frames from the video (S20). ), and detecting an object from the image frame using a deep learning model (S30).

또한, 상기 객체를 검출하는 단계(S30)는 도 14에서 도시된 바와 같이 상기 영상 프레임으로부터 관심도가 높은 경기 장면을 인식하기 위한 클래스를 설정하는 단계(S31), 설정된 클래스를 토대로 상기 영상 프레임의 데이터 라벨링(Data Labeling)을 수행하는 단계(S32)를 포함한다.In addition, the step of detecting the object (S30) is the step of setting a class for recognizing a game scene of high interest from the video frame as shown in FIG. 14 (S31), and the data of the video frame based on the set class. and performing labeling (Data Labeling) (S32).

또한, 상기 객체를 검출하는 단계(S30)는 딥러닝을 이용하여 데이터 라벨링된 영상 프레임의 데이터 학습을 수행하는 단계(S33) 및 학습 수행 결과를 토대로 상기 영상 프레임의 객체를 예측하는 단계(S34)를 더 포함한다.In addition, the step of detecting the object (S30) includes the step of performing data learning of the data-labeled image frame using deep learning (S33) and predicting the object of the image frame based on the learning result (S34). more includes

이때, 상기 클래스는 각 방송사별 중계판(311)이 있는 영상 프레임을 검출하기 위한 중계판 클래스와 각 방송사별 중계판(311)에서 누상(base) 영역(312)을 검출하기 위한 누상 클래스를 포함할 수 있다.At this time, the class includes a relay board class for detecting a video frame having a relay board 311 for each broadcasting company and a base image class for detecting a base area 312 in the relay board 311 for each broadcasting company. can do.

또한, 본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 도 13에서 도시된 바와 같이 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하여 기본 영상을 추출하는 단계(S40), 상기 기본 영상 또는 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판(311)의 변화를 이용하여 주요 영상을 선별하는 단계(S50), 및 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성하는 단계(S60)를 더 포함한다.In addition, a method for extracting a situation change section of a sports game video using deep learning according to an embodiment of the present invention includes the steps of extracting a basic image by removing noise images from the sports game video based on the object detection result, as shown in FIG. 13 . (S40), selecting a main image by using the change of the relay board 311 shown in the image frame based on the basic image or object detection result (S50), and combining the selected main images to video of the situation change section Generating (S60) is further included.

이때, 상기 주요 영상을 선별하는 단계(S50)에서 상기 중계판(311)의 변화에는 중계판(311)의 색상 변화와 중계판(311)의 점수 변화를 포함할 수 있다.At this time, in the step of selecting the main image (S50), the change of the relay board 311 may include a color change of the relay board 311 and a score change of the relay board 311.

또한, 상기 주요 영상을 선별하는 단계(S50)는 도 15에서 도시된 바와 같이 상기 영상 프레임으로부터 중계판(311)의 누상 영역(312)을 검출하는 단계(S51), 상기 누상 영역(312)을 좌우 또는 상하 방향으로 이등분하여 분할하는 단계(S52)를 포함한다.In addition, in the step of selecting the main image (S50), as shown in FIG. 15, the step of detecting the false image area 312 of the relay board 311 from the image frame (S51), the false image area 312 and dividing into two halves in the left and right or up and down directions (S52).

또한, 상기 주요 영상을 선별하는 단계(S50)는 분할된 누상 영역(312)의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역(312)의 색상 변화를 검출하는 단계(S53) 및 상기 누상 영역(312)의 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하는 단계(S54)를 더 포함한다.In addition, the step of selecting the main image (S50) is a step of detecting a change in color of the false image region 312 by comparing the previous frame and the next frame of each divided image based on the image of the divided image region 312. (S53) and, when a change in the false image area 312 is determined, selecting a corresponding image frame as the main image (S54).

도 16 및 도 17은 도 15에서 색상 변화를 검출하는 단계(S53)를 세부적으로 나타내는 순서도이다. 즉, 도 16은 영상 프레임의 픽셀들을 흑색과 백색으로 이진화하여 분류한 경우를 나타내고, 도 17은 영상 프레임의 픽셀들을 밝기 단계에 따라 2이상의 단계(예를 들어, 10단계)로 분류한 경우를 나타낸다.16 and 17 are flowcharts showing in detail the color change detection step S53 in FIG. 15 . That is, FIG. 16 shows a case in which pixels of an image frame are binarized into black and white and classified, and FIG. 17 shows a case in which pixels in an image frame are classified into two or more levels (eg, level 10) according to brightness levels. indicate

도 16에서 도시된 바와 같이 상기 누상 영역(312)의 색상 변화를 검출하는 단계(S53)는 각 영상 프레임의 픽셀들을 흑색과 백색으로 분류하여 나타내는 단계(S531), 분류된 흑색과 백색의 픽셀 수를 히스토그램으로 나타내는 단계(S532), 및 상기 이전 프레임과 다음 프레임의 히스토그램에서 흑색 대비 백색의 크기 차이를 비교하여 상기 이전 프레임과 다음 프레임 간 누상 영역(312)의 차이를 검출하는 단계(S533)를 포함할 수 있다.As shown in FIG. 16, the step of detecting the color change of the false image area 312 (S53) is the step of classifying and displaying the pixels of each image frame as black and white (S531), the number of classified black and white pixels a histogram (S532), and comparing the size difference between black and white in the histograms of the previous frame and the next frame to detect a difference in the false image area 312 between the previous frame and the next frame (S533). can include

또한, 도 17에서 도시된 바와 같이 상기 누상 영역(312)의 색상 변화를 검출하는 단계(S53)는 각 영상 프레임의 픽셀들을 밝기 단계에 따라 분류하여 점수화하는 단계(S5311), 점수화된 픽셀 값을 밝기 단계에 따른 히스토그램으로 나타내는 단계(S5312), 및 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램을 비교하여 누상 영역(312)의 차이를 검출하는 단계(S5313)를 포함할 수 있다.In addition, as shown in FIG. 17, the step of detecting the color change of the false image area 312 (S53) is the step of classifying and scoring the pixels of each image frame according to the brightness level (S5311), the scored pixel value Displaying a histogram according to the brightness level (S5312), and comparing the histogram of the previous frame and the histogram of the next frame for each divided image to detect a difference in the false image area 312 (S5313).

이때, 상기 주요 영상으로 선별하는 단계(S54)는 카이제곱 분포(chi-squqre distribution)를 이용하여 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램 간 누상 영역(312)의 변화를 판단하는 것을 특징으로 한다.At this time, the step of selecting as the main image (S54) is to determine the change in the red image area 312 between the histogram of the previous frame and the histogram of the next frame for each divided image using a chi-square distribution. characterized by

도 18은 본 발명의 다른 실시 예에 따른 스포츠 경기 동영상의 상황 변화 구간 추출 방법을 나타내는 순서도이다.18 is a flowchart illustrating a method for extracting a situation change section of a sports game video according to another embodiment of the present invention.

본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 도 18에서 도시된 바와 같이 스포츠 경기 동영상으로부터 영상 프레임을 추출하는 단계(S20a), 딥러닝 모델을 이용하여 투구 장면이 존재하는 영상 프레임을 검출하는 단계(S30a)를 포함할 수 있다.As shown in FIG. 18, a method for extracting a situation change section of a sports game video using deep learning according to an embodiment of the present invention includes extracting video frames from the sports game video (S20a), pitching scene using a deep learning model. A step of detecting the existing image frame (S30a) may be included.

또한, 본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 검출된 영상 프레임의 픽셀을 흑색과 백색으로 이진화하여 나타내는 단계(S40a), 이진화된 영상 프레임을 기반으로 주요 영상을 검출하는 단계(S50a)를 포함할 수 있다.In addition, in the method for extracting the situation change section of a sports game video using deep learning according to an embodiment of the present invention, the pixels of the detected image frame are binarized into black and white (S40a), and the main information is displayed based on the binarized image frame. A step of detecting an image (S50a) may be included.

또한, 상기 주요 영상을 검출하는 단계(S50a)는 상기 단계(S40a)에서 이진화된 영상 프레임을 기반으로 중계판(311)의 점수 영역을 검출하는 단계(S51a), 딥러닝 모델을 이용하여 상기 점수 영역의 숫자를 인식하는 단계(S52a)를 포함할 수 있다.In addition, the step of detecting the main image (S50a) includes the step of detecting the score area of the relay board 311 based on the image frame binarized in step (S40a) (S51a), the score using a deep learning model. A step of recognizing the number of regions (S52a) may be included.

이때, 상기 숫자를 인식하는 단계(S52a)는 파이썬에서 다양한 운영 체제를 위한 광학 문자 인식 엔진으로 사용되는 파이테서랙트(Pytesseract)를 이용하여 중계판(311) 점수 영역의 숫자를 인식할 수 있다.At this time, in step S52a of recognizing numbers, numbers in the score area of relay board 311 may be recognized using Pytesseract, which is used as an optical character recognition engine for various operating systems in Python.

또한, 상기 주요 영상을 검출하는 단계(S50a)는 상기 단계(S52a)를 통해 인식된 숫자를 기반으로 점수 변화를 검출하는 단계(S53a) 및 상기 점수 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하여 저장하는 단계(S54a)를 더 포함할 수 있다.In addition, the step of detecting the main image (S50a) includes the step of detecting a score change based on the number recognized through the step (S52a) (S53a) and the corresponding image frame as the main image when the score change is determined. A step of selecting and storing (S54a) may be further included.

또한, 상기 주요 영상을 검출하는 단계(S50a)는 상기 단계(S40a)에서 이진화된 영상 프레임을 기반으로 중계판(311)의 누상 영역(312)을 검출하는 단계(S51b), 검출된 누상 영역(312) 이미지를 좌우 방향 또는 상하 방향으로 이등분하여 분할하는 단계(S52b)를 포함할 수 있다.In addition, the step of detecting the main image (S50a) includes the step of detecting the false image area 312 of the relay board 311 based on the image frame binarized in the step (S40a) (S51b), the detected false image area ( 312) dividing the image into halves in the left-right direction or the top-down direction (S52b).

또한, 상기 주요 영상을 검출하는 단계(S50a)는 카이제곱 분포(chi-squqre distribution)를 이용하여 상기 단계(S52b)를 통해 분할된 이미지의 이전 프레임과 다음 프레임을 각각 비교하는 단계(S53b), 비교 결과를 토대로 판단하여 누상 영역(312)의 색상 변화를 검출하는 단계(S54b), 누상 영역(312)의 색상 변화가 검출되는 영상 프레임을 선별하여 저장하는 단계(S55b)를 더 포함할 수 있다.In addition, the step of detecting the main image (S50a) is a step of comparing the previous frame and the next frame of the image divided through the step (S52b) using a chi-square distribution (S53b), The method may further include detecting a color change of the false image area 312 based on the comparison result (S54b) and selecting and storing an image frame in which the color change of the false image area 312 is detected (S55b). .

예를 들어, 상기 색상 변화를 검출하는 단계(S54b)는 상기 단계(S53b)에서 계산된 카이제곱 값을 토대로 판단하여 카이제곱 값이 1 이상인 경우에 누상 영역(312)의 색상 변화가 발생한 것으로 판단할 수 있다.For example, the step of detecting the color change (S54b) is determined based on the chi-squared value calculated in the step (S53b), and when the chi-squared value is 1 or more, it is determined that the color change of the red image area 312 has occurred. can do.

또한, 본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법은 상기 주요 영상으로 선별하여 저장하는 단계(S54a)와 누상 영역(312)의 색상변화가 검출되는 영상 프레임을 선별하여 저장하는 단계(S55b)에서 선별된 주요 영상을 기반으로 중복된 영상 프레임을 삭제하는 단계(S56), 중복 영상 프레임이 삭제된 주요 영상을 시간 순서에 따라 조합하여 상황 변화 구간 동영상을 생성하는 단계(S60a)를 더 포함할 수 있다.In addition, the method for extracting the situation change section of a sports game video using deep learning according to an embodiment of the present invention includes the step of selecting and storing the main image as the main image (S54a) and the image frame in which the color change of the false image area 312 is detected. Deleting duplicate image frames based on the main images selected in the selecting and storing step (S55b) (S56), generating a situation change section video by combining the main images from which duplicate image frames have been deleted in chronological order. A step S60a may be further included.

이와 같이 본 발명의 실시 예에 따른 딥러닝을 이용한 스포츠 경기 동영상의 상황 변화 구간 추출 방법 및 장치는 영상 프레임에 나타난 중계판(311)의 색상 변화를 이용하거나 점수 영역의 변화를 이용하여 주요 영상을 선별함으로써 야구, 유도, 태권도, 펜싱, 배구, 테니스, 배드민턴, 빙상, 수영 등 동영상 내에 중계판(311)을 이용하는 스포츠 경기의 상황 변화 구간 동영상을 생성할 수 있다.As described above, the method and apparatus for extracting the situation change section of a sports game video using deep learning according to an embodiment of the present invention uses the color change of the relay board 311 shown in the video frame or the change in the score area to extract the main image. By screening, it is possible to generate a video of a situation change section of a sports game using the relay board 311 within a video, such as baseball, judo, taekwondo, fencing, volleyball, tennis, badminton, ice skating, and swimming.

이상으로 본 발명에 관한 바람직한 실시 예를 설명하였으나, 본 발명은 상기 실시예에 한정되지 아니하며, 본 발명의 실시 예로부터 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의한 용이하게 변경되어 균등하다고 인정되는 범위의 모든 변경을 포함한다.Although the preferred embodiments of the present invention have been described above, the present invention is not limited to the above embodiments, and can be easily changed from the embodiments of the present invention by those skilled in the art to which the present invention belongs, so that the same It includes all changes within the scope recognized as appropriate.

100 : 입력부 200 : 프레임 추출부
300 : 객체 검출부 310 : 클래스 설정 모듈
320 : 데이터 라벨링 모듈 330 : 학습 모듈
340 : 객체 예측 모듈 400 : 필터링부
500 : 영상 판별부 510 : 중계판 검출 모듈
520 : 분할 모듈 530 : 변화 검출 모듈
540 : 판단 모듈 600 : 상황 변화 구간 동영상 생성부
700 : 저장부 800 : 제어부100: input unit 200: frame extraction unit
300: object detection unit 310: class setting module
320: data labeling module 330: learning module
340: object prediction module 400: filtering unit
500: image determination unit 510: relay plate detection module
520: division module 530: change detection module
540: judgment module 600: situation change section video generation unit
700: storage unit 800: control unit

Claims

(a) 스포츠 경기 동영상으로부터 영상 프레임을 추출하는 단계;
(b) 딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출하는 단계;
(d) 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판의 색상 변화를 이용하여 주요 영상을 선별하는 단계; 및
(e) 선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성하는 단계를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.(a) extracting an image frame from a sports game video;
(b) detecting an object from the image frame using a deep learning model;
(d) selecting a main image by using a color change of a relay board displayed in the image frame based on an object detection result; and
(e) A method of extracting a situation change section of a sports game video, comprising the step of generating a situation change section video by combining selected main images.

제1항에서,
상기 단계 (b) 이후에
(c) 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하는 단계를 더 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 1,
After step (b) above
(c) A method of extracting a situation change section of a sports game video, further comprising the step of removing noise images from the sports game video based on the object detection result.

제1항에서,
상기 단계 (b)는
(b1) 상기 영상 프레임으로부터 관심도가 높은 경기 장면을 인식하기 위한 클래스를 설정하는 단계,
(b2) 설정된 클래스를 토대로 상기 영상 프레임의 데이터 라벨링(Data Labeling)을 수행하는 단계,
(b3) 딥러닝을 이용하여 데이터 라벨링된 영상 프레임의 데이터 학습을 수행하는 단계, 및
(b4) 학습 수행 결과를 토대로 상기 영상 프레임의 객체를 예측하는 단계를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 1,
The step (b) is
(b1) setting a class for recognizing a game scene of high interest from the video frame;
(b2) performing data labeling of the image frame based on the set class;
(b3) performing data learning of data-labeled image frames using deep learning, and
(b4) A method of extracting a situation change section of a sports game video, comprising predicting an object of the video frame based on a learning result.

제3항에서,
상기 클래스는
각 방송사별 중계판이 있는 영상 프레임을 검출하기 위한 중계판 클래스, 또는 각 방송사별 중계판에서 누상(base) 영역을 검출하기 위한 누상 클래스를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 3,
the class
A method for extracting a situation change section of a sports game video including a relay board class for detecting a video frame with a relay board for each broadcasting company or a base image class for detecting a base area in a relay board for each broadcasting company.

제1항에서,
상기 단계 (d)는
(d1) 상기 영상 프레임으로부터 중계판의 누상 영역을 검출하는 단계,
(d2) 상기 누상 영역을 좌우 또는 상하 방향으로 이등분하여 분할하는 단계,
(d3) 분할된 누상 영역의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역의 색상 변화를 검출하는 단계, 및
(d4) 상기 누상 영역의 색상 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하는 단계를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 1,
The step (d) is
(d1) detecting the image frame area of the relay board from the image frame;
(d2) dividing the pavilion image area into two halves in the left and right or up and down directions;
(d3) comparing a previous frame and a next frame of each divided image based on the image of the divided image region to detect a color change of the red image region; and
(d4) selecting a corresponding video frame as the main video when the color change of the false image area is determined.

제5항에서,
상기 단계 (d3)는
(d31) 각 영상 프레임의 픽셀들을 흑색과 백색으로 분류하여 나타내는 단계,
(d32) 분류된 흑색과 백색의 픽셀 수를 히스토그램으로 나타내는 단계, 및
(d33) 상기 이전 프레임과 다음 프레임의 히스토그램에서 흑색 대비 백색의 크기 차이를 비교하여 상기 이전 프레임과 다음 프레임 간 누상 영역의 차이를 검출하는 단계를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 5,
The step (d3) is
(d31) classifying and displaying pixels of each image frame as black and white;
(d32) displaying the number of classified black and white pixels as a histogram, and
(d33) comparing the size difference between black and white in the histograms of the previous frame and the next frame, and detecting a difference in a false image area between the previous frame and the next frame.

제5항에서,
상기 단계 (d3)는
(d311) 각 영상 프레임의 픽셀들을 밝기 단계에 따라 분류하여 점수화하는 단계,
(d312) 점수화된 픽셀 값을 밝기 단계에 따른 히스토그램으로 나타내는 단계, 및
(d313) 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램을 비교하여 누상 영역의 차이를 검출하는 단계를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 5,
The step (d3) is
(d311) classifying and scoring pixels of each image frame according to brightness level;
(d312) displaying the scored pixel values as a histogram according to brightness level, and
(d313) A method for extracting a situation change section of a sports game video, comprising the step of comparing a histogram of a previous frame and a histogram of a next frame for each divided image to detect a difference in a false image area.

제6항 또는 제7항에서,
상기 단계 (d4)는
카이제곱 분포(chi-squqre distribution)를 이용하여 각 분할 이미지에 대한 이전 프레임의 히스토그램과 다음 프레임의 히스토그램 간 누상 영역의 변화를 판단하는 것을 특징으로 하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 6 or 7,
The step (d4) is
A method for extracting a situation change section of a sports game video, characterized by determining a change in an image area between a histogram of a previous frame and a histogram of a next frame for each divided image using a chi-square distribution.

제5항에서,
상기 단계 (d4)에서 선별된 주요 영상을 기반으로 중복된 영상 프레임을 삭제하는 단계를 더 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 5,
The method of extracting a situation change section of a sports game video, further comprising deleting duplicate video frames based on the main video selected in step (d4).

제1항에서,
상기 딥러닝 모델은 YOLOv5인 것을 특징으로 하는 스포츠 경기 동영상의 상황 변화 구간 추출 방법.In paragraph 1,
The deep learning model is a method for extracting situation change sections of sports game videos, characterized in that YOLOv5.

스포츠 경기 동영상을 입력받는 입력부;
상기 입력부를 통해 입력받은 동영상으로부터 영상 프레임을 추출하는 프레임 추출부;
딥러닝 모델을 이용하여 상기 영상 프레임으로부터 객체를 검출하는 객체 검출부;
상기 객체 검출부에서 검출된 객체 검출 결과를 토대로 상기 영상 프레임에 나타난 중계판의 색상 변화를 이용하여 주요 영상을 선별하는 영상 판별부; 및
선별된 주요 영상을 조합하여 상황 변화 구간 동영상을 생성하는 상황 변화 구간 동영상 생성부를 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 장치.an input unit for receiving a sports game video;
a frame extraction unit extracting an image frame from the video input through the input unit;
an object detection unit detecting an object from the image frame using a deep learning model;
an image discriminating unit for selecting a main image using a color change of a relay board displayed in the image frame based on the object detection result detected by the object detecting unit; and
An apparatus for extracting a situation change section of a sports game video, including a situation change section video generator for generating a situation change section video by combining selected main images.

제11항에서,
상기 객체 검출부에서 검출된 객체 검출 결과를 토대로 스포츠 경기 동영상으로부터 노이즈 영상을 제거하는 필터링부를 더 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 장치.In paragraph 11,
and a filtering unit for removing noise images from the sports game video based on the object detection result detected by the object detection unit.

제11항에서,
상기 영상 판별부는
상기 영상 프레임에 나타난 중계판의 누상 영역을 검출하는 중계판 검출 모듈,
상기 중계판 검출 모듈에서 검출된 상기 누상 영역을 좌우 방향 또는 상하 방향으로 이등분하여 분할하는 분할 모듈,
분할된 누상 영역의 이미지를 기반으로 각 분할 이미지의 이전 프레임과 다음 프레임을 각각 비교하여 누상 영역의 색상 변화를 검출하는 변화 검출 모듈, 및
상기 누상 영역의 색상 변화가 판단되는 경우 해당 영상 프레임을 주요 영상으로 선별하는 판단 모듈을 포함하는 스포츠 경기 동영상의 상황 변화 구간 추출 장치.In paragraph 11,
The video discrimination unit
A relay plate detecting module for detecting an area of a relay plate displayed in the image frame;
a division module for dividing the false image area detected by the relay plate detection module into two halves in a left-right direction or an up-and-down direction;
A change detection module for detecting a color change of the red image area by comparing the previous frame and the next frame of each divided image based on the image of the divided image area; and
and a determination module for selecting a corresponding video frame as a main video when a color change of the false image area is determined.