KR20210128163A

KR20210128163A - Apparatus, method and computer program for generating data for synthesis

Info

Publication number: KR20210128163A
Application number: KR1020200046016A
Authority: KR
Inventors: 김세라
Original assignee: 주식회사 케이티
Priority date: 2020-04-16
Filing date: 2020-04-16
Publication date: 2021-10-26

Abstract

Provided is a data generation device for generating data for synthesis, which includes: a reception unit receiving multimedia data including image data and voice data from one or more slave clients; a weight deriving unit deriving weights for each of the image data and the voice data; a generation unit generating data for synthesis from the multimedia data based on the derived weight; and a transmission unit transmitting the generated data for synthesis to a video conference server.

Description

합성용 데이터를 생성하는 데이터 생성 장치, 방법 및 컴퓨터 프로그램{APPARATUS, METHOD AND COMPUTER PROGRAM FOR GENERATING DATA FOR SYNTHESIS}Data generating apparatus, method and computer program for generating data for synthesis

본 발명은 합성용 데이터를 생성하는 데이터 생성 장치, 방법 및 컴퓨터 프로그램에 관한 것이다.The present invention relates to a data generating apparatus, method and computer program for generating data for synthesis.

화상 회의는 지리적으로 멀리 떨어진 여러 회의실에서 각각 디스플레이 화면, 카메라, 모니터, 마이크, 스피커 등을 갖추고, 네트워크를 통해 실시간으로 영상 데이터 및 음성 데이터를 송수신함으로써 이루어진다.Video conferencing is achieved by transmitting and receiving video and audio data in real time through a network, each equipped with a display screen, camera, monitor, microphone, and speaker in several geographically distant conference rooms.

일반적으로 화상 회의 시스템은 각 장소에 하나의 카메라를 고정된 위치에 설치하여 활용한다. 소규모 회의는 하나의 카메라를 이용하여 화상 회의를 원활하게 진행할 수 있지만, 넓은 공간에서 다수의 회의 참석자가 존재하는 대규모 회의의 경우에는 하나의 카메라만으로 화상 회의를 진행하기에 어려움이 있다.In general, a video conferencing system is used by installing one camera in a fixed position in each place. A small conference can smoothly conduct a video conference using one camera, but in a large conference in which a large number of conference participants exist in a large space, it is difficult to conduct a video conference using only one camera.

대규모 회의에서 각 장소에 여러 대의 카메라를 이용하는 경우, 화상 회의 서버로 모든 카메라의 영상 데이터를 전송하는 경우에는 무의미한 데이터가 전송되어 자원이 낭비된다는 문제점이 있었다. 또한, 화상 회의 서버로 전송할 영상 데이터를 수동으로 선택하여 전송하는 경우에는 인력 및 비용이 추가로 필요하다는 문제점이 있었다.In the case of using multiple cameras in each place in a large-scale meeting, there is a problem in that, when video data of all cameras is transmitted to a video conference server, meaningless data is transmitted and resources are wasted. In addition, when manually selecting and transmitting image data to be transmitted to the video conferencing server, there is a problem in that additional manpower and cost are required.

한국공개특허공보 제 2017-0060023호 (2017.05.31. 공개)Korea Patent Publication No. 2017-0060023 (published on May 31, 2017)

본 발명은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하고 영상 데이터 및 음성 데이터 각각에 대한 가중치를 도출하고 가중치에 기초하여 멀티미디어 데이터로부터 합성용 데이터를 생성하고 합성용 데이터를 화상 회의 서버로 전송하고자 한다.The present invention is to solve the problems of the prior art described above, by receiving multimedia data including image data and audio data from one or more slave clients, deriving weights for each of the image data and audio data, and performing multimedia based on the weights. I want to create data for synthesis from data and send the data for synthesis to a video conference server.

다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical problems to be achieved by the present embodiment are not limited to the technical problems described above, and other technical problems may exist.

상술한 기술적 과제를 달성하기 위한 수단으로서, 본 발명의 일 실시예는, 합성용 데이터를 생성하는 데이터 생성 장치에 있어서, 하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하는 수신부, 상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하는 가중치 도출부, 상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하는 생성부 및 상기 생성한 합성용 데이터를 화상 회의 서버로 전송하는 전송부를 포함할 수 있다.As a means for achieving the above-described technical problem, an embodiment of the present invention provides a data generating apparatus for generating data for synthesis, a receiving unit receiving multimedia data including image data and audio data from one or more slave clients , a weight derivation unit for deriving weights for each of the video data and the audio data, a generation unit for generating synthesis data from the multimedia data based on the derived weights, and a video conference server for the generated synthesis data It may include a transmitter for transmitting.

일 실시예에서, 상기 가중치 도출부는 상기 영상 데이터에 대응하는 영상에 등장하는 회의 참석자의 행동 점수에 기초하여 상기 영상 데이터에 대한 영상 가중치를 도출하는 영상 가중치 도출부 및 상기 음성 데이터에 포함되는 음성 신호의 진폭에 기초하여 상기 음성 데이터 각각에 대한 음성 가중치를 도출하는 음성 가중치 도출부를 포함할 수 있다.In an embodiment, the weight derivation unit includes an image weight derivation unit for deriving an image weight for the image data based on behavior scores of conference participants appearing in an image corresponding to the image data, and an audio signal included in the audio data. and a voice weight derivation unit for deriving a voice weight for each of the voice data based on the amplitude of .

일 실시예에서, 상기 영상 가중치 도출부는 상기 회의 참석자의 행동에 기초한 결정 트리를 생성할 수 있다.In an embodiment, the image weight deriving unit may generate a decision tree based on the behavior of the meeting participant.

일 실시예에서, 상기 영상 가중치 도출부는 상기 결정 트리에 기초한 제 1 점수 및 상기 영상에 등장하는 회의 참석자의 수에 기초한 제 2 점수에 기초하여 상기 행동 점수를 산출할 수 있다.In an embodiment, the image weight deriving unit may calculate the behavior score based on a first score based on the decision tree and a second score based on the number of meeting participants appearing in the image.

일 실시예에서, 상기 생성부는 상기 영상 가중치에 기초하여 상기 하나 이상의 영상 데이터 중에서 합성용 영상 데이터를 선택하고, 상기 음성 가중치에 기초하여 상기 하나 이상의 상기 음성 데이터로부터 합성용 음성 데이터를 선택하고, 상기 합성용 영상 데이터 및 상기 합성용 음성 데이터를 합성하여 상기 합성용 데이터를 생성할 수 있다.In an embodiment, the generator selects image data for synthesis from among the one or more image data based on the image weight, selects audio data for synthesis from the one or more audio data based on the audio weight, and The synthesis data may be generated by synthesizing the image data for synthesis and the audio data for synthesis.

일 실시예에서, 상기 전송부는 상기 합성용 데이터에 대한 가중치를 상기 화상 회의 서버로 더 전송할 수 있다.In an embodiment, the transmitter may further transmit a weight for the data for synthesis to the video conferencing server.

일 실시예에서, 상기 화상 회의 서버는, 각 회의실에 위치한 복수의 데이터 생성 장치 각각으로부터 상기 합성용 데이터 및 상기 합성용 데이터에 대한 가중치를 수신하고, 상기 복수의 가중치에 기초하여 상기 합성용 데이터를 조합하여 화상 회의 데이터를 생성할 수 있다.In an embodiment, the video conferencing server receives the data for synthesis and a weight for the data for synthesis from each of a plurality of data generating devices located in each conference room, and generates the data for synthesis based on the plurality of weights. They can be combined to create videoconferencing data.

본 발명의 다른 실시예는, 합성용 데이터를 생성하는 데이터 생성 방법에 있어서, 하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하는 단계, 상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하는 단계, 상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하는 단계를 포함할 수 있다.Another embodiment of the present invention provides a method of generating data for generating data for synthesis, comprising: receiving multimedia data including image data and audio data from one or more slave clients; It may include deriving a weight, and generating data for synthesis from the multimedia data based on the derived weight.

본 발명의 또 다른 실시예는, 합성용 데이터를 생성하는 하는 명령어들의 시퀀스를 포함하는 컴퓨터 판독가능 기록매체에 저장된 컴퓨터 프로그램에 있어서, 상기 컴퓨터 프로그램은 컴퓨팅 장치에 의해 실행될 경우, 하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하고, 상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하고, 상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하도록 하는 명령어들의 시퀀스를 포함할 수 있다.Another embodiment of the present invention provides a computer program stored in a computer-readable recording medium including a sequence of instructions for generating data for synthesis, wherein the computer program is executed from one or more slave clients when executed by a computing device. A sequence of commands for receiving multimedia data including image data and audio data, deriving weights for each of the image data and the audio data, and generating data for synthesis from the multimedia data based on the derived weights may include.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본 발명을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 기재된 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary, and should not be construed as limiting the present invention. In addition to the exemplary embodiments described above, there may be additional embodiments described in the drawings and detailed description.

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하고 영상 데이터 및 음성 데이터 각각에 대한 가중치를 도출하고 가중치에 기초하여 멀티미디어 데이터로부터 합성용 데이터를 생성하고 합성용 데이터를 화상 회의 서버로 전송할 수 있다.According to any one of the above-described problem solving means of the present invention, multimedia data including image data and audio data is received from one or more slave clients, weights for each of the image data and audio data are derived, and multimedia data is obtained based on the weights. You can create data for compositing from , and transmit the data for compositing to a video conference server.

또한, 영상 데이터와 음성 데이터의 각각을 분석한 결과에 기초하여 생성한 합성용 데이터를 전송함으로써 회의 상황을 효과적으로 전달하는 화상 회의 서비스를 제공할 수 있다.In addition, it is possible to provide a video conferencing service that effectively conveys a meeting situation by transmitting the data for synthesis generated based on the result of analyzing each of the video data and the audio data.

도 1은 본 발명의 일 실시예에 따른 화상 회의 서비스 제공 시스템의 구성도이다.
도 2는 본 발명의 일 실시예에 따른 데이터 생성 장치의 구성도이다.
도 3은 본 발명의 일 실시예에 따라 합성용 데이터를 생성하는 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따라 화상 회의 데이터를 생성하는 방법을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따른 데이터 생성 방법의 순서도이다.
도 6은 본 발명의 일 실시예에 따른 데이터 생성 방법의 순서도이다.1 is a block diagram of a system for providing a video conferencing service according to an embodiment of the present invention.
2 is a block diagram of an apparatus for generating data according to an embodiment of the present invention.
3 is a diagram for explaining a method of generating data for synthesis according to an embodiment of the present invention.
4 is a diagram for explaining a method of generating video conference data according to an embodiment of the present invention.
5 is a flowchart of a data generation method according to an embodiment of the present invention.
6 is a flowchart of a data generation method according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art can easily implement them. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Throughout the specification, when a part is "connected" with another part, this includes not only the case of being "directly connected" but also the case of being "electrically connected" with another element interposed therebetween. . Also, when a part "includes" a component, it means that other components may be further included, rather than excluding other components, unless otherwise stated, and one or more other features However, it is to be understood that the existence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded in advance.

본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양방을 이용하여 실현되는 유닛을 포함한다. 또한, 1 개의 유닛이 2 개 이상의 하드웨어를 이용하여 실현되어도 되고, 2 개 이상의 유닛이 1 개의 하드웨어에 의해 실현되어도 된다. 한편, '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니며, '~부'는 어드레싱 할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로 더 분리될 수 있다. 뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU들을 재생시키도록 구현될 수도 있다.In this specification, a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. In addition, one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware. Meanwhile, '~ unit' is not limited to software or hardware, and '~ unit' may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors. Thus, as an example, '~' denotes components such as software components, object-oriented software components, class components, and task components, and processes, functions, properties, and procedures. , subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays and variables. The functions provided in the components and '~ units' may be combined into a smaller number of components and '~ units' or further separated into additional components and '~ units'. In addition, components and '~ units' may be implemented to play one or more CPUs in a device or secure multimedia card.

이하에서 언급되는 "네트워크"는 단말들 및 서버들과 같은 각각의 노드 상호 간에 정보 교환이 가능한 연결 구조를 의미하는 것으로, 근거리 통신망(LAN: Local Area Network), 광역 통신망(WAN: Wide Area Network), 인터넷 (WWW: World Wide Web), 유무선 데이터 통신망, 전화망, 유무선 텔레비전 통신망 등을 포함한다. 무선 데이터 통신망의 일례에는 3G, 4G, 5G, 3GPP(3rd Generation Partnership Project), LTE(Long Term Evolution), WIMAX(World Interoperability for Microwave Access), 와이파이(Wi-Fi), 블루투스 통신, 적외선 통신, 초음파 통신, 가시광 통신(VLC: Visible Light Communication), 라이파이(LiFi) 등이 포함되나 이에 한정되지는 않는다.The "network" referred to below means a connection structure capable of exchanging information between each node, such as terminals and servers, and includes a local area network (LAN), a wide area network (WAN). , the Internet (WWW: World Wide Web), wired and wireless data networks, telephone networks, wired and wireless television networks, and the like. Examples of wireless data communication networks include 3G, 4G, 5G, 3rd Generation Partnership Project (3GPP), Long Term Evolution (LTE), World Interoperability for Microwave Access (WIMAX), Wi-Fi, Bluetooth communication, infrared communication, ultrasound Communication, Visible Light Communication (VLC), LiFi, etc. are included, but are not limited thereto.

본 명세서에 있어서 단말 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말 또는 디바이스에서 수행될 수도 있다.Some of the operations or functions described as being performed by the terminal or device in the present specification may be instead performed by a server connected to the terminal or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal or device connected to the server.

이하 첨부된 도면을 참고하여 본 발명의 일 실시예를 상세히 설명하기로 한다.Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 화상 회의 서비스 제공 시스템의 구성도이다. 도 1을 참조하면, 화상 회의 서비스 제공 시스템(100)은 데이터 생성 장치(110), 하나 이상의 슬레이브 클라이언트(120) 및 화상 회의 서버(130)를 포함할 수 있다.1 is a block diagram of a system for providing a video conferencing service according to an embodiment of the present invention. Referring to FIG. 1 , the video conference service providing system 100 may include a data generating device 110 , one or more slave clients 120 , and a video conference server 130 .

화상 회의 서비스 제공 시스템(100)은 지리적으로 떨어진 여러 회의실 간의 화상 회의 서비스를 제공할 수 있다. 화상 회의 서비스 제공 시스템(100)은 예를 들어, 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 실시간으로 송수신하는 서비스를 제공할 수 있다.The video conferencing service providing system 100 may provide a video conferencing service between several geographically separated conference rooms. The video conferencing service providing system 100 may provide, for example, a service for transmitting and receiving multimedia data including video data and audio data in real time.

일 예에서, 데이터 생성 장치(110) 및 슬레이브 클라이언트(120)는 각각 카메라 및 마이크를 포함할 수 있다. 또는, 다른 예에서, 데이터 생성 장치(110) 및 슬레이브 클라이언트(120)는 카메라 및 마이크와 독립된 장치로서, 카메라 및 마이크로부터 영상 데이터 및 음성 데이터를 획득할 수도 있다.In one example, the data generating device 110 and the slave client 120 may include a camera and a microphone, respectively. Alternatively, in another example, the data generating device 110 and the slave client 120 are devices independent of the camera and the microphone, and may acquire image data and audio data from the camera and the microphone.

데이터 생성 장치(110)는 하나 이상의 슬레이브 클라이언트(120)와 마스터-슬레이브 관계를 형성할 수 있다. 예를 들어, 데이터 생성 장치(110)는 마스터로서 동작하며, 슬레이브로 동작하는 슬레이브 클라이언트(120)로부터 데이터를 수신할 수 있다.The data generating device 110 may form a master-slave relationship with one or more slave clients 120 . For example, the data generating device 110 may operate as a master and receive data from a slave client 120 operating as a slave.

데이터 생성 장치(110) 및 슬레이브 클라이언트(120)는 서로에 대하여 연결 요청 또는 연결 해제 요청을 송수신할 수 있다. 일 실시예에서, 데이터 생성 장치(110) 및 슬레이브 클라이언트(120)는 연결 권한 제어 서버를 통해 서로에 대한 연결 요청 또는 연결 해제 요청을 송수신할 수 있다.The data generating device 110 and the slave client 120 may transmit and receive a connection request or a connection release request with respect to each other. In an embodiment, the data generating device 110 and the slave client 120 may transmit and receive a connection request or a connection disconnection request to each other through the connection authority control server.

데이터 생성 장치(110)와 화상 회의 서버(130)는 네트워크를 통해 데이터를 송수신할 수 있다. 예를 들어, 데이터 생성 장치(110)는 화상 회의 서버(130)로 합성용 데이터를 전송할 수 있다.The data generating device 110 and the video conference server 130 may transmit/receive data through a network. For example, the data generating device 110 may transmit data for synthesis to the video conference server 130 .

화상 회의 서버(130)는 하나 이상의 외부 장치로 화상 회의 데이터를 전송할 수 있다. 화상 회의 서버(130)는 예를 들어, 화상 회의 서비스를 제공받는 사용자 단말로 화상 회의 데이터를 전송할 수 있다.The video conference server 130 may transmit video conference data to one or more external devices. The video conference server 130 may transmit video conference data to, for example, a user terminal receiving a video conference service.

도 2는 본 발명의 일 실시예에 따른 데이터 생성 장치의 구성도이다. 도 2를 참조하면, 데이터 생성 장치(110)는 수신부(111), 가중치 도출부(112), 생성부(113) 및 전송부(114)를 포함할 수 있다.2 is a block diagram of an apparatus for generating data according to an embodiment of the present invention. Referring to FIG. 2 , the data generating apparatus 110 may include a receiving unit 111 , a weight deriving unit 112 , a generating unit 113 , and a transmitting unit 114 .

데이터 생성 장치(110)는 합성용 데이터를 생성할 수 있다. 데이터 생성 장치(110)는 예를 들어, 하나 이상의 멀티미디어 데이터에 기초하여 합성용 데이터를 생성할 수 있다.The data generating apparatus 110 may generate data for synthesis. The data generating apparatus 110 may generate data for synthesis based on, for example, one or more multimedia data.

수신부(111)는 하나 이상의 슬레이브 클라이언트(120)로부터 멀티미디어 데이터를 수신할 수 있다. 멀티미디어 데이터는 영상 데이터 및 음성 데이터를 포함할 수 있다.The receiver 111 may receive multimedia data from one or more slave clients 120 . The multimedia data may include image data and audio data.

하나 이상의 슬레이브 클라이언트(120)로부터 수신한 멀티미디어 데이터는 예를 들어, 슬레이브 클라이언트(120) 각각에 대응하는 카메라 또는 마이크가 설치된 위치에 따라 영상 데이터에 대응하는 영상의 내용이 달라질 수 있으며, 음성 데이터에 포함되는 음성 신호가 달라질 수 있다.In the multimedia data received from one or more slave clients 120, for example, the content of the image corresponding to the image data may vary depending on the location where the camera or microphone corresponding to each of the slave clients 120 is installed, The included voice signal may be different.

가중치 도출부(112)는 영상 데이터 및 음성 데이터 각각에 대한 가중치를 도출할 수 있다. 다시 도 2를 참조하면, 가중치 도출부(112)는 영상 가중치 도출부(112a) 및 음성 가중치 도출부(112b)를 포함할 수 있다.The weight deriving unit 112 may derive a weight for each of the image data and the audio data. Referring back to FIG. 2 , the weight deriving unit 112 may include an image weight deriving unit 112a and an audio weight deriving unit 112b.

도 3을 참조하면, 회의 참석자(301)에 대하여 복수의 슬레이브 클라이언트(120_1, 120_2)가 설치될 수 있다. 도 3에서는 슬레이브 클라이언트(120_1, 120_2)가 카메라 및 마이크를 구비하는 것으로 도시되어 있지만, 반드시 이에 한정되지 않는다. 즉, 도면부호 120_1, 120_2로 표기된 슬레이브 클라이언트가 카메라 및 마이크를 구비하지 않고, 별도의 카메라 및 마이크로부터 영상 데이터 및 음성 데이터를 수신할 수도 있다.Referring to FIG. 3 , a plurality of slave clients 120_1 and 120_2 may be installed for a conference participant 301 . In FIG. 3 , the slave clients 120_1 and 120_2 are illustrated as having a camera and a microphone, but the present invention is not limited thereto. That is, the slave clients denoted by reference numerals 120_1 and 120_2 may receive image data and audio data from a separate camera and microphone without having a camera and a microphone.

이 때, 회의 참석자와의 거리가 가까운 제 1 슬레이브 클라이언트(120_1)에 의해 획득된 음성 데이터가 제 2 슬레이브 클라이언트(120_2)에 의해 획득된 음성 데이터보다 회의 참석자의 발화 내용을 잘 포함하는 데이터일 수 있다. 또한, 복수의 슬레이브 클라이언트(120_1, 120_2)의 배치에 따라 제 2 슬레이브 클라이언트(120_2)에 의해 획득된 영상 데이터가 제 1 슬레이브 클라이언트(120_1)에 의해 획득된 영상 데이터보다 회의 참석자의 모습을 잘 포함하는 데이터일 수 있다.At this time, the voice data acquired by the first slave client 120_1, which is close to the meeting participant, is data that contains the content of the utterance of the meeting participant better than the voice data acquired by the second slave client 120_2. have. In addition, the image data acquired by the second slave client 120_2 according to the arrangement of the plurality of slave clients 120_1 and 120_2 includes the appearance of the meeting participants better than the image data acquired by the first slave client 120_1 It may be data that

따라서, 데이터 생성 장치(110)는 영상 데이터 및 음성 데이터 각각에 대하여 가중치를 도출하고 이에 기초하여 각 회의실에서의 회의 상황을 효과적으로 전달하는 합성용 데이터를 생성할 수 있다. 또한, 화상 회의 서비스 제공 시스템(100)은 합성용 데이터를 이용하여 각 회의실 간의 화상 회의가 원활하게 이루어지도록 하는 서비스를 제공할 수 있다.Accordingly, the data generating apparatus 110 may derive a weight for each of the image data and the audio data, and generate data for synthesis that effectively conveys the meeting situation in each conference room based on the derived weights. In addition, the video conferencing service providing system 100 may provide a service for smoothly performing video conferencing between conference rooms using data for synthesis.

영상 가중치 도출부(112a)는 영상 데이터에 대응하는 영상에 등장하는 회의 참석자의 행동 점수에 기초하여 영상 데이터에 대한 영상 가중치를 도출할 수 있다.The image weight deriving unit 112a may derive an image weight for the image data based on the behavioral scores of the meeting participants appearing in the image corresponding to the image data.

영상 가중치 도출부(112a)는 회의 참석자의 행동에 기초한 결정 트리를 생성할 수 있다. 예를 들어, 영상 가중치 도출부(112a)는 회의 참석자의 1순위 행동을 결정 트리의 최상위에 배치하고, 그 하위에 2순위 행동을 배치하고, 그 하위에 3순위 행동을 배치하여 결정 트리를 생성할 수 있다.The image weight deriving unit 112a may generate a decision tree based on the behavior of the meeting participants. For example, the image weight derivation unit 112a generates a decision tree by arranging the first-priority behavior of the meeting participant at the top of the decision tree, placing the second-order behavior below it, and placing the third-order behavior below the decision tree. can do.

영상 가중치 도출부(112a)는 결정 트리에 기초한 제 1 점수를 산출할 수 있다. 예를 들어, 영상 가중치 도출부(112a)는 결정 트리를 구성하는 행동 예를 들어, 1순위 행동인 회의 참석자가 발언 중인지 여부, 2순위 행동인 회의 참석자가 기립하였는지 여부, 또는 3순위 행동인 회의 참석자가 판서를 하는 중인지 여부 각각에 대한 누적 점수에 기초하여 제 1 점수를 산출할 수 있다. 제 1 점수는 예를 들어, 0 이상의 정수의 값으로 나타날 수 있다.The image weight deriving unit 112a may calculate a first score based on the decision tree. For example, the image weight derivation unit 112a may perform an action constituting the decision tree, for example, whether a meeting participant, which is a first-order action, is speaking, whether a conference participant, which is a second-order action, is standing, or a meeting that is a third-order action. The first score may be calculated based on the cumulative score for each of whether the participant is writing. The first score may be represented by, for example, an integer value of 0 or more.

영상 가중치 도출부(112a)는 영상에 등장하는 회의 참석자의 수에 기초한 제 2 점수를 산출할 수 있다. 제 2 점수는 예를 들어, 0 이상이고 1 미만인 값으로 나타날 수 있다.The image weight deriving unit 112a may calculate a second score based on the number of meeting participants appearing in the image. The second score may be represented, for example, as a value greater than or equal to zero and less than one.

예를 들어, 영상 가중치 도출부(112a)는 수학식 1을 이용하여 제 2 점수를 산출할 수 있다.For example, the image weight deriving unit 112a may calculate the second score by using Equation (1).

여기서, 제 1 슬레이브 클라이언트에 의해 획득된 영상 데이터에 등장하는 회의 참석자의 수가 x이고, 제 2 슬레이브 클라이언트에 의해 획득된 영상 데이터에 등장하는 회의 참석자의 수가 y이고, 제 3 슬레이브 클라이언트에 의해 획득된 영상 데이터에 등장하는 회의 참석자의 수가 z인 경우에, 제 1 슬레이브 클라이언트에 의해 획득된 영상 데이터의 제 2 점수는 S_2x일 수 있다.Here, the number of conference participants appearing in the video data obtained by the first slave client is x, the number of conference participants appearing in the video data obtained by the second slave client is y, and the number of conference participants appearing in the video data obtained by the second slave client is y. When the number of conference participants appearing in the video data is z, the second score of the video data obtained by the first slave client may be _{S 2x.}

영상 가중치 도출부(112a)는 제 1 점수 및 제 2 점수에 기초하여 행동 점수를 산출할 수 있다. 영상 가중치 도출부(112a)는 예를 들어, 영상 가중치는 제 1 점수 및 제 2 점수를 합한 값으로 도출할 수 있다.The image weight deriving unit 112a may calculate a behavior score based on the first score and the second score. The image weight deriving unit 112a may, for example, derive the image weight as the sum of the first score and the second score.

영상 가중치 도출부(112a)가 0 이상의 정수의 값을 갖는 제 1 점수와 0 이상이고 1 미만인 값을 갖는 제 2 점수를 합하여 영상 가중치를 도출한 경우에, 각 영상 데이터에 대한 영상 가중치 값의 대소는 제 1 점수에 기초하여 우선적으로 결정되고, 제 1 점수가 동일한 경우에 제 2 점수에 기초하여 대소 관계가 결정될 수 있다.When the image weight derivation unit 112a derives the image weight by adding the first score having an integer value of 0 or more and the second score having a value of 0 or more and less than 1, the magnitude of the image weight value for each image data may be preferentially determined based on the first score, and when the first scores are the same, the magnitude relationship may be determined based on the second score.

음성 가중치 도출부(112b)는 음성 데이터에 포함되는 음성 신호의 진폭에 기초하여 음성 데이터 각각에 대한 음성 가중치를 도출할 수 있다. 일 실시예에서, 하나 이상의 슬레이브 클라이언트(120)에 의해 획득된 음성 데이터에 포함되는 음성 신호는 동일한 파형을 가질 수 있다.The voice weight deriving unit 112b may derive a voice weight for each voice data based on the amplitude of the voice signal included in the voice data. In an embodiment, voice signals included in voice data acquired by one or more slave clients 120 may have the same waveform.

생성부(113)는 도출한 가중치에 기초하여 멀티미디어 데이터로부터 합성용 데이터를 생성할 수 있다. 생성부(113)는 합성용 영상 데이터 및 합성용 음성 데이터에 기초하여 합성용 데이터를 생성할 수 있다.The generator 113 may generate data for synthesis from the multimedia data based on the derived weight. The generator 113 may generate data for synthesis based on the image data for synthesis and the audio data for synthesis.

생성부(113)는 영상 가중치에 기초하여 하나 이상의 영상 데이터 중에서 합성용 영상 데이터를 선택할 수 있다. 생성부(113)는 음성 가중치에 기초하여 하나 이상의 음성 데이터로부터 합성용 음성 데이터를 선택할 수 있다. 예를 들어, 생성부(113)는 영상 가중치가 가장 높은 값은 갖는 영상 데이터를 합성용 영상 데이터로 선택하고, 음성 가중치가 가장 높은 값은 갖는 음성 데이터를 합성용 음성 데이터로 선택할 수 있다.The generator 113 may select image data for synthesis from among one or more image data based on the image weight. The generator 113 may select voice data for synthesis from one or more voice data based on voice weights. For example, the generator 113 may select image data having the highest image weight as image data for synthesis, and select audio data having the highest audio weight as audio data for synthesis.

생성부(113)는 합성용 영상 데이터 및 합성용 음성 데이터를 합성하여 합성용 데이터를 생성할 수 있다.The generator 113 may generate synthesis data by synthesizing image data for synthesis and audio data for synthesis.

화상 회의 서버(130)는 각 회의실에 위치한 복수의 데이터 생성 장치(110) 각각으로부터 합성용 데이터 및 합성용 데이터에 대한 가중치를 수신할 수 있다. 화상 회의 서버(130)는 복수의 가중치에 기초하여 합성용 데이터를 조합하여 화상 회의 데이터를 생성할 수 있다.The video conferencing server 130 may receive data for synthesis and weights for the data for synthesis from each of the plurality of data generating devices 110 located in each conference room. The video conference server 130 may generate video conference data by combining data for synthesis based on a plurality of weights.

화상 회의 서버(130)는 영상 가중치에 기초하여 복수의 합성용 영상 데이터 중에서 포커스 영상 데이터를 선택할 수 있다.The video conferencing server 130 may select focus image data from among a plurality of image data for synthesis based on the image weight.

예를 들어, 화상 회의 서버(130)는 N 개의 합성용 영상 데이터에 대한 영상 가중치를 오름차순으로 정렬할 수 있다. 화상 회의 서버(130)는 N 개의 합성용 영상 데이터에 대한 영상 가중치의 중앙값을 도출할 수 있다. N이 홀수인 경우, 합성용 영상 가중치의 중앙값은 N 개의 합성용 영상 데이터에 대한 영상 가중치 중에서 (N+1)/2 번째로 높은 값일 수 있다. N이 짝수인 경우, 영상 가중치의 중앙값은 N 개의 합성용 영상 데이터에 대한 영상 가중치 중에서 N/2 번째로 높은 값과 (N/2)+1 번째로 높은 값의 평균값일 수 있다. 화상 회의 서버(130)는 중앙값에 기설정된 상수를 곱한 값으로 영상 가중치의 기준값을 도출할 수 있다.For example, the video conference server 130 may sort the image weights for the N pieces of image data for synthesis in ascending order. The video conference server 130 may derive a median value of the image weights for the N pieces of image data for synthesis. When N is an odd number, the median value of the image weight for synthesis may be the (N+1)/2-th highest value among the image weights for the N pieces of image data for synthesis. When N is an even number, the median value of the image weight may be an average value of the N/2-th highest value and the (N/2)+1-th highest value among the image weights for the N pieces of image data for synthesis. The video conferencing server 130 may derive a reference value of the image weight by multiplying the median value by a preset constant.

화상 회의 서버(130)는 영상 가중치가 기준값 이상인 합성용 영상 데이터 중에서 포커스 영상 데이터를 선택할 수 있다. 영상 가중치가 기준값 이상인 합성용 영상 데이터가 존재하지 않는 경우에, 포커스 영상 데이터는 0 개일 수 있다. 영상 가중치가 기준값 이상인 합성용 영상 데이터가 1 개인 경우에, 영상 가중치가 기준값 이상인 합성용 영상 데이터가 포커스 영상 데이터로 선택될 수 있다. 영상 가중치가 기준값 이상인 합성용 영상 데이터가 2 개 이상인 경우에, 화상 회의 서버(130)는 영상 가중치가 높은 2 개의 합성용 영상 데이터를 포커스 영상 데이터로 선택할 수 있다.The video conferencing server 130 may select focus image data from among image data for synthesis having an image weight equal to or greater than a reference value. When there is no image data for synthesis having an image weight equal to or greater than the reference value, there may be zero focus image data. When there is one image data for synthesis having an image weight equal to or greater than the reference value, image data for synthesis having an image weight equal to or greater than the reference value may be selected as focus image data. When there are two or more pieces of video data for synthesis having an image weight equal to or greater than the reference value, the video conferencing server 130 may select two pieces of video data for synthesis having a high image weight as focus image data.

화상 회의 서버(130)는 포커스 영상 데이터에 기초하여 화상 회의 데이터를 생성할 수 있다.The video conference server 130 may generate video conference data based on the focus image data.

도 4는 화상 회의 서버가 네 개의 합성용 영상 데이터를 이용하여 화상 회의 데이터를 생성하는 예시적인 도면이다. 포커스 영상 데이터가 0개인 경우에 화상 회의 서버(130)는 도 4의 (a)와 같이 네 개의 영상 데이터에 동일한 비중을 부여하여 화상 회의 데이터를 생성할 수 있다. 포커스 영상 데이터가 1 개인 경우에 화상 회의 서버(130)는 도 4의 (b)와 같이 포커스 영상 데이터에 가장 큰 비중을 부여할 수 있고, 포커스 영상 데이터가 2 개인 경우에 화상 회의 서버(130)는 도 4의 (c)와 같이 포커스 영상 데이터에 가장 큰 비중을 부여할 수 있다.4 is an exemplary diagram in which the video conference server generates video conference data by using four video data for synthesis. When the focus image data is 0, the video conferencing server 130 may generate the video conferencing data by giving the same weight to the four image data as shown in FIG. 4A . When there is one focus image data, the video conferencing server 130 may give the greatest weight to the focus image data as shown in FIG. 4B , and when there are two focus image data, the video conferencing server 130 may give the greatest weight to the focus image data as shown in FIG. 4C .

화상 회의 서버(130)는 음성 가중치에 기초하여 합성용 음성 데이터로부터 화상 회의 데이터를 생성할 수 있다.The video conferencing server 130 may generate video conferencing data from the voice data for synthesis based on the voice weight.

전송부(114)는 생성한 합성용 데이터를 화상 회의 서버(130)로 전송할 수 있다. 전송부(114)는 합성용 데이터에 대한 가중치를 화상 회의 서버(130)로 더 전송할 수 있다. 상술한 바와 같이, 화상 회의 서버(130)는 합성용 데이터 및 합성용 데이터에 대한 가중치에 기초하여 화상 회의 데이터를 생성할 수 있다.The transmission unit 114 may transmit the generated data for synthesis to the video conference server 130 . The transmitter 114 may further transmit a weight for the data for synthesis to the video conference server 130 . As described above, the video conference server 130 may generate video conference data based on the data for synthesis and the weights for the data for synthesis.

도 5는 본 발명의 일 실시예에 따른 데이터 생성 방법의 순서도이다. 도 5에 도시된 데이터 생성 장치(110)에서 수행되는 합성용 데이터를 생성하는 방법(500)은 도 2에 도시된 실시예에 따라 데이터 생성 장치(110)에 의해 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 2에 도시된 실시예에 따라 데이터 생성 장치(110)에서 수행되는 합성용 데이터를 생성하는 방법에도 적용된다.5 is a flowchart of a data generation method according to an embodiment of the present invention. The method 500 for generating data for synthesis performed by the data generating device 110 shown in FIG. 5 includes steps that are time-series processed by the data generating device 110 according to the embodiment shown in FIG. 2 . do. Therefore, even if omitted below, it is also applied to the method of generating data for synthesis performed by the data generating apparatus 110 according to the embodiment shown in FIG. 2 .

단계 S510에서 데이터 생성 장치(110)는 하나 이상의 슬레이브 클라이언트(120)로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신할 수 있다.In step S510 , the data generating device 110 may receive multimedia data including image data and audio data from one or more slave clients 120 .

단계 S520에서 데이터 생성 장치(110)는 영상 데이터 및 음성 데이터 각각에 대한 가중치를 도출할 수 있다.In operation S520, the data generating apparatus 110 may derive a weight for each of the image data and the audio data.

단계 S530에서 데이터 생성 장치(110)는 도출한 가중치에 기초하여 멀티미디어 데이터로부터 합성용 데이터를 생성할 수 있다.In operation S530, the data generating apparatus 110 may generate data for synthesis from the multimedia data based on the derived weight.

단계 S540에서 데이터 생성 장치(110)는 생성한 합성용 데이터를 화상 회의 서버로 전송할 수 있다.In operation S540, the data generating apparatus 110 may transmit the generated data for synthesis to the video conference server.

상술한 설명에서, 단계 S510 내지 S540은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 전환될 수도 있다.In the above description, steps S510 to S540 may be further divided into additional steps or combined into fewer steps, according to an embodiment of the present invention. In addition, some steps may be omitted if necessary, and the order between the steps may be switched.

도 6는 본 발명의 일 실시예에 따른 데이터 생성 방법의 순서도이다. 도 6에 도시된 데이터 생성 장치(110)에서 수행되는 합성용 영상 데이터를 생성하는 방법(600)은 도 2에 도시된 실시예에 따라 데이터 생성 장치(110)에 의해 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 2에 도시된 실시예에 따라 데이터 생성 장치(110)에서 수행되는 합성용 영상 데이터를 생성하는 방법에도 적용된다.6 is a flowchart of a data generation method according to an embodiment of the present invention. The method 600 for generating image data for synthesis performed by the data generating apparatus 110 shown in FIG. 6 includes steps processed in time series by the data generating apparatus 110 according to the embodiment shown in FIG. 2 . include Accordingly, even if omitted below, the method for generating image data for synthesis performed by the data generating apparatus 110 according to the embodiment shown in FIG. 2 is also applied.

단계 S610에서 데이터 생성 장치(110)는 회의 참석자의 행동에 기초한 결정 트리를 생성할 수 있다.In operation S610, the data generating device 110 may generate a decision tree based on the behavior of the meeting participants.

단계 S620에서 데이터 생성 장치(110)는 결정 트리에 기초한 제 1 점수를 산출할 수 있다.In operation S620, the data generating apparatus 110 may calculate a first score based on the decision tree.

단계 S630에서 데이터 생성 장치(110)는 영상에 등장하는 회의 참석자의 수에 기초한 제 2 점수를 산출할 수 있다.In operation S630 , the data generating apparatus 110 may calculate a second score based on the number of conference participants appearing in the image.

단계 S640에서 데이터 생성 장치(110)는 제 1 점수 및 제 2 점수에 기초하여 회의 참석자의 행동 점수를 산출할 수 있다.In operation S640, the data generating device 110 may calculate the behavioral score of the meeting participant based on the first score and the second score.

단계 S650에서 데이터 생성 장치(110)는 영상 데이터에 대한 영상 가중치를 도출할 수 있다.In operation S650 , the data generating apparatus 110 may derive an image weight for the image data.

단계 S660에서 데이터 생성 장치(110)는 영상 데이터 중에서 합성용 영상 데이터를 선택할 수 있다.In operation S660, the data generating apparatus 110 may select image data for synthesis from among the image data.

상술한 설명에서, 단계 S610 내지 S660은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 전환될 수도 있다.In the above description, steps S610 to S660 may be further divided into additional steps or combined into fewer steps according to an embodiment of the present invention. In addition, some steps may be omitted if necessary, and the order between the steps may be switched.

도 1 내지 도 6을 통해 설명된 데이터 생성 장치에서 데이터를 생성하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 컴퓨터 프로그램 또는 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다.The method of generating data in the data generating apparatus described with reference to FIGS. 1 to 6 may be implemented in the form of a computer program stored in a medium executed by a computer or a recording medium including instructions executable by the computer.

컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다.Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a dispersed form, and likewise components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

100: 화상 회의 서비스 제공 시스템
110: 데이터 생성 장치
111: 수신부
112: 가중치 도출부
112a: 영상 가중치 도출부
112b: 음성 가중치 도출부
113: 생성부
114: 전송부100: video conferencing service providing system
110: data generating device
111: receiver
112: weight derivation unit
112a: image weight derivation unit
112b: voice weight derivation unit
113: generator
114: transmission unit

Claims

합성용 데이터를 생성하는 데이터 생성 장치에 있어서,
하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하는 수신부;
상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하는 가중치 도출부;
상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하는 생성부; 및
상기 생성한 합성용 데이터를 화상 회의 서버로 전송하는 전송부
를 포함하는 것인, 데이터 생성 장치.
A data generating device for generating data for synthesis, comprising:
a receiver for receiving multimedia data including video data and audio data from one or more slave clients;
a weight deriving unit deriving weights for each of the image data and the audio data;
a generator for generating synthesis data from the multimedia data based on the derived weight; and
A transmission unit that transmits the generated data for synthesis to a video conference server
A data generating device comprising a.

제 1 항에 있어서,
상기 가중치 도출부는
상기 영상 데이터에 대응하는 영상에 등장하는 회의 참석자의 행동 점수에 기초하여 상기 영상 데이터에 대한 영상 가중치를 도출하는 영상 가중치 도출부; 및
상기 음성 데이터에 포함되는 음성 신호의 진폭에 기초하여 상기 음성 데이터 각각에 대한 음성 가중치를 도출하는 음성 가중치 도출부;
를 포함하는 것인, 데이터 생성 장치.
The method of claim 1,
The weight deriving unit
an image weight derivation unit for deriving an image weight for the image data based on behavior scores of conference participants appearing in the image corresponding to the image data; and
a voice weight derivation unit for deriving a voice weight for each of the voice data based on the amplitude of the voice signal included in the voice data;
A data generating device comprising a.

제 2 항에 있어서,
상기 영상 가중치 도출부는 상기 회의 참석자의 행동에 기초한 결정 트리를 생성하는 것인, 데이터 생성 장치.
3. The method of claim 2,
The image weight derivation unit is to generate a decision tree based on the behavior of the conference participant.

제 3 항에 있어서,
상기 영상 가중치 도출부는 상기 결정 트리에 기초한 제 1 점수 및 상기 영상에 등장하는 회의 참석자의 수에 기초한 제 2 점수에 기초하여 상기 행동 점수를 산출하는 것인, 데이터 생성 장치.
4. The method of claim 3,
and the image weight deriving unit calculates the behavioral score based on a first score based on the decision tree and a second score based on the number of conference participants appearing in the image.

제 2 항에 있어서,
상기 생성부는 상기 영상 가중치에 기초하여 상기 하나 이상의 영상 데이터 중에서 합성용 영상 데이터를 선택하고, 상기 음성 가중치에 기초하여 상기 하나 이상의 상기 음성 데이터로부터 합성용 음성 데이터를 선택하고, 상기 합성용 영상 데이터 및 상기 합성용 음성 데이터를 합성하여 상기 합성용 데이터를 생성하는 것인, 데이터 생성 장치.
3. The method of claim 2,
The generator selects image data for synthesis from among the one or more image data based on the image weight, selects audio data for synthesis from the one or more audio data based on the audio weight, and selects the image data for synthesis and and generating the synthesis data by synthesizing the voice data for synthesis.

제 1 항에 있어서,
상기 전송부는 상기 합성용 데이터에 대한 가중치를 상기 화상 회의 서버로 더 전송하는 것인, 데이터 생성 장치.
The method of claim 1,
and the transmitting unit further transmits a weight for the data for synthesis to the video conferencing server.

제 1 항에 있어서,
상기 화상 회의 서버는,
각 회의실에 위치한 복수의 데이터 생성 장치 각각으로부터 상기 합성용 데이터 및 상기 합성용 데이터에 대한 가중치를 수신하고,
상기 복수의 가중치에 기초하여 상기 합성용 데이터를 조합하여 화상 회의 데이터를 생성하는 것인, 데이터 생성 장치.
The method of claim 1,
The video conferencing server,
receiving the data for synthesis and weights for the data for synthesis from each of a plurality of data generating devices located in each conference room;
and generating video conference data by combining the data for synthesis based on the plurality of weights.

합성용 데이터를 생성하는 데이터 생성 방법에 있어서,
하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하는 단계;
상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하는 단계; 및
상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하는 단계
를 포함하는 것인, 데이터 생성 방법.
A data generation method for generating data for synthesis, comprising:
Receiving multimedia data including video data and audio data from one or more slave clients;
deriving weights for each of the image data and the audio data; and
generating data for synthesis from the multimedia data based on the derived weights;
A method for generating data, comprising:

제 8 항에 있어서,
상기 가중치를 도출하는 단계는
상기 영상 데이터에 대응하는 영상에 등장하는 회의 참석자의 행동 점수에 기초하여 상기 영상 데이터에 대한 영상 가중치를 도출하는 단계; 및
상기 음성 데이터에 포함되는 음성 신호의 진폭에 기초하여 상기 음성 데이터 각각에 대한 음성 가중치를 도출하는 단계
를 포함하는 것인, 데이터 생성 방법.
9. The method of claim 8,
The step of deriving the weight is
deriving an image weight for the image data based on behavior scores of conference participants appearing in an image corresponding to the image data; and
deriving a voice weight for each of the voice data based on the amplitude of the voice signal included in the voice data;
A method for generating data, comprising:

제 9 항에 있어서,
상기 영상 가중치를 도출하는 단계는 상기 회의 참석자의 행동에 기초한 결정 트리를 생성하는 단계를 포함하는 것인, 데이터 생성 방법.
10. The method of claim 9,
wherein deriving the image weight comprises generating a decision tree based on the behavior of the meeting participant.

제 10 항에 있어서,
상기 영상 가중치를 도출하는 단계는 상기 결정 트리에 기초한 제 1 점수 및 상기 영상에 등장하는 회의 참석자의 수에 기초한 제 2 점수에 기초하여 상기 행동 점수를 산출하는 단계를 더 포함하는 것인, 데이터 생성 방법.
11. The method of claim 10,
The step of deriving the image weight further comprises calculating the behavioral score based on a first score based on the decision tree and a second score based on the number of meeting participants appearing in the image. Way.

제 9 항에 있어서,
상기 합성용 데이터를 생성하는 단계는,
상기 영상 가중치에 기초하여 상기 하나 이상의 영상 데이터 중에서 합성용 영상 데이터를 선택하는 단계;
상기 음성 가중치에 기초하여 상기 하나 이상의 상기 음성 데이터로부터 합성용 음성 데이터를 선택하는 단계; 및
상기 합성용 영상 데이터 및 상기 합성용 음성 데이터를 합성하여 상기 합성용 데이터를 생성하는 단계
를 포함하는 것인, 데이터 생성 방법.
10. The method of claim 9,
The step of generating the data for synthesis includes:
selecting image data for synthesis from among the one or more image data based on the image weight;
selecting speech data for synthesis from the at least one speech data based on the speech weight; and
generating the synthesis data by synthesizing the image data for synthesis and the audio data for synthesis
A method for generating data, comprising:

제 8 항에 있어서,
상기 합성용 데이터 및 합성용 데이터에 대한 가중치를 화상 회의 서버로 전송하는 단계를 더 포함하는 것인, 데이터 생성 방법.
9. The method of claim 8,
The method of claim 1, further comprising the step of transmitting the data for synthesis and a weight for the data for synthesis to a video conferencing server.

제 13 항에 있어서,
상기 화상 회의 서버는,
각 회의실에 위치한 복수의 데이터 생성 장치 각각으로부터 상기 합성용 데이터 및 상기 합성용 데이터에 대한 가중치를 수신하고,
상기 복수의 가중치에 기초하여 상기 합성용 데이터를 조합하여 화상 회의 데이터를 생성하는 것인, 데이터 생성 방법.
14. The method of claim 13,
The video conferencing server,
receiving the data for synthesis and weights for the data for synthesis from each of a plurality of data generating devices located in each conference room;
and generating video conference data by combining the data for synthesis based on the plurality of weights.

합성용 데이터를 생성하는 하는 명령어들의 시퀀스를 포함하는 컴퓨터 판독가능 기록매체에 저장된 컴퓨터 프로그램에 있어서,
상기 컴퓨터 프로그램은 컴퓨팅 장치에 의해 실행될 경우,
하나 이상의 슬레이브 클라이언트로부터 영상 데이터 및 음성 데이터를 포함하는 멀티미디어 데이터를 수신하고,
상기 영상 데이터 및 상기 음성 데이터 각각에 대한 가중치를 도출하고,
상기 도출한 가중치에 기초하여 상기 멀티미디어 데이터로부터 합성용 데이터를 생성하도록 하는 명령어들의 시퀀스를 포함하는 것인, 컴퓨터 판독가능 기록매체에 저장된 컴퓨터 프로그램.A computer program stored in a computer-readable recording medium comprising a sequence of instructions for generating data for synthesis,
When the computer program is executed by a computing device,
Receive multimedia data including video data and audio data from one or more slave clients,
Deriving a weight for each of the image data and the audio data,
and a sequence of instructions for generating data for synthesis from the multimedia data based on the derived weight.