JP7065545B1

JP7065545B1 - Live distribution system, live distribution method, and live distribution program

Info

Publication number: JP7065545B1
Application number: JP2021185206A
Authority: JP
Inventors: 秀幸市橋
Original assignee: Frontier Co Ltd
Current assignee: Frontier Co Ltd
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2022-05-12
Anticipated expiration: 2041-11-12
Also published as: JP2023072567A

Abstract

【課題】配信者と視聴者とがコミュニケーションをとりながらリアルタイムで画像処理方法を決定可能なライブ配信システム、ライブ配信方法、及びライブ配信プログラムを提供する。【解決手段】カメラと、マイクロフォンと、映像調整装置と、映像変換装置と、動画データを視聴者端末へ送信する配信サーバとを備えるライブ配信システムであって、映像変換装置は、映像編集部と、入出力部と、視聴者情報管理部と、視聴者情報記憶部とを備え、映像編集部は、特定の人物の顔画像データを用いて構築された学習モデルを記憶する画像データ記憶部と、動画データに映っている人物の顔の画像を抽出し、学習モデルの、特定の人物の顔の画像と一致する画像を判定する顔画像判定部と、顔の画像にたいして画像処理を行ったのち、データに基づき画像処理の方法を決定する画像処理部と、視聴者端末から送信されたデータに基づき画像処理を行う画像表示部とを備える。【選択図】図５PROBLEM TO BE SOLVED: To provide a live distribution system, a live distribution method, and a live distribution program capable of determining an image processing method in real time while communicating between a distributor and a viewer. SOLUTION: This is a live distribution system including a camera, a microphone, a video adjusting device, a video conversion device, and a distribution server for transmitting video data to a viewer terminal, and the video conversion device is a video editing unit. The video editing unit includes an input / output unit, a viewer information management unit, and a viewer information storage unit, and the video editing unit includes an image data storage unit that stores a learning model constructed using facial image data of a specific person. After extracting the image of the face of the person shown in the video data and performing image processing on the image of the face, the face image judgment unit that determines the image that matches the image of the face of a specific person in the learning model. The image processing unit that determines the image processing method based on the data and the image display unit that performs the image processing based on the data transmitted from the viewer terminal are provided. [Selection diagram] FIG. 5

Description

本発明は、ライブ配信システム、ライブ配信方法、及びライブ配信プログラムに関する。 The present invention relates to a live distribution system, a live distribution method, and a live distribution program.

スポーツ、音楽の演奏等をライブ配信する際、出演者に加えて観客が撮影されていることがある。観客席を撮影する場合、出演者と観客席とが互いに離れていれば、観客の個人が特定できない程度の解像度によって撮影されるなどの措置が取られる。出演者と観客との距離が近い場合など、出演者と観客が同時に映像内に映る場合は、特許文献１及び特許文献２に開示されているように、映像内の特定人物を除く人物の顔画像を画像処理することにより、観客の個人が特定できないようにすることができる。 When live-streaming sports, music performances, etc., the audience may be photographed in addition to the performers. When shooting the audience seats, if the performers and the audience seats are far from each other, measures such as shooting at a resolution that does not allow the individual audience to be identified are taken. When the performer and the audience appear in the image at the same time, such as when the performer and the audience are close to each other, the face of a person other than a specific person in the image is disclosed in Patent Document 1 and Patent Document 2. By image processing the image, it is possible to prevent the individual audience from being identified.

特開２００２－１９１０４４号公報Japanese Unexamined Patent Publication No. 2002-191044 特開２００１―２５６４９６号公報Japanese Unexamined Patent Publication No. 2001-256496

特許文献１及び特許文献２に開示されている方法によれば、画像処理を行わない特定人物、及び画像処理方法は配信者が決定しており、視聴者は配信された映像を視聴するのみであった。 According to the methods disclosed in Patent Document 1 and Patent Document 2, the distributor determines the specific person who does not perform image processing and the image processing method, and the viewer only views the distributed video. there were.

上記問題点を鑑み、本発明は、配信者と視聴者とがコミュニケーションをとりながらリアルタイムで画像処理方法を決定可能なライブ配信システム、ライブ配信方法、及びライブ配信プログラムを提供することを目的とする。 In view of the above problems, it is an object of the present invention to provide a live distribution system, a live distribution method, and a live distribution program capable of determining an image processing method in real time while communicating between a distributor and a viewer. ..

本発明の第１の態様は、ライブを撮影して１又は複数の映像データを生成するための１又は複数のカメラと、ライブの音声を録音して１又は複数の音声データを生成するための１又は複数のマイクロフォンと、１又は複数の映像データ及び又は複数の音声データから動画データを生成する映像調整装置と、動画データを編集する映像変換装置と、編集された動画データを視聴者端末へ送信する配信サーバとを備えるライブ動画を配信するライブ配信システムであって、映像変換装置は、映像編集部と、視聴者端末と、ライブ動画に対するコメントをライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力部と、１又は複数の視聴者の登録と認証を行う視聴者情報管理部と、１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶部とを備え、映像編集部は、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、１又は複数の特定の人物の顔画像データを用いて構築された学習モデルを記憶する画像データ記憶部と、画像認識により、動画データの画像に映っている１又は複数の人物の顔の画像を抽出し、抽出された１又は複数の顔の画像の中に、学習モデルの、１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定部と、顔画像判定部によって判定された１又は複数の顔の画像のうち、１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、視聴者端末から送信されたリクエストデータに基づき画像処理を行う画像処理部と、視聴者端末から送信されたコメントデータに基づきコメントをライブ動画に表示する画像処理を行う画像表示部とを備えることを要旨とする。 A first aspect of the present invention is one or more cameras for shooting live to generate one or more video data, and one or more cameras for recording live audio to generate one or more audio data. A video adjustment device that generates video data from one or more microphones, one or more video data and / or a plurality of audio data, a video conversion device that edits video data, and an edited video data to a viewer terminal. It is a live distribution system that distributes a live video including a distribution server for transmitting, and the video conversion device includes a video editing unit, a viewer terminal, and a comment including information for expressing a comment on the live video in the live video. An input / output unit that sends / receives data and request data including information that requests an image processing method for a face image, and a viewer information management unit that registers and authenticates one or more viewers, and one or more. It is provided with a viewer information storage unit that stores viewer information, a viewer ID, and a password of the viewer, and the video editing unit assigns a tag related to one or a plurality of specific persons in the image data taken in advance. An image data storage unit that stores a learning model constructed using the face image data of one or more specific persons, and the face of one or more persons reflected in the image of the moving image data by image recognition. The face image determination unit that extracts the image of And, among the images of one or more faces determined by the face image determination unit, it becomes difficult to identify the individual face for the image that does not match the image of the face of one or more specific persons. After performing image processing, the image processing unit that performs image processing based on the request data sent from the viewer terminal and the image processing that displays the comment on the live video based on the comment data sent from the viewer terminal. The gist is to provide an image display unit.

本発明の第１の態様において、画像処理は顔の画像に対するモザイク処理であり、画像処理部は、抽出された又は複数の顔の画像のうち、１又は複数の特定の人物の顔の画像と一致しない画像にのみモザイク処理を行ってもよい。 In the first aspect of the present invention, the image processing is a mosaic processing for a face image, and the image processing unit is a face image of one or a plurality of specific persons among the extracted or a plurality of face images. Mosaic processing may be performed only on images that do not match.

本発明の第１の態様において、画像処理部は、抽出された１又は複数の顔の画像のうち、１又は複数の特定の人物の顔の画像と一致する画像と、一致しない画像とは、互いに異なる画像処理を行ってもよい。 In the first aspect of the present invention, the image processing unit determines that, among the extracted images of one or more faces, an image that matches the image of the face of one or more specific persons and an image that does not match the image of the face of one or more specific persons. Image processing different from each other may be performed.

本発明の第１の態様において、画像処理は顔の画像に対するモザイク処理であり、画像処理部は、リクエストデータに基づき、抽出された１又は複数の顔の画像に対して行うモザイク処理のモザイクの粗さを決定してもよい。 In the first aspect of the present invention, the image processing is the mosaic processing for the face image, and the image processing unit performs the mosaic processing for the extracted one or more face images based on the request data. Roughness may be determined.

本発明の第１の態様において、画像処理は顔の画像に対するモザイク処理であり、画像処理部は、リクエストデータに基づき、抽出された１又は複数の顔の画像に対して行うモザイク処理のモザイクの範囲を決定してもよい。 In the first aspect of the present invention, the image processing is the mosaic processing for the face image, and the image processing unit performs the mosaic processing for the extracted one or more face images based on the request data. The range may be determined.

本発明の第１の態様において、画像処理部は、リクエストデータに基づき、抽出された１又は複数の顔の画像のうち、画像処理を行う画像を決定してもよい。 In the first aspect of the present invention, the image processing unit may determine an image to be image-processed from among the extracted one or a plurality of facial images based on the request data.

本発明の第１の態様において、画像処理部による、リクエストデータに基づく画像処理の方法は、課金によって視聴者が決定可能としてもよい。 In the first aspect of the present invention, the image processing method based on the request data by the image processing unit may be determined by the viewer by charging.

本発明の第１の態様において、画像処理部による、リクエストデータに基づく画像処理の方法は、視聴者による多数決によって決定されてもよい。 In the first aspect of the present invention, the method of image processing based on the request data by the image processing unit may be determined by a majority vote by the viewer.

本発明の第１の態様において、視聴者による多数決によって決定される画像処理の方法の選択肢の内容は、画像表示部によってライブ動画に表示された前記コメントにおいて提案され、１又は複数の特定の人物がコメントにおいて提案された選択肢の内容を了承することにより、ライブ動画に提案された選択肢が表示されてもよい。 In the first aspect of the present invention, the content of the image processing method options determined by the majority vote by the viewer is proposed in the comment displayed in the live video by the image display unit, and one or more specific persons. By accepting the content of the proposed options in the comments, the proposed options may be displayed in the live video.

本発明の第１の態様において、画像処理部は、リクエストデータに基づき、抽出された１又は複数の顔の画像のうち、どの顔画像に対して画像処理を行うかを決定してもよい。 In the first aspect of the present invention, the image processing unit may determine which face image to perform image processing on among the extracted one or a plurality of face images based on the request data.

本発明の第１の態様において、データ記憶部は、予め撮影されたマーカーの画像データを用いて構築されたマーカー学習モデルを更に記憶し、画像処理部は、データに基づき、マーカーを装着した人物の顔の画像に対して画像処理を行ってもよい。 In the first aspect of the present invention, the data storage unit further stores the marker learning model constructed by using the image data of the markers taken in advance, and the image processing unit further stores the person wearing the marker based on the data. Image processing may be performed on the image of the face.

本発明の第１の態様において、画像処理部による、リクエストデータに基づく画像処理の方法の決定は、視聴者が所定の条件を満たすことによって可能となってよい。 In the first aspect of the present invention, the image processing unit may determine the method of image processing based on the request data if the viewer satisfies a predetermined condition.

本発明の第２の態様は、視聴者端末にライブ動画を配信するライブ配信方法であって、ライブを撮影して１又は複数の映像データを生成する映像データ生成ステップと、ライブの音声を録音して１又は複数の音声データを生成する音声データ生成ステップと、１又は複数の映像データ及び１又は複数の音声データから動画データを生成する動画データ生成ステップと、動画データを編集する映像編集ステップと、編集された動画データを前記視聴者端末へ送信する配信ステップと、１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶ステップと、１又は複数の視聴者の登録と認証を行う視聴者情報管理ステップと、視聴者端末と、ライブ動画に対するコメントをライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力ステップとを備え、映像編集ステップは、画像認識により、動画データの画像に映っている１又は複数の人物の顔の画像を抽出する抽出ステップと、抽出された１又は複数の顔の画像の中に、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、１又は複数の特定の人物の顔画像データを用いて構築された学習モデルの、１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定ステップと、画像判定ステップにおいて判定された１又は複数の顔の画像のうち、１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、視聴者端末から送信されたリクエストデータに基づき画像処理を行う画像処理ステップと、視聴者端末から送信されたコメントデータに基づきコメントをライブ動画に表示する画像処理を行う画像表示ステップとを備えることを要旨とする。 A second aspect of the present invention is a live distribution method for distributing a live video to a viewer terminal, in which a video data generation step of shooting a live to generate one or a plurality of video data and recording live audio are recorded. An audio data generation step for generating one or more audio data, a video data generation step for generating video data from one or more video data and one or more audio data, and a video editing step for editing video data. A distribution step of transmitting the edited video data to the viewer terminal, a viewer information storage step of storing the viewer information, viewer ID and password of one or more viewers, and one or more viewings. Viewer information management step to register and authenticate a person, comment data including information for expressing a comment on a live video in a live video, and information requesting an image processing method for a face image. The video editing step includes an input / output step for transmitting / receiving request data including the above, and an extraction step for extracting an image of the face of one or more persons reflected in the image of the moving image data by image recognition and an extraction step. Constructed using the face image data of one or more specific persons to which a tag relating to one or more specific persons is attached in the image data taken in advance in the image of one or more faces. Of the face image determination step of determining whether or not there is an image matching the image of the face of one or more specific persons of the learned learning model, and the image of one or more faces determined in the image determination step. Based on the request data sent from the viewer terminal after performing image processing so that it becomes difficult to identify the individual face for the image that does not match the image of the face of one or more specific persons. The gist is to include an image processing step for performing image processing and an image display step for performing image processing for displaying a comment on a live moving image based on comment data transmitted from a viewer terminal.

本発明の第３の態様は、コンピュータに、ライブを撮影して１又は複数の映像データを生成する映像データ生成機能と、ライブの音声を録音して１又は複数の音声データを生成する音声データ生成機能と、１又は複数の映像データ及び１又は複数の音声データから動画データを生成する動画データ生成機能と、動画データを編集する映像編集機能と、編集された動画データを視聴者端末へ送信する配信機能と、１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶機能と、１又は複数の視聴者の登録と認証を行う視聴者情報管理機能と、視聴者端末と、ライブ動画に対するコメントをライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力機能とを備え、映像編集機能は、画像認識により、動画データの画像に映っている１又は複数の人物の顔の画像を抽出する抽出機能と、抽出された１又は複数の顔の画像の中に、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、又は複数の特定の人物の顔画像データを用いて構築された学習モデルの、又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定機能と、顔画像判定機能によって判定された１又は複数の顔の画像のうち、１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、視聴者端末から送信されたリクエストデータに基づき画像処理を行う画像処理機能と視聴者端末から送信されたコメントデータに基づきコメントをライブ動画に表示する画像処理を行う画像表示機能とを備えるためのライブ配信プログラムであることを要旨とする。 A third aspect of the present invention is a video data generation function in which a computer captures a live image and generates one or more video data, and audio data that records live audio and generates one or more audio data. Generation function, video data generation function to generate video data from one or more video data and one or more audio data, video editing function to edit video data, and transmission of edited video data to viewer terminal Distribution function, viewer information storage function to store viewer information, viewer ID and password of one or more viewers, viewer information management function to register and authenticate one or more viewers, Equipped with a viewer terminal, comment data including information for expressing comments on live videos in live videos, and input / output functions for sending and receiving request data including information requesting image processing methods for facial images. , The video editing function is an extraction function that extracts the image of the face of one or more people reflected in the image of the moving image data by image recognition, and the image of one or more faces that is extracted in advance. Of a learning model that is tagged with one or more specific persons in the image data, or that is constructed using facial image data of a plurality of specific persons, or of the faces of a plurality of specific persons. Of the face image determination function that determines whether there is an image that matches the image and the image of one or more faces determined by the face image determination function, the image does not match the image of the face of one or more specific persons. After performing image processing so that it is difficult to identify the individual face of the image, the image processing function that performs image processing based on the request data transmitted from the viewer terminal and the image processing function transmitted from the viewer terminal The gist is that it is a live distribution program to have an image display function that performs image processing to display comments on live moving images based on comment data.

本発明によれば、配信者と視聴者とがコミュニケーションをとりながらリアルタイムで画像処理方法を決定可能なライブ配信システム、ライブ配信方法、及びライブ配信プログラムを提供できる。 According to the present invention, it is possible to provide a live distribution system, a live distribution method, and a live distribution program that enable a distributor and a viewer to determine an image processing method in real time while communicating with each other.

本発明の実施形態に係るライブ配信システムの全体構成の一例を示す概要図である。It is a schematic diagram which shows an example of the whole structure of the live distribution system which concerns on embodiment of this invention. 本実施形態に係るライブ配信システムの映像調整装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the image adjustment apparatus of the live distribution system which concerns on this embodiment. 本実施形態に係るライブ配信システムの映像変換装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the video conversion apparatus of the live distribution system which concerns on this embodiment. 本実施形態に係るライブ配信システムの映像編集部の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the video editing part of the live distribution system which concerns on this embodiment. 本実施形態に係るライブ配信システムの動作を説明するためのフローチャートである。It is a flowchart for demonstrating the operation of the live distribution system which concerns on this embodiment. 本実施形態に係るライブ配信システムの動作を説明するためのフローチャートである。It is a flowchart for demonstrating the operation of the live distribution system which concerns on this embodiment.

次に、図面を参照して、本発明の実施形態を説明する。実施形態に係る図面の記載において、同一又は類似の部分には同一又は類似の符号を付している。但し、図面は模式的なものである。 Next, an embodiment of the present invention will be described with reference to the drawings. In the description of the drawings according to the embodiment, the same or similar parts are designated by the same or similar reference numerals. However, the drawings are schematic.

又、実施形態は、本発明の技術的思想を具体化するための装置や方法を例示するものであって、本発明の技術的思想は、各構成要素の構成や配置、レイアウト等を下記のものに特定するものでない。本発明の技術的思想は、特許請求の範囲に記載された請求項が規定する技術的範囲内において、種々の変更を加えることができる。 Further, the embodiment illustrates an apparatus or method for embodying the technical idea of the present invention, and the technical idea of the present invention describes the configuration, arrangement, layout, etc. of each component as follows. It is not specific to things. The technical idea of the present invention may be modified in various ways within the technical scope specified by the claims described in the claims.

（実施形態）
本発明の第１の実施形態に係るライブ配信システムを以下に説明する。図１に、本実施形態に係るライブ配信システム１０の構成の一例を示す。図１に示すライブ配信システム１０は、第１カメラ１０１ａ、第２カメラ１０１ｂ、第３カメラ１０１ｃ、第１マイクロフォン１０２ａ、第２マイクロフォン１０２ｂ、第３マイクロフォン１０２ｃ、映像調整装置１０３、映像変換装置１０４、配信サーバ１０５、ネットワーク１０６、視聴者端末１０７ａ、１０７ｂ、１０７ｃから構成される。 (Embodiment)
The live distribution system according to the first embodiment of the present invention will be described below. FIG. 1 shows an example of the configuration of the live distribution system 10 according to the present embodiment. The live distribution system 10 shown in FIG. 1 includes a first camera 101a, a second camera 101b, a third camera 101c, a first microphone 102a, a second microphone 102b, a third microphone 102c, a video adjustment device 103, and a video conversion device 104. It is composed of a distribution server 105, a network 106, and viewer terminals 107a, 107b, and 107c.

１又は複数の視聴者端末１０７ａ、１０７ｂ、１０７ｃは、モバイル端末、パーソナルコンピュータ等、視聴者がライブ配信システムからネットワーク１０６を介して動画データを受信して閲覧し、かつ映像変換装置１０４とデータの送受信を行うことのできる端末である。 In the one or a plurality of viewer terminals 107a, 107b, 107c, a viewer such as a mobile terminal or a personal computer receives and browses video data from a live distribution system via a network 106, and has a video conversion device 104 and data. It is a terminal that can send and receive.

映像変換装置１０４は、パーソナルコンピュータ（ＰＣ）、メインフレーム、ワークステーション、クラウドコンピューティングシステム等、種々の電子計算機（計算リソース）であり、ディスプレイ、その他、入力用のキーボード、マウス等が接続されている。ネットワーク１０６は、インターネット、光ネットワーク、電話回線網等、いかなる通信網であってもよい。 The video converter 104 is various electronic computers (computing resources) such as a personal computer (PC), a mainframe, a workstation, and a cloud computing system, and is connected to a display, a keyboard for input, a mouse, and the like. There is. The network 106 may be any communication network such as the Internet, an optical network, and a telephone line network.

図２に、映像調整装置１０３の構成の一例を示す。図２に示す映像調整装置１０３は、第１映像データ変換器２０１ａ、第２映像データ変換器２０１ｂ、第３映像データ変換器２０１ｃ、第１音声データ変換器２０２ａ、第２音声データ変換器２０２ｂ、第３音声データ変換器２０２ｃ、第１合成器２０３ａ、第２合成器２０３ｂ、第３合成器２０３ｃ、スイッチャー２０４、エンコーダー２０５、ストリーミングサーバ２０６から構成される。 FIG. 2 shows an example of the configuration of the image adjusting device 103. The video adjusting device 103 shown in FIG. 2 includes a first video data converter 201a, a second video data converter 201b, a third video data converter 201c, a first audio data converter 202a, and a second audio data converter 202b. It is composed of a third audio data converter 202c, a first synthesizer 203a, a second synthesizer 203b, a third synthesizer 203c, a switcher 204, an encoder 205, and a streaming server 206.

図３に、映像変換装置１０４の構成の一例を示す。図３に示す映像変換装置１０４は、映像編集部３０１、入出力部３０２、視聴者情報管理部３０３、視聴者情報記憶部３０４から構成される。 FIG. 3 shows an example of the configuration of the video converter 104. The video conversion device 104 shown in FIG. 3 is composed of a video editing unit 301, an input / output unit 302, a viewer information management unit 303, and a viewer information storage unit 304.

ライブ会場において、出演者によるライブパフォーマンスが行われ、第１カメラ１０１ａ、第２カメラ１０１ｂ、第３カメラ１０１ｃによる撮影が行われ、それぞれのカメラによって第１映像データ、第２映像データ、第３映像データが生成される。同様に、第１マイクロフォン１０２ａ、第２マイクロフォン１０２ｂ、第３マイクロフォン１０２ｃによる録音が行われ、それぞれのマイクロフォンによって第１音声データ、第２音声データ、第３音声データが生成される。 At the live venue, a live performance is performed by the performers, shooting is performed by the first camera 101a, the second camera 101b, and the third camera 101c, and the first video data, the second video data, and the third video are taken by each camera. Data is generated. Similarly, recording is performed by the first microphone 102a, the second microphone 102b, and the third microphone 102c, and the first voice data, the second voice data, and the third voice data are generated by each microphone.

第１カメラ１０１ａ、第２カメラ１０１ｂ、第３カメラ１０１ｃによって得られた第１映像データ、第２映像データ、第３映像データは、それぞれ、第１映像データ変換器２０１ａ、第２映像データ変換器２０１ｂ、第３映像データ変換器２０１ｃによってアナログデータからデジタルデータに変換され、さらに、圧縮処理がなされる。第１マイクロフォン１０２ａ、第２マイクロフォン１０２ｂ、第３マイクロフォン１０２ｃによって得られた第１音声データ、第２音声データ、第３３音声データについても同様に、それぞれ、第１音声データ変換器２０２ａ、第２音声データ変換器２０２ｂ、第３音声データ変換器２０２ｃによってアナログデータからデジタルデータに変換され、さらに、圧縮処理がなされる。 The first video data, the second video data, and the third video data obtained by the first camera 101a, the second camera 101b, and the third camera 101c are the first video data converter 201a and the second video data converter, respectively. The analog data is converted into digital data by the third video data converter 201b, 201b, and further compression processing is performed. Similarly, for the first voice data, the second voice data, and the 33rd voice data obtained by the first microphone 102a, the second microphone 102b, and the third microphone 102c, the first voice data converter 202a and the second voice, respectively. The data converter 202b and the third voice data converter 202c convert the analog data into digital data, and further perform compression processing.

これらの第１～第３映像データ及び第１～第３音声データは、それぞれ、第１合成器２０３ａ、第２合成器２０３ｂ、第３合成器２０３ｃにおいて同期がとられ、合成されて、第１動画データ、第２動画データ、第３動画データが生成される。 The first to third video data and the first to third audio data are synchronized and combined in the first synthesizer 203a, the second synthesizer 203b, and the third synthesizer 203c, respectively, and the first synthesizer 203a, the second synthesizer 203b, and the third synthesizer 203c, respectively. The moving image data, the second moving image data, and the third moving image data are generated.

第１動画データ、第２動画データ、第３動画データは、それぞれ、スイッチャー２０４に入力され、スイッチャー２０４において、配信される動画が選択される。第１動画データ、第２動画データ、第３動画データのうち、配信される動画として選択された動画データは、エンコーダー２０５及びストリーミングサーバ２０６において、ストリーミング形式のデータに変換される。 The first moving image data, the second moving image data, and the third moving image data are each input to the switcher 204, and the moving image to be distributed is selected in the switcher 204. Of the first moving image data, the second moving image data, and the third moving image data, the moving image data selected as the moving image to be distributed is converted into streaming format data by the encoder 205 and the streaming server 206.

ストリーミング形式に変換された動画データは、映像編集部３０１において編集がなされたのち、配信サーバ１０５から、ネットワーク１０６を介して、視聴者端末１０７ａ、１０７ｂ、１０７ｃへ配信される。 The moving image data converted into the streaming format is edited by the video editing unit 301 and then distributed from the distribution server 105 to the viewer terminals 107a, 107b, 107c via the network 106.

図４に、映像編集部３０１の構成の一例を示す。図４に示す映像編集部３０１は、顔画像判定部４０１、画像データ記憶部４０２、画像処理部４０３、画像表示部４０４とから構成される。 FIG. 4 shows an example of the configuration of the video editing unit 301. The image editing unit 301 shown in FIG. 4 is composed of a face image determination unit 401, an image data storage unit 402, an image processing unit 403, and an image display unit 404.

顔画像判定部４０１は、画像認識により、動画データの画像に映っている全ての人物の顔の画像を抽出し、抽出された顔の画像の中に、画像データ記憶部４０２に記憶されている、１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する。 The face image determination unit 401 extracts the images of the faces of all the people shown in the image of the moving image by image recognition, and stores them in the image data storage unit 402 in the extracted face images. Determines if there is an image that matches the image of the face of one or more specific persons.

画像データ記憶部４０２は、顔画像判定部４０１が画像認識を行う際に用いられる機械学習モデルを記憶する。この機械学習モデルは、予め学習されたモデルであり、本実施形態において、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、１又は複数の特定の人物の顔画像データを用いて構築されたものである。本実施形態において、１又は複数の特定の人物とは、例えば、ライブパフォーマンスを行う出演者、司会者、等である。 The image data storage unit 402 stores a machine learning model used when the face image determination unit 401 performs image recognition. This machine learning model is a pre-learned model, and in the present embodiment, one or a plurality of specific persons tagged with respect to one or a plurality of specific persons in the image data taken in advance. It was constructed using face image data. In the present embodiment, the one or more specific persons are, for example, performers, moderators, and the like who perform a live performance.

画像処理部４０３は、顔画像判定部４０１によって抽出された顔の画像に対して、画像処理を行う。ここで、本実施形態において、画像処理部４０３によって行われる画像処理とは、顔画像判定部４０１によって抽出された顔の画像の少なくとも一部に対して、モザイクを掛ける、ぼかす、他の画像で覆う等、個人の特定が困難となるように加工することを指す。 The image processing unit 403 performs image processing on the face image extracted by the face image determination unit 401. Here, in the present embodiment, the image processing performed by the image processing unit 403 is to apply a mosaic, blur, or other image to at least a part of the face image extracted by the face image determination unit 401. It refers to processing so that it is difficult to identify an individual, such as covering it.

本実施形態においては、動画配信の開始時点で、顔画像判定部４０１は、抽出された顔の画像の中の、１又は複数の特定の人物の顔の画像と一致すると判定された顔の画像に対しては、モザイク処理を行わない等、１又は複数の特定の人物を除く人物の顔画像とは異なる処理を行うように設定される。 In the present embodiment, at the start of video distribution, the face image determination unit 401 is a face image determined to match the face image of one or a plurality of specific persons in the extracted face images. Is set to perform processing different from the facial image of a person other than one or a plurality of specific persons, such as not performing mosaic processing.

視聴者は視聴者端末１０７ａ、１０７ｂ、１０７ｃにおいて、本実施形態に係るライブ配信システムによって配信されるライブ動画を視聴し、ライブ動画に対するコメント、感想、メッセージ等をライブ動画において表現するための情報等を含むコメントデータを視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２に送信することができる。 The viewer watches the live video distributed by the live distribution system according to the present embodiment on the viewer terminals 107a, 107b, 107c, and information for expressing comments, impressions, messages, etc. on the live video in the live video, etc. The comment data including the above can be transmitted from the viewer terminals 107a, 107b, 107c to the input / output unit 302.

コメントデータは、視聴者端末１０７ａ、１０７ｂ、１０７ｃから、ネットワーク１０６を介して、映像変換装置１０４の入出力部３０２に送信されたのち、画像表示部４０４に送信される。画像表示部４０４は、コメントデータに基づき、コメント、感想、メッセージ等がライブ動画の画面のコメント欄に表示されるように画像処理を行う。 The comment data is transmitted from the viewer terminals 107a, 107b, 107c to the input / output unit 302 of the video conversion device 104 via the network 106, and then transmitted to the image display unit 404. Based on the comment data, the image display unit 404 performs image processing so that comments, impressions, messages, and the like are displayed in the comment field on the screen of the live video.

視聴者、及び、司会者、出演者、観客等の、ライブ動画に映る人物は、画像表示部４０４によってライブ動画の画面のコメント欄に表示されるコメントの内容を、ライブ会場にライブ動画の画面を表示するディスプレイを設置する等の方法によって知ることができる。コメントの内容に対して、ライブ動画に映る人物が応答することによって、視聴者とライブ動画に映る人物とが、コミュニケーションをとることができる。 For viewers, moderators, performers, spectators, and other people who appear in the live video, the content of the comment displayed in the comment field of the live video screen by the image display unit 404 is displayed on the live video screen at the live venue. It can be known by a method such as installing a display that displays. By responding to the content of the comment by the person appearing in the live video, the viewer and the person appearing in the live video can communicate with each other.

視聴者は、更に、視聴者端末１０７ａ、１０７ｂ、１０７ｃにおいて、本実施形態に係るライブ動画を視聴し、画像処理部４０３が顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、処理方法をリクエストする情報を含むリクエストデータを視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２に送信することができる。 The viewer further watches the live moving image according to the present embodiment on the viewer terminals 107a, 107b, 107c, and the image processing unit 403 performs mosaic processing on the face image extracted by the face image determination unit 401. Request data including information requesting a processing method can be transmitted from the viewer terminals 107a, 107b, 107c to the input / output unit 302.

リクエストデータは、視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２に送信されたのち、画像処理部４０３に送信される。画像処理部４０３は、リクエストデータに含まれる、処理方法をリクエストする情報に基づき、顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う。 The request data is transmitted from the viewer terminals 107a, 107b, 107c to the input / output unit 302, and then transmitted to the image processing unit 403. The image processing unit 403 performs mosaic processing or the like on the face image extracted by the face image determination unit 401 based on the information included in the request data for requesting the processing method.

視聴者によってリクエストされる、画像処理部４０３が顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、処理方法としては、モザイク処理を行う人物の顔の画像を指定してもよい。本実施形態において、動画配信開始時点で、顔画像判定部４０１によって抽出された顔の画像に対して、出演者および司会者等、画像データ記憶部４０２に記憶されている、１又は複数の特定の人物の顔の画像と一致しない画像にのみモザイク処理を行うとするが、視聴者のリクエストにより、例えば、出演者、司会者等、１又は複数の特定の人物の顔の画像にもモザイク処理を行うとしてもよい。 When the image processing unit 403 requests the viewer to perform mosaic processing on the face image extracted by the face image determination unit 401, as a processing method, an image of the face of the person to be mosaicked is used. You may specify it. In the present embodiment, one or a plurality of identifications stored in the image data storage unit 402, such as the performer and the moderator, with respect to the face image extracted by the face image determination unit 401 at the time of starting the moving image distribution. It is assumed that the mosaic processing is performed only on the image that does not match the image of the face of the person, but at the request of the viewer, for example, the mosaic processing is also performed on the image of the face of one or more specific people such as performers and moderators. May be done.

例えば、ライブの出演者、司会者等、１又は複数の特定の人物が、視聴者の好まないパフォーマンスを行ったとき、そのパフォーマンスを行った特定の人物の顔の画像に対して、画像処理部４０３によってモザイク処理をするように視聴者がリクエストする等、視聴者の反応や要望をリアルタイムに反映させることができる。 For example, when one or more specific persons such as a live performer, a moderator, etc. perform a performance that the viewer does not like, the image processing unit for the image of the face of the specific person who performed the performance. The reaction and request of the viewer can be reflected in real time, such as the viewer requesting the mosaic processing by the 403.

視聴者によってリクエストされる、画像処理部４０３が顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、他の処理方法としては、例えば、モザイクの粗さを指定することができる。モザイクの粗さの指定方法としては、例えば、各モザイクの１単位のサイズを数値で指定してもよい。 As another processing method when the image processing unit 403 requests the viewer to perform mosaic processing or the like on the face image extracted by the face image determination unit 401, for example, the roughness of the mosaic is specified. can do. As a method of specifying the roughness of the mosaic, for example, the size of one unit of each mosaic may be specified numerically.

顔の画像に掛けるモザイクの粗さは、粗さが小さい程、顔の個人を特定しやすく、粗さが大きい程、顔の個人を特定することが困難になる。例えば、ライブ動画の画像に映っている観客の顔の画像に掛けられているモザイクの粗さが十分でなく、観客の個人が特定される可能性がある場合、視聴者は、観客の個人が特定されないように、モザイクの粗さを大きくするようにリクエストすることができる。また、例えば、特定の人物の顔の画像に対して、画像処理部４０３によってモザイク処理を行う場合、特定の人物に欠けるモザイクの粗さを小さくし、観客とは区別できるようにすることができる。 As for the roughness of the mosaic applied to the face image, the smaller the roughness, the easier it is to identify the individual face, and the larger the roughness, the more difficult it is to identify the individual face. For example, if the mosaic on the image of the audience's face in the image of the live video is not sufficiently coarse and the individual of the audience may be identified, the viewer may see the individual of the audience. You can request to increase the roughness of the mosaic so that it is not specified. Further, for example, when the image processing unit 403 performs mosaic processing on an image of the face of a specific person, the roughness of the mosaic lacking in the specific person can be reduced so that the image can be distinguished from the audience. ..

また、顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、他の処理方法としては、モザイク処理等を行う範囲を指定することができる。モザイク処理を行う範囲を広く指定すると、顔の画像だけでなく、例えば、モザイク処理をする人物の首、衣服の部分等、広い範囲でモザイク処理がなされる。モザイク処理を行う範囲を狭く指定すると、顔の画像の一部だけにモザイク処理がなされる。モザイク処理を行う範囲を指定する方法としては、モザイク処理をする範囲の大きさ、位置を指定してもよい。例えば、眼だけにモザイク処理をする、等の方法である。 Further, as another processing method when performing mosaic processing or the like on the face image extracted by the face image determination unit 401, it is possible to specify a range in which the mosaic processing or the like is performed. If a wide range of mosaic processing is specified, not only the image of the face but also the neck of the person to be mosaicked, the part of clothes, and the like, the mosaic processing is performed in a wide range. If the range to be mosaicked is specified narrowly, the mosaic processing is performed only on a part of the face image. As a method of specifying the range to be mosaicked, the size and position of the range to be mosaicked may be specified. For example, it is a method of performing mosaic processing only on the eyes.

顔の画像に掛けるモザイク処理等の範囲の広さは、範囲が小さい程、顔の個人を特定しやすく、範囲が大きい程、顔の個人を特定することが困難になる。モザイクの粗さを指定する場合と同様、観客の個人が特定されないように、モザイクの範囲を大きくするようにリクエストすることができ、また、特定の人物に欠けるモザイクの範囲を小さくし、観客とは区別できるようにすることができる。 As for the width of the range such as the mosaic processing applied to the face image, the smaller the range, the easier it is to identify the individual face, and the larger the range, the more difficult it is to identify the individual face. As with the case of specifying the roughness of the mosaic, you can request to increase the range of the mosaic so that the individual spectator is not identified, and you can reduce the range of the mosaic lacking a specific person with the audience. Can be made distinguishable.

顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、他の処理方法としては、モザイク処理として指定部分にモザイクを掛ける方法のほかに、例えば、他の画像によって覆い隠す方法がある。他の画像としては、他の人物の顔や、イラスト等が挙げられ、他の画像を視聴者が指定できるようにしてもよい。 As another processing method when performing mosaic processing or the like on the face image extracted by the face image determination unit 401, in addition to the method of applying a mosaic to the designated portion as the mosaic processing, for example, by another image. There is a way to cover it up. Examples of the other image include a face of another person, an illustration, and the like, and the viewer may be able to specify the other image.

例えば、ライブ動画の出演者等、特定の人物が行ったパフォーマンスに応じた画像によって、特定の人物の顔の画像を覆い隠す等、視聴者の意向をリアルタイムにライブに反映させることにより、ライブをより興味を引き付けるものとすることができる。 For example, by reflecting the viewer's intention in real time, such as hiding the image of the face of a specific person with an image according to the performance performed by a specific person such as a performer of a live video, the live can be performed. It can be more interesting.

顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、他の処理方法としては、モザイク処理として指定部分にモザイクを掛ける代わりに、ぼかしを入れる方法であってもよい。 As another processing method when performing mosaic processing or the like on the face image extracted by the face image determination unit 401, even if the method is to add a blur instead of applying a mosaic to the specified part as the mosaic processing. good.

また、ライブをより興味を引き付けるものとするために、リクエスト内容は、視聴者の多数決によって決定されてもよい。ライブ動画の画面内に、例えば、モザイク処理の方法について１又は複数の選択肢が表示され、各視聴者が、視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２に、１又は複数の選択肢のうち希望するモザイク処理方法を送信してもよい。 Also, in order to make the live more interesting, the content of the request may be decided by a majority vote of the viewers. For example, one or a plurality of options for the mosaic processing method are displayed in the screen of the live video, and each viewer receives one or a plurality of options from the viewer terminals 107a, 107b, 107c to the input / output unit 302. You may send the desired mosaic processing method.

上記のように、リクエスト内容を視聴者の多数決によって決定する際、モザイク処理の方法についての１又は複数の選択肢は、ライブ動画の画面のコメント欄に表示される視聴者から送信されたコメントによって選出されてもよい。例えば、視聴者から選択肢の内容について提案するコメントがコメント欄に表示され、司会者が選択肢の内容についての提案を了承すること等により、提案された選択肢の内容が採用され、ライブ動画の画面内に、採用された選択肢の内容が表示されてもよい。 As mentioned above, when the content of the request is decided by the majority of the viewers, one or more options for the mosaic processing method are selected by the comments sent from the viewers displayed in the comment field of the live video screen. May be done. For example, a comment that the viewer proposes about the content of the option is displayed in the comment column, and the moderator approves the proposal about the content of the option, so that the content of the proposed option is adopted and the content of the proposed option is adopted in the live video screen. The contents of the adopted options may be displayed in.

また、例えば、観客等、１又は複数の特定の人物以外の人物のうちの特定の顔の画像に対する画像処理方法を、視聴者のリクエストによって決定してもよい。この場合、画像データ記憶部４０２に記憶されていない顔画像に対する画像処理であるので、例えば、帽子、名札、番号札等をマーカーとし、予め撮影されたマーカーの画像データを用いて構築されたマーカー機械学習モデルを画像データ記憶部４０２に記憶させておき、マーカーを観客等に装着させ、マーカーに基づき、画像処理を行う人物を特定するようにしてもよい。 Further, for example, an image processing method for an image of a specific face among a person other than one or a plurality of specific persons such as an audience may be determined at the request of the viewer. In this case, since the image processing is performed on the face image that is not stored in the image data storage unit 402, for example, a hat, a name tag, a number tag, or the like is used as a marker, and a marker constructed by using the image data of the marker taken in advance. The machine learning model may be stored in the image data storage unit 402, a marker may be attached to an audience or the like, and a person who performs image processing may be specified based on the marker.

視聴者が、モザイク処理等を行う際の、処理方法をリクエストする情報を含むリクエストデータを視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２に送信する際、配信者が、リクエストを希望する視聴者に対して課金することによってリクエストを可能としてもよい。また、リクエスト内容によって、課金の金額が決定されてもよく、課金額によって、リクエストが通るかどうかが決定されてもよい。 When the viewer transmits request data including information requesting a processing method from the viewer terminals 107a, 107b, 107c to the input / output unit 302 when performing mosaic processing or the like, the distributor desires the viewing. The request may be made possible by charging the person. Further, the billing amount may be determined by the content of the request, and whether or not the request is accepted may be determined by the billing amount.

視聴者による、モザイク処理等を行う際の、処理方法のリクエストは、対価の支払いによって可能となるとしてもよい。又は、例えば、出演者に関連した物品等を購入し、購入した物品や購入金額等に応じて、処理方法のリクエストが可能となるようにしてもよい。あるいは、動画配信中に、出演者、司会者が視聴者とコミュニケーションをとり、コミュニケーションの内容に基づき、所定の条件を満たす等、視聴者が配信される動画を閲覧中に、特定の条件を満たすことによって処理方法のリクエストが可能となるようにしてもよい。出演者、司会者が視聴者とコミュニケーションをとり、コミュニケーションの内容に基づき、所定の条件を満たす場面としては、例えば、ライブ動画において、司会者が視聴者に対して質問等を行い、視聴者が司会者の質問に対してメッセージで回答し、司会者が視聴者の回答に対して適切な回答であると判断してその旨を本実施形態に係るライブ配信システムに入力すること、等が挙げられる。 A viewer may request a processing method when performing mosaic processing or the like by paying a consideration. Alternatively, for example, an article related to the performer may be purchased, and a processing method may be requested according to the purchased article, the purchase price, or the like. Alternatively, during video distribution, the performer and the moderator communicate with the viewer, and based on the content of the communication, certain conditions are met while the viewer is viewing the video to be distributed. By doing so, it may be possible to request a processing method. As a scene where the performer and the moderator communicate with the viewer and satisfy the predetermined conditions based on the content of the communication, for example, in a live video, the moderator asks a question to the viewer and the viewer asks a question. Answering the question of the moderator with a message, determining that the moderator is an appropriate answer to the answer of the viewer, and inputting that fact into the live distribution system according to the present embodiment, etc. Be done.

本実施形態に係るライブ配信システムにおいて、視聴者がモザイク処理等を行う際の処理方法をリクエストする場合、視聴者は、前もってライブ配信システムに視聴者登録を行う。視聴者登録を行う際は、視聴者によって、視聴者端末１０７ａ、１０７ｂ、１０７ｃから、ネットワーク１０６及び入出力部３０２を介して、視聴者情報管理部３０３へ視聴者登録を行うための視聴者情報が送信される。視聴者情報管理部３０３は、受信した視聴者情報をもとに、視聴者ＩＤとパスワードを生成し、視聴者情報記憶部３０４に視聴者情報、視聴者ＩＤ及びパスワードを記憶させ、視聴者ＩＤ及びパスワードを、視聴者端末１０７ａ、１０７ｂ、１０７ｃに送信する。視聴者は、視聴者ＩＤ及びパスワードを用いて視聴者情報管理部３０３によって認証を受けることによって、処理方法のリクエストが可能となる。 In the live distribution system according to the present embodiment, when the viewer requests a processing method when performing mosaic processing or the like, the viewer registers the viewer in the live distribution system in advance. When registering a viewer, the viewer information for registering the viewer from the viewer terminals 107a, 107b, 107c to the viewer information management unit 303 via the network 106 and the input / output unit 302. Is sent. The viewer information management unit 303 generates a viewer ID and a password based on the received viewer information, stores the viewer information, the viewer ID and the password in the viewer information storage unit 304, and stores the viewer ID. And the password are transmitted to the viewer terminals 107a, 107b, 107c. The viewer can request the processing method by being authenticated by the viewer information management unit 303 using the viewer ID and password.

図５及び図６のフローチャートを参照しながら、本実施形態に係るライブ配信方法を説明する。なお、図５及び図６のフロー開始時点で視聴者登録及び視聴者認証はすでに行われているものとする。 The live distribution method according to the present embodiment will be described with reference to the flowcharts of FIGS. 5 and 6. It is assumed that the viewer registration and the viewer authentication have already been performed at the start of the flow of FIGS. 5 and 6.

図５に示すフローは、ライブの動画が撮影されてから視聴者端末１０７ａ、１０７ｂ、１０７ｃへ配信されるまでのフローである。 The flow shown in FIG. 5 is a flow from the shooting of the live moving image to the distribution to the viewer terminals 107a, 107b, 107c.

ステップＳ５０１において、第１～第３カメラ１０１ａ～１０１ｃによる撮影、及び第１～第３マイクロフォン１０２ａ～１０２ｃによる録音が行われる。 In step S501, shooting by the first to third cameras 101a to 101c and recording by the first to third microphones 102a to 102c are performed.

ステップＳ５０２において、第１～第３映像データ変換器２０１ａ～２０１ｃ、第１～第３音声データ変換器２０２ａ～２０２ｃによって、第１～第３動画データが生成される。 In step S502, the first to third moving image data are generated by the first to third video data converters 201a to 201c and the first to third audio data converters 202a to 202c.

ステップＳ５０３において、スイッチャー２０４によって、第１～第３動画データの中から、配信される動画が選択され、選択された動画データが、エンコーダー２０５及びストリーミングサーバ２０６によって、ストリーミング形式のデータに変換される。 In step S503, the switcher 204 selects the video to be distributed from the first to third video data, and the selected video data is converted into streaming format data by the encoder 205 and the streaming server 206. ..

ステップＳ５０４において、顔画像判定部４０１が、画像認識により、動画データの画像に映っている全ての人物の顔の画像を抽出する。 In step S504, the face image determination unit 401 extracts the images of the faces of all the persons reflected in the image of the moving image data by image recognition.

ステップＳ５０５において、顔画像判定部４０１が、抽出された全ての人物の顔の画像の中に、画像データ記憶部４０２に記憶されている、１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する。 In step S505, the face image determination unit 401 matches the face image of one or a plurality of specific persons stored in the image data storage unit 402 in the extracted facial images of all the persons. Determine if there is an image.

ステップＳ５０６において、画像処理部４０３は、顔画像判定部４０１によって抽出された全ての人物の顔のうち、条件に適合する人物の顔の画像に対して、モザイク処理等を行う。 In step S506, the image processing unit 403 performs mosaic processing or the like on the image of the face of the person who meets the conditions among all the faces of the person extracted by the face image determination unit 401.

ステップＳ５０７において、編集された動画データが、配信サーバ１０５から、ネットワーク１０６を介して、視聴者端末１０７ａ、１０７ｂ、１０７ｃへ配信される。 In step S507, the edited moving image data is distributed from the distribution server 105 to the viewer terminals 107a, 107b, 107c via the network 106.

図６に示すフローは、視聴者端末１０７ａ、１０７ｂ、１０７ｃへライブ動画が配信されてから、視聴者によってリクエストされた顔の画像に対するモザイク処理方法が実行された動画が配信されるまでのフローである。 The flow shown in FIG. 6 is a flow from the distribution of the live video to the viewer terminals 107a, 107b, 107c to the distribution of the video in which the mosaic processing method for the face image requested by the viewer is executed. be.

ステップＳ６０１において、画像処理部４０３が顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う際の、処理方法をリクエストする情報を含むリクエストデータが視聴者端末１０７ａ、１０７ｂ、１０７ｃから入出力部３０２へ送信される。 In step S601, request data including information requesting a processing method when the image processing unit 403 performs mosaic processing or the like on the face image extracted by the face image determination unit 401 is the viewer terminals 107a, 107b. It is transmitted from 107c to the input / output unit 302.

ステップＳ６０２において、入出力部３０２は、リクエストデータを受信すると、リクエストデータを画像処理部４０３に送信する。 In step S602, when the input / output unit 302 receives the request data, the input / output unit 302 transmits the request data to the image processing unit 403.

ステップＳ６０３において、画像処理部４０３が、リクエストデータに含まれる、処理方法をリクエストする情報に基づき、顔画像判定部４０１によって抽出された顔の画像に対してモザイク処理等を行う。 In step S603, the image processing unit 403 performs mosaic processing or the like on the face image extracted by the face image determination unit 401 based on the information included in the request data for requesting the processing method.

以上、本発明はここでは記載していない様々な実施形態等を含むことは勿論である。したがって、本発明の技術的範囲は上記の説明から妥当な特許請求の範囲に係る発明特定事項によってのみ定められるものである。 As described above, it goes without saying that the present invention includes various embodiments not described here. Therefore, the technical scope of the present invention is defined only by the matters specifying the invention relating to the reasonable claims from the above description.

１０ライブ配信システム
１０１ａ第１カメラ
１０１ｂ第２カメラ
１０１ｃ第３カメラ
１０２ａ第１マイクロフォン
１０２ｂ第２マイクロフォン
１０２ｃ第３マイクロフォン
１０３映像調整装置
１０４映像変換装置
１０５配信サーバ
１０６ネットワーク
１０７ａ、１０７ｂ、１０７ｃ視聴者端末
２０１ａ第１映像データ変換器
２０１ｂ第２映像データ変換器
２０１ｃ第３映像データ変換器
２０２ａ第１音声データ変換器
２０２ｂ第２音声データ変換器
２０２ｃ第３音声データ変換器
２０３ａ第１合成器
２０３ｂ第２合成器
２０３ｃ第３合成器
２０４スイッチャー
２０５エンコーダー
２０６ストリーミングサーバ
３０１映像編集部
３０２入出力部
３０３視聴者情報管理部
３０４視聴者情報記憶部
４０１顔画像判定部
４０２画像データ記憶部
４０３画像処理部
４０４画像表示部 10 Live distribution system 101a 1st camera 101b 2nd camera 101c 3rd camera 102a 1st microphone 102b 2nd microphone 102c 3rd microphone 103 Video adjustment device 104 Video conversion device 105 Distribution server 106 Network 107a, 107b, 107c Viewer terminal 201a 1st video data converter 201b 2nd video data converter 201c 3rd video data converter 202a 1st audio data converter 202b 2nd audio data converter 202c 3rd audio data converter 203a 1st synthesizer 203b 2nd synthesis Instrument 203c 3rd synthesizer 204 Switcher 205 Encoder 206 Streaming server 301 Video editing unit 302 Input / output unit 303 Viewer information management unit 304 Viewer information storage unit 401 Face image judgment unit 402 Image data storage unit 403 Image processing unit 404 Image display Department

Claims

ライブを撮影して１又は複数の映像データを生成するための１又は複数のカメラと、
前記ライブの音声を録音して１又は複数の音声データを生成するための１又は複数のマイクロフォンと、
前記１又は複数の映像データ及び前記１又は複数の音声データから動画データを生成する映像調整装置と、
前記動画データを編集する映像変換装置と、
編集された前記動画データを視聴者端末へ送信する配信サーバと
を備えるライブ動画を配信するライブ配信システムであって、前記映像変換装置は、
映像編集部と、
前記視聴者端末と、前記ライブ動画に対するコメントを前記ライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力部と、
１又は複数の視聴者の登録と認証を行う視聴者情報管理部と、
前記１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶部と
を備え、前記映像編集部は、
予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、１又は複数の特定の人物の顔画像データを用いて構築された学習モデルを記憶する画像データ記憶部と、
画像認識により、前記動画データの画像に映っている１又は複数の人物の顔の画像を抽出し、抽出された前記１又は複数の顔の画像の中に、前記学習モデルの、前記１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定部と、
前記顔画像判定部によって判定された前記１又は複数の顔の画像のうち、前記１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、前記視聴者端末から送信された前記リクエストデータに基づき画像処理を行う画像処理部と、
前記視聴者端末から送信された前記コメントデータに基づき前記コメントを前記ライブ動画に表示する画像処理を行う画像表示部と
を備えることを特徴とするライブ配信システム。 One or more cameras for shooting live and generating one or more video data,
With one or more microphones for recording the live audio and generating one or more audio data.
A video adjustment device that generates video data from the one or more video data and the one or more audio data, and
A video converter that edits the video data, and
The video conversion device is a live distribution system that distributes a live video including a distribution server that transmits the edited video data to a viewer terminal.
With the video editorial department
An input / output unit that sends and receives the viewer terminal, comment data including information for expressing a comment on the live video in the live video, and request data including information requesting an image processing method for a face image. When,
The Viewer Information Management Department, which registers and authenticates one or more viewers,
The video editing unit includes a viewer information storage unit that stores viewer information, viewer IDs, and passwords of the one or more viewers.
An image data storage unit that stores a learning model constructed using facial image data of one or more specific persons to which tags related to one or more specific persons are attached in the image data taken in advance. ,
By image recognition, the images of the faces of one or more people appearing in the image of the moving image data are extracted, and the one or more of the learning model is included in the extracted images of the one or more faces. A face image determination unit that determines whether there is an image that matches the image of the face of a specific person,
Among the images of the one or a plurality of faces determined by the face image determination unit, it becomes difficult to identify an individual face for an image that does not match the image of the face of the one or a plurality of specific persons. After performing image processing as described above, an image processing unit that performs image processing based on the request data transmitted from the viewer terminal, and an image processing unit.
A live distribution system including an image display unit that performs image processing for displaying the comment in the live moving image based on the comment data transmitted from the viewer terminal.

前記画像処理は顔の画像に対するモザイク処理であり、前記画像処理部は、抽出された前記１又は複数の顔の画像のうち、前記１又は複数の特定の人物の顔の画像と一致しない画像にのみモザイク処理を行うことを特徴とする請求項１に記載のライブ配信システム。 The image processing is a mosaic processing for a face image, and the image processing unit converts an extracted image of the one or a plurality of faces into an image that does not match the image of the face of the one or a plurality of specific persons. The live distribution system according to claim 1, wherein only mosaic processing is performed.

前記画像処理部は、抽出された前記１又は複数の顔の画像のうち、前記１又は複数の特定の人物の顔の画像と一致する画像と、一致しない画像とは、互いに異なる画像処理を行うことを特徴とする請求項１に記載のライブ配信システム。 Among the extracted images of the one or a plurality of faces, the image processing unit performs image processing different from each other for an image that matches the image of the face of the one or a plurality of specific persons and an image that does not match. The live distribution system according to claim 1, wherein the live distribution system is characterized in that.

前記画像処理は顔の画像に対するモザイク処理であり、前記画像処理部は、前記リクエストデータに基づき、抽出された前記１又は複数の顔の画像に対して行うモザイク処理のモザイクの粗さを決定することを特徴とする請求項１に記載のライブ配信システム。 The image processing is a mosaic processing for a face image, and the image processing unit determines the roughness of the mosaic of the mosaic processing performed on the extracted one or a plurality of face images based on the request data. The live distribution system according to claim 1, wherein the live distribution system is characterized in that.

前記画像処理は顔の画像に対するモザイク処理であり、前記画像処理部は、前記リクエストデータに基づき、抽出された前記１又は複数の顔の画像に対して行うモザイク処理のモザイクの範囲を決定することを特徴とする請求項１に記載のライブ配信システム。 The image processing is a mosaic processing for a face image, and the image processing unit determines a range of mosaic processing for the extracted one or a plurality of face images based on the request data. The live distribution system according to claim 1, wherein the live distribution system is characterized by.

前記画像処理部は、前記リクエストデータに基づき、抽出された前記１又は複数の顔の画像のうち、画像処理を行う画像を決定することを特徴とする請求項１に記載のライブ配信システム。 The live distribution system according to claim 1, wherein the image processing unit determines an image to be image-processed from the extracted one or a plurality of facial images based on the request data.

前記画像処理部による、前記リクエストデータに基づく画像処理の方法は、課金によって視聴者が決定可能となることを特徴とする請求項１に記載のライブ配信システム。 The live distribution system according to claim 1, wherein the image processing method based on the request data by the image processing unit can be determined by the viewer by billing.

前記画像処理部による、前記リクエストデータに基づく画像処理の方法は、前記視聴者による多数決によって決定されることを特徴とする請求項１に記載のライブ配信システム。 The live distribution system according to claim 1, wherein the image processing method based on the request data by the image processing unit is determined by a majority vote by the viewer.

前記視聴者による多数決によって決定される前記画像処理の方法の選択肢の内容は、前記画像表示部によって前記ライブ動画に表示された前記コメントにおいて提案され、前記１又は複数の特定の人物が前記コメントにおいて提案された前記選択肢の内容を了承することにより、前記ライブ動画に提案された前記選択肢が表示されることを特徴とする請求項８に記載のライブ配信システム。 The content of the options of the image processing method determined by the majority vote by the viewer is proposed in the comment displayed in the live video by the image display unit, and the one or more specific persons are proposed in the comment. The live distribution system according to claim 8, wherein the proposed option is displayed in the live video by accepting the content of the proposed option.

前記画像処理部は、前記リクエストデータに基づき、抽出された前記１又は複数の顔の画像のうち、どの顔画像に対して画像処理を行うかを決定することを特徴とする請求項１に記載のライブ配信システム。 The first aspect of claim 1, wherein the image processing unit determines which face image to perform image processing on among the extracted one or a plurality of face images based on the request data. Live delivery system.

前記画像データ記憶部は、予め撮影されたマーカーの画像データを用いて構築されたマーカー学習モデルを更に記憶し、
前記画像処理部は、前記リクエストデータに基づき、マーカーを装着した人物の顔の画像に対して画像処理を行うことを特徴とする請求項１に記載のライブ配信システム。 The image data storage unit further stores a marker learning model constructed using image data of markers taken in advance.
The live distribution system according to claim 1, wherein the image processing unit performs image processing on an image of a face of a person wearing a marker based on the request data.

前記画像処理部による、前記リクエストデータに基づく画像処理の方法の決定は、前記視聴者が所定の条件を満たすことによって可能となることを特徴とする請求項１に記載のライブ配信システム。 The live distribution system according to claim 1, wherein the image processing unit can determine a method of image processing based on the request data when the viewer satisfies a predetermined condition.

視聴者端末にライブ動画を配信するライブ配信方法であって、
ライブを撮影して１又は複数の映像データを生成する映像データ生成ステップと、
前記ライブの音声を録音して１又は複数の音声データを生成する音声データ生成ステップと、
前記１又は複数の映像データ及び前記１又は複数の音声データから動画データを生成する動画データ生成ステップと、
前記動画データを編集する映像編集ステップと、
編集された前記動画データを前記視聴者端末へ送信する配信ステップと、
１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶ステップと、
前記１又は複数の視聴者の登録と認証を行う視聴者情報管理ステップと、
前記視聴者端末と、前記ライブ動画に対するコメントを前記ライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力ステップと、
を備え、前記映像編集ステップは、
画像認識により、前記動画データの画像に映っている１又は複数の人物の顔の画像を抽出する抽出ステップと、
抽出された前記１又は複数の顔の画像の中に、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、前記１又は複数の特定の人物の顔画像データを用いて構築された学習モデルの、前記１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定ステップと、
前記顔画像判定ステップにおいて判定された前記１又は複数の顔の画像のうち、前記１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、前記視聴者端末から送信された前記リクエストデータに基づき画像処理を行う画像処理ステップと、
前記視聴者端末から送信された前記コメントデータに基づき前記コメントを前記ライブ動画に表示する画像処理を行う画像表示ステップと
を備えることを特徴とするライブ配信方法。 It is a live distribution method that distributes live video to viewer terminals.
A video data generation step that shoots a live and generates one or more video data,
An audio data generation step of recording the live audio to generate one or more audio data,
A moving image data generation step for generating moving image data from the one or more video data and the one or more audio data,
A video editing step for editing the video data, and
A distribution step of transmitting the edited video data to the viewer terminal, and
A viewer information storage step for storing viewer information, viewer IDs and passwords of one or more viewers, and
The viewer information management step for registering and authenticating one or more viewers,
An input / output step for transmitting and receiving comment data including information for expressing a comment for the live video in the live video and request data including information for requesting an image processing method for a face image with the viewer terminal. When,
The video editing step is
An extraction step of extracting an image of the face of one or more people reflected in the image of the moving image data by image recognition, and an extraction step.
In the extracted image of the one or more faces, the face image data of the one or more specific persons is tagged with respect to one or more specific persons in the image data taken in advance. A face image determination step for determining whether or not there is an image that matches the image of the face of one or more specific persons in the learning model constructed using the above.
Among the images of the one or a plurality of faces determined in the face image determination step, it becomes difficult to identify an individual face for an image that does not match the image of the face of the one or a plurality of specific persons. After performing image processing as described above, an image processing step of performing image processing based on the request data transmitted from the viewer terminal, and an image processing step.
A live distribution method comprising: an image display step of performing image processing for displaying the comment in the live moving image based on the comment data transmitted from the viewer terminal.

コンピュータに、
ライブを撮影して１又は複数の映像データを生成する映像データ生成機能と、
前記ライブの音声を録音して１又は複数の音声データを生成する音声データ生成機能と、
前記１又は複数の映像データ及び前記１又は複数の音声データから動画データを生成する動画データ生成機能と、
前記動画データを編集する映像編集機能と、
編集された前記動画データを視聴者端末へ送信する配信機能と、
１又は複数の視聴者の視聴者情報、視聴者ＩＤ及びパスワードを記憶する視聴者情報記憶機能と、
前記１又は複数の視聴者の登録と認証を行う視聴者情報管理機能と、
前記視聴者端末と、ライブ動画に対するコメントを前記ライブ動画において表現するための情報を含むコメントデータと、顔の画像に対する画像処理の方法をリクエストする情報を含むリクエストデータの送受信をする入出力機能と、
を備え、前記映像編集機能は、
画像認識により、前記動画データの画像に映っている１又は複数の人物の顔の画像を抽出する抽出機能と、
抽出された前記１又は複数の顔の画像の中に、予め撮影された画像データ内の、１又は複数の特定の人物に関するタグを付与された、前記１又は複数の特定の人物の顔画像データを用いて構築された学習モデルの、前記１又は複数の特定の人物の顔の画像と一致する画像があるかどうかを判定する顔画像判定機能と、
前記顔画像判定機能によって判定された前記１又は複数の顔の画像のうち、前記１又は複数の特定の人物の顔の画像と一致しない画像に対して顔の個人を特定することが困難となるように画像処理を行ったのち、前記視聴者端末から送信された前記リクエストデータに基づき画像処理を行う画像処理機能と、
前記視聴者端末から送信された前記コメントデータに基づき前記コメントを前記ライブ動画に表示する画像処理を行う画像表示機能と
を備えることを実現させるためのライブ配信プログラム。

On the computer
A video data generation function that shoots a live concert and generates one or more video data,
An audio data generation function that records the live audio and generates one or more audio data,
A video data generation function that generates video data from the one or more video data and the one or more audio data, and
A video editing function that edits the video data, and
A distribution function that sends the edited video data to the viewer terminal,
A viewer information storage function that stores viewer information, viewer IDs, and passwords of one or more viewers, and
The viewer information management function that registers and authenticates one or more viewers, and
The viewer terminal, an input / output function for transmitting and receiving comment data including information for expressing a comment on a live video in the live video, and request data including information requesting an image processing method for a face image. ,
The video editing function is equipped with
An extraction function that extracts images of the faces of one or more people reflected in the image of the moving image data by image recognition, and an extraction function.
In the extracted image of the one or more faces, the face image data of the one or more specific persons is tagged with respect to one or more specific persons in the image data taken in advance. A face image determination function that determines whether or not there is an image that matches the image of the face of one or more specific persons in the learning model constructed using the above.
Among the images of the one or a plurality of faces determined by the face image determination function, it becomes difficult to identify an individual face for an image that does not match the image of the face of the one or a plurality of specific persons. An image processing function that performs image processing based on the request data transmitted from the viewer terminal after performing image processing as described above.
A live distribution program for realizing an image display function that performs image processing for displaying the comment in the live moving image based on the comment data transmitted from the viewer terminal.